From patchwork Wed Jul 14 15:06:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32CE1C11F69 for ; Wed, 14 Jul 2021 15:11:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CC55613B6 for ; Wed, 14 Jul 2021 15:11:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239577AbhGNPOJ (ORCPT ); Wed, 14 Jul 2021 11:14:09 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3404 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239576AbhGNPOJ (ORCPT ); Wed, 14 Jul 2021 11:14:09 -0400 Received: from fraeml703-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ12Y4Q6hz6K64C; Wed, 14 Jul 2021 23:02:45 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml703-chm.china.huawei.com (10.206.15.52) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:15 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:13 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 1/9] blk-mq: Change rqs check in blk_mq_free_rqs() Date: Wed, 14 Jul 2021 23:06:27 +0800 Message-ID: <1626275195-215652-2-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org The original code in commit 24d2f90309b23 ("blk-mq: split out tag initialization, support shared tags") would check tags->rqs is non-NULL and then dereference tags->rqs[]. Then in commit 2af8cbe30531 ("blk-mq: split tag ->rqs[] into two"), we started to dereference tags->static_rqs[], but continued to check non-NULL tags->rqs. Check tags->static_rqs as non-NULL instead, which is more logical. Signed-off-by: John Garry Reviewed-by: Ming Lei --- block/blk-mq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 2c4ac51e54eb..ae28f470893c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2348,7 +2348,7 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, { struct page *page; - if (tags->rqs && set->ops->exit_request) { + if (tags->static_rqs && set->ops->exit_request) { int i; for (i = 0; i < tags->nr_tags; i++) { From patchwork Wed Jul 14 15:06:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95989C11F66 for ; Wed, 14 Jul 2021 15:11:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8071E613CB for ; Wed, 14 Jul 2021 15:11:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239594AbhGNPON (ORCPT ); Wed, 14 Jul 2021 11:14:13 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3405 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239585AbhGNPOL (ORCPT ); Wed, 14 Jul 2021 11:14:11 -0400 Received: from fraeml702-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ0vk6sbZz6D97c; Wed, 14 Jul 2021 22:56:50 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml702-chm.china.huawei.com (10.206.15.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:17 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:15 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 2/9] block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ Date: Wed, 14 Jul 2021 23:06:28 +0800 Message-ID: <1626275195-215652-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org It is a bit confusing that there is BLKDEV_MAX_RQ and MAX_SCHED_RQ, as the name BLKDEV_MAX_RQ would imply the max requests always, which it is not. Rename to BLKDEV_MAX_RQ to BLKDEV_DEFAULT_RQ, matching its usage - that being the default number of requests assigned when allocating a request queue. Signed-off-by: John Garry Reviewed-by: Ming Lei --- block/blk-core.c | 2 +- block/blk-mq-sched.c | 2 +- block/blk-mq-sched.h | 2 +- drivers/block/rbd.c | 2 +- include/linux/blkdev.h | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 04477697ee4b..5d71382b6131 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -579,7 +579,7 @@ struct request_queue *blk_alloc_queue(int node_id) blk_queue_dma_alignment(q, 511); blk_set_default_limits(&q->limits); - q->nr_requests = BLKDEV_MAX_RQ; + q->nr_requests = BLKDEV_DEFAULT_RQ; return q; diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index c838d81ac058..f5cb2931c20d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -615,7 +615,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) * Additionally, this is a per-hw queue depth. */ q->nr_requests = 2 * min_t(unsigned int, q->tag_set->queue_depth, - BLKDEV_MAX_RQ); + BLKDEV_DEFAULT_RQ); queue_for_each_hw_ctx(q, hctx, i) { ret = blk_mq_sched_alloc_tags(q, hctx, i); diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 5246ae040704..1e46be6c5178 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -5,7 +5,7 @@ #include "blk-mq.h" #include "blk-mq-tag.h" -#define MAX_SCHED_RQ (16 * BLKDEV_MAX_RQ) +#define MAX_SCHED_RQ (16 * BLKDEV_DEFAULT_RQ) void blk_mq_sched_assign_ioc(struct request *rq); diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 531d390902dd..d3f329749173 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -836,7 +836,7 @@ struct rbd_options { u32 alloc_hint_flags; /* CEPH_OSD_OP_ALLOC_HINT_FLAG_* */ }; -#define RBD_QUEUE_DEPTH_DEFAULT BLKDEV_MAX_RQ +#define RBD_QUEUE_DEPTH_DEFAULT BLKDEV_DEFAULT_RQ #define RBD_ALLOC_SIZE_DEFAULT (64 * 1024) #define RBD_LOCK_TIMEOUT_DEFAULT 0 /* no timeout */ #define RBD_READ_ONLY_DEFAULT false diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 3177181c4326..6a64ea23f552 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -45,7 +45,7 @@ struct blk_stat_callback; struct blk_keyslot_manager; #define BLKDEV_MIN_RQ 4 -#define BLKDEV_MAX_RQ 128 /* Default maximum */ +#define BLKDEV_DEFAULT_RQ 128 /* Must be consistent with blk_mq_poll_stats_bkt() */ #define BLK_MQ_POLL_STATS_BKTS 16 From patchwork Wed Jul 14 15:06:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1ADB6C11F68 for ; Wed, 14 Jul 2021 15:11:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 04B5C613CF for ; Wed, 14 Jul 2021 15:11:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239607AbhGNPON (ORCPT ); Wed, 14 Jul 2021 11:14:13 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3406 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239605AbhGNPON (ORCPT ); Wed, 14 Jul 2021 11:14:13 -0400 Received: from fraeml701-chm.china.huawei.com (unknown [172.18.147.201]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ0vm6HF0z6D97r; Wed, 14 Jul 2021 22:56:52 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml701-chm.china.huawei.com (10.206.15.50) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:19 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:17 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 3/9] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() Date: Wed, 14 Jul 2021 23:06:29 +0800 Message-ID: <1626275195-215652-4-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org For shared sbitmap, if the call to blk_mq_tag_update_depth() was successful for any hctx when hctx->sched_tags is not set, then it would be successful for all (due to nature in which blk_mq_tag_update_depth() fails). As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for each hctx. So relocate the call until after the hctx iteration under the !q->elevator check, which is equivalent (to !hctx->sched_tags). Signed-off-by: John Garry --- block/blk-mq.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index ae28f470893c..56e3c6fdba60 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3624,8 +3624,6 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) if (!hctx->sched_tags) { ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr, false); - if (!ret && blk_mq_is_sbitmap_shared(set->flags)) - blk_mq_tag_resize_shared_sbitmap(set, nr); } else { ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags, nr, true); @@ -3643,9 +3641,14 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) } if (!ret) { q->nr_requests = nr; - if (q->elevator && blk_mq_is_sbitmap_shared(set->flags)) - sbitmap_queue_resize(&q->sched_bitmap_tags, - nr - set->reserved_tags); + if (blk_mq_is_sbitmap_shared(set->flags)) { + if (q->elevator) { + sbitmap_queue_resize(&q->sched_bitmap_tags, + nr - set->reserved_tags); + } else { + blk_mq_tag_resize_shared_sbitmap(set, nr); + } + } } blk_mq_unquiesce_queue(q); From patchwork Wed Jul 14 15:06:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52587C11F66 for ; Wed, 14 Jul 2021 15:11:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 36A57613CB for ; Wed, 14 Jul 2021 15:11:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239613AbhGNPOQ (ORCPT ); Wed, 14 Jul 2021 11:14:16 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3407 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239611AbhGNPOP (ORCPT ); Wed, 14 Jul 2021 11:14:15 -0400 Received: from fraeml745-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ0vq10X7z6D91S; Wed, 14 Jul 2021 22:56:55 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml745-chm.china.huawei.com (10.206.15.226) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:21 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:19 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 4/9] blk-mq: Add blk_mq_tag_resize_sched_shared_sbitmap() Date: Wed, 14 Jul 2021 23:06:30 +0800 Message-ID: <1626275195-215652-5-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Put the functionality to resize the sched shared sbitmap in a common function. Since the same formula is always used to resize, and it can be got from the request queue argument, so just pass the request queue pointer. Signed-off-by: John Garry --- block/blk-mq-sched.c | 3 +-- block/blk-mq-tag.c | 10 ++++++++++ block/blk-mq-tag.h | 1 + block/blk-mq.c | 3 +-- 4 files changed, 13 insertions(+), 4 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index f5cb2931c20d..1e028183557d 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -584,8 +584,7 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue) &queue->sched_breserved_tags; } - sbitmap_queue_resize(&queue->sched_bitmap_tags, - queue->nr_requests - set->reserved_tags); + blk_mq_tag_resize_sched_shared_sbitmap(queue); return 0; } diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 86f87346232a..55c7f1bf41c7 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -634,6 +634,16 @@ void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int s sbitmap_queue_resize(&set->__bitmap_tags, size - set->reserved_tags); } +/* + * We always resize with q->nr_requests - q->tag_set->reserved_tags, so + * don't bother passing a size. + */ +void blk_mq_tag_resize_sched_shared_sbitmap(struct request_queue *q) +{ + sbitmap_queue_resize(&q->sched_bitmap_tags, + q->nr_requests - q->tag_set->reserved_tags); +} + /** * blk_mq_unique_tag() - return a tag that is unique queue-wide * @rq: request for which to compute a unique tag diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index 8ed55af08427..3a7495e47fb4 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -48,6 +48,7 @@ extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, unsigned int depth, bool can_grow); extern void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int size); +extern void blk_mq_tag_resize_sched_shared_sbitmap(struct request_queue *q); extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool); void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, diff --git a/block/blk-mq.c b/block/blk-mq.c index 56e3c6fdba60..b0d4197d36c7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3643,8 +3643,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) q->nr_requests = nr; if (blk_mq_is_sbitmap_shared(set->flags)) { if (q->elevator) { - sbitmap_queue_resize(&q->sched_bitmap_tags, - nr - set->reserved_tags); + blk_mq_tag_resize_sched_shared_sbitmap(q); } else { blk_mq_tag_resize_shared_sbitmap(set, nr); } From patchwork Wed Jul 14 15:06:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3FEEC11F67 for ; Wed, 14 Jul 2021 15:11:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DFDCD613B6 for ; Wed, 14 Jul 2021 15:11:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239652AbhGNPOU (ORCPT ); Wed, 14 Jul 2021 11:14:20 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3408 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239623AbhGNPOR (ORCPT ); Wed, 14 Jul 2021 11:14:17 -0400 Received: from fraeml744-chm.china.huawei.com (unknown [172.18.147.201]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ0zf6NGwz6L88Y; Wed, 14 Jul 2021 23:00:14 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml744-chm.china.huawei.com (10.206.15.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:23 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:21 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 5/9] blk-mq: Invert check in blk_mq_update_nr_requests() Date: Wed, 14 Jul 2021 23:06:31 +0800 Message-ID: <1626275195-215652-6-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org It's easier to read: if (x) X; else Y; over: if (!x) Y; else X; No functional change intended. Signed-off-by: John Garry --- block/blk-mq.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b0d4197d36c7..3ff5c15ce0eb 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3621,18 +3621,18 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) * If we're using an MQ scheduler, just update the scheduler * queue depth. This is similar to what the old code would do. */ - if (!hctx->sched_tags) { - ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr, - false); - } else { + if (hctx->sched_tags) { ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags, - nr, true); + nr, true); if (blk_mq_is_sbitmap_shared(set->flags)) { hctx->sched_tags->bitmap_tags = &q->sched_bitmap_tags; hctx->sched_tags->breserved_tags = &q->sched_breserved_tags; } + } else { + ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr, + false); } if (ret) break; From patchwork Wed Jul 14 15:06:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2F1DC11F68 for ; Wed, 14 Jul 2021 15:11:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9222D613CB for ; Wed, 14 Jul 2021 15:11:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239669AbhGNPOW (ORCPT ); Wed, 14 Jul 2021 11:14:22 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3409 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239620AbhGNPOT (ORCPT ); Wed, 14 Jul 2021 11:14:19 -0400 Received: from fraeml743-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ12l6ZtXz6K604; Wed, 14 Jul 2021 23:02:55 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml743-chm.china.huawei.com (10.206.15.224) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:25 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:23 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH RFC 6/9] blk-mq: Refactor blk_mq_{alloc,free}_rqs Date: Wed, 14 Jul 2021 23:06:32 +0800 Message-ID: <1626275195-215652-7-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org It will be required to be able to allocate and free static requests for a tag set or a request queue (in addition to a struct blk_mq_tags), so break out the common functionality of blk_mq_{alloc,free}_rqs, passing the required static rqs ** and page list pointer to the core code. Signed-off-by: John Garry --- block/blk-mq.c | 61 +++++++++++++++++++++++++++++++++----------------- block/blk-mq.h | 7 ++++++ 2 files changed, 47 insertions(+), 21 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 3ff5c15ce0eb..801af0993044 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2310,14 +2310,15 @@ static size_t order_to_size(unsigned int order) } /* called before freeing request pool in @tags */ -static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set, - struct blk_mq_tags *tags, unsigned int hctx_idx) +void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set, + unsigned int hctx_idx, + struct list_head *page_list) { struct blk_mq_tags *drv_tags = set->tags[hctx_idx]; struct page *page; unsigned long flags; - list_for_each_entry(page, &tags->page_list, lru) { + list_for_each_entry(page, page_list, lru) { unsigned long start = (unsigned long)page_address(page); unsigned long end = start + order_to_size(page->private); int i; @@ -2343,28 +2344,29 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set, spin_unlock_irqrestore(&drv_tags->lock, flags); } -void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, - unsigned int hctx_idx) +void __blk_mq_free_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, + unsigned int depth, struct list_head *page_list, + struct request **static_rqs) { struct page *page; - if (tags->static_rqs && set->ops->exit_request) { + if (static_rqs && set->ops->exit_request) { int i; - for (i = 0; i < tags->nr_tags; i++) { - struct request *rq = tags->static_rqs[i]; + for (i = 0; i < depth; i++) { + struct request *rq = static_rqs[i]; if (!rq) continue; set->ops->exit_request(set, rq, hctx_idx); - tags->static_rqs[i] = NULL; + static_rqs[i] = NULL; } } - blk_mq_clear_rq_mapping(set, tags, hctx_idx); + blk_mq_clear_rq_mapping(set, hctx_idx, page_list); - while (!list_empty(&tags->page_list)) { - page = list_first_entry(&tags->page_list, struct page, lru); + while (!list_empty(page_list)) { + page = list_first_entry(page_list, struct page, lru); list_del_init(&page->lru); /* * Remove kmemleak object previously allocated in @@ -2375,6 +2377,13 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, } } +void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, + unsigned int hctx_idx) +{ + __blk_mq_free_rqs(set, hctx_idx, tags->nr_tags, &tags->page_list, + tags->static_rqs); +} + void blk_mq_free_rq_map(struct blk_mq_tags *tags, unsigned int flags) { kfree(tags->rqs); @@ -2437,8 +2446,10 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq, return 0; } -int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, - unsigned int hctx_idx, unsigned int depth) + +int __blk_mq_alloc_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, + unsigned int depth, struct list_head *page_list, + struct request **static_rqs) { unsigned int i, j, entries_per_page, max_order = 4; size_t rq_size, left; @@ -2448,7 +2459,7 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, if (node == NUMA_NO_NODE) node = set->numa_node; - INIT_LIST_HEAD(&tags->page_list); + INIT_LIST_HEAD(page_list); /* * rq_size is the size of the request plus driver payload, rounded @@ -2483,7 +2494,7 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, goto fail; page->private = this_order; - list_add_tail(&page->lru, &tags->page_list); + list_add_tail(&page->lru, page_list); p = page_address(page); /* @@ -2497,9 +2508,9 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, for (j = 0; j < to_do; j++) { struct request *rq = p; - tags->static_rqs[i] = rq; + static_rqs[i] = rq; if (blk_mq_init_request(set, rq, hctx_idx, node)) { - tags->static_rqs[i] = NULL; + static_rqs[i] = NULL; goto fail; } @@ -2510,10 +2521,17 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, return 0; fail: - blk_mq_free_rqs(set, tags, hctx_idx); + __blk_mq_free_rqs(set, hctx_idx, depth, page_list, static_rqs); return -ENOMEM; } +int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, + unsigned int hctx_idx, unsigned int depth) +{ + return __blk_mq_alloc_rqs(set, hctx_idx, depth, &tags->page_list, + tags->static_rqs); +} + struct rq_iter_data { struct blk_mq_hw_ctx *hctx; bool has_rq; @@ -2856,12 +2874,13 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set, int ret = 0; set->tags[hctx_idx] = blk_mq_alloc_rq_map(set, hctx_idx, - set->queue_depth, set->reserved_tags, flags); + set->queue_depth, + set->reserved_tags, flags); if (!set->tags[hctx_idx]) return false; ret = blk_mq_alloc_rqs(set, set->tags[hctx_idx], hctx_idx, - set->queue_depth); + set->queue_depth); if (!ret) return true; diff --git a/block/blk-mq.h b/block/blk-mq.h index d08779f77a26..1e0fbb06412b 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -54,6 +54,9 @@ void blk_mq_put_rq_ref(struct request *rq); */ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, unsigned int hctx_idx); +void __blk_mq_free_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, + unsigned int depth, struct list_head *page_list, + struct request **static_rqs); void blk_mq_free_rq_map(struct blk_mq_tags *tags, unsigned int flags); struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, unsigned int hctx_idx, @@ -63,6 +66,10 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, unsigned int hctx_idx, unsigned int depth); +int __blk_mq_alloc_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, + unsigned int depth, struct list_head *page_list, + struct request **static_rqs); + /* * Internal helpers for request insertion into sw queues */ From patchwork Wed Jul 14 15:06:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA3BBC11F66 for ; Wed, 14 Jul 2021 15:11:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A4308613D6 for ; Wed, 14 Jul 2021 15:11:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239654AbhGNPOX (ORCPT ); Wed, 14 Jul 2021 11:14:23 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3410 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239653AbhGNPOV (ORCPT ); Wed, 14 Jul 2021 11:14:21 -0400 Received: from fraeml742-chm.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ12n6kyjz6K648; Wed, 14 Jul 2021 23:02:57 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml742-chm.china.huawei.com (10.206.15.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:28 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:25 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH RFC 7/9] blk-mq: Allocate per tag set static rqs for shared sbitmap Date: Wed, 14 Jul 2021 23:06:33 +0800 Message-ID: <1626275195-215652-8-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org It is wasteful for memory that a full set of static rqs is allocated per hw queue for shared sbitmap, considering that the total number of requests usable at any given time across all hctx is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set, and not per hw queue. Make this change, and ensure that the hctx_idx argument to blk_mq_init_request() and __blk_mq_free_rqs() is == 0 for case of shared sbitmap. If a driver needs to init the static rq with a hctx idx, then it cannot now use shared sbitmap - only SCSI drivers and null block use shared sbitmap today, and neither use hctx_idx for static request init or free. Signed-off-by: John Garry --- block/blk-mq.c | 66 +++++++++++++++++++++++++++++++++++++----- include/linux/blk-mq.h | 4 +++ 2 files changed, 62 insertions(+), 8 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 801af0993044..764c601376c6 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2350,6 +2350,9 @@ void __blk_mq_free_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, { struct page *page; + if (WARN_ON((hctx_idx > 0) && blk_mq_is_sbitmap_shared(set->flags))) + return; + if (static_rqs && set->ops->exit_request) { int i; @@ -2455,6 +2458,9 @@ int __blk_mq_alloc_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, size_t rq_size, left; int node; + if (WARN_ON((hctx_idx > 0) && blk_mq_is_sbitmap_shared(set->flags))) + return -EINVAL; + node = blk_mq_hw_queue_to_node(&set->map[HCTX_TYPE_DEFAULT], hctx_idx); if (node == NUMA_NO_NODE) node = set->numa_node; @@ -2875,15 +2881,30 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set, set->tags[hctx_idx] = blk_mq_alloc_rq_map(set, hctx_idx, set->queue_depth, - set->reserved_tags, flags); + set->reserved_tags, + flags); if (!set->tags[hctx_idx]) return false; - ret = blk_mq_alloc_rqs(set, set->tags[hctx_idx], hctx_idx, - set->queue_depth); + if (blk_mq_is_sbitmap_shared(flags)) { + int i; + + if (!set->static_rqs) + goto err_out; + + for (i = 0; i < set->queue_depth; i++) + set->tags[hctx_idx]->static_rqs[i] = + set->static_rqs[i]; + + return true; + } else { + ret = blk_mq_alloc_rqs(set, set->tags[hctx_idx], hctx_idx, + set->queue_depth); + } if (!ret) return true; +err_out: blk_mq_free_rq_map(set->tags[hctx_idx], flags); set->tags[hctx_idx] = NULL; return false; @@ -2895,7 +2916,8 @@ static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set, unsigned int flags = set->flags; if (set->tags && set->tags[hctx_idx]) { - blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx); + if (!blk_mq_is_sbitmap_shared(set->flags)) + blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx); blk_mq_free_rq_map(set->tags[hctx_idx], flags); set->tags[hctx_idx] = NULL; } @@ -3363,6 +3385,21 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set) { int i; + if (blk_mq_is_sbitmap_shared(set->flags)) { + gfp_t flags = GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY; + int ret; + + set->static_rqs = kcalloc_node(set->queue_depth, + sizeof(struct request *), + flags, set->numa_node); + if (!set->static_rqs) + return -ENOMEM; + ret = __blk_mq_alloc_rqs(set, 0, set->queue_depth, + &set->page_list, set->static_rqs); + if (ret < 0) + goto out_free_static_rqs; + } + for (i = 0; i < set->nr_hw_queues; i++) { if (!__blk_mq_alloc_map_and_request(set, i)) goto out_unwind; @@ -3374,7 +3411,15 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set) out_unwind: while (--i >= 0) blk_mq_free_map_and_requests(set, i); - + if (blk_mq_is_sbitmap_shared(set->flags)) { + __blk_mq_free_rqs(set, 0, set->queue_depth, &set->page_list, + set->static_rqs); + } +out_free_static_rqs: + if (blk_mq_is_sbitmap_shared(set->flags)) { + kfree(set->static_rqs); + set->static_rqs = NULL; + } return -ENOMEM; } @@ -3601,12 +3646,17 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) { int i, j; + if (blk_mq_is_sbitmap_shared(set->flags)) { + blk_mq_exit_shared_sbitmap(set); + __blk_mq_free_rqs(set, 0, set->queue_depth, &set->page_list, + set->static_rqs); + kfree(set->static_rqs); + set->static_rqs = NULL; + } + for (i = 0; i < set->nr_hw_queues; i++) blk_mq_free_map_and_requests(set, i); - if (blk_mq_is_sbitmap_shared(set->flags)) - blk_mq_exit_shared_sbitmap(set); - for (j = 0; j < set->nr_maps; j++) { kfree(set->map[j].mq_map); set->map[j].mq_map = NULL; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 1d18447ebebc..68b648ab5435 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -261,6 +261,10 @@ struct blk_mq_tag_set { struct sbitmap_queue __breserved_tags; struct blk_mq_tags **tags; + /* for shared sbitmap */ + struct request **static_rqs; + struct list_head page_list; + struct mutex tag_list_lock; struct list_head tag_list; }; From patchwork Wed Jul 14 15:06:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8344EC11F66 for ; Wed, 14 Jul 2021 15:11:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6CC17613D3 for ; Wed, 14 Jul 2021 15:11:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239700AbhGNPO2 (ORCPT ); Wed, 14 Jul 2021 11:14:28 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3411 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239668AbhGNPOX (ORCPT ); Wed, 14 Jul 2021 11:14:23 -0400 Received: from fraeml741-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ12q6Qy4z6K640; Wed, 14 Jul 2021 23:02:59 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml741-chm.china.huawei.com (10.206.15.222) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:30 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:28 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 8/9] blk-mq: Allocate per request queue static rqs for shared sbitmap Date: Wed, 14 Jul 2021 23:06:34 +0800 Message-ID: <1626275195-215652-9-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Similar to allocating a full set of static rqs per hw queue per tag set for when using a shared sbitmap, it is also inefficient in memory terms to allocate a full set of static rqs per hw queue per request queue. Reduce memory usage by allocating a set of static rqs per request queue for when using a shared sbitmap, and make the per-hctx sched_tags->static_rqs[] point at them. Error handling for updating the number of requests in blk_mq_update_nr_requests() -> blk_mq_tag_update_depth() can get quite complicated, so allocate the full max depth of rqs at init time to try to simplify things. This will be somewhat inefficient for when the request queue depth is not close to max, but generally still more efficient than the current situation. For failures in blk_mq_update_nr_requests() -> blk_mq_tag_update_depth() when shrinking request queue depth, q->nr_requests still needs to be updated. This is because some of the hctx->sched_tags may be successfully updated, and now they are now smaller than q->nr_requests, which will lead to problems since a scheduler tag could be greater than hctx->sched_tags size. For failures in blk_mq_update_nr_requests() -> blk_mq_tag_update_depth() when growing a request queue depth, q->nr_requests does not need to be updated. Signed-off-by: John Garry --- block/blk-mq-sched.c | 51 ++++++++++++++++++++++++++++++++++++++---- block/blk-mq-tag.c | 12 +++++----- block/blk-mq.c | 20 ++++++++++++++++- include/linux/blkdev.h | 4 ++++ 4 files changed, 76 insertions(+), 11 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 1e028183557d..7b5c46647820 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -538,6 +538,9 @@ static int blk_mq_sched_alloc_tags(struct request_queue *q, if (!hctx->sched_tags) return -ENOMEM; + if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) + return 0; + ret = blk_mq_alloc_rqs(set, hctx->sched_tags, hctx_idx, q->nr_requests); if (ret) blk_mq_sched_free_tags(set, hctx, hctx_idx); @@ -563,8 +566,30 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue) { struct blk_mq_tag_set *set = queue->tag_set; int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags); + gfp_t flags = GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY; struct blk_mq_hw_ctx *hctx; - int ret, i; + int ret, i, j; + + /* + * In case we need to grow, allocate max we will ever need. This will + * waste memory when the request queue depth is less than the max, + * i.e. almost always. But helps keep our sanity, rather than dealing + * with error handling in blk_mq_update_nr_requests(). + */ + queue->static_rqs = kcalloc_node(MAX_SCHED_RQ, sizeof(struct request *), + flags, queue->node); + if (!queue->static_rqs) + return -ENOMEM; + + ret = __blk_mq_alloc_rqs(set, 0, MAX_SCHED_RQ, &queue->page_list, + queue->static_rqs); + if (ret) + goto err_rqs; + + queue_for_each_hw_ctx(queue, hctx, i) { + for (j = 0; j < queue->nr_requests; j++) + hctx->sched_tags->static_rqs[j] = queue->static_rqs[j]; + } /* * Set initial depth at max so that we don't need to reallocate for @@ -575,7 +600,7 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue) MAX_SCHED_RQ, set->reserved_tags, set->numa_node, alloc_policy); if (ret) - return ret; + goto err_bitmaps; queue_for_each_hw_ctx(queue, hctx, i) { hctx->sched_tags->bitmap_tags = @@ -587,10 +612,24 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue) blk_mq_tag_resize_sched_shared_sbitmap(queue); return 0; + +err_bitmaps: + __blk_mq_free_rqs(set, 0, MAX_SCHED_RQ, &queue->page_list, + queue->static_rqs); +err_rqs: + kfree(queue->static_rqs); + queue->static_rqs = NULL; + return ret; } static void blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue) { + __blk_mq_free_rqs(queue->tag_set, 0, MAX_SCHED_RQ, &queue->page_list, + queue->static_rqs); + + kfree(queue->static_rqs); + queue->static_rqs = NULL; + sbitmap_queue_free(&queue->sched_bitmap_tags); sbitmap_queue_free(&queue->sched_breserved_tags); } @@ -670,8 +709,12 @@ void blk_mq_sched_free_requests(struct request_queue *q) int i; queue_for_each_hw_ctx(q, hctx, i) { - if (hctx->sched_tags) - blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i); + if (hctx->sched_tags) { + if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) { + } else { + blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i); + } + } } } diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 55c7f1bf41c7..e5a21907306a 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -592,7 +592,6 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, if (tdepth > tags->nr_tags) { struct blk_mq_tag_set *set = hctx->queue->tag_set; struct blk_mq_tags *new; - bool ret; if (!can_grow) return -EINVAL; @@ -608,13 +607,14 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, tags->nr_reserved_tags, set->flags); if (!new) return -ENOMEM; - ret = blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth); - if (ret) { - blk_mq_free_rq_map(new, set->flags); - return -ENOMEM; + if (!blk_mq_is_sbitmap_shared(hctx->flags)) { + if (blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth)) { + blk_mq_free_rq_map(new, set->flags); + return -ENOMEM; + } + blk_mq_free_rqs(set, *tagsptr, hctx->queue_num); } - blk_mq_free_rqs(set, *tagsptr, hctx->queue_num); blk_mq_free_rq_map(*tagsptr, set->flags); *tagsptr = new; } else { diff --git a/block/blk-mq.c b/block/blk-mq.c index 764c601376c6..443fd64239ed 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3694,10 +3694,17 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags, nr, true); if (blk_mq_is_sbitmap_shared(set->flags)) { + int j; + hctx->sched_tags->bitmap_tags = &q->sched_bitmap_tags; hctx->sched_tags->breserved_tags = &q->sched_breserved_tags; + + for (j = 0;j < hctx->sched_tags->nr_tags; j++) { + hctx->sched_tags->static_rqs[j] = + q->static_rqs[j]; + } } } else { ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr, @@ -3708,7 +3715,18 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) if (q->elevator && q->elevator->type->ops.depth_updated) q->elevator->type->ops.depth_updated(hctx); } - if (!ret) { + if (ret) { + if (blk_mq_is_sbitmap_shared(set->flags) && (q->elevator)) { + /* + * If we error'ed, then we need to revert to the + * lowest size, otherwise we may attempt to reference + * unset hctx->sched_tags->static_rqs[] + */ + q->nr_requests = min((unsigned long)nr, + q->nr_requests); + blk_mq_tag_resize_sched_shared_sbitmap(q); + } + } else { q->nr_requests = nr; if (blk_mq_is_sbitmap_shared(set->flags)) { if (q->elevator) { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6a64ea23f552..97e80a07adb2 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -470,6 +470,10 @@ struct request_queue { struct sbitmap_queue sched_bitmap_tags; struct sbitmap_queue sched_breserved_tags; + /* For shared sbitmap */ + struct request **static_rqs; + struct list_head page_list; + struct list_head icq_list; #ifdef CONFIG_BLK_CGROUP DECLARE_BITMAP (blkcg_pols, BLKCG_MAX_POLS); From patchwork Wed Jul 14 15:06:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 12377235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C207C11F66 for ; Wed, 14 Jul 2021 15:11:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 264C9613F5 for ; Wed, 14 Jul 2021 15:11:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239728AbhGNPOb (ORCPT ); Wed, 14 Jul 2021 11:14:31 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3412 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239686AbhGNPOZ (ORCPT ); Wed, 14 Jul 2021 11:14:25 -0400 Received: from fraeml739-chm.china.huawei.com (unknown [172.18.147.226]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GQ0w14vGsz6D97R; Wed, 14 Jul 2021 22:57:05 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml739-chm.china.huawei.com (10.206.15.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 17:11:32 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 16:11:30 +0100 From: John Garry To: CC: , , , , , , John Garry Subject: [PATCH 9/9] blk-mq: Clear mappings for shared sbitmap sched static rqs Date: Wed, 14 Jul 2021 23:06:35 +0800 Message-ID: <1626275195-215652-10-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626275195-215652-1-git-send-email-john.garry@huawei.com> References: <1626275195-215652-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org To avoid use-after-free in accessing stale requests in the driver tags rqs[], clear the mappings for the request queue static rqs. Signed-off-by: John Garry --- block/blk-mq-sched.c | 1 + block/blk-mq.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 7b5c46647820..f1cea7f3bc68 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -711,6 +711,7 @@ void blk_mq_sched_free_requests(struct request_queue *q) queue_for_each_hw_ctx(q, hctx, i) { if (hctx->sched_tags) { if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) { + blk_mq_clear_rq_mapping(q->tag_set, i, &q->page_list); } else { blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i); } diff --git a/block/blk-mq.h b/block/blk-mq.h index 1e0fbb06412b..a5b7aa7a07b9 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -69,6 +69,8 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, int __blk_mq_alloc_rqs(struct blk_mq_tag_set *set, unsigned int hctx_idx, unsigned int depth, struct list_head *page_list, struct request **static_rqs); +void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set, unsigned int hctx_idx, + struct list_head *page_list); /* * Internal helpers for request insertion into sw queues