From patchwork Tue Dec 17 02:40:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13910933 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EBA3513DBB6; Tue, 17 Dec 2024 02:44:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403476; cv=none; b=HWCywPcvbh2G/PimA6PAu9v+w8vnoO0cqL7D/u68m1LW70jQklDQU9dVU+3upGBL6Uwi1AAkpT2lhbOutlpI3D6hOTaqa2gPnPhBGvoCOi1cv8v4RZMOrxF023S9QlwgpJstbrVldXMUBs9aRfwQBdDAOP274sDI2RJmN6zMlf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403476; c=relaxed/simple; bh=fBY0IkGRzmci9c0dwhhGUl9Nsn/ahC9Wq/2R6REVevI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RyF1zCj+ZmZAP3wIt+47qrKa3ad3Y9eQ3ousTLAgwOe5+/pw5arOGWcYwLfNx2EUkJFynVQeO0cgCkwTGYrPilNieTRutenndnx6gFplkUXEJ7bbZnKFDscPRttRgBis6C1YLLua8vs2LzvmdlJNcrcvSWpGdfrvyl2GHUY1tbU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YC1NT0p2Nz4f3jqb; Tue, 17 Dec 2024 10:44:09 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 84C1D1A018C; Tue, 17 Dec 2024 10:44:23 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHMYWE5WBndhZJEw--.48443S5; Tue, 17 Dec 2024 10:44:23 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, akpm@linux-foundation.org, ming.lei@redhat.com, yang.yang@vivo.com, bvanassche@acm.org, osandov@fb.com, paolo.valente@linaro.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 1/4] block/mq-deadline: Revert "block/mq-deadline: Fix the tag reservation code" Date: Tue, 17 Dec 2024 10:40:44 +0800 Message-Id: <20241217024047.1091893-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20241217024047.1091893-1-yukuai1@huaweicloud.com> References: <20241217024047.1091893-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHMYWE5WBndhZJEw--.48443S5 X-Coremail-Antispam: 1UD129KBjvJXoWxWr1fJry7ZryUKry5tw4fKrg_yoW5Ww1xpr 4rJanFkr1rJF4q9F1Ut39xZr10gws3WryfJF1Yqw1Skr1DAa9rXF15tFWrZFyIvrWxCa1j 9rWqqr95JwnFvFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JU4OJ5UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ From: Yu Kuai This reverts commit 39823b47bbd40502632ffba90ebb34fff7c8b5e8. 1) Set min_shallow_depth to 1 will end up setting wake_batch to 1, and this will cause performance degradation in some high concurrency test, for both IO bandwidth and cpu usage. async_depth can be changed by sysfs, and the minimal value is 1. This is why min_shallow_depth is set to 1 at initialization to make sure functional is correct if async_depth is set to 1. However, sacrifice performance in the default scenario is not acceptable. 2) dd_to_word_depth() is supposed to scale down async_depth, however, user can set low nr_requests and sb->depth can be less than 1 << sb->shift, then dd_to_word_depth() will end up scale up async_depth. Fixes: 39823b47bbd4 ("block/mq-deadline: Fix the tag reservation code") Signed-off-by: Yu Kuai --- block/mq-deadline.c | 20 +++----------------- 1 file changed, 3 insertions(+), 17 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 91b3789f710e..1f0d175a941e 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -487,20 +487,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) return rq; } -/* - * 'depth' is a number in the range 1..INT_MAX representing a number of - * requests. Scale it with a factor (1 << bt->sb.shift) / q->nr_requests since - * 1..(1 << bt->sb.shift) is the range expected by sbitmap_get_shallow(). - * Values larger than q->nr_requests have the same effect as q->nr_requests. - */ -static int dd_to_word_depth(struct blk_mq_hw_ctx *hctx, unsigned int qdepth) -{ - struct sbitmap_queue *bt = &hctx->sched_tags->bitmap_tags; - const unsigned int nrr = hctx->queue->nr_requests; - - return ((qdepth << bt->sb.shift) + nrr - 1) / nrr; -} - /* * Called by __blk_mq_alloc_request(). The shallow_depth value set by this * function is used by __blk_mq_get_tag(). @@ -517,7 +503,7 @@ static void dd_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data) * Throttle asynchronous requests and writes such that these requests * do not block the allocation of synchronous requests. */ - data->shallow_depth = dd_to_word_depth(data->hctx, dd->async_depth); + data->shallow_depth = dd->async_depth; } /* Called by blk_mq_update_nr_requests(). */ @@ -527,9 +513,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx) struct deadline_data *dd = q->elevator->elevator_data; struct blk_mq_tags *tags = hctx->sched_tags; - dd->async_depth = q->nr_requests; + dd->async_depth = max(1UL, 3 * q->nr_requests / 4); - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1); + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth); } /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */ From patchwork Tue Dec 17 02:40:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13910929 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB9EB84E1C; Tue, 17 Dec 2024 02:44:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403475; cv=none; b=ey6Ks335GhyyDeKh7hxsIgLiWtOWYapdzBdclSN1VuC3jtWNz3Mge1auM4MNZryvEWI/XUMCaf2mYKYswJhPyUVLWdvXHOSB/j9weOaDKQMeIUQ/HlPyTep5xmKF7RtN5WfRHp3ElxDpixb5bQjNrwq/OYmHU8mTAmJPMbBsyLs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403475; c=relaxed/simple; bh=j+LT9LQcC+fMKtmYV6yS0Laov4aaG0UX4qIxbAPTanE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kwA/ZD2eaoNzB1dVkjMJSFDB6ia2civ1gcuWSEqhU6ohFPOzbwf7cn14mTOSKbOGDApg3M6QqwCoq/qHwgJHx/wLa7BMgXr80/WP5qnSBeNzEcAtkyZ4rOJtHxGQYwOFtBWmUqYN48fGHZ4Qja0LlO6HGb/jyP9YM9DXZmBK/qU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YC1NM4CLSz4f3lW6; Tue, 17 Dec 2024 10:44:03 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 137411A0196; Tue, 17 Dec 2024 10:44:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHMYWE5WBndhZJEw--.48443S6; Tue, 17 Dec 2024 10:44:23 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, akpm@linux-foundation.org, ming.lei@redhat.com, yang.yang@vivo.com, bvanassche@acm.org, osandov@fb.com, paolo.valente@linaro.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH v2 RFC 2/4] lib/sbitmap: fix shallow_depth tag allocation Date: Tue, 17 Dec 2024 10:40:45 +0800 Message-Id: <20241217024047.1091893-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20241217024047.1091893-1-yukuai1@huaweicloud.com> References: <20241217024047.1091893-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHMYWE5WBndhZJEw--.48443S6 X-Coremail-Antispam: 1UD129KBjvJXoW3WF47uFy7Wr4DKF4xGF48WFg_yoW7GF48pF 4xK3WxGrWFyryj9rn8G3yDZF90qw4DGwnxGF4fWw1Fkr45Janavr9YkFyaqa4xCFWkZFWU ZFWrXry8G34UXaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUQXo7UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ From: Yu Kuai Currently, shallow_depth is used by bfq, kyber and mq-deadline, they both pass in the value for the whole sbitmap, while sbitmap treats the value for just one word. Which means, shallow_depth never work as expected, and there really is no such functional tests to covert it. Consider that callers doesn't know which word will be used, and it's not possible to distinguish the last word and previous word. Fix this problem by treating shallow_depth for the whole sbitmap in sbitmap_find_bit(). Fixes: 00e043936e9a ("blk-mq: introduce Kyber multiqueue I/O scheduler") Fixes: a52a69ea89dc ("block, bfq: limit tags for writes and async I/O") Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests") Signed-off-by: Yu Kuai --- include/linux/sbitmap.h | 6 ++--- lib/sbitmap.c | 55 +++++++++++++++++++++-------------------- 2 files changed, 31 insertions(+), 30 deletions(-) diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 189140bf11fc..92e77bc13cf6 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -213,12 +213,12 @@ int sbitmap_get(struct sbitmap *sb); * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap, * limiting the depth used from each word. * @sb: Bitmap to allocate from. - * @shallow_depth: The maximum number of bits to allocate from a single word. + * @shallow_depth: The maximum number of bits to allocate from the bitmap. * * This rather specific operation allows for having multiple users with * different allocation limits. E.g., there can be a high-priority class that * uses sbitmap_get() and a low-priority class that uses sbitmap_get_shallow() - * with a @shallow_depth of (1 << (@sb->shift - 1)). Then, the low-priority + * with a @shallow_depth of (sb->depth << 1). Then, the low-priority * class can only allocate half of the total bits in the bitmap, preventing it * from starving out the high-priority class. * @@ -478,7 +478,7 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, * sbitmap_queue, limiting the depth used from each word, with preemption * already disabled. * @sbq: Bitmap queue to allocate from. - * @shallow_depth: The maximum number of bits to allocate from a single word. + * @shallow_depth: The maximum number of bits to allocate from the queue. * See sbitmap_get_shallow(). * * If you call this, make sure to call sbitmap_queue_min_shallow_depth() after diff --git a/lib/sbitmap.c b/lib/sbitmap.c index d3412984170c..6b8b909614a5 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -208,8 +208,27 @@ static int sbitmap_find_bit_in_word(struct sbitmap_word *map, return nr; } +static unsigned int __map_depth_with_shallow(const struct sbitmap *sb, + int index, + unsigned int shallow_depth) +{ + unsigned int pre_word_bits = 0; + + if (shallow_depth >= sb->depth) + return __map_depth(sb, index); + + if (index > 0) + pre_word_bits += (index - 1) << sb->shift; + + if (shallow_depth <= pre_word_bits) + return 0; + + return min_t(unsigned int, __map_depth(sb, index), + shallow_depth - pre_word_bits); +} + static int sbitmap_find_bit(struct sbitmap *sb, - unsigned int depth, + unsigned int shallow_depth, unsigned int index, unsigned int alloc_hint, bool wrap) @@ -218,12 +237,12 @@ static int sbitmap_find_bit(struct sbitmap *sb, int nr = -1; for (i = 0; i < sb->map_nr; i++) { - nr = sbitmap_find_bit_in_word(&sb->map[index], - min_t(unsigned int, - __map_depth(sb, index), - depth), - alloc_hint, wrap); + unsigned int depth = __map_depth_with_shallow(sb, index, + shallow_depth); + if (depth) + nr = sbitmap_find_bit_in_word(&sb->map[index], depth, + alloc_hint, wrap); if (nr != -1) { nr += index << sb->shift; break; @@ -406,27 +425,9 @@ EXPORT_SYMBOL_GPL(sbitmap_bitmap_show); static unsigned int sbq_calc_wake_batch(struct sbitmap_queue *sbq, unsigned int depth) { - unsigned int wake_batch; - unsigned int shallow_depth; - - /* - * Each full word of the bitmap has bits_per_word bits, and there might - * be a partial word. There are depth / bits_per_word full words and - * depth % bits_per_word bits left over. In bitwise arithmetic: - * - * bits_per_word = 1 << shift - * depth / bits_per_word = depth >> shift - * depth % bits_per_word = depth & ((1 << shift) - 1) - * - * Each word can be limited to sbq->min_shallow_depth bits. - */ - shallow_depth = min(1U << sbq->sb.shift, sbq->min_shallow_depth); - depth = ((depth >> sbq->sb.shift) * shallow_depth + - min(depth & ((1U << sbq->sb.shift) - 1), shallow_depth)); - wake_batch = clamp_t(unsigned int, depth / SBQ_WAIT_QUEUES, 1, - SBQ_WAKE_BATCH); - - return wake_batch; + return clamp_t(unsigned int, + min(depth, sbq->min_shallow_depth) / SBQ_WAIT_QUEUES, + 1, SBQ_WAKE_BATCH); } int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, From patchwork Tue Dec 17 02:40:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13910932 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED539142E77; Tue, 17 Dec 2024 02:44:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403476; cv=none; b=YepBDKYdodxMGnffQFVp+GT5gtuQSN244bpiNkE9S1ebbRhCqRLBumd1vRwoezs8AFdf9jiCw7VKtQHD4ipheoDsfjcDARLFA6rQx9q7x6KC4yx/Ysq/f8aiGUFQ7f3Ndov3vZ/KgL8m166U7NaAWTSHZCvg8LTpnYxBkl15JEk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403476; c=relaxed/simple; bh=KGUzL3di9Ia5YRhTXvPCVyRFD/NrCSniOiQsSTfYg1E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QppYJ5TJynp9jDOPeRYKC4mAwvTuYyN1HiGSasG1tXFg1zgW/ouwFXCi6MyeIGzODTALGrvqMZN0i9UkSvMLddH1qxzHFXvasST44vqZl89QXiNt7/uY/bQ2udK77+nt5iRL6Ma6Oewk5c6RElnvL1HnnqPeQy9wG0guBdDxJJw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4YC1NP12b5z4f3jrl; Tue, 17 Dec 2024 10:44:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 95D411A058E; Tue, 17 Dec 2024 10:44:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHMYWE5WBndhZJEw--.48443S7; Tue, 17 Dec 2024 10:44:24 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, akpm@linux-foundation.org, ming.lei@redhat.com, yang.yang@vivo.com, bvanassche@acm.org, osandov@fb.com, paolo.valente@linaro.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH v2 3/4] block/elevator: choose none elevator for high IO concurrency ability disk Date: Tue, 17 Dec 2024 10:40:46 +0800 Message-Id: <20241217024047.1091893-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20241217024047.1091893-1-yukuai1@huaweicloud.com> References: <20241217024047.1091893-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHMYWE5WBndhZJEw--.48443S7 X-Coremail-Antispam: 1UD129KBjvJXoW7ZF1DJw45AFWDCrWkXrWDArb_yoW8Gw47pF 4rKwsIkw4ktFnFk3ykAw1fX3W5ta9Ygr17Wr42y345GFn7XrW7C3W5CF4rXFWxCF4fGanF vr4DJFWkGa4jqrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42 IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIev Ja73UjIFyTuYvjfUO_MaUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ From: Yu Kuai The maximal default nr_requests is 256, and if disk can handle more than 256 requests concurrently, use elevator in this case is useless, on the one hand it limits the number of requests to 256, on the other hand, it can't merge or sort IO because requests are dispatched to disk immediately and the elevator is just empty. For example, for nvme megaraid with 512 queue_depth by default, we have to change default elevator to none, otherwise deadline will lose a lot of performance. Signed-off-by: Yu Kuai --- block/elevator.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/block/elevator.c b/block/elevator.c index 7c3ba80e5ff4..4cce1e7c47d5 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -568,6 +568,17 @@ static struct elevator_type *elevator_get_default(struct request_queue *q) !blk_mq_is_shared_tags(q->tag_set->flags)) return NULL; + /* + * If nr_queues will be less than disk ability, requests will be + * dispatched to disk immediately, it's useless to use elevator. User + * should set a bigger nr_requests or limit disk ability manually if + * they really want to use elevator. + */ + if (q->queue_depth && q->queue_depth >= BLKDEV_DEFAULT_RQ * 2) + return NULL; + if (!q->queue_depth && q->tag_set->queue_depth >= BLKDEV_DEFAULT_RQ * 2) + return NULL; + return elevator_find_get("mq-deadline"); } From patchwork Tue Dec 17 02:40:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13910931 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F79018035; Tue, 17 Dec 2024 02:44:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403475; cv=none; b=gAfFxzhnPAfxbFlNrM64794SbXTPas/vpCkTSJXcjRXQSRQPjIhTi6e18ZsUUNKq29LtVGoB0r5X+8+U2E9xlFf8yjlXAGWKZ9GlpQbnZ4T3fQj7KH3GWDti5VSkuRl3GIwWhEjPiJnzdEOOwyLLiP6BtHgz1f9kBtUdUOk/ELs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734403475; c=relaxed/simple; bh=K9SoLNCfEeUaljYFOjFnHTxHoqjk1HY5h19jRWTR4As=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WPePlKg0EhbiGXRiBo0S58IfnsaYyjA8BPZkvan6kUqVJEXEeTNfraFNkLZ4aza8u9pXc/aYyu5vMe6DcnVq+yupWa9Dmn4xrjCZv4fXQWVN7veogI+Z0MPDYWNlw7xkLKDJL5sgBj7m7VqOSz1FGtDBTL24CnojOZ7DQcnKqyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YC1NN4JZBz4f3lWG; Tue, 17 Dec 2024 10:44:04 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 1C67A1A018D; Tue, 17 Dec 2024 10:44:25 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHMYWE5WBndhZJEw--.48443S8; Tue, 17 Dec 2024 10:44:24 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, akpm@linux-foundation.org, ming.lei@redhat.com, yang.yang@vivo.com, bvanassche@acm.org, osandov@fb.com, paolo.valente@linaro.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 4/4] block/mq-deadline: introduce min_async_depth Date: Tue, 17 Dec 2024 10:40:47 +0800 Message-Id: <20241217024047.1091893-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20241217024047.1091893-1-yukuai1@huaweicloud.com> References: <20241217024047.1091893-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHMYWE5WBndhZJEw--.48443S8 X-Coremail-Antispam: 1UD129KBjvJXoWxCF4DAF4Dtw4ruFWftr4fZrb_yoW5AF47pr W5Ja1kKr1jqF4j9rWDA3y3Xr1rWws5Wry3GF1rt3yfCF95ArZxXF1rtF13WF93XFZ7Aw47 Kr1qqayfJ3Z8tFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUOyIUUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ From: Yu Kuai min_shallow_depth must be less or equal to any shallow_depth value, and it's 1 currently, and this will change default wake_batch to 1, causing performance degradation for fast disk with high concurrency. This patch make following changes: - set default minimal async_depth to 64, to avoid performance degradation in the commen case. And user can set lower value if necessary. - disable throttling asynchronous requests by default, to prevent performance degradation in some special setup. User must set a value to async_depth to enable it. - if async_depth is set already, don't reset it if user sets new nr_requests. Fixes: 07757588e507 ("block/mq-deadline: Reserve 25% of scheduler tags for synchronous requests") Signed-off-by: Yu Kuai --- block/mq-deadline.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 1f0d175a941e..9be0a33985ce 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -24,6 +24,16 @@ #include "blk-mq-debugfs.h" #include "blk-mq-sched.h" +/* + * async_depth is used to reserve scheduler tags for synchronous requests, + * and the value will affect sbitmap wake_batch. The default minimal value is 64 + * because the corresponding wake_batch is 8, and lower wake_batch may affect + * IO performance. + */ +static unsigned int min_async_depth = 64; +module_param(min_async_depth, int, 0444); +MODULE_PARM_DESC(min_async_depth, "The minimal number of tags available for asynchronous requests"); + /* * See Documentation/block/deadline-iosched.rst */ @@ -513,9 +523,12 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx) struct deadline_data *dd = q->elevator->elevator_data; struct blk_mq_tags *tags = hctx->sched_tags; - dd->async_depth = max(1UL, 3 * q->nr_requests / 4); + if (q->nr_requests > min_async_depth) + sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, + min_async_depth); - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth); + if (q->nr_requests <= dd->async_depth) + dd->async_depth = 0; } /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */ @@ -814,7 +827,7 @@ STORE_JIFFIES(deadline_write_expire_store, &dd->fifo_expire[DD_WRITE], 0, INT_MA STORE_JIFFIES(deadline_prio_aging_expire_store, &dd->prio_aging_expire, 0, INT_MAX); STORE_INT(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX); STORE_INT(deadline_front_merges_store, &dd->front_merges, 0, 1); -STORE_INT(deadline_async_depth_store, &dd->async_depth, 1, INT_MAX); +STORE_INT(deadline_async_depth_store, &dd->async_depth, min_async_depth, INT_MAX); STORE_INT(deadline_fifo_batch_store, &dd->fifo_batch, 0, INT_MAX); #undef STORE_FUNCTION #undef STORE_INT