From patchwork Thu Apr 17 10:58:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055308 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D115C22AE74 for ; Thu, 17 Apr 2025 11:08:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888121; cv=none; b=C2vDt13wXakaI9WZwyRzHt4PEwc4GHgBD3tJ+XSeaxJj1jHhqyERAu01tzywKH0VmDXFOry/Yg39braIjlgEdnlBgXVrk+b0gs6iQsvUoYmwD4JYWZxdsCcGYaDga4dCzcGDbPYfjf/42dXcAFdCbO87RXv9u5tmnW0FVDkiAfs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888121; c=relaxed/simple; bh=C7h4JePUzyuPmQOS4n7fYcAG45/rSbNIWBwh81LbkfM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HT14sOZXuMLBasSsvXJA3NHlowhO40h7n6Q4qr2EnT284IK2vOQssM/owLbEXr1TqANnic+8IEq8xWxkJZfirugjXdQYc/Ks89FO52hwKhgYAHO2osT8iIV4Jv6HJh+ASqqCT8MHK+WIvGttwgQTbsvvHg9ZUuMP+QKY3i/5LkY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4ZdZrD1BD6z4f3jtT for ; Thu, 17 Apr 2025 19:08:12 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 87E451A058E for ; Thu, 17 Apr 2025 19:08:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S5; Thu, 17 Apr 2025 19:08:30 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 1/7] blk-throttle: Rename tg_may_dispatch() to tg_dispatch_time() Date: Thu, 17 Apr 2025 18:58:27 +0800 Message-ID: <20250417105833.1930283-2-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S5 X-Coremail-Antispam: 1UD129KBjvJXoWxGr4rKFy5tw17KFW8Cry5CFg_yoWrArW8pF W2kF45Wa18JFsFkr43ZFnrCFyrtws7X3srGrZ3G3ySya1jvr98tFn5ZryFyFWxAF93WFsI vrWvy3srG3WjyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmCb4IE77IF4wAFF20E14v26ryj6rWUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_JFI_Gr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262 kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCFx2Iq xVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r 106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AK xVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7 xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_ Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07jb5rcUUUUU= Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ tg_may_dispatch() can directly indicate whether bio can be dispatched by returning the time to wait, without the need for the redundant "wait" parameter. Remove it and modify the function's return type accordingly. Since we have determined by the return time whether bio can be dispatched, rename tg_may_dispatch() to tg_dispatch_time(). Signed-off-by: Zizhi Wo Reviewed-by: Yu Kuai --- block/blk-throttle.c | 40 +++++++++++++++------------------------- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 91dab43c65ab..dc23c961c028 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -743,14 +743,13 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, } /* - * Returns whether one can dispatch a bio or not. Also returns approx number - * of jiffies to wait before this bio is with-in IO rate and can be dispatched + * Returns approx number of jiffies to wait before this bio is with-in IO rate + * and can be dispatched. */ -static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio, - unsigned long *wait) +static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) { bool rw = bio_data_dir(bio); - unsigned long bps_wait = 0, iops_wait = 0, max_wait = 0; + unsigned long bps_wait, iops_wait, max_wait; u64 bps_limit = tg_bps_limit(tg, rw); u32 iops_limit = tg_iops_limit(tg, rw); @@ -765,11 +764,8 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio, /* If tg->bps = -1, then BW is unlimited */ if ((bps_limit == U64_MAX && iops_limit == UINT_MAX) || - tg->flags & THROTL_TG_CANCELING) { - if (wait) - *wait = 0; - return true; - } + tg->flags & THROTL_TG_CANCELING) + return 0; /* * If previous slice expired, start a new one otherwise renew/extend @@ -789,21 +785,15 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio, bps_wait = tg_within_bps_limit(tg, bio, bps_limit); iops_wait = tg_within_iops_limit(tg, bio, iops_limit); - if (bps_wait + iops_wait == 0) { - if (wait) - *wait = 0; - return true; - } + if (bps_wait + iops_wait == 0) + return 0; max_wait = max(bps_wait, iops_wait); - if (wait) - *wait = max_wait; - if (time_before(tg->slice_end[rw], jiffies + max_wait)) throtl_extend_slice(tg, rw, jiffies + max_wait); - return false; + return max_wait; } static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio) @@ -854,16 +844,16 @@ static void throtl_add_bio_tg(struct bio *bio, struct throtl_qnode *qn, static void tg_update_disptime(struct throtl_grp *tg) { struct throtl_service_queue *sq = &tg->service_queue; - unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime; + unsigned long read_wait = -1, write_wait = -1, min_wait, disptime; struct bio *bio; bio = throtl_peek_queued(&sq->queued[READ]); if (bio) - tg_may_dispatch(tg, bio, &read_wait); + read_wait = tg_dispatch_time(tg, bio); bio = throtl_peek_queued(&sq->queued[WRITE]); if (bio) - tg_may_dispatch(tg, bio, &write_wait); + write_wait = tg_dispatch_time(tg, bio); min_wait = min(read_wait, write_wait); disptime = jiffies + min_wait; @@ -941,7 +931,7 @@ static int throtl_dispatch_tg(struct throtl_grp *tg) /* Try to dispatch 75% READS and 25% WRITES */ while ((bio = throtl_peek_queued(&sq->queued[READ])) && - tg_may_dispatch(tg, bio, NULL)) { + tg_dispatch_time(tg, bio) == 0) { tg_dispatch_one_bio(tg, READ); nr_reads++; @@ -951,7 +941,7 @@ static int throtl_dispatch_tg(struct throtl_grp *tg) } while ((bio = throtl_peek_queued(&sq->queued[WRITE])) && - tg_may_dispatch(tg, bio, NULL)) { + tg_dispatch_time(tg, bio) == 0) { tg_dispatch_one_bio(tg, WRITE); nr_writes++; @@ -1610,7 +1600,7 @@ static bool tg_within_limit(struct throtl_grp *tg, struct bio *bio, bool rw) if (tg->service_queue.nr_queued[rw]) return false; - return tg_may_dispatch(tg, bio, NULL); + return tg_dispatch_time(tg, bio) == 0; } bool __blk_throtl_bio(struct bio *bio) From patchwork Thu Apr 17 10:58:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055312 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 681B82356CA for ; Thu, 17 Apr 2025 11:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; cv=none; b=fSY6mGbOIMSTF96mPEOkcmgR2sY+7ITI7EG7t32dFpGk+JUBEx6iLEBhesZPQXXa3Rm3R2PYYtvgCfB9+McTINX0HFWNt8dRwgMizOpB41OXFsqgikOhYmDwL1d2OdWay9m5/nVYbjbPI53hbUirN3ORrjPOAtLMz8HG+9kwSb4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; c=relaxed/simple; bh=rq2nzB6AGN+njRKRrRXGx+qRmOmuKYg6BXl3Ej5JXxA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fqKdeaLgGLnKrhpkcjG9am2VqI3Cv4qUiS5ZwEKWMyhrERCIg2oEufiCFTy6pE5FaWZl40NrDyoUi28xe8v8dB5CqJ+UVzkX7AUVoX8n6+/5wspiQyha4lhrh72htpfYdJ6xRizNz9EY4wqQKkIHSXKjOIZ3AHQYPV/QDl4JzfU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4ZdZr549GPz4f3m7D for ; Thu, 17 Apr 2025 19:08:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D5B0B1A06D7 for ; Thu, 17 Apr 2025 19:08:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S6; Thu, 17 Apr 2025 19:08:30 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 2/7] blk-throttle: Refactor tg_dispatch_time by extracting tg_dispatch_bps/iops_time Date: Thu, 17 Apr 2025 18:58:28 +0800 Message-ID: <20250417105833.1930283-3-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S6 X-Coremail-Antispam: 1UD129KBjvJXoW3WF4fKFWDGr4kAFykuF15Jwb_yoW7uw1kpr W3Ca1jqF48XFn7KFW3Zrn8GayFkws7Ar9rJa9rGryDAF4YvFyDKF1kZryYvFW5AF97ua17 AFyqv347Ca1qyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmCb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXw A2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262 kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCFx2Iq xVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r 106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AK xVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7 xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_ Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07jYXdbUUUUU= Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ tg_dispatch_time() contained both bps and iops throttling logic. We now split its internal logic into tg_dispatch_bps/iops_time() to improve code consistency for future separation of the bps and iops queues. Besides, merge time_before() from caller into throtl_extend_slice() to make code cleaner. Signed-off-by: Zizhi Wo Reviewed-by: Yu Kuai --- block/blk-throttle.c | 98 +++++++++++++++++++++++++------------------- 1 file changed, 55 insertions(+), 43 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index dc23c961c028..0633ae0cce90 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -520,6 +520,9 @@ static inline void throtl_set_slice_end(struct throtl_grp *tg, bool rw, static inline void throtl_extend_slice(struct throtl_grp *tg, bool rw, unsigned long jiffy_end) { + if (!time_before(tg->slice_end[rw], jiffy_end)) + return; + throtl_set_slice_end(tg, rw, jiffy_end); throtl_log(&tg->service_queue, "[%c] extend slice start=%lu end=%lu jiffies=%lu", @@ -682,10 +685,6 @@ static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio int io_allowed; unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; - if (iops_limit == UINT_MAX) { - return 0; - } - jiffy_elapsed = jiffies - tg->slice_start[rw]; /* Round up to the next throttle slice, wait time must be nonzero */ @@ -711,11 +710,6 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; unsigned int bio_size = throtl_bio_data_size(bio); - /* no need to throttle if this bio's bytes have been accounted */ - if (bps_limit == U64_MAX || bio_flagged(bio, BIO_BPS_THROTTLED)) { - return 0; - } - jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw]; /* Slice has just started. Consider one slice interval */ @@ -742,6 +736,54 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, return jiffy_wait; } +/* + * If previous slice expired, start a new one otherwise renew/extend existing + * slice to make sure it is at least throtl_slice interval long since now. + * New slice is started only for empty throttle group. If there is queued bio, + * that means there should be an active slice and it should be extended instead. + */ +static void tg_update_slice(struct throtl_grp *tg, bool rw) +{ + if (throtl_slice_used(tg, rw) && !(tg->service_queue.nr_queued[rw])) + throtl_start_new_slice(tg, rw, true); + else + throtl_extend_slice(tg, rw, jiffies + tg->td->throtl_slice); +} + +static unsigned long tg_dispatch_bps_time(struct throtl_grp *tg, struct bio *bio) +{ + bool rw = bio_data_dir(bio); + u64 bps_limit = tg_bps_limit(tg, rw); + unsigned long bps_wait; + + /* no need to throttle if this bio's bytes have been accounted */ + if (bps_limit == U64_MAX || tg->flags & THROTL_TG_CANCELING || + bio_flagged(bio, BIO_BPS_THROTTLED)) + return 0; + + tg_update_slice(tg, rw); + bps_wait = tg_within_bps_limit(tg, bio, bps_limit); + throtl_extend_slice(tg, rw, jiffies + bps_wait); + + return bps_wait; +} + +static unsigned long tg_dispatch_iops_time(struct throtl_grp *tg, struct bio *bio) +{ + bool rw = bio_data_dir(bio); + u32 iops_limit = tg_iops_limit(tg, rw); + unsigned long iops_wait; + + if (iops_limit == UINT_MAX || tg->flags & THROTL_TG_CANCELING) + return 0; + + tg_update_slice(tg, rw); + iops_wait = tg_within_iops_limit(tg, bio, iops_limit); + throtl_extend_slice(tg, rw, jiffies + iops_wait); + + return iops_wait; +} + /* * Returns approx number of jiffies to wait before this bio is with-in IO rate * and can be dispatched. @@ -749,9 +791,7 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) { bool rw = bio_data_dir(bio); - unsigned long bps_wait, iops_wait, max_wait; - u64 bps_limit = tg_bps_limit(tg, rw); - u32 iops_limit = tg_iops_limit(tg, rw); + unsigned long bps_wait, iops_wait; /* * Currently whole state machine of group depends on first bio @@ -762,38 +802,10 @@ static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) BUG_ON(tg->service_queue.nr_queued[rw] && bio != throtl_peek_queued(&tg->service_queue.queued[rw])); - /* If tg->bps = -1, then BW is unlimited */ - if ((bps_limit == U64_MAX && iops_limit == UINT_MAX) || - tg->flags & THROTL_TG_CANCELING) - return 0; - - /* - * If previous slice expired, start a new one otherwise renew/extend - * existing slice to make sure it is at least throtl_slice interval - * long since now. New slice is started only for empty throttle group. - * If there is queued bio, that means there should be an active - * slice and it should be extended instead. - */ - if (throtl_slice_used(tg, rw) && !(tg->service_queue.nr_queued[rw])) - throtl_start_new_slice(tg, rw, true); - else { - if (time_before(tg->slice_end[rw], - jiffies + tg->td->throtl_slice)) - throtl_extend_slice(tg, rw, - jiffies + tg->td->throtl_slice); - } - - bps_wait = tg_within_bps_limit(tg, bio, bps_limit); - iops_wait = tg_within_iops_limit(tg, bio, iops_limit); - if (bps_wait + iops_wait == 0) - return 0; - - max_wait = max(bps_wait, iops_wait); - - if (time_before(tg->slice_end[rw], jiffies + max_wait)) - throtl_extend_slice(tg, rw, jiffies + max_wait); + bps_wait = tg_dispatch_bps_time(tg, bio); + iops_wait = tg_dispatch_iops_time(tg, bio); - return max_wait; + return max(bps_wait, iops_wait); } static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio) From patchwork Thu Apr 17 10:58:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055310 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66BDB2356C8 for ; Thu, 17 Apr 2025 11:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; cv=none; b=HG/5ND6PKPmytNO0rr/HmCDp7yHRmXF2gRSQHuzH+uawM6YD4LsX46t9LWRhm1HOT3WHzVRoYFjqzBbNtTHg8kalPHZQ4AENtOg4p7T7rZT/Kkizb6DNVSdpTJDJOFnBA/F4D1K5TikCzxT692vcOGYwxOyOryqcHwAhEKZcA8E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; c=relaxed/simple; bh=T7Jrj4b0Gwm2nlkldATwC4/KAP38uK8GsChMZY1pDq4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GzVOwrbJ8sTpqpu8u2v7atzFhm7+mxtUO6i89kGrWu/+3JfBfO8mRdJKcU2Rd20QNB8693/cEuB/yBdCx0pnYE4lCuxOvwXx7Qlr8oLLWqQydenoxaCSQjukHZEgoyRdfVpapAEjjFjKDQw7MygDrYX3qxKskNffhqYT3p255MQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4ZdZr56VSRz4f3m6S for ; Thu, 17 Apr 2025 19:08:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 328771A019B for ; Thu, 17 Apr 2025 19:08:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S7; Thu, 17 Apr 2025 19:08:31 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 3/7] blk-throttle: Split throtl_charge_bio() into bps and iops functions Date: Thu, 17 Apr 2025 18:58:29 +0800 Message-ID: <20250417105833.1930283-4-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S7 X-Coremail-Antispam: 1UD129KBjvJXoWxXr45KF43uw48Zw4xAw1xAFb_yoW5XFWDpa y7CwsxGw4kJr4DKrsxXF17GFZ5ta1xAry2k393G3yUAanI9w1Dtr1UZryrAFWUuFZ3Wwn7 ZF90qw17G3WUJr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmCb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262 kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCFx2Iq xVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r 106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AK xVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7 xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_ Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07UNjjgUUUUU= Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ Split throtl_charge_bio() to facilitate subsequent patches that will separately charge bps and iops after queue separation. Signed-off-by: Zizhi Wo Reviewed-by: Yu Kuai --- block/blk-throttle.c | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 0633ae0cce90..91ee1c502b41 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -736,6 +736,20 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, return jiffy_wait; } +static void throtl_charge_bps_bio(struct throtl_grp *tg, struct bio *bio) +{ + unsigned int bio_size = throtl_bio_data_size(bio); + + /* Charge the bio to the group */ + if (!bio_flagged(bio, BIO_BPS_THROTTLED)) + tg->bytes_disp[bio_data_dir(bio)] += bio_size; +} + +static void throtl_charge_iops_bio(struct throtl_grp *tg, struct bio *bio) +{ + tg->io_disp[bio_data_dir(bio)]++; +} + /* * If previous slice expired, start a new one otherwise renew/extend existing * slice to make sure it is at least throtl_slice interval long since now. @@ -808,18 +822,6 @@ static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) return max(bps_wait, iops_wait); } -static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio) -{ - bool rw = bio_data_dir(bio); - unsigned int bio_size = throtl_bio_data_size(bio); - - /* Charge the bio to the group */ - if (!bio_flagged(bio, BIO_BPS_THROTTLED)) - tg->bytes_disp[rw] += bio_size; - - tg->io_disp[rw]++; -} - /** * throtl_add_bio_tg - add a bio to the specified throtl_grp * @bio: bio to add @@ -906,7 +908,8 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw) bio = throtl_pop_queued(&sq->queued[rw], &tg_to_put); sq->nr_queued[rw]--; - throtl_charge_bio(tg, bio); + throtl_charge_bps_bio(tg, bio); + throtl_charge_iops_bio(tg, bio); /* * If our parent is another tg, we just need to transfer @bio to @@ -1633,7 +1636,8 @@ bool __blk_throtl_bio(struct bio *bio) while (true) { if (tg_within_limit(tg, bio, rw)) { /* within limits, let's charge and dispatch directly */ - throtl_charge_bio(tg, bio); + throtl_charge_bps_bio(tg, bio); + throtl_charge_iops_bio(tg, bio); /* * We need to trim slice even when bios are not being @@ -1656,7 +1660,8 @@ bool __blk_throtl_bio(struct bio *bio) * control algorithm is adaptive, and extra IO bytes * will be throttled for paying the debt */ - throtl_charge_bio(tg, bio); + throtl_charge_bps_bio(tg, bio); + throtl_charge_iops_bio(tg, bio); } else { /* if above limits, break to queue */ break; From patchwork Thu Apr 17 10:58:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055309 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7DE62356AA for ; Thu, 17 Apr 2025 11:08:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888121; cv=none; b=Ur6MyY5J9Ijw+vVbHbCTOmpXO32qQRlCjL1c6oDwd+ymEAE6NfhDdtfDNkzfWA1A6act6t3oIi1mqO6vCKH8Z6fJgE3CEPP2hAd/61NqsPlupLl0ilyb39OliUHY+FoUQNMV1lfb/JLJTjN+MJ4Ctl0vHn727GRoN4036SOSB08= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888121; c=relaxed/simple; bh=d2jYNP46pjZwMxFz8e5VGc2oTprAnxIoTZXpKCw3rJ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bhiOXSTpDSXsC6+TtSgtsyi3sWgOyk3e3W5frIRvy04UlmztJPshis6zCIAJ/GHRmwX6gz5hKpuV9Pg/rWFGGPlxTAPzIOK/GxrB4TCXpORbUMbLHMkG9JwoeKP7HGAzgerUMl4WdhSWwwyUE7RAthYS2OUpYLKZhsyQ1RrZkhk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4ZdZrF19Nkz4f3jt7 for ; Thu, 17 Apr 2025 19:08:13 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 877241A019B for ; Thu, 17 Apr 2025 19:08:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S8; Thu, 17 Apr 2025 19:08:31 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 4/7] blk-throttle: Introduce flag "BIO_TG_BPS_THROTTLED" Date: Thu, 17 Apr 2025 18:58:30 +0800 Message-ID: <20250417105833.1930283-5-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S8 X-Coremail-Antispam: 1UD129KBjvJXoWxWFWfWw43Wr17GF1DZw1kKrg_yoW5Xr1rpF W8urs5Cw18Gr4vgr97J3W3ua97Ar4xCry5ArZxJr1YvF17Kr1Dtrn7Zr10ya1Fka9a9F47 ZFs5KrWxC3WUGrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQjb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x 0262kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2 IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_ Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07UMBTnUUUUU= Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ Subsequent patches will split the single queue into separate bps and iops queues. To prevent IO that has already passed through the bps queue at a single tg level from being counted toward bps wait time again, we introduce "BIO_TG_BPS_THROTTLED" flag. Since throttle and QoS operate at different levels, we reuse the value as "BIO_QOS_THROTTLED". We set this flag when charge bps and clear it when charge iops, as the bio will move to the upper-level tg or be dispatched. This patch does not involve functional changes. Signed-off-by: Zizhi Wo Reviewed-by: Yu Kuai --- block/blk-throttle.c | 9 +++++++-- include/linux/blk_types.h | 5 +++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 91ee1c502b41..caae2e3b7534 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -741,12 +741,16 @@ static void throtl_charge_bps_bio(struct throtl_grp *tg, struct bio *bio) unsigned int bio_size = throtl_bio_data_size(bio); /* Charge the bio to the group */ - if (!bio_flagged(bio, BIO_BPS_THROTTLED)) + if (!bio_flagged(bio, BIO_BPS_THROTTLED) && + !bio_flagged(bio, BIO_TG_BPS_THROTTLED)) { + bio_set_flag(bio, BIO_TG_BPS_THROTTLED); tg->bytes_disp[bio_data_dir(bio)] += bio_size; + } } static void throtl_charge_iops_bio(struct throtl_grp *tg, struct bio *bio) { + bio_clear_flag(bio, BIO_TG_BPS_THROTTLED); tg->io_disp[bio_data_dir(bio)]++; } @@ -772,7 +776,8 @@ static unsigned long tg_dispatch_bps_time(struct throtl_grp *tg, struct bio *bio /* no need to throttle if this bio's bytes have been accounted */ if (bps_limit == U64_MAX || tg->flags & THROTL_TG_CANCELING || - bio_flagged(bio, BIO_BPS_THROTTLED)) + bio_flagged(bio, BIO_BPS_THROTTLED) || + bio_flagged(bio, BIO_TG_BPS_THROTTLED)) return 0; tg_update_slice(tg, rw); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index dce7615c35e7..7e0eddfaaa98 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -296,6 +296,11 @@ enum { * of this bio. */ BIO_CGROUP_ACCT, /* has been accounted to a cgroup */ BIO_QOS_THROTTLED, /* bio went through rq_qos throttle path */ + /* + * This bio has undergone rate limiting at the single throtl_grp level bps + * queue. Since throttle and QoS are not at the same level, reuse the value. + */ + BIO_TG_BPS_THROTTLED = BIO_QOS_THROTTLED, BIO_QOS_MERGED, /* but went through rq_qos merge path */ BIO_REMAPPED, BIO_ZONE_WRITE_PLUGGING, /* bio handled through zone write plugging */ From patchwork Thu Apr 17 10:58:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055306 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE36E22AE74 for ; Thu, 17 Apr 2025 11:08:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888117; cv=none; b=gTmwNKFXgnGV0g7fQbInc/29lanLiJF60+juXce8lOLaHCWFnInXNzQyGt+s5DxASMvYjiXWOD3NEpXKQNt+/waeif4GDLMw4Nm5njDwhb6XBsXrx/8Ydbb1QYG2ARvi8OaeNE8kBZ9j5jV867jSokrzcCT0nGiigrqTYxr0llk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888117; c=relaxed/simple; bh=ln18I+Uzg5aQQqcGjekgH365lRliBJaY4xftGn6LMRc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i0SACiVq9SIt4mDkMKCbdaGE7TsKQiXjzfW3eBrMJV8h2y7p27SM+zRo0HVLEQdGwhWoJIxjj2TigiEM/dNL8i4o9KyhESk09Kkw8ZVO+u6FBoAqKw1QnbgwZRqOD+zxXyBRnPLH22FDBVtoUzTSPcnupUQoYlgkYOpepL+1jBg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4ZdZr80xZrz4f3kFQ for ; Thu, 17 Apr 2025 19:08:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D776F1A06D7 for ; Thu, 17 Apr 2025 19:08:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S9; Thu, 17 Apr 2025 19:08:31 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 5/7] blk-throttle: Split the blkthrotl queue Date: Thu, 17 Apr 2025 18:58:31 +0800 Message-ID: <20250417105833.1930283-6-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S9 X-Coremail-Antispam: 1UD129KBjvJXoWxtFy3JryxCF43Ar4xZFy7ZFb_yoWxJF15pF W3GFs8Ja1kJrs2grySqF47CFyfta1xZrZrtr93CrZ0yrW3ZrsrXFn8ZFy8AFWrAFZ3Wa10 vr1aqr43W3WUJrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQjb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x 0262kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2 IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_ Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07UMBTnUUUUU= Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ This patch splits the single queue into separate bps and iops queues. Now, an IO request must first pass through the bps queue, then the iops queue, and finally be dispatched. Due to the queue splitting, we need to modify the throtl add/peek/pop function. Additionally, the patch modifies the logic related to tg_dispatch_time(). If bio needs to wait for bps, function directly returns the bps wait time; otherwise, it charges bps and returns the iops wait time so that bio can be directly placed into the iops queue afterward. Note that this may lead to more frequent updates to disptime, but the overhead is negligible for the slow path. Signed-off-by: Zizhi Wo Reviewed-by: Yu Kuai --- block/blk-throttle.c | 49 ++++++++++++++++++++++++++++++-------------- block/blk-throttle.h | 3 ++- 2 files changed, 36 insertions(+), 16 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index caae2e3b7534..1cfd226c3b39 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -143,7 +143,8 @@ static inline unsigned int throtl_bio_data_size(struct bio *bio) static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg) { INIT_LIST_HEAD(&qn->node); - bio_list_init(&qn->bios); + bio_list_init(&qn->bios_bps); + bio_list_init(&qn->bios_iops); qn->tg = tg; } @@ -160,7 +161,11 @@ static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg) static void throtl_qnode_add_bio(struct bio *bio, struct throtl_qnode *qn, struct list_head *queued) { - bio_list_add(&qn->bios, bio); + if (bio_flagged(bio, BIO_TG_BPS_THROTTLED)) + bio_list_add(&qn->bios_iops, bio); + else + bio_list_add(&qn->bios_bps, bio); + if (list_empty(&qn->node)) { list_add_tail(&qn->node, queued); blkg_get(tg_to_blkg(qn->tg)); @@ -170,6 +175,10 @@ static void throtl_qnode_add_bio(struct bio *bio, struct throtl_qnode *qn, /** * throtl_peek_queued - peek the first bio on a qnode list * @queued: the qnode list to peek + * + * Always take a bio from the head of the iops queue first. If the queue + * is empty, we then take it from the bps queue to maintain the overall + * idea of fetching bios from the head. */ static struct bio *throtl_peek_queued(struct list_head *queued) { @@ -180,7 +189,9 @@ static struct bio *throtl_peek_queued(struct list_head *queued) return NULL; qn = list_first_entry(queued, struct throtl_qnode, node); - bio = bio_list_peek(&qn->bios); + bio = bio_list_peek(&qn->bios_iops); + if (!bio) + bio = bio_list_peek(&qn->bios_bps); WARN_ON_ONCE(!bio); return bio; } @@ -190,9 +201,10 @@ static struct bio *throtl_peek_queued(struct list_head *queued) * @queued: the qnode list to pop a bio from * @tg_to_put: optional out argument for throtl_grp to put * - * Pop the first bio from the qnode list @queued. After popping, the first - * qnode is removed from @queued if empty or moved to the end of @queued so - * that the popping order is round-robin. + * Pop the first bio from the qnode list @queued. Note that we firstly + * focus on the iops list because bios are ultimately dispatched from it. + * After popping, the first qnode is removed from @queued if empty or moved to + * the end of @queued so that the popping order is round-robin. * * When the first qnode is removed, its associated throtl_grp should be put * too. If @tg_to_put is NULL, this function automatically puts it; @@ -209,10 +221,12 @@ static struct bio *throtl_pop_queued(struct list_head *queued, return NULL; qn = list_first_entry(queued, struct throtl_qnode, node); - bio = bio_list_pop(&qn->bios); + bio = bio_list_pop(&qn->bios_iops); + if (!bio) + bio = bio_list_pop(&qn->bios_bps); WARN_ON_ONCE(!bio); - if (bio_list_empty(&qn->bios)) { + if (bio_list_empty(&qn->bios_bps) && bio_list_empty(&qn->bios_iops)) { list_del_init(&qn->node); if (tg_to_put) *tg_to_put = qn->tg; @@ -805,12 +819,12 @@ static unsigned long tg_dispatch_iops_time(struct throtl_grp *tg, struct bio *bi /* * Returns approx number of jiffies to wait before this bio is with-in IO rate - * and can be dispatched. + * and can be moved to other queue or dispatched. */ static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) { bool rw = bio_data_dir(bio); - unsigned long bps_wait, iops_wait; + unsigned long wait; /* * Currently whole state machine of group depends on first bio @@ -821,10 +835,17 @@ static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) BUG_ON(tg->service_queue.nr_queued[rw] && bio != throtl_peek_queued(&tg->service_queue.queued[rw])); - bps_wait = tg_dispatch_bps_time(tg, bio); - iops_wait = tg_dispatch_iops_time(tg, bio); + wait = tg_dispatch_bps_time(tg, bio); + if (wait != 0) + return wait; - return max(bps_wait, iops_wait); + /* + * Charge bps here because @bio will be directly placed into the + * iops queue afterward. + */ + throtl_charge_bps_bio(tg, bio); + + return tg_dispatch_iops_time(tg, bio); } /** @@ -913,7 +934,6 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw) bio = throtl_pop_queued(&sq->queued[rw], &tg_to_put); sq->nr_queued[rw]--; - throtl_charge_bps_bio(tg, bio); throtl_charge_iops_bio(tg, bio); /* @@ -1641,7 +1661,6 @@ bool __blk_throtl_bio(struct bio *bio) while (true) { if (tg_within_limit(tg, bio, rw)) { /* within limits, let's charge and dispatch directly */ - throtl_charge_bps_bio(tg, bio); throtl_charge_iops_bio(tg, bio); /* diff --git a/block/blk-throttle.h b/block/blk-throttle.h index 7964cc041e06..5257e5c053e6 100644 --- a/block/blk-throttle.h +++ b/block/blk-throttle.h @@ -28,7 +28,8 @@ */ struct throtl_qnode { struct list_head node; /* service_queue->queued[] */ - struct bio_list bios; /* queued bios */ + struct bio_list bios_bps; /* queued bios for bps limit */ + struct bio_list bios_iops; /* queued bios for iops limit */ struct throtl_grp *tg; /* tg this qnode belongs to */ }; From patchwork Thu Apr 17 10:58:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055313 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 697692356D5 for ; Thu, 17 Apr 2025 11:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; cv=none; b=kZSSiM19NwLYVGwXUKrp4U4vIOiTbm+HgGXYtx4N0VP7rBZMCma+Pcy1Iqj3wLzZ6qMo01HMkH+hRle4/rwiNsS30vudKNw3AnCUO2ZTh6X9zubis/8Guw2hpCkUbW8mxEiltL36EdG8HyQf7MwNDXVdpgSW32nYVWb8dtxI6II= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; c=relaxed/simple; bh=MmwdzGlly+CqnwnCFaqQGVs7tgIWTC6b1TUPm6BbQbE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=al92skDD8YjYaBFq1KLcWykAOn7EmW10oBw9GDS92sZUoWbOHcsiLhr/zWWkHngoG4Yb4qgwZinpQCvrmi0mU3IOUxSsstF1/Fxf3oLAzDt29veV7nmJh2ioEh/LBByt/QN6/UCKv1mD+R44YwBHNpG/0WR+y6/2opR7kfqJbZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4ZdZrF5sdjz4f3jkr for ; Thu, 17 Apr 2025 19:08:13 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 341FD1A17EE for ; Thu, 17 Apr 2025 19:08:32 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S10; Thu, 17 Apr 2025 19:08:32 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 6/7] blk-throttle: Split the service queue Date: Thu, 17 Apr 2025 18:58:32 +0800 Message-ID: <20250417105833.1930283-7-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S10 X-Coremail-Antispam: 1UD129KBjvJXoW3AF13XFWfCw1UZF1UtFW5KFg_yoWfZw1fpr W5CFsxJw4kJr4vgry3tr47GFWSqw4xArW3A3s7GrWfA3y2q3Z8XF1UZryFvFWrAF97uF48 Zryqqrs8WF1UJrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQjb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x 0262kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2 IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_ Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07UMBTnUUUUU= Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ This patch splits throtl_service_queue->nr_queued into "nr_queued_bps" and "nr_queued_iops", allowing separate accounting of BPS and IOPS queued bios. This prepares for future changes that need to check whether the BPS or IOPS queues are empty. To facilitate updating the number of IOs in the BPS and IOPS queues, the addition logic will be moved from throtl_add_bio_tg() to throtl_qnode_add_bio(), and similarly, the removal logic will be moved from tg_dispatch_one_bio() to throtl_pop_queued(). And introduce sq_queued() to calculate the total sum of sq->nr_queued. Signed-off-by: Zizhi Wo --- block/blk-throttle.c | 75 +++++++++++++++++++++++++++----------------- block/blk-throttle.h | 3 +- 2 files changed, 48 insertions(+), 30 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 1cfd226c3b39..6f9f08d7e5fe 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -152,22 +152,27 @@ static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg) * throtl_qnode_add_bio - add a bio to a throtl_qnode and activate it * @bio: bio being added * @qn: qnode to add bio to - * @queued: the service_queue->queued[] list @qn belongs to + * @sq: the service_queue @qn belongs to * - * Add @bio to @qn and put @qn on @queued if it's not already on. + * Add @bio to @qn and put @qn on @sq->queued if it's not already on. * @qn->tg's reference count is bumped when @qn is activated. See the * comment on top of throtl_qnode definition for details. */ static void throtl_qnode_add_bio(struct bio *bio, struct throtl_qnode *qn, - struct list_head *queued) + struct throtl_service_queue *sq) { - if (bio_flagged(bio, BIO_TG_BPS_THROTTLED)) + bool rw = bio_data_dir(bio); + + if (bio_flagged(bio, BIO_TG_BPS_THROTTLED)) { bio_list_add(&qn->bios_iops, bio); - else + sq->nr_queued_iops[rw]++; + } else { bio_list_add(&qn->bios_bps, bio); + sq->nr_queued_bps[rw]++; + } if (list_empty(&qn->node)) { - list_add_tail(&qn->node, queued); + list_add_tail(&qn->node, &sq->queued[rw]); blkg_get(tg_to_blkg(qn->tg)); } } @@ -198,22 +203,24 @@ static struct bio *throtl_peek_queued(struct list_head *queued) /** * throtl_pop_queued - pop the first bio form a qnode list - * @queued: the qnode list to pop a bio from + * @sq: the service_queue to pop a bio from * @tg_to_put: optional out argument for throtl_grp to put + * @rw: read/write * - * Pop the first bio from the qnode list @queued. Note that we firstly + * Pop the first bio from the qnode list @sq->queued. Note that we firstly * focus on the iops list because bios are ultimately dispatched from it. - * After popping, the first qnode is removed from @queued if empty or moved to - * the end of @queued so that the popping order is round-robin. + * After popping, the first qnode is removed from @sq->queued if empty or + * moved to the end of @queued so that the popping order is round-robin. * * When the first qnode is removed, its associated throtl_grp should be put * too. If @tg_to_put is NULL, this function automatically puts it; * otherwise, *@tg_to_put is set to the throtl_grp to put and the caller is * responsible for putting it. */ -static struct bio *throtl_pop_queued(struct list_head *queued, - struct throtl_grp **tg_to_put) +static struct bio *throtl_pop_queued(struct throtl_service_queue *sq, + struct throtl_grp **tg_to_put, bool rw) { + struct list_head *queued = &sq->queued[rw]; struct throtl_qnode *qn; struct bio *bio; @@ -222,8 +229,12 @@ static struct bio *throtl_pop_queued(struct list_head *queued, qn = list_first_entry(queued, struct throtl_qnode, node); bio = bio_list_pop(&qn->bios_iops); - if (!bio) + if (!bio) { bio = bio_list_pop(&qn->bios_bps); + sq->nr_queued_bps[rw]--; + } else { + sq->nr_queued_iops[rw]--; + } WARN_ON_ONCE(!bio); if (bio_list_empty(&qn->bios_bps) && bio_list_empty(&qn->bios_iops)) { @@ -553,6 +564,11 @@ static bool throtl_slice_used(struct throtl_grp *tg, bool rw) return true; } +static unsigned int sq_queued(struct throtl_service_queue *sq, int type) +{ + return sq->nr_queued_bps[type] + sq->nr_queued_iops[type]; +} + static unsigned int calculate_io_allowed(u32 iops_limit, unsigned long jiffy_elapsed) { @@ -682,9 +698,9 @@ static void tg_update_carryover(struct throtl_grp *tg) long long bytes[2] = {0}; int ios[2] = {0}; - if (tg->service_queue.nr_queued[READ]) + if (sq_queued(&tg->service_queue, READ)) __tg_update_carryover(tg, READ, &bytes[READ], &ios[READ]); - if (tg->service_queue.nr_queued[WRITE]) + if (sq_queued(&tg->service_queue, WRITE)) __tg_update_carryover(tg, WRITE, &bytes[WRITE], &ios[WRITE]); /* see comments in struct throtl_grp for meaning of these fields. */ @@ -776,7 +792,8 @@ static void throtl_charge_iops_bio(struct throtl_grp *tg, struct bio *bio) */ static void tg_update_slice(struct throtl_grp *tg, bool rw) { - if (throtl_slice_used(tg, rw) && !(tg->service_queue.nr_queued[rw])) + if (throtl_slice_used(tg, rw) && + sq_queued(&tg->service_queue, rw) == 0) throtl_start_new_slice(tg, rw, true); else throtl_extend_slice(tg, rw, jiffies + tg->td->throtl_slice); @@ -832,7 +849,7 @@ static unsigned long tg_dispatch_time(struct throtl_grp *tg, struct bio *bio) * this function with a different bio if there are other bios * queued. */ - BUG_ON(tg->service_queue.nr_queued[rw] && + BUG_ON(sq_queued(&tg->service_queue, rw) && bio != throtl_peek_queued(&tg->service_queue.queued[rw])); wait = tg_dispatch_bps_time(tg, bio); @@ -872,12 +889,11 @@ static void throtl_add_bio_tg(struct bio *bio, struct throtl_qnode *qn, * dispatched. Mark that @tg was empty. This is automatically * cleared on the next tg_update_disptime(). */ - if (!sq->nr_queued[rw]) + if (sq_queued(sq, rw) == 0) tg->flags |= THROTL_TG_WAS_EMPTY; - throtl_qnode_add_bio(bio, qn, &sq->queued[rw]); + throtl_qnode_add_bio(bio, qn, sq); - sq->nr_queued[rw]++; throtl_enqueue_tg(tg); } @@ -931,8 +947,7 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw) * getting released prematurely. Remember the tg to put and put it * after @bio is transferred to @parent_sq. */ - bio = throtl_pop_queued(&sq->queued[rw], &tg_to_put); - sq->nr_queued[rw]--; + bio = throtl_pop_queued(sq, &tg_to_put, rw); throtl_charge_iops_bio(tg, bio); @@ -949,7 +964,7 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw) } else { bio_set_flag(bio, BIO_BPS_THROTTLED); throtl_qnode_add_bio(bio, &tg->qnode_on_parent[rw], - &parent_sq->queued[rw]); + parent_sq); BUG_ON(tg->td->nr_queued[rw] <= 0); tg->td->nr_queued[rw]--; } @@ -1014,7 +1029,7 @@ static int throtl_select_dispatch(struct throtl_service_queue *parent_sq) nr_disp += throtl_dispatch_tg(tg); sq = &tg->service_queue; - if (sq->nr_queued[READ] || sq->nr_queued[WRITE]) + if (sq_queued(sq, READ) || sq_queued(sq, WRITE)) tg_update_disptime(tg); else throtl_dequeue_tg(tg); @@ -1067,9 +1082,11 @@ static void throtl_pending_timer_fn(struct timer_list *t) dispatched = false; while (true) { + unsigned int bio_cnt_r = sq_queued(sq, READ); + unsigned int bio_cnt_w = sq_queued(sq, WRITE); + throtl_log(sq, "dispatch nr_queued=%u read=%u write=%u", - sq->nr_queued[READ] + sq->nr_queued[WRITE], - sq->nr_queued[READ], sq->nr_queued[WRITE]); + bio_cnt_r + bio_cnt_w, bio_cnt_r, bio_cnt_w); ret = throtl_select_dispatch(sq); if (ret) { @@ -1131,7 +1148,7 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work) spin_lock_irq(&q->queue_lock); for (rw = READ; rw <= WRITE; rw++) - while ((bio = throtl_pop_queued(&td_sq->queued[rw], NULL))) + while ((bio = throtl_pop_queued(td_sq, NULL, rw))) bio_list_add(&bio_list_on_stack, bio); spin_unlock_irq(&q->queue_lock); @@ -1637,7 +1654,7 @@ void blk_throtl_cancel_bios(struct gendisk *disk) static bool tg_within_limit(struct throtl_grp *tg, struct bio *bio, bool rw) { /* throtl is FIFO - if bios are already queued, should queue */ - if (tg->service_queue.nr_queued[rw]) + if (sq_queued(&tg->service_queue, rw)) return false; return tg_dispatch_time(tg, bio) == 0; @@ -1711,7 +1728,7 @@ bool __blk_throtl_bio(struct bio *bio) tg->bytes_disp[rw], bio->bi_iter.bi_size, tg_bps_limit(tg, rw), tg->io_disp[rw], tg_iops_limit(tg, rw), - sq->nr_queued[READ], sq->nr_queued[WRITE]); + sq_queued(sq, READ), sq_queued(sq, WRITE)); td->nr_queued[rw]++; throtl_add_bio_tg(bio, qn, tg); diff --git a/block/blk-throttle.h b/block/blk-throttle.h index 5257e5c053e6..04e92cfd0ab1 100644 --- a/block/blk-throttle.h +++ b/block/blk-throttle.h @@ -41,7 +41,8 @@ struct throtl_service_queue { * children throtl_grp's. */ struct list_head queued[2]; /* throtl_qnode [READ/WRITE] */ - unsigned int nr_queued[2]; /* number of queued bios */ + unsigned int nr_queued_bps[2]; /* number of queued bps bios */ + unsigned int nr_queued_iops[2]; /* number of queued iops bios */ /* * RB tree of active children throtl_grp's, which are sorted by From patchwork Thu Apr 17 10:58:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zizhi Wo X-Patchwork-Id: 14055311 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 654EF2356C2 for ; Thu, 17 Apr 2025 11:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; cv=none; b=s94h7FLzGNhdkJGqHrQkUYFw1dUo4yvSlYqgIgHDOp/BjRb57Lv/PXvtXlj19364D0RZACeHP0KaPAKXz+sLHUNH25tewAQ3/9oEKcZw31XSBlA/PCUSCyfuqSQzqQWAWLA5i0dubh0gHWk4OYkRsc+Wp+dCas1sjMbvhVQ2kKo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744888122; c=relaxed/simple; bh=W+KHsOaae04VA7yN3kiBUvS/M/J98S7B1ss6g2kt+ow=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O6ygRqMEIySQtgmy2rNRASN+Ixbxb2tcSLZJgb81HmW/jiiFXQdvI9uPr6m66mZjFg5ZSUws0UtHvlMbqNKeJSGn8Or4sbrQNylgBDUJ74BWjc16R/0J6koMwXUbYv6oHLCPYJ+6MrEij4ouM/FubLkADNRXGZVNwHTy9N5WPGI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4ZdZr71zzXz4f3lVL for ; Thu, 17 Apr 2025 19:08:07 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 890261A17ED for ; Thu, 17 Apr 2025 19:08:32 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgC3Gl8s4QBoyDT8Jg--.2150S11; Thu, 17 Apr 2025 19:08:32 +0800 (CST) From: Zizhi Wo To: axboe@kernel.dk, linux-block@vger.kernel.org Cc: yangerkun@huawei.com, yukuai3@huawei.com, wozizhi@huaweicloud.com, ming.lei@redhat.com, tj@kernel.org Subject: [PATCH V2 7/7] blk-throttle: Prevents the bps restricted io from entering the bps queue again Date: Thu, 17 Apr 2025 18:58:33 +0800 Message-ID: <20250417105833.1930283-8-wozizhi@huawei.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250417105833.1930283-1-wozizhi@huawei.com> References: <20250417105833.1930283-1-wozizhi@huawei.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgC3Gl8s4QBoyDT8Jg--.2150S11 X-Coremail-Antispam: 1UD129KBjvJXoW3GF15Ww18Zr43Wr47XryfXrb_yoWxGFyrpF WxCF4rJa1ktrsrKr1fXF17JFWSqw4fJry3ArZ3GrySyrW2qr1vgr1qva4rZFWrAF9xCF43 ZF4UKrZ5C3WUA3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ0b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x 0262kKe7AKxVWUAVWUtwCF04k20xvY0x0EwIxGrwCF04k20xvEw4C26cxK6c8Ij28IcwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v2 6r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUwKZGDUUUU Sender: wozizhi@huaweicloud.com X-CM-SenderInfo: pzr2x6tkl6x35dzhxuhorxvhhfrp/ [BUG] There has an issue of io delayed dispatch caused by io splitting. Consider the following scenario: 1) If we set a BPS limit of 1MB/s and restrict the maximum IO size per dispatch to 4KB, submitting -two- 1MB IO requests results in completion times of 1s and 2s, which is expected. 2) However, if we additionally set an IOPS limit of 1,000,000/s with the same BPS limit of 1MB/s, submitting -two- 1MB IO requests again results in both completing in 2s, even though the IOPS constraint is being met. [CAUSE] This issue arises because BPS and IOPS currently share the same queue in the blkthrotl mechanism: 1) This issue does not occur when only BPS is limited because the split IOs return false in blk_should_throtl() and do not go through to throtl again. 2) For split IOs, even if they have been tagged with BIO_BPS_THROTTLED, they still get queued alternately in the same list due to continuous splitting and reordering. As a result, the two IO requests are both completed at the 2-second mark, causing an unintended delay. 3) It is not difficult to imagine that in this scenario, if N 1MB IOs are issued at once, all IOs will eventually complete together in N seconds. [FIX] With the queue separation introduced in the previous patch, we now have separate BPS and IOPS queues. For IOs that have already passed the BPS limitation, they do not need to re-enter the BPS queue and can directly placed to the IOPS queue. Since we have split the queues, when the IOPS queue is previously empty and a new bio is added to the first qnode in the service_queue, we also need to update the disptime. This patch introduces "THROTL_TG_IOPS_WAS__EMPTY" flag to mark it. Signed-off-by: Zizhi Wo Reviewed-by: Yu Kuai --- block/blk-throttle.c | 53 +++++++++++++++++++++++++++++++++++++------- block/blk-throttle.h | 8 ++++--- 2 files changed, 50 insertions(+), 11 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 6f9f08d7e5fe..29a60ce8a344 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -163,7 +163,12 @@ static void throtl_qnode_add_bio(struct bio *bio, struct throtl_qnode *qn, { bool rw = bio_data_dir(bio); - if (bio_flagged(bio, BIO_TG_BPS_THROTTLED)) { + /* + * Split bios have already been throttled by bps, so they are + * directly queued into the iops path. + */ + if (bio_flagged(bio, BIO_TG_BPS_THROTTLED) || + bio_flagged(bio, BIO_BPS_THROTTLED)) { bio_list_add(&qn->bios_iops, bio); sq->nr_queued_iops[rw]++; } else { @@ -894,6 +899,15 @@ static void throtl_add_bio_tg(struct bio *bio, struct throtl_qnode *qn, throtl_qnode_add_bio(bio, qn, sq); + /* + * Since we have split the queues, when the iops queue is + * previously empty and a new @bio is added into the first @qn, + * we also need to update the @tg->disptime. + */ + if (bio_flagged(bio, BIO_BPS_THROTTLED) && + bio == throtl_peek_queued(&sq->queued[rw])) + tg->flags |= THROTL_TG_IOPS_WAS_EMPTY; + throtl_enqueue_tg(tg); } @@ -921,6 +935,7 @@ static void tg_update_disptime(struct throtl_grp *tg) /* see throtl_add_bio_tg() */ tg->flags &= ~THROTL_TG_WAS_EMPTY; + tg->flags &= ~THROTL_TG_IOPS_WAS_EMPTY; } static void start_parent_slice_with_credit(struct throtl_grp *child_tg, @@ -1108,7 +1123,8 @@ static void throtl_pending_timer_fn(struct timer_list *t) if (parent_sq) { /* @parent_sq is another throl_grp, propagate dispatch */ - if (tg->flags & THROTL_TG_WAS_EMPTY) { + if (tg->flags & THROTL_TG_WAS_EMPTY || + tg->flags & THROTL_TG_IOPS_WAS_EMPTY) { tg_update_disptime(tg); if (!throtl_schedule_next_dispatch(parent_sq, false)) { /* window is already open, repeat dispatching */ @@ -1653,9 +1669,28 @@ void blk_throtl_cancel_bios(struct gendisk *disk) static bool tg_within_limit(struct throtl_grp *tg, struct bio *bio, bool rw) { - /* throtl is FIFO - if bios are already queued, should queue */ - if (sq_queued(&tg->service_queue, rw)) + struct throtl_service_queue *sq = &tg->service_queue; + + /* + * For a split bio, we need to specifically distinguish whether the + * iops queue is empty. + */ + if (bio_flagged(bio, BIO_BPS_THROTTLED)) + return sq->nr_queued_iops[rw] == 0 && + tg_dispatch_iops_time(tg, bio) == 0; + + /* + * Throtl is FIFO - if bios are already queued, should queue. + * If the bps queue is empty and @bio is within the bps limit, charge + * bps here for direct placement into the iops queue. + */ + if (sq_queued(&tg->service_queue, rw)) { + if (sq->nr_queued_bps[rw] == 0 && + tg_dispatch_bps_time(tg, bio) == 0) + throtl_charge_bps_bio(tg, bio); + return false; + } return tg_dispatch_time(tg, bio) == 0; } @@ -1736,11 +1771,13 @@ bool __blk_throtl_bio(struct bio *bio) /* * Update @tg's dispatch time and force schedule dispatch if @tg - * was empty before @bio. The forced scheduling isn't likely to - * cause undue delay as @bio is likely to be dispatched directly if - * its @tg's disptime is not in the future. + * was empty before @bio, or the iops queue is empty and @bio will + * add to. The forced scheduling isn't likely to cause undue + * delay as @bio is likely to be dispatched directly if its @tg's + * disptime is not in the future. */ - if (tg->flags & THROTL_TG_WAS_EMPTY) { + if (tg->flags & THROTL_TG_WAS_EMPTY || + tg->flags & THROTL_TG_IOPS_WAS_EMPTY) { tg_update_disptime(tg); throtl_schedule_next_dispatch(tg->service_queue.parent_sq, true); } diff --git a/block/blk-throttle.h b/block/blk-throttle.h index 04e92cfd0ab1..6f11aaabe7e7 100644 --- a/block/blk-throttle.h +++ b/block/blk-throttle.h @@ -55,9 +55,11 @@ struct throtl_service_queue { }; enum tg_state_flags { - THROTL_TG_PENDING = 1 << 0, /* on parent's pending tree */ - THROTL_TG_WAS_EMPTY = 1 << 1, /* bio_lists[] became non-empty */ - THROTL_TG_CANCELING = 1 << 2, /* starts to cancel bio */ + THROTL_TG_PENDING = 1 << 0, /* on parent's pending tree */ + THROTL_TG_WAS_EMPTY = 1 << 1, /* bio_lists[] became non-empty */ + /* iops queue is empty, and a bio is about to be enqueued to the first qnode. */ + THROTL_TG_IOPS_WAS_EMPTY = 1 << 2, + THROTL_TG_CANCELING = 1 << 3, /* starts to cancel bio */ }; struct throtl_grp {