From patchwork Tue Nov 29 03:01:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058146 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 967B8C4321E for ; Tue, 29 Nov 2022 03:01:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235299AbiK2DBz (ORCPT ); Mon, 28 Nov 2022 22:01:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235205AbiK2DBx (ORCPT ); Mon, 28 Nov 2022 22:01:53 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA80C3D92C; Mon, 28 Nov 2022 19:01:52 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLnCn4JGkzCqj2; Tue, 29 Nov 2022 11:01:09 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:50 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init Date: Tue, 29 Nov 2022 11:01:38 +0800 Message-ID: <20221129030147.27400-2-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On the default hierarchy (cgroup2), the throttle interface files don't exist in the root cgroup, so the ablity to limit the whole system by configuring root group is not existing anymore. In general, cgroup doesn't wanna be in the business of restricting resources at the system level, so correct the stale comment that we can limit whole system to we can only limit subtree. Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 847721dc2b2b..59acfac87764 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -395,8 +395,9 @@ static void throtl_pd_init(struct blkg_policy_data *pd) * If on the default hierarchy, we switch to properly hierarchical * behavior where limits on a given throtl_grp are applied to the * whole subtree rather than just the group itself. e.g. If 16M - * read_bps limit is set on the root group, the whole system can't - * exceed 16M for the device. + * read_bps limit is set on a "parent" group, summary bps of + * "parent" group and its subtree groups can't exceed 16M for the + * device. * * If not on the default hierarchy, the broken flat hierarchy * behavior is retained where all throtl_grps are treated as if From patchwork Tue Nov 29 03:01:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96F50C4332F for ; Tue, 29 Nov 2022 03:01:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235295AbiK2DBz (ORCPT ); Mon, 28 Nov 2022 22:01:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235290AbiK2DBy (ORCPT ); Mon, 28 Nov 2022 22:01:54 -0500 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C77445084; Mon, 28 Nov 2022 19:01:53 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4NLnCs3z0Jz15Mxb; Tue, 29 Nov 2022 11:01:13 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:51 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 02/10] blk-throttle: Fix that bps of child could exceed bps limited in parent Date: Tue, 29 Nov 2022 11:01:39 +0800 Message-ID: <20221129030147.27400-3-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Consider situation as following (on the default hierarchy): HDD | root (bps limit: 4k) | child (bps limit :8k) | fio bs=8k Rate of fio is supposed to be 4k, but result is 8k. Reason is as following: Size of single IO from fio is larger than bytes allowed in one throtl_slice in child, so IOs are always queued in child group first. When queued IOs in child are dispatched to parent group, BIO_BPS_THROTTLED is set and these IOs will not be limited by tg_within_bps_limit anymore. Fix this by only set BIO_BPS_THROTTLED when the bio traversed the entire tree. There patch has no influence on situation which is not on the default hierarchy as each group is a single root group without parent. Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 59acfac87764..d516a37e82fb 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1067,7 +1067,6 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw) sq->nr_queued[rw]--; throtl_charge_bio(tg, bio); - bio_set_flag(bio, BIO_BPS_THROTTLED); /* * If our parent is another tg, we just need to transfer @bio to @@ -1080,6 +1079,7 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw) throtl_add_bio_tg(bio, &tg->qnode_on_parent[rw], parent_tg); start_parent_slice_with_credit(tg, parent_tg, rw); } else { + bio_set_flag(bio, BIO_BPS_THROTTLED); throtl_qnode_add_bio(bio, &tg->qnode_on_parent[rw], &parent_sq->queued[rw]); BUG_ON(tg->td->nr_queued[rw] <= 0); From patchwork Tue Nov 29 03:01:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCF38C4332F for ; Tue, 29 Nov 2022 03:02:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235307AbiK2DB4 (ORCPT ); Mon, 28 Nov 2022 22:01:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235291AbiK2DBy (ORCPT ); Mon, 28 Nov 2022 22:01:54 -0500 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1509845097; Mon, 28 Nov 2022 19:01:54 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4NLnCt1XVRz15MpD; Tue, 29 Nov 2022 11:01:14 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:51 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 03/10] blk-throttle: ignore cgroup without io queued in blk_throtl_cancel_bios Date: Tue, 29 Nov 2022 11:01:40 +0800 Message-ID: <20221129030147.27400-4-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Ignore cgroup without io queued in blk_throtl_cancel_bios for two reasons: 1. Save cpu cycle for trying to dispatch cgroup which is no io queued. 2. Avoid non-consistent state that cgroup is inserted to service queue without THROTL_TG_PENDING set as tg_update_disptime will unconditional re-insert cgroup to service queue. If we are on the default hierarchy, IO dispatched from child in tg_dispatch_one_bio will trigger inserting cgroup to service queue without erase first and ruin the tree. Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index d516a37e82fb..ee7dc1a0adfd 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1738,7 +1738,18 @@ void blk_throtl_cancel_bios(struct gendisk *disk) * Set the flag to make sure throtl_pending_timer_fn() won't * stop until all throttled bios are dispatched. */ - blkg_to_tg(blkg)->flags |= THROTL_TG_CANCELING; + tg->flags |= THROTL_TG_CANCELING; + + /* + * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup + * will be inserted to service queue without THROTL_TG_PENDING + * set in tg_update_disptime below. Then IO dispatched from + * child in tg_dispatch_one_bio will trigger double insertion + * and corrupt the tree. + */ + if (!(tg->flags & THROTL_TG_PENDING)) + continue; + /* * Update disptime after setting the above flag to make sure * throtl_select_dispatch() won't exit without dispatching. From patchwork Tue Nov 29 03:01:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 276B0C46467 for ; Tue, 29 Nov 2022 03:02:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235311AbiK2DB5 (ORCPT ); Mon, 28 Nov 2022 22:01:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235301AbiK2DB4 (ORCPT ); Mon, 28 Nov 2022 22:01:56 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB3CD45EF1; Mon, 28 Nov 2022 19:01:54 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLnCq49DWzCqjJ; Tue, 29 Nov 2022 11:01:11 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:52 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 04/10] blk-throttle: correct calculation of wait time in tg_may_dispatch Date: Tue, 29 Nov 2022 11:01:41 +0800 Message-ID: <20221129030147.27400-5-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org In C language, When executing "if (expression1 && expression2)" and expression1 return false, the expression2 may not be executed. For "tg_within_bps_limit(tg, bio, bps_limit, &bps_wait) && tg_within_iops_limit(tg, bio, iops_limit, &iops_wait))", if bps is limited, tg_within_bps_limit will return false and tg_within_iops_limit will not be called. So even bps and iops are both limited, iops_wait will not be calculated and is always zero. So wait time of iops is always ignored. Fix this by always calling tg_within_bps_limit and tg_within_iops_limit to get wait time for both bps and iops. Observed that: 1. Wait time in tg_within_iops_limit/tg_within_bps_limit need always be stored as wait argument is always passed. 2. wait time is stored to zero if iops/bps is limited otherwise non-zero is stored. Simpfy tg_within_iops_limit/tg_within_bps_limit by removing wait argument and return wait time directly. Caller tg_may_dispatch checks if wait time is zero to find if iops/bps is limited. Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 38 +++++++++++++------------------------- 1 file changed, 13 insertions(+), 25 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index ee7dc1a0adfd..06e4193b064e 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -822,17 +822,15 @@ static void tg_update_carryover(struct throtl_grp *tg) tg->carryover_ios[READ], tg->carryover_ios[WRITE]); } -static bool tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio, - u32 iops_limit, unsigned long *wait) +static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio, + u32 iops_limit) { bool rw = bio_data_dir(bio); unsigned int io_allowed; unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; if (iops_limit == UINT_MAX) { - if (wait) - *wait = 0; - return true; + return 0; } jiffy_elapsed = jiffies - tg->slice_start[rw]; @@ -842,21 +840,16 @@ static bool tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio, io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd) + tg->carryover_ios[rw]; if (tg->io_disp[rw] + 1 <= io_allowed) { - if (wait) - *wait = 0; - return true; + return 0; } /* Calc approx time to dispatch */ jiffy_wait = jiffy_elapsed_rnd - jiffy_elapsed; - - if (wait) - *wait = jiffy_wait; - return false; + return jiffy_wait; } -static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, - u64 bps_limit, unsigned long *wait) +static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, + u64 bps_limit) { bool rw = bio_data_dir(bio); u64 bytes_allowed, extra_bytes; @@ -865,9 +858,7 @@ static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, /* no need to throttle if this bio's bytes have been accounted */ if (bps_limit == U64_MAX || bio_flagged(bio, BIO_BPS_THROTTLED)) { - if (wait) - *wait = 0; - return true; + return 0; } jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw]; @@ -880,9 +871,7 @@ static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) + tg->carryover_bytes[rw]; if (tg->bytes_disp[rw] + bio_size <= bytes_allowed) { - if (wait) - *wait = 0; - return true; + return 0; } /* Calc approx time to dispatch */ @@ -897,9 +886,7 @@ static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio, * up we did. Add that time also. */ jiffy_wait = jiffy_wait + (jiffy_elapsed_rnd - jiffy_elapsed); - if (wait) - *wait = jiffy_wait; - return false; + return jiffy_wait; } /* @@ -947,8 +934,9 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio, jiffies + tg->td->throtl_slice); } - if (tg_within_bps_limit(tg, bio, bps_limit, &bps_wait) && - tg_within_iops_limit(tg, bio, iops_limit, &iops_wait)) { + bps_wait = tg_within_bps_limit(tg, bio, bps_limit); + iops_wait = tg_within_iops_limit(tg, bio, iops_limit); + if (bps_wait + iops_wait == 0) { if (wait) *wait = 0; return true; From patchwork Tue Nov 29 03:01:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75CA7C4332F for ; Tue, 29 Nov 2022 03:02:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235347AbiK2DCH (ORCPT ); Mon, 28 Nov 2022 22:02:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235305AbiK2DB4 (ORCPT ); Mon, 28 Nov 2022 22:01:56 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68B85303F3; Mon, 28 Nov 2022 19:01:55 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLnCr1jMyzCqjL; Tue, 29 Nov 2022 11:01:12 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:53 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 05/10] blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade Date: Tue, 29 Nov 2022 11:01:42 +0800 Message-ID: <20221129030147.27400-6-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Commit c79892c557616 ("blk-throttle: add upgrade logic for LIMIT_LOW state") added upgrade logic for low limit and methioned that 1. "To determine if a cgroup exceeds its limitation, we check if the cgroup has pending request. Since cgroup is throttled according to the limit, pending request means the cgroup reaches the limit." 2. "If a cgroup has limit set for both read and write, we consider the combination of them for upgrade. The reason is read IO and write IO can interfere with each other. If we do the upgrade based in one direction IO, the other direction IO could be severly harmed." Besides, we also determine that cgroup reaches low limit if low limit is 0, see comment in throtl_tg_can_upgrade. Collect the information above, the desgin of upgrade check is as following: 1.The low limit is reached if limit is zero or io is already queued. 2.Cgroup will pass upgrade check if low limits of READ and WRITE are both reached. Simpfy the check code described above to removce repeat check and improve readability. There is no functional change. Detail equivalence proof is as following: All replaced conditions to return true are as following: condition 1 (!read_limit && !write_limit) condition 2 read_limit && sq->nr_queued[READ] && (!write_limit || sq->nr_queued[WRITE]) condition 3 write_limit && sq->nr_queued[WRITE] && (!read_limit || sq->nr_queued[READ]) Transferring condition 2 as following: read_limit && sq->nr_queued[READ] && (!write_limit || sq->nr_queued[WRITE]) is equivalent to (read_limit && sq->nr_queued[READ]) && (!write_limit || (write_limit && sq->nr_queued[WRITE])) is equivalent to condition 2.1 (read_limit && sq->nr_queued[READ] && !write_limit) || condition 2.2 (read_limit && sq->nr_queued[READ] && (write_limit && sq->nr_queued[WRITE])) Transferring condition 3 as following: write_limit && sq->nr_queued[WRITE] && (!read_limit || sq->nr_queued[READ]) is equivalent to (write_limit && sq->nr_queued[WRITE]) && (!read_limit || (read_limit && sq->nr_queued[READ])) is equivalent to condition 3.1 ((write_limit && sq->nr_queued[WRITE]) && !read_limit) || condition 3.2 ((write_limit && sq->nr_queued[WRITE]) && (read_limit && sq->nr_queued[READ])) Condition 3.2 is the same as condition 2.2, so all conditions we get to return are as following: (!read_limit && !write_limit) (1) (!read_limit && (write_limit && sq->nr_queued[WRITE])) (3.1) ((read_limit && sq->nr_queued[READ]) && !write_limit) (2.1) ((write_limit && sq->nr_queued[WRITE]) && (read_limit && sq->nr_queued[READ])) (2.2) As we can extract conditions "(a1 || a2) && (b1 || b2)" to: a1 && b1 a1 && b2 a2 && b1 ab && b2 Considering that: a1 = !read_limit a2 = read_limit && sq->nr_queued[READ] b1 = !write_limit b2 = write_limit && sq->nr_queued[WRITE] We can pack replaced conditions to (!read_limint || (read_limit && sq->nr_queued[READ])) && (!write_limit || (write_limit && sq->nr_queued[WRITE]) which is equivalent to (!read_limint || sq->nr_queued[READ]) && (!write_limit || sq->nr_queued[WRITE]) Signed-off-by: Kemeng Shi Reported-by: kernel test robot Acked-by: Tejun Heo --- block/blk-throttle.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 06e4193b064e..4977fac56a00 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1816,24 +1816,29 @@ static bool throtl_tg_is_idle(struct throtl_grp *tg) return ret; } -static bool throtl_tg_can_upgrade(struct throtl_grp *tg) +static bool throtl_tg_reach_low_limit(struct throtl_grp *tg, int rw) { struct throtl_service_queue *sq = &tg->service_queue; - bool read_limit, write_limit; + bool limit = tg->bps[rw][LIMIT_LOW] || tg->iops[rw][LIMIT_LOW]; /* - * if cgroup reaches low limit (if low limit is 0, the cgroup always - * reaches), it's ok to upgrade to next limit + * if low limit is zero, low limit is always reached. + * if low limit is non-zero, we can check if there is any request + * is queued to determine if low limit is reached as we throttle + * request according to limit. */ - read_limit = tg->bps[READ][LIMIT_LOW] || tg->iops[READ][LIMIT_LOW]; - write_limit = tg->bps[WRITE][LIMIT_LOW] || tg->iops[WRITE][LIMIT_LOW]; - if (!read_limit && !write_limit) - return true; - if (read_limit && sq->nr_queued[READ] && - (!write_limit || sq->nr_queued[WRITE])) - return true; - if (write_limit && sq->nr_queued[WRITE] && - (!read_limit || sq->nr_queued[READ])) + return !limit || sq->nr_queued[rw]; +} + +static bool throtl_tg_can_upgrade(struct throtl_grp *tg) +{ + /* + * cgroup reaches low limit when low limit of READ and WRITE are + * both reached, it's ok to upgrade to next limit if cgroup reaches + * low limit + */ + if (throtl_tg_reach_low_limit(tg, READ) && + throtl_tg_reach_low_limit(tg, WRITE)) return true; if (time_after_eq(jiffies, From patchwork Tue Nov 29 03:01:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ED51C4708A for ; Tue, 29 Nov 2022 03:02:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235342AbiK2DCH (ORCPT ); Mon, 28 Nov 2022 22:02:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235306AbiK2DB4 (ORCPT ); Mon, 28 Nov 2022 22:01:56 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 126D43D92C; Mon, 28 Nov 2022 19:01:56 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLnCr6DsJzCqjd; Tue, 29 Nov 2022 11:01:12 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:53 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 06/10] blk-throttle: fix typo in comment of throtl_adjusted_limit Date: Tue, 29 Nov 2022 11:01:43 +0800 Message-ID: <20221129030147.27400-7-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org lapsed time -> elapsed time Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 4977fac56a00..4b11099625da 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -129,7 +129,7 @@ static struct throtl_data *sq_to_td(struct throtl_service_queue *sq) /* * cgroup's limit in LIMIT_MAX is scaled if low limit is set. This scale is to * make the IO dispatch more smooth. - * Scale up: linearly scale up according to lapsed time since upgrade. For + * Scale up: linearly scale up according to elapsed time since upgrade. For * every throtl_slice, the limit scales up 1/2 .low limit till the * limit hits .max limit * Scale down: exponentially scale down if a cgroup doesn't hit its .low limit From patchwork Tue Nov 29 03:01:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09B26C43217 for ; Tue, 29 Nov 2022 03:02:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235350AbiK2DCJ (ORCPT ); Mon, 28 Nov 2022 22:02:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235314AbiK2DB6 (ORCPT ); Mon, 28 Nov 2022 22:01:58 -0500 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B10E45084; Mon, 28 Nov 2022 19:01:57 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4NLn8n0j6jzJp0N; Tue, 29 Nov 2022 10:58:33 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:54 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 07/10] blk-throttle: remove incorrect comment for tg_last_low_overflow_time Date: Tue, 29 Nov 2022 11:01:44 +0800 Message-ID: <20221129030147.27400-8-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Function tg_last_low_overflow_time is called with intermediate node as following: throtl_hierarchy_can_downgrade throtl_tg_can_downgrade tg_last_low_overflow_time throtl_hierarchy_can_upgrade throtl_tg_can_upgrade tg_last_low_overflow_time throtl_hierarchy_can_downgrade/throtl_hierarchy_can_upgrade will traverse from leaf node to sub-root node and pass traversed intermediate node to tg_last_low_overflow_time. No such limit could be found from context and implementation of tg_last_low_overflow_time, so remove this limit in comment. Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 1 - 1 file changed, 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 4b11099625da..3ddd8a15aa3f 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1762,7 +1762,6 @@ static unsigned long __tg_last_low_overflow_time(struct throtl_grp *tg) return min(rtime, wtime); } -/* tg should not be an intermediate node */ static unsigned long tg_last_low_overflow_time(struct throtl_grp *tg) { struct throtl_service_queue *parent_sq; From patchwork Tue Nov 29 03:01:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DB0AC46467 for ; Tue, 29 Nov 2022 03:02:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235336AbiK2DCI (ORCPT ); Mon, 28 Nov 2022 22:02:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235313AbiK2DB6 (ORCPT ); Mon, 28 Nov 2022 22:01:58 -0500 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6AB6C45097; Mon, 28 Nov 2022 19:01:57 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4NLnCx4dyFz15Mxq; Tue, 29 Nov 2022 11:01:17 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:55 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 08/10] blk-throttle: remove repeat check of elapsed time from last upgrade in throtl_hierarchy_can_downgrade Date: Tue, 29 Nov 2022 11:01:45 +0800 Message-ID: <20221129030147.27400-9-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org There is no need to check elapsed time from last upgrade for each node in hierarchy. Move this check before traversing as throtl_can_upgrade do to remove repeat check. Signed-off-by: Kemeng Shi Reported-by: kernel test robot Acked-by: Tejun Heo --- block/blk-throttle.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 3ddd8a15aa3f..94d850b57462 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1955,8 +1955,7 @@ static bool throtl_tg_can_downgrade(struct throtl_grp *tg) * If cgroup is below low limit, consider downgrade and throttle other * cgroups */ - if (time_after_eq(now, td->low_upgrade_time + td->throtl_slice) && - time_after_eq(now, tg_last_low_overflow_time(tg) + + if (time_after_eq(now, tg_last_low_overflow_time(tg) + td->throtl_slice) && (!throtl_tg_is_idle(tg) || !list_empty(&tg_to_blkg(tg)->blkcg->css.children))) @@ -1966,6 +1965,11 @@ static bool throtl_tg_can_downgrade(struct throtl_grp *tg) static bool throtl_hierarchy_can_downgrade(struct throtl_grp *tg) { + struct throtl_data *td = tg->td; + + if (time_before(jiffies, td->low_upgrade_time + td->throtl_slice)) + return false; + while (true) { if (!throtl_tg_can_downgrade(tg)) return false; From patchwork Tue Nov 29 03:01:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E383BC4321E for ; Tue, 29 Nov 2022 03:02:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235353AbiK2DCK (ORCPT ); Mon, 28 Nov 2022 22:02:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235316AbiK2DB7 (ORCPT ); Mon, 28 Nov 2022 22:01:59 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2874E3D92C; Mon, 28 Nov 2022 19:01:58 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLnCy0SkqzRpWf; Tue, 29 Nov 2022 11:01:18 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:55 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start Date: Tue, 29 Nov 2022 11:01:46 +0800 Message-ID: <20221129030147.27400-10-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org There is no need to update tg->slice_start[rw] to start when they are equal already. So remove "eq" part of check before update slice_start. Signed-off-by: Kemeng Shi Acked-by: Tejun Heo --- block/blk-throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 94d850b57462..4c80f2aa1e29 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -645,7 +645,7 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg, * that bandwidth. Do try to make use of that bandwidth while giving * credit. */ - if (time_after_eq(start, tg->slice_start[rw])) + if (time_after(start, tg->slice_start[rw])) tg->slice_start[rw] = start; tg->slice_end[rw] = jiffies + tg->td->throtl_slice; From patchwork Tue Nov 29 03:01:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kemeng Shi X-Patchwork-Id: 13058156 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE472C4708B for ; Tue, 29 Nov 2022 03:02:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235356AbiK2DCK (ORCPT ); Mon, 28 Nov 2022 22:02:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235317AbiK2DB7 (ORCPT ); Mon, 28 Nov 2022 22:01:59 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBBE445084; Mon, 28 Nov 2022 19:01:58 -0800 (PST) Received: from kwepemi500016.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLnCy4y5WzRpX3; Tue, 29 Nov 2022 11:01:18 +0800 (CST) Received: from huawei.com (10.174.178.129) by kwepemi500016.china.huawei.com (7.221.188.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 11:01:56 +0800 From: Kemeng Shi To: , , CC: , , , Subject: [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade Date: Tue, 29 Nov 2022 11:01:47 +0800 Message-ID: <20221129030147.27400-11-shikemeng@huawei.com> X-Mailer: git-send-email 2.14.1.windows.1 In-Reply-To: <20221129030147.27400-1-shikemeng@huawei.com> References: <20221129030147.27400-1-shikemeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.178.129] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemi500016.china.huawei.com (7.221.188.220) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Function throtl_hierarchy_can_upgrade will always return from while loop, so the return outside while loop is never reached. Break the loop when we traverse to root as throtl_hierarchy_can_downgrade do to avoid dead code. Signed-off-by: Kemeng Shi --- block/blk-throttle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 4c80f2aa1e29..9946524de158 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1854,7 +1854,7 @@ static bool throtl_hierarchy_can_upgrade(struct throtl_grp *tg) return true; tg = sq_to_tg(tg->service_queue.parent_sq); if (!tg || !tg_to_blkg(tg)->parent) - return false; + break; } return false; }