diff mbox series

[v3] block: flush all throttled bios when deleting the cgroup

Message ID 20240817071108.1919729-1-lilingfeng@huaweicloud.com (mailing list archive)
State New, archived
Headers show
Series [v3] block: flush all throttled bios when deleting the cgroup | expand

Commit Message

Li Lingfeng Aug. 17, 2024, 7:11 a.m. UTC
From: Li Lingfeng <lilingfeng3@huawei.com>

When a process migrates to another cgroup and the original cgroup is deleted,
the restrictions of throttled bios cannot be removed. If the restrictions
are set too low, it will take a long time to complete these bios.

Refer to the process of deleting a disk to remove the restrictions and
issue bios when deleting the cgroup.

This makes difference on the behavior of throttled bios:
Before: the limit of the throttled bios can't be changed and the bios will
complete under this limit;
Now: the limit will be canceled and the throttled bios will be flushed
immediately.

References:
[1] https://lore.kernel.org/r/20220318130144.1066064-4-ming.lei@redhat.com
[2] https://lore.kernel.org/all/da861d63-58c6-3ca0-2535-9089993e9e28@huaweicloud.com/

Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
  v2->v3:
    Change "tg_cancel_bios" to "tg_flush_bios";
    Add reference of v2 to describe the background.
 block/blk-throttle.c | 68 ++++++++++++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 24 deletions(-)

Comments

Tejun Heo Aug. 19, 2024, 9:24 p.m. UTC | #1
Hello,

On Sat, Aug 17, 2024 at 03:11:08PM +0800, Li Lingfeng wrote:
> From: Li Lingfeng <lilingfeng3@huawei.com>
> 
> When a process migrates to another cgroup and the original cgroup is deleted,
> the restrictions of throttled bios cannot be removed. If the restrictions
> are set too low, it will take a long time to complete these bios.
> 
> Refer to the process of deleting a disk to remove the restrictions and
> issue bios when deleting the cgroup.
> 
> This makes difference on the behavior of throttled bios:
> Before: the limit of the throttled bios can't be changed and the bios will
> complete under this limit;
> Now: the limit will be canceled and the throttled bios will be flushed
> immediately.

I still don't see why this behavior is better. Wouldn't this make it easy to
escape IO limits by creating cgroups, doing a bunch of IOs and then deleting
them?

Thanks.
Li Lingfeng Aug. 20, 2024, 7:13 a.m. UTC | #2
在 2024/8/20 5:24, Tejun Heo 写道:
> Hello,
>
> On Sat, Aug 17, 2024 at 03:11:08PM +0800, Li Lingfeng wrote:
>> From: Li Lingfeng <lilingfeng3@huawei.com>
>>
>> When a process migrates to another cgroup and the original cgroup is deleted,
>> the restrictions of throttled bios cannot be removed. If the restrictions
>> are set too low, it will take a long time to complete these bios.
>>
>> Refer to the process of deleting a disk to remove the restrictions and
>> issue bios when deleting the cgroup.
>>
>> This makes difference on the behavior of throttled bios:
>> Before: the limit of the throttled bios can't be changed and the bios will
>> complete under this limit;
>> Now: the limit will be canceled and the throttled bios will be flushed
>> immediately.
> I still don't see why this behavior is better. Wouldn't this make it easy to
> escape IO limits by creating cgroups, doing a bunch of IOs and then deleting
> them?
>
> Thanks.
Yes, this actually would make it easy to escape IO limits.

As described by Yu Kuai in v2, I changed this to prevent IO hang.
And I think it may be more appropriate to remove the limits in this
scenario since the limits were set by cgroup and the cgroup has been
deleted.

Thanks.
Michal Koutný Aug. 20, 2024, 10 a.m. UTC | #3
On Mon, Aug 19, 2024 at 11:24:18AM GMT, Tejun Heo <tj@kernel.org> wrote:
> I still don't see why this behavior is better. Wouldn't this make it easy to
> escape IO limits by creating cgroups, doing a bunch of IOs and then deleting
> them?

IIUC, bios are flushed to parent throttl group, so if there's an
ancestral limit, it should be honored. (I find this similar to memcg
reparenting.)

Mere create + set limit + delete falls under the same delegation scope,
so if that limit is bypassed, it is only self-shooting in the leg.
Shortening the lifetime of offlined structures is benefitial, no?

Michal
Tejun Heo Aug. 20, 2024, 5:06 p.m. UTC | #4
On Sat, Aug 17, 2024 at 03:11:08PM +0800, Li Lingfeng wrote:
> From: Li Lingfeng <lilingfeng3@huawei.com>
> 
> When a process migrates to another cgroup and the original cgroup is deleted,
> the restrictions of throttled bios cannot be removed. If the restrictions
> are set too low, it will take a long time to complete these bios.
> 
> Refer to the process of deleting a disk to remove the restrictions and
> issue bios when deleting the cgroup.
> 
> This makes difference on the behavior of throttled bios:
> Before: the limit of the throttled bios can't be changed and the bios will
> complete under this limit;
> Now: the limit will be canceled and the throttled bios will be flushed
> immediately.
> 
> References:
> [1] https://lore.kernel.org/r/20220318130144.1066064-4-ming.lei@redhat.com
> [2] https://lore.kernel.org/all/da861d63-58c6-3ca0-2535-9089993e9e28@huaweicloud.com/
> 
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.
Li Lingfeng Oct. 22, 2024, 2:43 a.m. UTC | #5
Friendly ping ...

Thanks

在 2024/8/17 15:11, Li Lingfeng 写道:
> From: Li Lingfeng <lilingfeng3@huawei.com>
>
> When a process migrates to another cgroup and the original cgroup is deleted,
> the restrictions of throttled bios cannot be removed. If the restrictions
> are set too low, it will take a long time to complete these bios.
>
> Refer to the process of deleting a disk to remove the restrictions and
> issue bios when deleting the cgroup.
>
> This makes difference on the behavior of throttled bios:
> Before: the limit of the throttled bios can't be changed and the bios will
> complete under this limit;
> Now: the limit will be canceled and the throttled bios will be flushed
> immediately.
>
> References:
> [1] https://lore.kernel.org/r/20220318130144.1066064-4-ming.lei@redhat.com
> [2] https://lore.kernel.org/all/da861d63-58c6-3ca0-2535-9089993e9e28@huaweicloud.com/
>
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>    v2->v3:
>      Change "tg_cancel_bios" to "tg_flush_bios";
>      Add reference of v2 to describe the background.
>   block/blk-throttle.c | 68 ++++++++++++++++++++++++++++----------------
>   1 file changed, 44 insertions(+), 24 deletions(-)
>
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 6943ec720f39..cf7f4912c57a 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1526,6 +1526,42 @@ static void throtl_shutdown_wq(struct request_queue *q)
>   	cancel_work_sync(&td->dispatch_work);
>   }
>   
> +static void tg_flush_bios(struct throtl_grp *tg)
> +{
> +	struct throtl_service_queue *sq = &tg->service_queue;
> +
> +	if (tg->flags & THROTL_TG_CANCELING)
> +		return;
> +	/*
> +	 * Set the flag to make sure throtl_pending_timer_fn() won't
> +	 * stop until all throttled bios are dispatched.
> +	 */
> +	tg->flags |= THROTL_TG_CANCELING;
> +
> +	/*
> +	 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
> +	 * will be inserted to service queue without THROTL_TG_PENDING
> +	 * set in tg_update_disptime below. Then IO dispatched from
> +	 * child in tg_dispatch_one_bio will trigger double insertion
> +	 * and corrupt the tree.
> +	 */
> +	if (!(tg->flags & THROTL_TG_PENDING))
> +		return;
> +
> +	/*
> +	 * Update disptime after setting the above flag to make sure
> +	 * throtl_select_dispatch() won't exit without dispatching.
> +	 */
> +	tg_update_disptime(tg);
> +
> +	throtl_schedule_pending_timer(sq, jiffies + 1);
> +}
> +
> +static void throtl_pd_offline(struct blkg_policy_data *pd)
> +{
> +	tg_flush_bios(pd_to_tg(pd));
> +}
> +
>   struct blkcg_policy blkcg_policy_throtl = {
>   	.dfl_cftypes		= throtl_files,
>   	.legacy_cftypes		= throtl_legacy_files,
> @@ -1533,6 +1569,7 @@ struct blkcg_policy blkcg_policy_throtl = {
>   	.pd_alloc_fn		= throtl_pd_alloc,
>   	.pd_init_fn		= throtl_pd_init,
>   	.pd_online_fn		= throtl_pd_online,
> +	.pd_offline_fn		= throtl_pd_offline,
>   	.pd_free_fn		= throtl_pd_free,
>   };
>   
> @@ -1553,32 +1590,15 @@ void blk_throtl_cancel_bios(struct gendisk *disk)
>   	 */
>   	rcu_read_lock();
>   	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
> -		struct throtl_grp *tg = blkg_to_tg(blkg);
> -		struct throtl_service_queue *sq = &tg->service_queue;
> -
> -		/*
> -		 * Set the flag to make sure throtl_pending_timer_fn() won't
> -		 * stop until all throttled bios are dispatched.
> -		 */
> -		tg->flags |= THROTL_TG_CANCELING;
> -
>   		/*
> -		 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
> -		 * will be inserted to service queue without THROTL_TG_PENDING
> -		 * set in tg_update_disptime below. Then IO dispatched from
> -		 * child in tg_dispatch_one_bio will trigger double insertion
> -		 * and corrupt the tree.
> +		 * disk_release will call pd_offline_fn to cancel bios.
> +		 * However, disk_release can't be called if someone get
> +		 * the refcount of device and issued bios which are
> +		 * inflight after del_gendisk.
> +		 * Cancel bios here to ensure no bios are inflight after
> +		 * del_gendisk.
>   		 */
> -		if (!(tg->flags & THROTL_TG_PENDING))
> -			continue;
> -
> -		/*
> -		 * Update disptime after setting the above flag to make sure
> -		 * throtl_select_dispatch() won't exit without dispatching.
> -		 */
> -		tg_update_disptime(tg);
> -
> -		throtl_schedule_pending_timer(sq, jiffies + 1);
> +		tg_flush_bios(blkg_to_tg(blkg));
>   	}
>   	rcu_read_unlock();
>   	spin_unlock_irq(&q->queue_lock);
Jens Axboe Oct. 22, 2024, 2:47 p.m. UTC | #6
On Sat, 17 Aug 2024 15:11:08 +0800, Li Lingfeng wrote:
> When a process migrates to another cgroup and the original cgroup is deleted,
> the restrictions of throttled bios cannot be removed. If the restrictions
> are set too low, it will take a long time to complete these bios.
> 
> Refer to the process of deleting a disk to remove the restrictions and
> issue bios when deleting the cgroup.
> 
> [...]

Applied, thanks!

[1/1] block: flush all throttled bios when deleting the cgroup
      (no commit info)

Best regards,
diff mbox series

Patch

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 6943ec720f39..cf7f4912c57a 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1526,6 +1526,42 @@  static void throtl_shutdown_wq(struct request_queue *q)
 	cancel_work_sync(&td->dispatch_work);
 }
 
+static void tg_flush_bios(struct throtl_grp *tg)
+{
+	struct throtl_service_queue *sq = &tg->service_queue;
+
+	if (tg->flags & THROTL_TG_CANCELING)
+		return;
+	/*
+	 * Set the flag to make sure throtl_pending_timer_fn() won't
+	 * stop until all throttled bios are dispatched.
+	 */
+	tg->flags |= THROTL_TG_CANCELING;
+
+	/*
+	 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
+	 * will be inserted to service queue without THROTL_TG_PENDING
+	 * set in tg_update_disptime below. Then IO dispatched from
+	 * child in tg_dispatch_one_bio will trigger double insertion
+	 * and corrupt the tree.
+	 */
+	if (!(tg->flags & THROTL_TG_PENDING))
+		return;
+
+	/*
+	 * Update disptime after setting the above flag to make sure
+	 * throtl_select_dispatch() won't exit without dispatching.
+	 */
+	tg_update_disptime(tg);
+
+	throtl_schedule_pending_timer(sq, jiffies + 1);
+}
+
+static void throtl_pd_offline(struct blkg_policy_data *pd)
+{
+	tg_flush_bios(pd_to_tg(pd));
+}
+
 struct blkcg_policy blkcg_policy_throtl = {
 	.dfl_cftypes		= throtl_files,
 	.legacy_cftypes		= throtl_legacy_files,
@@ -1533,6 +1569,7 @@  struct blkcg_policy blkcg_policy_throtl = {
 	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
+	.pd_offline_fn		= throtl_pd_offline,
 	.pd_free_fn		= throtl_pd_free,
 };
 
@@ -1553,32 +1590,15 @@  void blk_throtl_cancel_bios(struct gendisk *disk)
 	 */
 	rcu_read_lock();
 	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
-		struct throtl_grp *tg = blkg_to_tg(blkg);
-		struct throtl_service_queue *sq = &tg->service_queue;
-
-		/*
-		 * Set the flag to make sure throtl_pending_timer_fn() won't
-		 * stop until all throttled bios are dispatched.
-		 */
-		tg->flags |= THROTL_TG_CANCELING;
-
 		/*
-		 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
-		 * will be inserted to service queue without THROTL_TG_PENDING
-		 * set in tg_update_disptime below. Then IO dispatched from
-		 * child in tg_dispatch_one_bio will trigger double insertion
-		 * and corrupt the tree.
+		 * disk_release will call pd_offline_fn to cancel bios.
+		 * However, disk_release can't be called if someone get
+		 * the refcount of device and issued bios which are
+		 * inflight after del_gendisk.
+		 * Cancel bios here to ensure no bios are inflight after
+		 * del_gendisk.
 		 */
-		if (!(tg->flags & THROTL_TG_PENDING))
-			continue;
-
-		/*
-		 * Update disptime after setting the above flag to make sure
-		 * throtl_select_dispatch() won't exit without dispatching.
-		 */
-		tg_update_disptime(tg);
-
-		throtl_schedule_pending_timer(sq, jiffies + 1);
+		tg_flush_bios(blkg_to_tg(blkg));
 	}
 	rcu_read_unlock();
 	spin_unlock_irq(&q->queue_lock);