diff mbox series

blk-cgroup: don't clear stat in blkcg_reset_stats()

Message ID 20240627090856.2345018-1-lilingfeng@huaweicloud.com (mailing list archive)
State New
Headers show
Series blk-cgroup: don't clear stat in blkcg_reset_stats() | expand

Commit Message

Li Lingfeng June 27, 2024, 9:08 a.m. UTC
From: Li Lingfeng <lilingfeng3@huawei.com>

The list corruption described in commit 6da668063279 ("blk-cgroup: fix
list corruption from resetting io stat") has no effect. It's unnecessary
to fix it.

As for cgroup v1, it does not use iostat any more after commit
ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
memset to clear iostat has no real effect.
As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
list.

The list of root cgroup can be used by both cgroup v1 and v2 while
non-root cgroup can't since it must be removed before switch between
cgroup v1 and v2.
So it may has effect if the list of root used by cgroup v2 was corrupted
after switching to cgroup v1, and switch back to cgroup v2 to use the
corrupted list again.
However, the root cgroup will not use the list any more after commit
ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat").

Although this has no negative effect, it is not necessary. Remove the
related code.

Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting io stat")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
 block/blk-cgroup.c | 24 ------------------------
 1 file changed, 24 deletions(-)

Comments

Tejun Heo June 27, 2024, 8:10 p.m. UTC | #1
Hello,

On Thu, Jun 27, 2024 at 05:08:56PM +0800, Li Lingfeng wrote:
> The list corruption described in commit 6da668063279 ("blk-cgroup: fix
> list corruption from resetting io stat") has no effect. It's unnecessary
> to fix it.

I find this paragraph a bit confusing. At the time, it was broken, right?
And if we were to memset() now, it'd break again.

> As for cgroup v1, it does not use iostat any more after commit
> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
> memset to clear iostat has no real effect.

Ah, okay, this is because we made the stats blk-throtl specific but didn't
implement ->pd_reset_stat_fn(), right?

> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
> list.
> 
> The list of root cgroup can be used by both cgroup v1 and v2 while
> non-root cgroup can't since it must be removed before switch between
> cgroup v1 and v2.
> So it may has effect if the list of root used by cgroup v2 was corrupted
> after switching to cgroup v1, and switch back to cgroup v2 to use the
> corrupted list again.
> However, the root cgroup will not use the list any more after commit
> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat").

Hmm... I'm still having a bit of trouble following this line of argument
given that all the patch does is dropping stat clearing.

> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
>  	 * anyway.  If you get hit by a race, retry.
>  	 */
>  	hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
> -		blkg_clear_stat(blkg);
>  		for (i = 0; i < BLKCG_MAX_POLS; i++) {
>  			struct blkcg_policy *pol = blkcg_policy[i];

The patch looks fine to me although it'd be nice to follow up with a patch
to implement ->pd_reset_stat_fn() for blk-throtl. I'm not quite following
the list corruption part of argument.

Thanks.
Waiman Long June 27, 2024, 9:03 p.m. UTC | #2
On 6/27/24 05:08, Li Lingfeng wrote:
> From: Li Lingfeng <lilingfeng3@huawei.com>
>
> The list corruption described in commit 6da668063279 ("blk-cgroup: fix
> list corruption from resetting io stat") has no effect. It's unnecessary
> to fix it.
>
> As for cgroup v1, it does not use iostat any more after commit
> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
> memset to clear iostat has no real effect.
> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
> list.
>
> The list of root cgroup can be used by both cgroup v1 and v2 while
> non-root cgroup can't since it must be removed before switch between
> cgroup v1 and v2.
> So it may has effect if the list of root used by cgroup v2 was corrupted
> after switching to cgroup v1, and switch back to cgroup v2 to use the
> corrupted list again.
> However, the root cgroup will not use the list any more after commit
> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat").
>
> Although this has no negative effect, it is not necessary. Remove the
> related code.
You may be right that it may not be necessary in the mainline kernel, it 
does fix the issue on distros with older kernels or some stable releases 
where commit ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup 
v1") may not be present.

>
> Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting io stat")
I don't think there should be a fixes tag or it will be backported to 
stable releases.
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>   block/blk-cgroup.c | 24 ------------------------
>   1 file changed, 24 deletions(-)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 37e6cc91d576..1113c398a742 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src)
>   	}
>   }
>   
> -static void __blkg_clear_stat(struct blkg_iostat_set *bis)
> -{
> -	struct blkg_iostat cur = {0};
> -	unsigned long flags;
> -
> -	flags = u64_stats_update_begin_irqsave(&bis->sync);
> -	blkg_iostat_set(&bis->cur, &cur);
> -	blkg_iostat_set(&bis->last, &cur);
> -	u64_stats_update_end_irqrestore(&bis->sync, flags);
> -}
> -
> -static void blkg_clear_stat(struct blkcg_gq *blkg)
> -{
> -	int cpu;
> -
> -	for_each_possible_cpu(cpu) {
> -		struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu);
> -
> -		__blkg_clear_stat(s);
> -	}
> -	__blkg_clear_stat(&blkg->iostat);
> -}
> -
>   static int blkcg_reset_stats(struct cgroup_subsys_state *css,
>   			     struct cftype *cftype, u64 val)
>   {
> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
>   	 * anyway.  If you get hit by a race, retry.
>   	 */
>   	hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
> -		blkg_clear_stat(blkg);
>   		for (i = 0; i < BLKCG_MAX_POLS; i++) {
>   			struct blkcg_policy *pol = blkcg_policy[i];
>   

If you are saying that iostat is no longer used in cgroup v1, why not 
remove the blkcg_reset_stats() and its supporting functions and 
deprecate the v1 reset_stats control file. The file should still be 
there to avoid userspace regression, but it will be a nop.

Cheers,
Longman
Yu Kuai June 28, 2024, 3:14 a.m. UTC | #3
Hi,

在 2024/06/28 4:10, Tejun Heo 写道:
> Hello,
> 
> On Thu, Jun 27, 2024 at 05:08:56PM +0800, Li Lingfeng wrote:
>> The list corruption described in commit 6da668063279 ("blk-cgroup: fix
>> list corruption from resetting io stat") has no effect. It's unnecessary
>> to fix it.
> 
> I find this paragraph a bit confusing. At the time, it was broken, right?
> And if we were to memset() now, it'd break again.
> 
>> As for cgroup v1, it does not use iostat any more after commit
>> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
>> memset to clear iostat has no real effect.
> 
> Ah, okay, this is because we made the stats blk-throtl specific but didn't
> implement ->pd_reset_stat_fn(), right?

I'm afraid not... Implement pd_reset_stat_fn() or not is another
problem, this patch should be just code cleanup, not fixing any real
problems.

> 
>> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
>> list.
>>
>> The list of root cgroup can be used by both cgroup v1 and v2 while
>> non-root cgroup can't since it must be removed before switch between
>> cgroup v1 and v2.
>> So it may has effect if the list of root used by cgroup v2 was corrupted
>> after switching to cgroup v1, and switch back to cgroup v2 to use the
>> corrupted list again.
>> However, the root cgroup will not use the list any more after commit
>> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat").
> 
> Hmm... I'm still having a bit of trouble following this line of argument
> given that all the patch does is dropping stat clearing.
> 
>> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
>>   	 * anyway.  If you get hit by a race, retry.
>>   	 */
>>   	hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
>> -		blkg_clear_stat(blkg);
>>   		for (i = 0; i < BLKCG_MAX_POLS; i++) {
>>   			struct blkcg_policy *pol = blkcg_policy[i];
> 
> The patch looks fine to me although it'd be nice to follow up with a patch
> to implement ->pd_reset_stat_fn() for blk-throtl. I'm not quite following
> the list corruption part of argument.

The code deleted by this patch was claimed to fix a lsit corruption,
however, the list corruption does not exist now hence related code can
be removed:

1) Take a look at blk_cgroup_bio_start, now there are two conditions
before this blkg can be added to the per_cpu list:

blk_cgroup_bio_start
  if (!cgroup_subsys_on_dfl(io_cgrp_subsys))
  -> the list will always be empty for cgroup v1
   return;
  if (!cgroup_parent(blkcg->css.cgroup))
  -> the list will always be empty for root blkg
   return;
  ...
  llist_add()

2) blkcg_reset_stats can only be called from cgroup v1 api, hence
there is nothing to be cleared for blkg_clear_stat();

3) Noted that user can switch from cgroup v2 to v1, however, we found
that user must delete all the child cg to do that, hence only root blkg
can be kept after switching to v1. And root blkg is bypassed from
blk_cgroup_bio_start(), hence no problem.

Thanks,
Kuai
> 
> Thanks.
>
Li Lingfeng June 28, 2024, 3:17 a.m. UTC | #4
在 2024/6/28 5:03, Waiman Long 写道:
>
> On 6/27/24 05:08, Li Lingfeng wrote:
>> From: Li Lingfeng <lilingfeng3@huawei.com>
>>
>> The list corruption described in commit 6da668063279 ("blk-cgroup: fix
>> list corruption from resetting io stat") has no effect. It's unnecessary
>> to fix it.
>>
>> As for cgroup v1, it does not use iostat any more after commit
>> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
>> memset to clear iostat has no real effect.
>> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
>> list.
>>
>> The list of root cgroup can be used by both cgroup v1 and v2 while
>> non-root cgroup can't since it must be removed before switch between
>> cgroup v1 and v2.
>> So it may has effect if the list of root used by cgroup v2 was corrupted
>> after switching to cgroup v1, and switch back to cgroup v2 to use the
>> corrupted list again.
>> However, the root cgroup will not use the list any more after commit
>> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup 
>> io.stat").
>>
>> Although this has no negative effect, it is not necessary. Remove the
>> related code.
> You may be right that it may not be necessary in the mainline kernel, 
> it does fix the issue on distros with older kernels or some stable 
> releases where commit ad7c3b41e86b("blk-throttle: Fix io statistics 
> for cgroup v1") may not be present.
>
>>
>> Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting 
>> io stat")
> I don't think there should be a fixes tag or it will be backported to 
> stable releases.
OK, I will remove it.
>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
>> ---
>>   block/blk-cgroup.c | 24 ------------------------
>>   1 file changed, 24 deletions(-)
>>
>> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
>> index 37e6cc91d576..1113c398a742 100644
>> --- a/block/blk-cgroup.c
>> +++ b/block/blk-cgroup.c
>> @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat 
>> *dst, struct blkg_iostat *src)
>>       }
>>   }
>>   -static void __blkg_clear_stat(struct blkg_iostat_set *bis)
>> -{
>> -    struct blkg_iostat cur = {0};
>> -    unsigned long flags;
>> -
>> -    flags = u64_stats_update_begin_irqsave(&bis->sync);
>> -    blkg_iostat_set(&bis->cur, &cur);
>> -    blkg_iostat_set(&bis->last, &cur);
>> -    u64_stats_update_end_irqrestore(&bis->sync, flags);
>> -}
>> -
>> -static void blkg_clear_stat(struct blkcg_gq *blkg)
>> -{
>> -    int cpu;
>> -
>> -    for_each_possible_cpu(cpu) {
>> -        struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu);
>> -
>> -        __blkg_clear_stat(s);
>> -    }
>> -    __blkg_clear_stat(&blkg->iostat);
>> -}
>> -
>>   static int blkcg_reset_stats(struct cgroup_subsys_state *css,
>>                    struct cftype *cftype, u64 val)
>>   {
>> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct 
>> cgroup_subsys_state *css,
>>        * anyway.  If you get hit by a race, retry.
>>        */
>>       hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
>> -        blkg_clear_stat(blkg);
>>           for (i = 0; i < BLKCG_MAX_POLS; i++) {
>>               struct blkcg_policy *pol = blkcg_policy[i];
>
> If you are saying that iostat is no longer used in cgroup v1, why not 
> remove the blkcg_reset_stats() and its supporting functions and 
> deprecate the v1 reset_stats control file. The file should still be 
> there to avoid userspace regression, but it will be a nop.
I'm not sure if we can just remove blkcg_reset_stats() since
bfq_pd_reset_stats() may be called in it.

Thanks.
>
> Cheers,
> Longman
Yu Kuai June 28, 2024, 3:22 a.m. UTC | #5
Hi,

+CC Jan and Paolo for bfq part.

在 2024/06/28 5:03, Waiman Long 写道:
> 
> On 6/27/24 05:08, Li Lingfeng wrote:
>> From: Li Lingfeng <lilingfeng3@huawei.com>
>>
>> The list corruption described in commit 6da668063279 ("blk-cgroup: fix
>> list corruption from resetting io stat") has no effect. It's unnecessary
>> to fix it.
>>
>> As for cgroup v1, it does not use iostat any more after commit
>> ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
>> memset to clear iostat has no real effect.
>> As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
>> list.
>>
>> The list of root cgroup can be used by both cgroup v1 and v2 while
>> non-root cgroup can't since it must be removed before switch between
>> cgroup v1 and v2.
>> So it may has effect if the list of root used by cgroup v2 was corrupted
>> after switching to cgroup v1, and switch back to cgroup v2 to use the
>> corrupted list again.
>> However, the root cgroup will not use the list any more after commit
>> ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup 
>> io.stat").
>>
>> Although this has no negative effect, it is not necessary. Remove the
>> related code.
> You may be right that it may not be necessary in the mainline kernel, it 
> does fix the issue on distros with older kernels or some stable releases 
> where commit ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup 
> v1") may not be present.
> 
>>
>> Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting 
>> io stat")
> I don't think there should be a fixes tag or it will be backported to 
> stable releases.

Yes, and probably it's better to add:
Depends-on: ad7c3b41e86b ("blk-throttle: Fix io statistics for cgroup v1")
Depends-on: 0416f3be58c6 ("blk-cgroup: don't update io stat for root 
cgroup")

>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
>> ---
>>   block/blk-cgroup.c | 24 ------------------------
>>   1 file changed, 24 deletions(-)
>>
>> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
>> index 37e6cc91d576..1113c398a742 100644
>> --- a/block/blk-cgroup.c
>> +++ b/block/blk-cgroup.c
>> @@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat 
>> *dst, struct blkg_iostat *src)
>>       }
>>   }
>> -static void __blkg_clear_stat(struct blkg_iostat_set *bis)
>> -{
>> -    struct blkg_iostat cur = {0};
>> -    unsigned long flags;
>> -
>> -    flags = u64_stats_update_begin_irqsave(&bis->sync);
>> -    blkg_iostat_set(&bis->cur, &cur);
>> -    blkg_iostat_set(&bis->last, &cur);
>> -    u64_stats_update_end_irqrestore(&bis->sync, flags);
>> -}
>> -
>> -static void blkg_clear_stat(struct blkcg_gq *blkg)
>> -{
>> -    int cpu;
>> -
>> -    for_each_possible_cpu(cpu) {
>> -        struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu);
>> -
>> -        __blkg_clear_stat(s);
>> -    }
>> -    __blkg_clear_stat(&blkg->iostat);
>> -}
>> -
>>   static int blkcg_reset_stats(struct cgroup_subsys_state *css,
>>                    struct cftype *cftype, u64 val)
>>   {
>> @@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct 
>> cgroup_subsys_state *css,
>>        * anyway.  If you get hit by a race, retry.
>>        */
>>       hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
>> -        blkg_clear_stat(blkg);
>>           for (i = 0; i < BLKCG_MAX_POLS; i++) {
>>               struct blkcg_policy *pol = blkcg_policy[i];
> 
> If you are saying that iostat is no longer used in cgroup v1, why not 
> remove the blkcg_reset_stats() and its supporting functions and 
> deprecate the v1 reset_stats control file. The file should still be 
> there to avoid userspace regression, but it will be a nop.

No, this is not correct, blkg->iostat is not used in cgroup v1, however,
bfq(and probably blk-throtl) has it's only stats, and they rely on
pd_reset_stats_fn() from blkcg_reset_stats() to clear the stats. I don't
think remove it is a good idea for now, bfq maintainer must agree with
this.

Thanks,
Kuai
> 
> Cheers,
> Longman
> 
> 
> .
>
diff mbox series

Patch

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 37e6cc91d576..1113c398a742 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -629,29 +629,6 @@  static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src)
 	}
 }
 
-static void __blkg_clear_stat(struct blkg_iostat_set *bis)
-{
-	struct blkg_iostat cur = {0};
-	unsigned long flags;
-
-	flags = u64_stats_update_begin_irqsave(&bis->sync);
-	blkg_iostat_set(&bis->cur, &cur);
-	blkg_iostat_set(&bis->last, &cur);
-	u64_stats_update_end_irqrestore(&bis->sync, flags);
-}
-
-static void blkg_clear_stat(struct blkcg_gq *blkg)
-{
-	int cpu;
-
-	for_each_possible_cpu(cpu) {
-		struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu);
-
-		__blkg_clear_stat(s);
-	}
-	__blkg_clear_stat(&blkg->iostat);
-}
-
 static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 			     struct cftype *cftype, u64 val)
 {
@@ -668,7 +645,6 @@  static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 	 * anyway.  If you get hit by a race, retry.
 	 */
 	hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
-		blkg_clear_stat(blkg);
 		for (i = 0; i < BLKCG_MAX_POLS; i++) {
 			struct blkcg_policy *pol = blkcg_policy[i];