Message ID | 20230328142309.73413-1-hanjinke.666@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-throttle: Fix io statistics for cgroup v1 | expand |
On Tue, Mar 28, 2023 at 10:23:09PM +0800, Jinke Han wrote: > From: Jinke Han <hanjinke.666@bytedance.com> > > Now the io statistics of cgroup v1 are no longer accurate. Although > in the long run it's best that rstat is a good implementation of > cgroup v1 io statistics. But before that, we'd better fix this issue. Can you please expand on how the stats are wrong on v1 and how the patch fixes it? Thanks.
在 2023/3/30 上午2:54, Tejun Heo 写道: > On Tue, Mar 28, 2023 at 10:23:09PM +0800, Jinke Han wrote: >> From: Jinke Han <hanjinke.666@bytedance.com> >> >> Now the io statistics of cgroup v1 are no longer accurate. Although >> in the long run it's best that rstat is a good implementation of >> cgroup v1 io statistics. But before that, we'd better fix this issue. > > Can you please expand on how the stats are wrong on v1 and how the patch > fixes it? > > Thanks. > Now blkio.throttle.io_serviced and blkio.throttle.io_serviced become the only stable io stats interface of cgroup v1, and these statistics are done in the blk-throttle code. But the current code only counts the bios that are actually throttled. When the user does not add the throttle limit, the io stats for cgroup v1 has nothing. I fix it according to the statistical method of v2, and made it count all ios accurately. Thanks.
Hello, On Thu, Mar 30, 2023 at 11:44:04AM +0800, hanjinke wrote: > 在 2023/3/30 上午2:54, Tejun Heo 写道: > > On Tue, Mar 28, 2023 at 10:23:09PM +0800, Jinke Han wrote: > > > From: Jinke Han <hanjinke.666@bytedance.com> > > > > > > Now the io statistics of cgroup v1 are no longer accurate. Although > > > in the long run it's best that rstat is a good implementation of > > > cgroup v1 io statistics. But before that, we'd better fix this issue. > > > > Can you please expand on how the stats are wrong on v1 and how the patch > > fixes it? > > > > Thanks. > > > Now blkio.throttle.io_serviced and blkio.throttle.io_serviced become the "now" might be a bit too vague. Can you point to the commit which made the change? > only stable io stats interface of cgroup v1, and these statistics are done > in the blk-throttle code. But the current code only counts the bios that are Ah, okay, so the stats are now updated by blk-throtl itself but > actually throttled. When the user does not add the throttle limit, the io > stats for cgroup v1 has nothing. I fix it according to the statistical > method of v2, and made it count all ios accurately. updated only when limits are configured which can be confusing. Makes sense to me. Can you please update the patch description accordingly? Also, the following change: @@ -2033,6 +2033,9 @@ void blk_cgroup_bio_start(struct bio *bio) struct blkg_iostat_set *bis; unsigned long flags; + if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) + return; + /* Root-level stats are sourced from system-wide IO stats */ if (!cgroup_parent(blkcg->css.cgroup)) return; seems incomplete as there's an additional cgroup_subsys_on_dfl(io_cgrp_subsys) test in the function. We probably wanna remove that? Thanks.
在 2023/3/31 上午9:44, Tejun Heo 写道: > Hello, > > On Thu, Mar 30, 2023 at 11:44:04AM +0800, hanjinke wrote: >> 在 2023/3/30 上午2:54, Tejun Heo 写道: >>> On Tue, Mar 28, 2023 at 10:23:09PM +0800, Jinke Han wrote: >>>> From: Jinke Han <hanjinke.666@bytedance.com> >>>> >>>> Now the io statistics of cgroup v1 are no longer accurate. Although >>>> in the long run it's best that rstat is a good implementation of >>>> cgroup v1 io statistics. But before that, we'd better fix this issue. >>> >>> Can you please expand on how the stats are wrong on v1 and how the patch >>> fixes it? >>> >>> Thanks. >>> >> Now blkio.throttle.io_serviced and blkio.throttle.io_serviced become the > > "now" might be a bit too vague. Can you point to the commit which made the > change? > >> only stable io stats interface of cgroup v1, and these statistics are done >> in the blk-throttle code. But the current code only counts the bios that are > > Ah, okay, so the stats are now updated by blk-throtl itself but > >> actually throttled. When the user does not add the throttle limit, the io >> stats for cgroup v1 has nothing. I fix it according to the statistical >> method of v2, and made it count all ios accurately. > > updated only when limits are configured which can be confusing. Makes sense > to me. Can you please update the patch description accordingly? > > Also, the following change: > > @@ -2033,6 +2033,9 @@ void blk_cgroup_bio_start(struct bio *bio) > struct blkg_iostat_set *bis; > unsigned long flags; > > + if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) > + return; > + > /* Root-level stats are sourced from system-wide IO stats */ > if (!cgroup_parent(blkcg->css.cgroup)) > return; > > seems incomplete as there's an additional > cgroup_subsys_on_dfl(io_cgrp_subsys) test in the function. We probably wanna > remove that? > > Thanks. > okay, according to your suggestion, I will send a v2. Thanks
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index bd50b55bdb61..677e4473e45e 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -2033,6 +2033,9 @@ void blk_cgroup_bio_start(struct bio *bio) struct blkg_iostat_set *bis; unsigned long flags; + if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) + return; + /* Root-level stats are sourced from system-wide IO stats */ if (!cgroup_parent(blkcg->css.cgroup)) return; diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 47e9d8be68f3..2be66e9430f7 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2174,12 +2174,6 @@ bool __blk_throtl_bio(struct bio *bio) rcu_read_lock(); - if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) { - blkg_rwstat_add(&tg->stat_bytes, bio->bi_opf, - bio->bi_iter.bi_size); - blkg_rwstat_add(&tg->stat_ios, bio->bi_opf, 1); - } - spin_lock_irq(&q->queue_lock); throtl_update_latency_buckets(td); diff --git a/block/blk-throttle.h b/block/blk-throttle.h index ef4b7a4de987..d1ccbfe9f797 100644 --- a/block/blk-throttle.h +++ b/block/blk-throttle.h @@ -185,6 +185,15 @@ static inline bool blk_should_throtl(struct bio *bio) struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg); int rw = bio_data_dir(bio); + if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) { + if (!bio_flagged(bio, BIO_CGROUP_ACCT)) { + bio_set_flag(bio, BIO_CGROUP_ACCT); + blkg_rwstat_add(&tg->stat_bytes, bio->bi_opf, + bio->bi_iter.bi_size); + } + blkg_rwstat_add(&tg->stat_ios, bio->bi_opf, 1); + } + /* iops limit is always counted */ if (tg->has_rules_iops[rw]) return true;