diff mbox series

[v5,4/4] blk-cgroup: Document the design of new lockless iostat_cpu list

Message ID 20220602185401.162937-1-longman@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v5,1/3] blk-cgroup: Correctly free percpu iostat_cpu in blkg on error exit | expand

Commit Message

Waiman Long June 2, 2022, 6:54 p.m. UTC
A set of percpu lockless lists per block cgroup (blkcg) is added to
track the set of recently updated iostat_cpu structures. Add comment
in the code to document the design of this new set of lockless lists.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 block/blk-cgroup.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Tejun Heo June 2, 2022, 7:05 p.m. UTC | #1
On Thu, Jun 02, 2022 at 02:54:01PM -0400, Waiman Long wrote:
> A set of percpu lockless lists per block cgroup (blkcg) is added to
> track the set of recently updated iostat_cpu structures. Add comment
> in the code to document the design of this new set of lockless lists.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.
Waiman Long June 2, 2022, 7:12 p.m. UTC | #2
On 6/2/22 15:05, Tejun Heo wrote:
> On Thu, Jun 02, 2022 at 02:54:01PM -0400, Waiman Long wrote:
>> A set of percpu lockless lists per block cgroup (blkcg) is added to
>> track the set of recently updated iostat_cpu structures. Add comment
>> in the code to document the design of this new set of lockless lists.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
> Acked-by: Tejun Heo <tj@kernel.org>
>
> Thanks.

I have just realized that I forgot to free the percpu blkcg->lhead in 
blkcg_css_free(). I will send out v6 with this change as well as 
integrating this documentation patch back. Sorry for the omission.

Thanks,
Longman
diff mbox series

Patch

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8af97f3b2fc9..f8f27551c16a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -60,6 +60,21 @@  static struct workqueue_struct *blkcg_punt_bio_wq;
 #define BLKG_DESTROY_BATCH_SIZE  64
 
 /*
+ * Lockless lists for tracking IO stats update
+ *
+ * New IO stats are stored in the percpu iostat_cpu within blkcg_gq (blkg).
+ * There are multiple blkg's (one for each block device) attached to each
+ * blkcg. The rstat code keeps track of which cpu has IO stats updated,
+ * but it doesn't know which blkg has the updated stats. If there are many
+ * block devices in a system, the cost of iterating all the blkg's to flush
+ * out the IO stats can be high. To reduce such overhead, a set of percpu
+ * lockless lists (lhead) per blkcg are used to track the set of recently
+ * updated iostat_cpu's since the last flush. An iostat_cpu will be put
+ * onto the lockless list on the update side [blk_cgroup_bio_start()] if
+ * not there yet and then removed when being flushed [blkcg_rstat_flush()].
+ * References to blkg are gotten and then put back in the process to
+ * protect against blkg removal.
+ *
  * lnode.next of the last entry in a lockless list is NULL. To enable us to
  * use lnode.next as a boolean flag to indicate its presence in a lockless
  * list, we have to make it non-NULL for all. This is done by using a