Message ID | 20230428045149.1310073-1-tao1.su@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] block: Skip destroyed blkg when restart in blkg_destroy_all() | expand |
Hi, 在 2023/04/28 12:51, Tao Su 写道: > Kernel hang in blkg_destroy_all() when total blkg greater than > BLKG_DESTROY_BATCH_SIZE, because of not removing destroyed blkg in > blkg_list. So the size of blkg_list is same after destroying a > batch of blkg, and the infinite 'restart' occurs. > > Since blkg should stay on the queue list until blkg_free_workfn(), > skip destroyed blkg when restart a new round, which will solve this > kernel hang issue and satisfy the previous will to restart. Please add a fix tag: Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()") > > Reported-by: Xiangfei Ma <xiangfeix.ma@intel.com> > Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com> > Tested-by: Farrah Chen <farrah.chen@intel.com> > Signed-off-by: Yu Kuai <yukuai1@huaweicloud.com> You can remove this tag, and feel free to add: Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com> Thanks, Kuai > Signed-off-by: Tao Su <tao1.su@linux.intel.com> > --- > v2: > - change 'directly remove destroyed blkg' to 'skip destroyed blkg' > > v1: > - https://lore.kernel.org/all/20230425075911.839539-1-tao1.su@linux.intel.com/ > > block/blk-cgroup.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index bd50b55bdb61..75bad5d60c9f 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -528,6 +528,9 @@ static void blkg_destroy_all(struct gendisk *disk) > list_for_each_entry_safe(blkg, n, &q->blkg_list, q_node) { > struct blkcg *blkcg = blkg->blkcg; > > + if (hlist_unhashed(&blkg->blkcg_node)) > + continue; > + > spin_lock(&blkcg->lock); > blkg_destroy(blkg); > spin_unlock(&blkcg->lock); >
On Fri, Apr 28, 2023 at 02:12:06PM +0800, Yu Kuai wrote: > Hi, > > 在 2023/04/28 12:51, Tao Su 写道: > > Kernel hang in blkg_destroy_all() when total blkg greater than > > BLKG_DESTROY_BATCH_SIZE, because of not removing destroyed blkg in > > blkg_list. So the size of blkg_list is same after destroying a > > batch of blkg, and the infinite 'restart' occurs. > > > > Since blkg should stay on the queue list until blkg_free_workfn(), > > skip destroyed blkg when restart a new round, which will solve this > > kernel hang issue and satisfy the previous will to restart. > > Please add a fix tag: > > Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from > blkg_free_workfn() and blkcg_deactivate_policy()") > > > > Reported-by: Xiangfei Ma <xiangfeix.ma@intel.com> > > Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com> > > Tested-by: Farrah Chen <farrah.chen@intel.com> > > Signed-off-by: Yu Kuai <yukuai1@huaweicloud.com> > > You can remove this tag, and feel free to add: > > Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com> Got it, thanks! Tao > > Thanks, > Kuai > > Signed-off-by: Tao Su <tao1.su@linux.intel.com> > > --- > > v2: > > - change 'directly remove destroyed blkg' to 'skip destroyed blkg' > > > > v1: > > - https://lore.kernel.org/all/20230425075911.839539-1-tao1.su@linux.intel.com/ > > > > block/blk-cgroup.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > > index bd50b55bdb61..75bad5d60c9f 100644 > > --- a/block/blk-cgroup.c > > +++ b/block/blk-cgroup.c > > @@ -528,6 +528,9 @@ static void blkg_destroy_all(struct gendisk *disk) > > list_for_each_entry_safe(blkg, n, &q->blkg_list, q_node) { > > struct blkcg *blkcg = blkg->blkcg; > > + if (hlist_unhashed(&blkg->blkcg_node)) > > + continue; > > + > > spin_lock(&blkcg->lock); > > blkg_destroy(blkg); > > spin_unlock(&blkcg->lock); > > >
On Fri, 28 Apr 2023 12:51:49 +0800, Tao Su wrote: > Kernel hang in blkg_destroy_all() when total blkg greater than > BLKG_DESTROY_BATCH_SIZE, because of not removing destroyed blkg in > blkg_list. So the size of blkg_list is same after destroying a > batch of blkg, and the infinite 'restart' occurs. > > Since blkg should stay on the queue list until blkg_free_workfn(), > skip destroyed blkg when restart a new round, which will solve this > kernel hang issue and satisfy the previous will to restart. > > [...] Applied, thanks! [1/1] block: Skip destroyed blkg when restart in blkg_destroy_all() commit: 8176080d59e6d4ff9fc97ae534063073b4f7a715 Best regards,
On Fri, Apr 28, 2023 at 11:24:20AM -0600, Jens Axboe wrote: > > On Fri, 28 Apr 2023 12:51:49 +0800, Tao Su wrote: > > Kernel hang in blkg_destroy_all() when total blkg greater than > > BLKG_DESTROY_BATCH_SIZE, because of not removing destroyed blkg in > > blkg_list. So the size of blkg_list is same after destroying a > > batch of blkg, and the infinite 'restart' occurs. > > > > Since blkg should stay on the queue list until blkg_free_workfn(), > > skip destroyed blkg when restart a new round, which will solve this > > kernel hang issue and satisfy the previous will to restart. > > > > [...] > > Applied, thanks! Axboe, thanks! Tao > > [1/1] block: Skip destroyed blkg when restart in blkg_destroy_all() > commit: 8176080d59e6d4ff9fc97ae534063073b4f7a715 > > Best regards, > -- > Jens Axboe > > >
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index bd50b55bdb61..75bad5d60c9f 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -528,6 +528,9 @@ static void blkg_destroy_all(struct gendisk *disk) list_for_each_entry_safe(blkg, n, &q->blkg_list, q_node) { struct blkcg *blkcg = blkg->blkcg; + if (hlist_unhashed(&blkg->blkcg_node)) + continue; + spin_lock(&blkcg->lock); blkg_destroy(blkg); spin_unlock(&blkcg->lock);