Message ID | ZFWKHRCDsaNDNXIj@slm.duckdns.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
Hi, > BACKGROUND > ========== > > When multiple work items are queued to a workqueue, their execution order > doesn't match the queueing order. They may get executed in any order and > simultaneously. When fully serialized execution - one by one in the queueing > order - is needed, an ordered workqueue should be used which can be created > with alloc_ordered_workqueue(). > > However, alloc_ordered_workqueue() was a later addition. Before it, an > ordered workqueue could be obtained by creating an UNBOUND workqueue with > @max_active==1. This originally was an implementation side-effect which was > broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be > ordered"). Because there were users that depended on the ordered execution, > 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered") > made workqueue allocation path to implicitly promote UNBOUND workqueues w/ > @max_active==1 to ordered workqueues. > > While this has worked okay, overloading the UNBOUND allocation interface > this way creates other issues. It's difficult to tell whether a given > workqueue actually needs to be ordered and users that legitimately want a > min concurrency level wq unexpectedly gets an ordered one instead. With > planned UNBOUND workqueue updates to improve execution locality and more > prevalence of chiplet designs which can benefit from such improvements, this > isn't a state we wanna be in forever. > > This patch series audits all callsites that create an UNBOUND workqueue w/ > @max_active==1 and converts them to alloc_ordered_workqueue() as necessary. > > WHAT TO LOOK FOR > ================ > > The conversions are from > > alloc_workqueue(WQ_UNBOUND | flags, 1, args..) > > to > > alloc_ordered_workqueue(flags, args...) > > which don't cause any functional changes. If you know that fully ordered > execution is not ncessary, please let me know. I'll drop the conversion and > instead add a comment noting the fact to reduce confusion while conversion > is in progress. > > If you aren't fully sure, it's completely fine to let the conversion > through. The behavior will stay exactly the same and we can always > reconsider later. > > As there are follow-up workqueue core changes, I'd really appreciate if the > patch can be routed through the workqueue tree w/ your acks. Thanks. > > v2: btrfs_alloc_workqueue() updated too as suggested by Wang. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Cc: Wang Yugui <wangyugui@e16-tech.com> > Cc: Chris Mason <clm@fb.com> > Cc: Josef Bacik <josef@toxicpanda.com> > Cc: David Sterba <dsterba@suse.com> > Cc: linux-btrfs@vger.kernel.org > --- > Hello, > > Wang, yeah, that's a helper that can't tell whether the caller wants an > ordered wq or not, so it needs to be updated too. How does this look? > > Thanks. > > fs/btrfs/async-thread.c | 7 +++++-- > fs/btrfs/disk-io.c | 2 +- > fs/btrfs/scrub.c | 6 ++++-- > 3 files changed, 10 insertions(+), 5 deletions(-) > > --- a/fs/btrfs/async-thread.c > +++ b/fs/btrfs/async-thread.c > @@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_work > ret->thresh = thresh; > } > > - ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active, > - name); > + if (ret->current_active == 1) > + ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name); > + else > + ret->normal_wq = alloc_workqueue("btrfs-%s", flags, > + ret->current_active, name); > if (!ret->normal_wq) { > kfree(ret); > return NULL; by test, I noticed some warning caused by void workqueue_set_max_active(struct workqueue_struct *wq, int max_active) if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT)) return; so I tested again with the flowing fix diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c index 43c8995..e4b68e9 100644 --- a/fs/btrfs/async-thread.c +++ b/fs/btrfs/async-thread.c @@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info, ret->thresh = thresh; } - ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active, - name); + if(limit_active == 1) + ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name); + else + ret->normal_wq = alloc_workqueue("btrfs-%s", flags, + ret->current_active, name); if (!ret->normal_wq) { kfree(ret); return NULL; @@ -139,7 +139,7 @@ static inline void thresh_exec_hook(struct btrfs_workqueue *wq) long pending; int need_change = 0; - if (wq->thresh == NO_THRESHOLD) + if (wq->thresh == NO_THRESHOLD || wq->limit_active == 1) return; atomic_dec(&wq->pending); we need 'limit_active' at 2nd postition, so I used 'limit_active' and 1st postition too. Best Regards Wang Yugui (wangyugui@e16-tech.com) 2023/05/06 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -2218,7 +2218,7 @@ static int btrfs_init_workqueues(struct > fs_info->qgroup_rescan_workers = > btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0); > fs_info->discard_ctl.discard_workers = > - alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1); > + alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE); > > if (!(fs_info->workers && fs_info->hipri_workers && > fs_info->delalloc_workers && fs_info->flush_workers && > --- a/fs/btrfs/scrub.c > +++ b/fs/btrfs/scrub.c > @@ -4245,8 +4245,10 @@ static noinline_for_stack int scrub_work > if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt)) > return 0; > > - scrub_workers = alloc_workqueue("btrfs-scrub", flags, > - is_dev_replace ? 1 : max_active); > + if (is_dev_replace) > + scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags); > + else > + scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active); > if (!scrub_workers) > goto fail_scrub_workers; >
On Sat, May 06, 2023 at 09:40:14AM +0800, Wang Yugui wrote: > by test, I noticed some warning caused by > void workqueue_set_max_active(struct workqueue_struct *wq, int max_active) > if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT)) > return; > > so I tested again with the flowing fix > > diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c > index 43c8995..e4b68e9 100644 > --- a/fs/btrfs/async-thread.c > +++ b/fs/btrfs/async-thread.c > @@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info, > ret->thresh = thresh; > } > > - ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active, > - name); > + if(limit_active == 1) > + ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name); > + else > + ret->normal_wq = alloc_workqueue("btrfs-%s", flags, > + ret->current_active, name); > if (!ret->normal_wq) { > kfree(ret); > return NULL; > @@ -139,7 +139,7 @@ static inline void thresh_exec_hook(struct btrfs_workqueue *wq) > long pending; > int need_change = 0; > > - if (wq->thresh == NO_THRESHOLD) > + if (wq->thresh == NO_THRESHOLD || wq->limit_active == 1) > return; > > atomic_dec(&wq->pending); > > we need 'limit_active' at 2nd postition, so I used 'limit_active' and 1st > postition too. Oh, that most likely means that these workqueues don't need to and shouldn't be ordered. Will update the patch. Thanks.
--- a/fs/btrfs/async-thread.c +++ b/fs/btrfs/async-thread.c @@ -99,8 +99,11 @@ struct btrfs_workqueue *btrfs_alloc_work ret->thresh = thresh; } - ret->normal_wq = alloc_workqueue("btrfs-%s", flags, ret->current_active, - name); + if (ret->current_active == 1) + ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name); + else + ret->normal_wq = alloc_workqueue("btrfs-%s", flags, + ret->current_active, name); if (!ret->normal_wq) { kfree(ret); return NULL; --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2218,7 +2218,7 @@ static int btrfs_init_workqueues(struct fs_info->qgroup_rescan_workers = btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0); fs_info->discard_ctl.discard_workers = - alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1); + alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE); if (!(fs_info->workers && fs_info->hipri_workers && fs_info->delalloc_workers && fs_info->flush_workers && --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -4245,8 +4245,10 @@ static noinline_for_stack int scrub_work if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt)) return 0; - scrub_workers = alloc_workqueue("btrfs-scrub", flags, - is_dev_replace ? 1 : max_active); + if (is_dev_replace) + scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags); + else + scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active); if (!scrub_workers) goto fail_scrub_workers;
BACKGROUND ========== When multiple work items are queued to a workqueue, their execution order doesn't match the queueing order. They may get executed in any order and simultaneously. When fully serialized execution - one by one in the queueing order - is needed, an ordered workqueue should be used which can be created with alloc_ordered_workqueue(). However, alloc_ordered_workqueue() was a later addition. Before it, an ordered workqueue could be obtained by creating an UNBOUND workqueue with @max_active==1. This originally was an implementation side-effect which was broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered"). Because there were users that depended on the ordered execution, 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered") made workqueue allocation path to implicitly promote UNBOUND workqueues w/ @max_active==1 to ordered workqueues. While this has worked okay, overloading the UNBOUND allocation interface this way creates other issues. It's difficult to tell whether a given workqueue actually needs to be ordered and users that legitimately want a min concurrency level wq unexpectedly gets an ordered one instead. With planned UNBOUND workqueue updates to improve execution locality and more prevalence of chiplet designs which can benefit from such improvements, this isn't a state we wanna be in forever. This patch series audits all callsites that create an UNBOUND workqueue w/ @max_active==1 and converts them to alloc_ordered_workqueue() as necessary. WHAT TO LOOK FOR ================ The conversions are from alloc_workqueue(WQ_UNBOUND | flags, 1, args..) to alloc_ordered_workqueue(flags, args...) which don't cause any functional changes. If you know that fully ordered execution is not ncessary, please let me know. I'll drop the conversion and instead add a comment noting the fact to reduce confusion while conversion is in progress. If you aren't fully sure, it's completely fine to let the conversion through. The behavior will stay exactly the same and we can always reconsider later. As there are follow-up workqueue core changes, I'd really appreciate if the patch can be routed through the workqueue tree w/ your acks. Thanks. v2: btrfs_alloc_workqueue() updated too as suggested by Wang. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Wang Yugui <wangyugui@e16-tech.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: linux-btrfs@vger.kernel.org --- Hello, Wang, yeah, that's a helper that can't tell whether the caller wants an ordered wq or not, so it needs to be updated too. How does this look? Thanks. fs/btrfs/async-thread.c | 7 +++++-- fs/btrfs/disk-io.c | 2 +- fs/btrfs/scrub.c | 6 ++++-- 3 files changed, 10 insertions(+), 5 deletions(-)