diff mbox series

[01/10] btrfs: use a plain workqueue for ordered_extent processing

Message ID 20230314165910.373347-2-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [01/10] btrfs: use a plain workqueue for ordered_extent processing | expand

Commit Message

Christoph Hellwig March 14, 2023, 4:59 p.m. UTC
The endio_write_workers and endio_freespace_workers workqueues don't use
any of the ordering features in the btrfs_workqueue, so switch them to
plain Linux workqueues.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/disk-io.c      | 16 ++++++++--------
 fs/btrfs/fs.h           |  4 ++--
 fs/btrfs/ordered-data.c |  8 ++++----
 fs/btrfs/ordered-data.h |  2 +-
 fs/btrfs/super.c        |  2 --
 5 files changed, 15 insertions(+), 17 deletions(-)

Comments

Johannes Thumshirn March 16, 2023, 5:10 p.m. UTC | #1
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
David Sterba March 16, 2023, 5:31 p.m. UTC | #2
On Tue, Mar 14, 2023 at 05:59:01PM +0100, Christoph Hellwig wrote:
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1632,8 +1632,6 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info *fs_info,
>  	btrfs_workqueue_set_max(fs_info->hipri_workers, new_pool_size);
>  	btrfs_workqueue_set_max(fs_info->delalloc_workers, new_pool_size);
>  	btrfs_workqueue_set_max(fs_info->caching_workers, new_pool_size);
> -	btrfs_workqueue_set_max(fs_info->endio_write_workers, new_pool_size);
> -	btrfs_workqueue_set_max(fs_info->endio_freespace_worker, new_pool_size);

I haven't noticed in the past workque patches but here we lose the
connection of mount option thread_pool and max_active per workqueue.
Christoph Hellwig March 20, 2023, 6:12 a.m. UTC | #3
On Thu, Mar 16, 2023 at 06:31:34PM +0100, David Sterba wrote:
> On Tue, Mar 14, 2023 at 05:59:01PM +0100, Christoph Hellwig wrote:
> > --- a/fs/btrfs/super.c
> > +++ b/fs/btrfs/super.c
> > @@ -1632,8 +1632,6 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info *fs_info,
> >  	btrfs_workqueue_set_max(fs_info->hipri_workers, new_pool_size);
> >  	btrfs_workqueue_set_max(fs_info->delalloc_workers, new_pool_size);
> >  	btrfs_workqueue_set_max(fs_info->caching_workers, new_pool_size);
> > -	btrfs_workqueue_set_max(fs_info->endio_write_workers, new_pool_size);
> > -	btrfs_workqueue_set_max(fs_info->endio_freespace_worker, new_pool_size);
> 
> I haven't noticed in the past workque patches but here we lose the
> connection of mount option thread_pool and max_active per workqueue.

Looking at the code in detail, We do, but for the freespace-write workqueue
only.

endio-write passes 2 as the thresh argument to btrfs_alloc_workqueue,
which is below DFT_THRESHOLD and thus all the max active tracking is
disabled.

Which makes sense to me - the max_active tracking is fairly expensive,
and the cost of more active items in the workqueue code is non-existant
until they are actually used.

The history here is a bit weird, the initial version of
endio_freespace_worker was added by Josef in commit 0cb59c995317
("Btrfs: write out free space cache").  This was back in the day of the
old btrfs workers, that weren't using workqueues at all, and the
new freespace workers passed in a thread_pool_size of 1, instead
of fs_info->thread_pool_size used for the normal endio workers.

Qu then replaced the btrfs_workers with the btrfs_workqueue in commit
fccb5d86d8f5 ("btrfs: Replace fs_info->endio_* workqueue with btrfs_workqueue.")
which stopped looking at the thread_pool_size for both of them,
passing down the thread_pool_size (as the max_active local variable)
to both, then limits thresh to 2 for end-io-writes, but to the default
using 0 for the free-space write workqueue, which seems like a big
change to the original behavior, without much of an explanation in the
commit log.

So as far as I can tell this change (accidentally) restores behavior
that is closer, but not identical, to the 2014 behaviour and very
sesible, but it needs to be documented way better or even be split
into a prep patch.

This little exercise also makes me wonder what the point of the
threshold tracking in the btrfs-workqueue is to start with.  I think
the original idea based on some of the commit logs is to be able
to set a higher max_active than the workqueue default using
workqueue_set_max_active for HDD workloads using the thread_pool
mount option.  But given that a higher max_active in the workqueue
code has not extra resource cost until the additional workers are
created on demand, I think just wiring up workqueue_set_max_active to
btrfs_workqueue_set_max is a much better option than the separate
tracking that duplicates the core workqueue functionality.  But that
is left for another series..
Qu Wenruo March 20, 2023, 11:08 a.m. UTC | #4
On 2023/3/15 00:59, Christoph Hellwig wrote:
> The endio_write_workers and endio_freespace_workers workqueues don't use
> any of the ordering features in the btrfs_workqueue, so switch them to
> plain Linux workqueues.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   fs/btrfs/disk-io.c      | 16 ++++++++--------
>   fs/btrfs/fs.h           |  4 ++--
>   fs/btrfs/ordered-data.c |  8 ++++----
>   fs/btrfs/ordered-data.h |  2 +-
>   fs/btrfs/super.c        |  2 --
>   5 files changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index bb864cf2eed60f..0889eb81e71a7d 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1993,8 +1993,10 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info)
>   		destroy_workqueue(fs_info->rmw_workers);
>   	if (fs_info->compressed_write_workers)
>   		destroy_workqueue(fs_info->compressed_write_workers);
> -	btrfs_destroy_workqueue(fs_info->endio_write_workers);
> -	btrfs_destroy_workqueue(fs_info->endio_freespace_worker);
> +	if (fs_info->endio_write_workers)
> +		destroy_workqueue(fs_info->endio_write_workers);
> +	if (fs_info->endio_freespace_worker)
> +		destroy_workqueue(fs_info->endio_freespace_worker);
>   	btrfs_destroy_workqueue(fs_info->delayed_workers);
>   	btrfs_destroy_workqueue(fs_info->caching_workers);
>   	btrfs_destroy_workqueue(fs_info->flush_workers);
> @@ -2204,13 +2206,11 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
>   		alloc_workqueue("btrfs-endio-meta", flags, max_active);
>   	fs_info->rmw_workers = alloc_workqueue("btrfs-rmw", flags, max_active);
>   	fs_info->endio_write_workers =
> -		btrfs_alloc_workqueue(fs_info, "endio-write", flags,
> -				      max_active, 2);
> +		alloc_workqueue("btrfs-endio-write", flags, max_active);
>   	fs_info->compressed_write_workers =
>   		alloc_workqueue("btrfs-compressed-write", flags, max_active);
>   	fs_info->endio_freespace_worker =
> -		btrfs_alloc_workqueue(fs_info, "freespace-write", flags,
> -				      max_active, 0);
> +		alloc_workqueue("btrfs-freespace-write", flags, max_active);
>   	fs_info->delayed_workers =
>   		btrfs_alloc_workqueue(fs_info, "delayed-meta", flags,
>   				      max_active, 0);
> @@ -4536,9 +4536,9 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
>   	 * the final btrfs_put_ordered_extent() (which must happen at
>   	 * btrfs_finish_ordered_io() when we are unmounting).
>   	 */
> -	btrfs_flush_workqueue(fs_info->endio_write_workers);
> +	flush_workqueue(fs_info->endio_write_workers);
>   	/* Ordered extents for free space inodes. */
> -	btrfs_flush_workqueue(fs_info->endio_freespace_worker);
> +	flush_workqueue(fs_info->endio_freespace_worker);
>   	btrfs_run_delayed_iputs(fs_info);
>   
>   	cancel_work_sync(&fs_info->async_reclaim_work);
> diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
> index 20d554a0c2ac0d..276a17780f2b1b 100644
> --- a/fs/btrfs/fs.h
> +++ b/fs/btrfs/fs.h
> @@ -542,8 +542,8 @@ struct btrfs_fs_info {
>   	struct workqueue_struct *endio_meta_workers;
>   	struct workqueue_struct *rmw_workers;
>   	struct workqueue_struct *compressed_write_workers;
> -	struct btrfs_workqueue *endio_write_workers;
> -	struct btrfs_workqueue *endio_freespace_worker;
> +	struct workqueue_struct *endio_write_workers;
> +	struct workqueue_struct *endio_freespace_worker;
>   	struct btrfs_workqueue *caching_workers;
>   
>   	/*
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index 1848d0d1a9c41e..23f496f0d7b776 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -303,7 +303,7 @@ void btrfs_add_ordered_sum(struct btrfs_ordered_extent *entry,
>   	spin_unlock_irq(&tree->lock);
>   }
>   
> -static void finish_ordered_fn(struct btrfs_work *work)
> +static void finish_ordered_fn(struct work_struct *work)
>   {
>   	struct btrfs_ordered_extent *ordered_extent;
>   
> @@ -330,7 +330,7 @@ void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode,
>   {
>   	struct btrfs_ordered_inode_tree *tree = &inode->ordered_tree;
>   	struct btrfs_fs_info *fs_info = inode->root->fs_info;
> -	struct btrfs_workqueue *wq;
> +	struct workqueue_struct *wq;
>   	struct rb_node *node;
>   	struct btrfs_ordered_extent *entry = NULL;
>   	unsigned long flags;
> @@ -439,8 +439,8 @@ void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode,
>   			refcount_inc(&entry->refs);
>   			trace_btrfs_ordered_extent_mark_finished(inode, entry);
>   			spin_unlock_irqrestore(&tree->lock, flags);
> -			btrfs_init_work(&entry->work, finish_ordered_fn, NULL, NULL);
> -			btrfs_queue_work(wq, &entry->work);
> +			INIT_WORK(&entry->work, finish_ordered_fn);
> +			queue_work(wq, &entry->work);
>   			spin_lock_irqsave(&tree->lock, flags);
>   		}
>   		cur += len;
> diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
> index 18007f9c00add8..b8a92f040458f0 100644
> --- a/fs/btrfs/ordered-data.h
> +++ b/fs/btrfs/ordered-data.h
> @@ -146,7 +146,7 @@ struct btrfs_ordered_extent {
>   	/* a per root list of all the pending ordered extents */
>   	struct list_head root_extent_list;
>   
> -	struct btrfs_work work;
> +	struct work_struct work;
>   
>   	struct completion completion;
>   	struct btrfs_work flush_work;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index d8885966e801cd..065b4fab1ee011 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1632,8 +1632,6 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info *fs_info,
>   	btrfs_workqueue_set_max(fs_info->hipri_workers, new_pool_size);
>   	btrfs_workqueue_set_max(fs_info->delalloc_workers, new_pool_size);
>   	btrfs_workqueue_set_max(fs_info->caching_workers, new_pool_size);
> -	btrfs_workqueue_set_max(fs_info->endio_write_workers, new_pool_size);
> -	btrfs_workqueue_set_max(fs_info->endio_freespace_worker, new_pool_size);

I guess we need workqueue_set_max_active() for all our plain work queues 
too.

Thanks,
Qu

>   	btrfs_workqueue_set_max(fs_info->delayed_workers, new_pool_size);
>   }
>
Qu Wenruo March 20, 2023, 11:35 a.m. UTC | #5
On 2023/3/20 19:08, Qu Wenruo wrote:
> 
> 
> On 2023/3/15 00:59, Christoph Hellwig wrote:
>> The endio_write_workers and endio_freespace_workers workqueues don't use
>> any of the ordering features in the btrfs_workqueue, so switch them to
>> plain Linux workqueues.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>   fs/btrfs/disk-io.c      | 16 ++++++++--------
>>   fs/btrfs/fs.h           |  4 ++--
>>   fs/btrfs/ordered-data.c |  8 ++++----
>>   fs/btrfs/ordered-data.h |  2 +-
>>   fs/btrfs/super.c        |  2 --
>>   5 files changed, 15 insertions(+), 17 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index bb864cf2eed60f..0889eb81e71a7d 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1993,8 +1993,10 @@ static void btrfs_stop_all_workers(struct 
>> btrfs_fs_info *fs_info)
>>           destroy_workqueue(fs_info->rmw_workers);
>>       if (fs_info->compressed_write_workers)
>>           destroy_workqueue(fs_info->compressed_write_workers);
>> -    btrfs_destroy_workqueue(fs_info->endio_write_workers);
>> -    btrfs_destroy_workqueue(fs_info->endio_freespace_worker);
>> +    if (fs_info->endio_write_workers)
>> +        destroy_workqueue(fs_info->endio_write_workers);
>> +    if (fs_info->endio_freespace_worker)
>> +        destroy_workqueue(fs_info->endio_freespace_worker);
>>       btrfs_destroy_workqueue(fs_info->delayed_workers);
>>       btrfs_destroy_workqueue(fs_info->caching_workers);
>>       btrfs_destroy_workqueue(fs_info->flush_workers);
>> @@ -2204,13 +2206,11 @@ static int btrfs_init_workqueues(struct 
>> btrfs_fs_info *fs_info)
>>           alloc_workqueue("btrfs-endio-meta", flags, max_active);
>>       fs_info->rmw_workers = alloc_workqueue("btrfs-rmw", flags, 
>> max_active);
>>       fs_info->endio_write_workers =
>> -        btrfs_alloc_workqueue(fs_info, "endio-write", flags,
>> -                      max_active, 2);
>> +        alloc_workqueue("btrfs-endio-write", flags, max_active);
>>       fs_info->compressed_write_workers =
>>           alloc_workqueue("btrfs-compressed-write", flags, max_active);
>>       fs_info->endio_freespace_worker =
>> -        btrfs_alloc_workqueue(fs_info, "freespace-write", flags,
>> -                      max_active, 0);
>> +        alloc_workqueue("btrfs-freespace-write", flags, max_active);
>>       fs_info->delayed_workers =
>>           btrfs_alloc_workqueue(fs_info, "delayed-meta", flags,
>>                         max_active, 0);
>> @@ -4536,9 +4536,9 @@ void __cold close_ctree(struct btrfs_fs_info 
>> *fs_info)
>>        * the final btrfs_put_ordered_extent() (which must happen at
>>        * btrfs_finish_ordered_io() when we are unmounting).
>>        */
>> -    btrfs_flush_workqueue(fs_info->endio_write_workers);
>> +    flush_workqueue(fs_info->endio_write_workers);
>>       /* Ordered extents for free space inodes. */
>> -    btrfs_flush_workqueue(fs_info->endio_freespace_worker);
>> +    flush_workqueue(fs_info->endio_freespace_worker);
>>       btrfs_run_delayed_iputs(fs_info);
>>       cancel_work_sync(&fs_info->async_reclaim_work);
>> diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
>> index 20d554a0c2ac0d..276a17780f2b1b 100644
>> --- a/fs/btrfs/fs.h
>> +++ b/fs/btrfs/fs.h
>> @@ -542,8 +542,8 @@ struct btrfs_fs_info {
>>       struct workqueue_struct *endio_meta_workers;
>>       struct workqueue_struct *rmw_workers;
>>       struct workqueue_struct *compressed_write_workers;
>> -    struct btrfs_workqueue *endio_write_workers;
>> -    struct btrfs_workqueue *endio_freespace_worker;
>> +    struct workqueue_struct *endio_write_workers;
>> +    struct workqueue_struct *endio_freespace_worker;
>>       struct btrfs_workqueue *caching_workers;
>>       /*
>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
>> index 1848d0d1a9c41e..23f496f0d7b776 100644
>> --- a/fs/btrfs/ordered-data.c
>> +++ b/fs/btrfs/ordered-data.c
>> @@ -303,7 +303,7 @@ void btrfs_add_ordered_sum(struct 
>> btrfs_ordered_extent *entry,
>>       spin_unlock_irq(&tree->lock);
>>   }
>> -static void finish_ordered_fn(struct btrfs_work *work)
>> +static void finish_ordered_fn(struct work_struct *work)
>>   {
>>       struct btrfs_ordered_extent *ordered_extent;
>> @@ -330,7 +330,7 @@ void btrfs_mark_ordered_io_finished(struct 
>> btrfs_inode *inode,
>>   {
>>       struct btrfs_ordered_inode_tree *tree = &inode->ordered_tree;
>>       struct btrfs_fs_info *fs_info = inode->root->fs_info;
>> -    struct btrfs_workqueue *wq;
>> +    struct workqueue_struct *wq;
>>       struct rb_node *node;
>>       struct btrfs_ordered_extent *entry = NULL;
>>       unsigned long flags;
>> @@ -439,8 +439,8 @@ void btrfs_mark_ordered_io_finished(struct 
>> btrfs_inode *inode,
>>               refcount_inc(&entry->refs);
>>               trace_btrfs_ordered_extent_mark_finished(inode, entry);
>>               spin_unlock_irqrestore(&tree->lock, flags);
>> -            btrfs_init_work(&entry->work, finish_ordered_fn, NULL, 
>> NULL);
>> -            btrfs_queue_work(wq, &entry->work);
>> +            INIT_WORK(&entry->work, finish_ordered_fn);
>> +            queue_work(wq, &entry->work);
>>               spin_lock_irqsave(&tree->lock, flags);
>>           }
>>           cur += len;
>> diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
>> index 18007f9c00add8..b8a92f040458f0 100644
>> --- a/fs/btrfs/ordered-data.h
>> +++ b/fs/btrfs/ordered-data.h
>> @@ -146,7 +146,7 @@ struct btrfs_ordered_extent {
>>       /* a per root list of all the pending ordered extents */
>>       struct list_head root_extent_list;
>> -    struct btrfs_work work;
>> +    struct work_struct work;
>>       struct completion completion;
>>       struct btrfs_work flush_work;
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index d8885966e801cd..065b4fab1ee011 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1632,8 +1632,6 @@ static void btrfs_resize_thread_pool(struct 
>> btrfs_fs_info *fs_info,
>>       btrfs_workqueue_set_max(fs_info->hipri_workers, new_pool_size);
>>       btrfs_workqueue_set_max(fs_info->delalloc_workers, new_pool_size);
>>       btrfs_workqueue_set_max(fs_info->caching_workers, new_pool_size);
>> -    btrfs_workqueue_set_max(fs_info->endio_write_workers, 
>> new_pool_size);
>> -    btrfs_workqueue_set_max(fs_info->endio_freespace_worker, 
>> new_pool_size);
> 
> I guess we need workqueue_set_max_active() for all our plain work queues 
> too.

In fact, I believe we not only need to add workqueue_set_max_active() 
for the thread_pool= mount option, but also add a test case for 
thread_pool=1 to shake out all the possible hidden bugs...

Mind me to send a patch adding the max_active setting for all plain 
workqueues?

Thanks,
Qu

> 
> Thanks,
> Qu
> 
>>       btrfs_workqueue_set_max(fs_info->delayed_workers, new_pool_size);
>>   }
Christoph Hellwig March 20, 2023, 12:24 p.m. UTC | #6
On Mon, Mar 20, 2023 at 07:35:45PM +0800, Qu Wenruo wrote:
> In fact, I believe we not only need to add workqueue_set_max_active() for 
> the thread_pool= mount option, but also add a test case for thread_pool=1 
> to shake out all the possible hidden bugs...
>
> Mind me to send a patch adding the max_active setting for all plain 
> workqueues?

I don't think blindly doing that is a good idea.  As explained in my
reply to Dave, going all the way back to 2014, all workqueues hat had
a less than default treshold never used it to start with.

I'm also really curious (and I might have to do some digging) what
the intended use case is, and if it actually works as-is.  I know
one of your workloads mentioned a higher concurrency for some HDD
workloads, do you still remember what the workloads are?  Because
I'm pretty sure it won't be needed for all workqueues, and the fact
that btrfs is the only caller of workqueue_set_max_active in the
entire kernel makes me very sceptical that we do need it everywhere.

So I'd be much happier to figure out where we needed it first, but even
if not and we want to restore historic behavior from some point in the
past we'd still only need to apply it selectively.
Qu Wenruo March 20, 2023, 11:19 p.m. UTC | #7
On 2023/3/20 20:24, Christoph Hellwig wrote:
> On Mon, Mar 20, 2023 at 07:35:45PM +0800, Qu Wenruo wrote:
>> In fact, I believe we not only need to add workqueue_set_max_active() for
>> the thread_pool= mount option, but also add a test case for thread_pool=1
>> to shake out all the possible hidden bugs...
>>
>> Mind me to send a patch adding the max_active setting for all plain
>> workqueues?
> 
> I don't think blindly doing that is a good idea.  As explained in my
> reply to Dave, going all the way back to 2014, all workqueues hat had
> a less than default treshold never used it to start with.
> 
> I'm also really curious (and I might have to do some digging) what
> the intended use case is, and if it actually works as-is.  I know
> one of your workloads mentioned a higher concurrency for some HDD
> workloads, do you still remember what the workloads are?

In fact, recently we had a patchset trying to adding a new mount option 
to pin workqueue loads to certain CPUs.

The idea of that patchset is to limit the CPU usage for compression/csum 
generation.

This can also apply to scrub workload.


The other thing is, I also want to test if we could really have certain 
deadlock related to workqueue.
Currently thread_pool= mount option only changes the btrfs workqueues, 
not the new plain ones.

Thanks,
Qu
>  Because
> I'm pretty sure it won't be needed for all workqueues, and the fact
> that btrfs is the only caller of workqueue_set_max_active in the
> entire kernel makes me very sceptical that we do need it everywhere.
> 
> So I'd be much happier to figure out where we needed it first, but even
> if not and we want to restore historic behavior from some point in the
> past we'd still only need to apply it selectively.
Christoph Hellwig March 21, 2023, 12:48 p.m. UTC | #8
On Tue, Mar 21, 2023 at 07:19:30AM +0800, Qu Wenruo wrote:
>> I'm also really curious (and I might have to do some digging) what
>> the intended use case is, and if it actually works as-is.  I know
>> one of your workloads mentioned a higher concurrency for some HDD
>> workloads, do you still remember what the workloads are?
>
> In fact, recently we had a patchset trying to adding a new mount option to 
> pin workqueue loads to certain CPUs.
>
> The idea of that patchset is to limit the CPU usage for compression/csum 
> generation.
>
> This can also apply to scrub workload.

... and is totally unrelated to using either a core workqueue max_active
value or reimplementing that in btrfs.

> The other thing is, I also want to test if we could really have certain 
> deadlock related to workqueue.
> Currently thread_pool= mount option only changes the btrfs workqueues, not 
> the new plain ones.

What kind of deadlock testing do you want, and why does it apply
to all workqueues in btrfs and not other workqueues?  Note that you'd
also need to change the btrfs_workqueue code to actually apply literally
everywhere.

Maybe we can start by going back to me request, and can actually come
up with a definition for what thread_pool= is supposed to affect, and
how users should pick values for it?  There is a definition in the
btrfs(5) man page, but that one has been wrong at least since 2014
and the switch to use workqueues.
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index bb864cf2eed60f..0889eb81e71a7d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1993,8 +1993,10 @@  static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info)
 		destroy_workqueue(fs_info->rmw_workers);
 	if (fs_info->compressed_write_workers)
 		destroy_workqueue(fs_info->compressed_write_workers);
-	btrfs_destroy_workqueue(fs_info->endio_write_workers);
-	btrfs_destroy_workqueue(fs_info->endio_freespace_worker);
+	if (fs_info->endio_write_workers)
+		destroy_workqueue(fs_info->endio_write_workers);
+	if (fs_info->endio_freespace_worker)
+		destroy_workqueue(fs_info->endio_freespace_worker);
 	btrfs_destroy_workqueue(fs_info->delayed_workers);
 	btrfs_destroy_workqueue(fs_info->caching_workers);
 	btrfs_destroy_workqueue(fs_info->flush_workers);
@@ -2204,13 +2206,11 @@  static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 		alloc_workqueue("btrfs-endio-meta", flags, max_active);
 	fs_info->rmw_workers = alloc_workqueue("btrfs-rmw", flags, max_active);
 	fs_info->endio_write_workers =
-		btrfs_alloc_workqueue(fs_info, "endio-write", flags,
-				      max_active, 2);
+		alloc_workqueue("btrfs-endio-write", flags, max_active);
 	fs_info->compressed_write_workers =
 		alloc_workqueue("btrfs-compressed-write", flags, max_active);
 	fs_info->endio_freespace_worker =
-		btrfs_alloc_workqueue(fs_info, "freespace-write", flags,
-				      max_active, 0);
+		alloc_workqueue("btrfs-freespace-write", flags, max_active);
 	fs_info->delayed_workers =
 		btrfs_alloc_workqueue(fs_info, "delayed-meta", flags,
 				      max_active, 0);
@@ -4536,9 +4536,9 @@  void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	 * the final btrfs_put_ordered_extent() (which must happen at
 	 * btrfs_finish_ordered_io() when we are unmounting).
 	 */
-	btrfs_flush_workqueue(fs_info->endio_write_workers);
+	flush_workqueue(fs_info->endio_write_workers);
 	/* Ordered extents for free space inodes. */
-	btrfs_flush_workqueue(fs_info->endio_freespace_worker);
+	flush_workqueue(fs_info->endio_freespace_worker);
 	btrfs_run_delayed_iputs(fs_info);
 
 	cancel_work_sync(&fs_info->async_reclaim_work);
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 20d554a0c2ac0d..276a17780f2b1b 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -542,8 +542,8 @@  struct btrfs_fs_info {
 	struct workqueue_struct *endio_meta_workers;
 	struct workqueue_struct *rmw_workers;
 	struct workqueue_struct *compressed_write_workers;
-	struct btrfs_workqueue *endio_write_workers;
-	struct btrfs_workqueue *endio_freespace_worker;
+	struct workqueue_struct *endio_write_workers;
+	struct workqueue_struct *endio_freespace_worker;
 	struct btrfs_workqueue *caching_workers;
 
 	/*
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 1848d0d1a9c41e..23f496f0d7b776 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -303,7 +303,7 @@  void btrfs_add_ordered_sum(struct btrfs_ordered_extent *entry,
 	spin_unlock_irq(&tree->lock);
 }
 
-static void finish_ordered_fn(struct btrfs_work *work)
+static void finish_ordered_fn(struct work_struct *work)
 {
 	struct btrfs_ordered_extent *ordered_extent;
 
@@ -330,7 +330,7 @@  void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode,
 {
 	struct btrfs_ordered_inode_tree *tree = &inode->ordered_tree;
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
-	struct btrfs_workqueue *wq;
+	struct workqueue_struct *wq;
 	struct rb_node *node;
 	struct btrfs_ordered_extent *entry = NULL;
 	unsigned long flags;
@@ -439,8 +439,8 @@  void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode,
 			refcount_inc(&entry->refs);
 			trace_btrfs_ordered_extent_mark_finished(inode, entry);
 			spin_unlock_irqrestore(&tree->lock, flags);
-			btrfs_init_work(&entry->work, finish_ordered_fn, NULL, NULL);
-			btrfs_queue_work(wq, &entry->work);
+			INIT_WORK(&entry->work, finish_ordered_fn);
+			queue_work(wq, &entry->work);
 			spin_lock_irqsave(&tree->lock, flags);
 		}
 		cur += len;
diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
index 18007f9c00add8..b8a92f040458f0 100644
--- a/fs/btrfs/ordered-data.h
+++ b/fs/btrfs/ordered-data.h
@@ -146,7 +146,7 @@  struct btrfs_ordered_extent {
 	/* a per root list of all the pending ordered extents */
 	struct list_head root_extent_list;
 
-	struct btrfs_work work;
+	struct work_struct work;
 
 	struct completion completion;
 	struct btrfs_work flush_work;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d8885966e801cd..065b4fab1ee011 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1632,8 +1632,6 @@  static void btrfs_resize_thread_pool(struct btrfs_fs_info *fs_info,
 	btrfs_workqueue_set_max(fs_info->hipri_workers, new_pool_size);
 	btrfs_workqueue_set_max(fs_info->delalloc_workers, new_pool_size);
 	btrfs_workqueue_set_max(fs_info->caching_workers, new_pool_size);
-	btrfs_workqueue_set_max(fs_info->endio_write_workers, new_pool_size);
-	btrfs_workqueue_set_max(fs_info->endio_freespace_worker, new_pool_size);
 	btrfs_workqueue_set_max(fs_info->delayed_workers, new_pool_size);
 }