diff mbox series

[v3] btrfs: zoned: btrfs: zoned: use greedy gc for auto reclaim

Message ID 667291b2902cad926bbff8d5e9124166796cba32.1634204285.git.johannes.thumshirn@wdc.com (mailing list archive)
State New, archived
Headers show
Series [v3] btrfs: zoned: btrfs: zoned: use greedy gc for auto reclaim | expand

Commit Message

Johannes Thumshirn Oct. 14, 2021, 9:39 a.m. UTC
Currently auto reclaim of unusable zones reclaims the block-groups in the
order they have been added to the reclaim list.

Change this to a greedy algorithm by sorting the list so we have the
block-groups with the least amount of valid bytes reclaimed first.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

---
Changes since v2:
- Go back to the RFC state, as we must not access ->bg_list
  without taking the lock. (Nikolay)

Changes since v1:
-  Changed list_sort() comparator to 'boolean' style

Changes since RFC:
- Updated the patch description
- Don't sort the list under the spin_lock (David)
---
 fs/btrfs/block-group.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Comments

Nikolay Borisov Oct. 14, 2021, 9:42 a.m. UTC | #1
On 14.10.21 г. 12:39, Johannes Thumshirn wrote:
> Currently auto reclaim of unusable zones reclaims the block-groups in the
> order they have been added to the reclaim list.
> 
> Change this to a greedy algorithm by sorting the list so we have the
> block-groups with the least amount of valid bytes reclaimed first.
> 
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> 
> ---
> Changes since v2:
> - Go back to the RFC state, as we must not access ->bg_list
>   without taking the lock. (Nikolay)

So following my feedback on his v2 Johannes added an assert for the
emptylessn of the list which triggered. Turns out that's due to the fact
unpin_extent_range calls __btrfs_add_free_space_zoned which adds a bg
via its ->bg_list to the reclaim list. Actually accessing any of the
blockgroups on reclaim_bgs list without holding unused_bg_lock is wrong,
because even if reclaimi_bgs is spliced to a local list, each individual
block group can still be accessed by other parts of the code and its
->bg_list used to link it to some list, all of this happens under
unused_bgs_lock. So that's the reason why we need to keep the code as is.


> 
> Changes since v1:
> -  Changed list_sort() comparator to 'boolean' style
> 
> Changes since RFC:
> - Updated the patch description
> - Don't sort the list under the spin_lock (David)
> ---
>  fs/btrfs/block-group.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 7dba9028c80c..77e6224866c1 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -1,5 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  
> +#include <linux/list_sort.h>
> +
>  #include "misc.h"
>  #include "ctree.h"
>  #include "block-group.h"
> @@ -1486,6 +1488,21 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
>  	spin_unlock(&fs_info->unused_bgs_lock);
>  }
>  
> +/*
> + * We want block groups with a low number of used bytes to be in the beginning
> + * of the list, so they will get reclaimed first.
> + */
> +static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
> +			   const struct list_head *b)
> +{
> +	const struct btrfs_block_group *bg1, *bg2;
> +
> +	bg1 = list_entry(a, struct btrfs_block_group, bg_list);
> +	bg2 = list_entry(b, struct btrfs_block_group, bg_list);
> +
> +	return bg1->used - bg2->used;
> +}
> +
>  void btrfs_reclaim_bgs_work(struct work_struct *work)
>  {
>  	struct btrfs_fs_info *fs_info =
> @@ -1510,6 +1527,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
>  	}
>  
>  	spin_lock(&fs_info->unused_bgs_lock);
> +	list_sort(NULL, &fs_info->reclaim_bgs, reclaim_bgs_cmp);
>  	while (!list_empty(&fs_info->reclaim_bgs)) {
>  		u64 zone_unusable;
>  		int ret = 0;
>
David Sterba Oct. 18, 2021, 3:39 p.m. UTC | #2
On Thu, Oct 14, 2021 at 12:42:21PM +0300, Nikolay Borisov wrote:
> 
> 
> On 14.10.21 г. 12:39, Johannes Thumshirn wrote:
> > Currently auto reclaim of unusable zones reclaims the block-groups in the
> > order they have been added to the reclaim list.
> > 
> > Change this to a greedy algorithm by sorting the list so we have the
> > block-groups with the least amount of valid bytes reclaimed first.
> > 
> > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> > 
> > ---
> > Changes since v2:
> > - Go back to the RFC state, as we must not access ->bg_list
> >   without taking the lock. (Nikolay)
> 
> So following my feedback on his v2 Johannes added an assert for the
> emptylessn of the list which triggered. Turns out that's due to the fact
> unpin_extent_range calls __btrfs_add_free_space_zoned which adds a bg
> via its ->bg_list to the reclaim list. Actually accessing any of the
> blockgroups on reclaim_bgs list without holding unused_bg_lock is wrong,
> because even if reclaimi_bgs is spliced to a local list, each individual
> block group can still be accessed by other parts of the code and its
> ->bg_list used to link it to some list, all of this happens under
> unused_bgs_lock. So that's the reason why we need to keep the code as is.

Thanks, this is exactly what should appear in the changelog in case
there's a implementation we might do but can't for $reasons. I've
written a summary as a note.
David Sterba Oct. 18, 2021, 3:42 p.m. UTC | #3
On Thu, Oct 14, 2021 at 06:39:02PM +0900, Johannes Thumshirn wrote:
> Currently auto reclaim of unusable zones reclaims the block-groups in the
> order they have been added to the reclaim list.
> 
> Change this to a greedy algorithm by sorting the list so we have the
> block-groups with the least amount of valid bytes reclaimed first.
> 
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> 
> ---
> Changes since v2:
> - Go back to the RFC state, as we must not access ->bg_list
>   without taking the lock. (Nikolay)
> 
> Changes since v1:
> -  Changed list_sort() comparator to 'boolean' style
> 
> Changes since RFC:
> - Updated the patch description
> - Don't sort the list under the spin_lock (David)
> ---
>  fs/btrfs/block-group.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 7dba9028c80c..77e6224866c1 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -1,5 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  
> +#include <linux/list_sort.h>
> +
>  #include "misc.h"
>  #include "ctree.h"
>  #include "block-group.h"
> @@ -1486,6 +1488,21 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
>  	spin_unlock(&fs_info->unused_bgs_lock);
>  }
>  
> +/*
> + * We want block groups with a low number of used bytes to be in the beginning
> + * of the list, so they will get reclaimed first.
> + */
> +static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
> +			   const struct list_head *b)
> +{
> +	const struct btrfs_block_group *bg1, *bg2;
> +
> +	bg1 = list_entry(a, struct btrfs_block_group, bg_list);
> +	bg2 = list_entry(b, struct btrfs_block_group, bg_list);
> +
> +	return bg1->used - bg2->used;

So you also reverted to v1 the compare condition, this should be < so
it's the valid stable sort condition.
Johannes Thumshirn Oct. 18, 2021, 3:51 p.m. UTC | #4
On 18/10/2021 17:42, David Sterba wrote:
> On Thu, Oct 14, 2021 at 06:39:02PM +0900, Johannes Thumshirn wrote:
>> Currently auto reclaim of unusable zones reclaims the block-groups in the
>> order they have been added to the reclaim list.
>>
>> Change this to a greedy algorithm by sorting the list so we have the
>> block-groups with the least amount of valid bytes reclaimed first.
>>
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>
>> ---
>> Changes since v2:
>> - Go back to the RFC state, as we must not access ->bg_list
>>   without taking the lock. (Nikolay)
>>
>> Changes since v1:
>> -  Changed list_sort() comparator to 'boolean' style
>>
>> Changes since RFC:
>> - Updated the patch description
>> - Don't sort the list under the spin_lock (David)
>> ---
>>  fs/btrfs/block-group.c | 18 ++++++++++++++++++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
>> index 7dba9028c80c..77e6224866c1 100644
>> --- a/fs/btrfs/block-group.c
>> +++ b/fs/btrfs/block-group.c
>> @@ -1,5 +1,7 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  
>> +#include <linux/list_sort.h>
>> +
>>  #include "misc.h"
>>  #include "ctree.h"
>>  #include "block-group.h"
>> @@ -1486,6 +1488,21 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
>>  	spin_unlock(&fs_info->unused_bgs_lock);
>>  }
>>  
>> +/*
>> + * We want block groups with a low number of used bytes to be in the beginning
>> + * of the list, so they will get reclaimed first.
>> + */
>> +static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
>> +			   const struct list_head *b)
>> +{
>> +	const struct btrfs_block_group *bg1, *bg2;
>> +
>> +	bg1 = list_entry(a, struct btrfs_block_group, bg_list);
>> +	bg2 = list_entry(b, struct btrfs_block_group, bg_list);
>> +
>> +	return bg1->used - bg2->used;
> 
> So you also reverted to v1 the compare condition, this should be < so
> it's the valid stable sort condition.
> 

Ah damn, want me to resend with fixed commit message, subject and compare function?
David Sterba Oct. 18, 2021, 3:56 p.m. UTC | #5
On Mon, Oct 18, 2021 at 03:51:44PM +0000, Johannes Thumshirn wrote:
> On 18/10/2021 17:42, David Sterba wrote:
> > On Thu, Oct 14, 2021 at 06:39:02PM +0900, Johannes Thumshirn wrote:
> >> +/*
> >> + * We want block groups with a low number of used bytes to be in the beginning
> >> + * of the list, so they will get reclaimed first.
> >> + */
> >> +static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
> >> +			   const struct list_head *b)
> >> +{
> >> +	const struct btrfs_block_group *bg1, *bg2;
> >> +
> >> +	bg1 = list_entry(a, struct btrfs_block_group, bg_list);
> >> +	bg2 = list_entry(b, struct btrfs_block_group, bg_list);
> >> +
> >> +	return bg1->used - bg2->used;
> > 
> > So you also reverted to v1 the compare condition, this should be < so
> > it's the valid stable sort condition.
> > 
> 
> Ah damn, want me to resend with fixed commit message, subject and compare function?

Not needed, simple fixes are ok. And I correct myself it should be ">"
as was in v2.
Johannes Thumshirn Oct. 18, 2021, 3:57 p.m. UTC | #6
On 18/10/2021 17:56, David Sterba wrote:
> On Mon, Oct 18, 2021 at 03:51:44PM +0000, Johannes Thumshirn wrote:
>> On 18/10/2021 17:42, David Sterba wrote:
>>> On Thu, Oct 14, 2021 at 06:39:02PM +0900, Johannes Thumshirn wrote:
>>>> +/*
>>>> + * We want block groups with a low number of used bytes to be in the beginning
>>>> + * of the list, so they will get reclaimed first.
>>>> + */
>>>> +static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
>>>> +			   const struct list_head *b)
>>>> +{
>>>> +	const struct btrfs_block_group *bg1, *bg2;
>>>> +
>>>> +	bg1 = list_entry(a, struct btrfs_block_group, bg_list);
>>>> +	bg2 = list_entry(b, struct btrfs_block_group, bg_list);
>>>> +
>>>> +	return bg1->used - bg2->used;
>>>
>>> So you also reverted to v1 the compare condition, this should be < so
>>> it's the valid stable sort condition.
>>>
>>
>> Ah damn, want me to resend with fixed commit message, subject and compare function?
> 
> Not needed, simple fixes are ok. And I correct myself it should be ">"
> as was in v2.
> 

Thanks a lot.
diff mbox series

Patch

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 7dba9028c80c..77e6224866c1 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1,5 +1,7 @@ 
 // SPDX-License-Identifier: GPL-2.0
 
+#include <linux/list_sort.h>
+
 #include "misc.h"
 #include "ctree.h"
 #include "block-group.h"
@@ -1486,6 +1488,21 @@  void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
 	spin_unlock(&fs_info->unused_bgs_lock);
 }
 
+/*
+ * We want block groups with a low number of used bytes to be in the beginning
+ * of the list, so they will get reclaimed first.
+ */
+static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
+			   const struct list_head *b)
+{
+	const struct btrfs_block_group *bg1, *bg2;
+
+	bg1 = list_entry(a, struct btrfs_block_group, bg_list);
+	bg2 = list_entry(b, struct btrfs_block_group, bg_list);
+
+	return bg1->used - bg2->used;
+}
+
 void btrfs_reclaim_bgs_work(struct work_struct *work)
 {
 	struct btrfs_fs_info *fs_info =
@@ -1510,6 +1527,7 @@  void btrfs_reclaim_bgs_work(struct work_struct *work)
 	}
 
 	spin_lock(&fs_info->unused_bgs_lock);
+	list_sort(NULL, &fs_info->reclaim_bgs, reclaim_bgs_cmp);
 	while (!list_empty(&fs_info->reclaim_bgs)) {
 		u64 zone_unusable;
 		int ret = 0;