diff mbox series

btrfs: Remove the duplicated and sometimes too cautious btrfs_can_relocate()

Message ID 20190718054857.8970-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: Remove the duplicated and sometimes too cautious btrfs_can_relocate() | expand

Commit Message

Qu Wenruo July 18, 2019, 5:48 a.m. UTC
[BUG]
The following script can easily cause unexpected ENOSPC:
  umount $dev &> /dev/null
  umount $mnt &> /dev/null

  mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null

  mount $dev $mnt -o enospc_debug

  for i in $(seq -w 0 511); do
  	xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
  done
  sync

  btrfs balance start --full $mnt || return 1
  sync

  # This will report -ENOSPC
  btrfs balance start --full $mnt || return 1
  umount $mnt

Also, btrfs/156 can also fail due to ENOSPC.

[CAUSE]
The ENOSPC is reported by btrfs_can_relocate().

In btrfs_can_relocate(), it does the following check:
- If the block group is empty
  If empty, definitely we can relocate this block group.
- If we are not the only block group and we have enough space
  Then we can relocate this block group.

Above two checks are completely OK, although I could argue they doesn't
make much sense, but the following check is vague and even sometimes
too cautious to cause ENOSPC:
- If we can allocate a new block group as large as current one.
  If we failed previous two checks, we must pass this to relocate this
  block group.

There are several problems here:
1. We don't need to allocate as large as the source block group.
   E.g. source block group is 1G sized, but only 1M used. We only need
   to allocated a data chunk larger than 1M to continue relocation.

2. The check in btrfs_can_relocate() is vague and impossible to be as
   accurate as __btrfs_alloc_chunk()
   How could this less than 200 lines code do the same work as
   __btrfs_alloc_chunk()? And it's hard to maintain two different
   functions to do similar work.

3. We have more accurate check in btrfs_inc_block_group_ro().
   Btrfs_inc_block_group_ro() is doing similar check but much better.
   In btrfs_inc_block_group_ro() we do:
   * Forced chunk allocation if we're converting

   * Try to mark block group ro first
     in inc_btrfs_block_group_ro(), we will do comprehensive space
     check to ensure we have enough free space for the used and reserved
     space of the block group.
     If succeeded, we're done.

   * Force chunk allocation for more space
     If we failed here, we indeed hits ENOSPC.

   * Try to mark block group ro again
     As we have extra space, we can try again.
     This is the last chance, either we have enough space now and
     success, or the newly allocated space is not large enough, ENOSPC
     is returned.

   Such try-allocate-try behavior is way more accurate in every way
   compared to btrfs_can_relocate(), we can rely on
   btrfs_inc_block_group_ro() to replace btrfs_can_relocate()
   completely.

[FIX]
Since regular balance routine already has a better ENOSPC detector,
there is no need to keep the false-alert-prone btrfs_can_relocate().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h       |   1 -
 fs/btrfs/extent-tree.c | 141 -----------------------------------------
 fs/btrfs/volumes.c     |   4 --
 3 files changed, 146 deletions(-)

Comments

Filipe Manana July 18, 2019, 11:16 a.m. UTC | #1
On Thu, Jul 18, 2019 at 6:50 AM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> The following script can easily cause unexpected ENOSPC:
>   umount $dev &> /dev/null
>   umount $mnt &> /dev/null
>
>   mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null
>
>   mount $dev $mnt -o enospc_debug
>
>   for i in $(seq -w 0 511); do
>         xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
>   done
>   sync
>
>   btrfs balance start --full $mnt || return 1
>   sync
>
>   # This will report -ENOSPC
>   btrfs balance start --full $mnt || return 1
>   umount $mnt
>
> Also, btrfs/156 can also fail due to ENOSPC.

Well, that script you pasted is btrfs/156 essentially.

When did the test started failing? When the test was added, it didn't
fail, did it?

>
> [CAUSE]
> The ENOSPC is reported by btrfs_can_relocate().
>
> In btrfs_can_relocate(), it does the following check:
> - If the block group is empty
>   If empty, definitely we can relocate this block group.
> - If we are not the only block group and we have enough space
>   Then we can relocate this block group.
>
> Above two checks are completely OK, although I could argue they doesn't
> make much sense, but the following check is vague and even sometimes
> too cautious to cause ENOSPC:
> - If we can allocate a new block group as large as current one.
>   If we failed previous two checks, we must pass this to relocate this
>   block group.
>
> There are several problems here:
> 1. We don't need to allocate as large as the source block group.
>    E.g. source block group is 1G sized, but only 1M used. We only need
>    to allocated a data chunk larger than 1M to continue relocation.

Right. But where does btrfs_can_relocate() do such assumption?
It only tries to check if there's enough space for an amount that
corresponds to the amount used in the block group, that is, not the
size of the block group (unless the block group is completely full).

>
> 2. The check in btrfs_can_relocate() is vague and impossible to be as
>    accurate as __btrfs_alloc_chunk()
>    How could this less than 200 lines code do the same work as
>    __btrfs_alloc_chunk()? And it's hard to maintain two different
>    functions to do similar work.
>
> 3. We have more accurate check in btrfs_inc_block_group_ro().
>    Btrfs_inc_block_group_ro() is doing similar check but much better.
>    In btrfs_inc_block_group_ro() we do:
>    * Forced chunk allocation if we're converting
>
>    * Try to mark block group ro first
>      in inc_btrfs_block_group_ro(), we will do comprehensive space
>      check to ensure we have enough free space for the used and reserved
>      space of the block group.
>      If succeeded, we're done.
>
>    * Force chunk allocation for more space
>      If we failed here, we indeed hits ENOSPC.
>
>    * Try to mark block group ro again
>      As we have extra space, we can try again.
>      This is the last chance, either we have enough space now and
>      success, or the newly allocated space is not large enough, ENOSPC
>      is returned.
>
>    Such try-allocate-try behavior is way more accurate in every way
>    compared to btrfs_can_relocate(), we can rely on
>    btrfs_inc_block_group_ro() to replace btrfs_can_relocate()
>    completely.
>
> [FIX]
> Since regular balance routine already has a better ENOSPC detector,
> there is no need to keep the false-alert-prone btrfs_can_relocate().
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ctree.h       |   1 -
>  fs/btrfs/extent-tree.c | 141 -----------------------------------------
>  fs/btrfs/volumes.c     |   4 --
>  3 files changed, 146 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 0a61dff27f57..965d1e5a4af7 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -2772,7 +2772,6 @@ int btrfs_setup_space_cache(struct btrfs_trans_handle *trans);
>  int btrfs_extent_readonly(struct btrfs_fs_info *fs_info, u64 bytenr);
>  int btrfs_free_block_groups(struct btrfs_fs_info *info);
>  int btrfs_read_block_groups(struct btrfs_fs_info *info);
> -int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr);
>  int btrfs_make_block_group(struct btrfs_trans_handle *trans,
>                            u64 bytes_used, u64 type, u64 chunk_offset,
>                            u64 size);
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 5faf057f6f37..822a4102980d 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -9774,147 +9774,6 @@ void btrfs_dec_block_group_ro(struct btrfs_block_group_cache *cache)
>         spin_unlock(&sinfo->lock);
>  }
>
> -/*
> - * Checks to see if it's even possible to relocate this block group.
> - *
> - * @return - -1 if it's not a good idea to relocate this block group, 0 if its
> - * ok to go ahead and try.
> - */
> -int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr)
> -{
> -       struct btrfs_block_group_cache *block_group;
> -       struct btrfs_space_info *space_info;
> -       struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
> -       struct btrfs_device *device;
> -       u64 min_free;
> -       u64 dev_min = 1;
> -       u64 dev_nr = 0;
> -       u64 target;
> -       int debug;
> -       int index;
> -       int full = 0;
> -       int ret = 0;
> -
> -       debug = btrfs_test_opt(fs_info, ENOSPC_DEBUG);
> -
> -       block_group = btrfs_lookup_block_group(fs_info, bytenr);
> -
> -       /* odd, couldn't find the block group, leave it alone */
> -       if (!block_group) {
> -               if (debug)
> -                       btrfs_warn(fs_info,
> -                                  "can't find block group for bytenr %llu",
> -                                  bytenr);
> -               return -1;
> -       }
> -
> -       min_free = btrfs_block_group_used(&block_group->item);
> -
> -       /* no bytes used, we're good */
> -       if (!min_free)
> -               goto out;
> -
> -       space_info = block_group->space_info;
> -       spin_lock(&space_info->lock);
> -
> -       full = space_info->full;
> -
> -       /*
> -        * if this is the last block group we have in this space, we can't
> -        * relocate it unless we're able to allocate a new chunk below.
> -        *
> -        * Otherwise, we need to make sure we have room in the space to handle
> -        * all of the extents from this block group.  If we can, we're good
> -        */
> -       if ((space_info->total_bytes != block_group->key.offset) &&
> -           (btrfs_space_info_used(space_info, false) + min_free <
> -            space_info->total_bytes)) {
> -               spin_unlock(&space_info->lock);
> -               goto out;
> -       }
> -       spin_unlock(&space_info->lock);
> -
> -       /*
> -        * ok we don't have enough space, but maybe we have free space on our
> -        * devices to allocate new chunks for relocation, so loop through our
> -        * alloc devices and guess if we have enough space.  if this block
> -        * group is going to be restriped, run checks against the target
> -        * profile instead of the current one.
> -        */
> -       ret = -1;
> -
> -       /*
> -        * index:
> -        *      0: raid10
> -        *      1: raid1
> -        *      2: dup
> -        *      3: raid0
> -        *      4: single
> -        */
> -       target = get_restripe_target(fs_info, block_group->flags);
> -       if (target) {
> -               index = btrfs_bg_flags_to_raid_index(extended_to_chunk(target));
> -       } else {
> -               /*
> -                * this is just a balance, so if we were marked as full
> -                * we know there is no space for a new chunk
> -                */
> -               if (full) {
> -                       if (debug)
> -                               btrfs_warn(fs_info,
> -                                          "no space to alloc new chunk for block group %llu",
> -                                          block_group->key.objectid);
> -                       goto out;
> -               }
> -
> -               index = btrfs_bg_flags_to_raid_index(block_group->flags);
> -       }
> -
> -       if (index == BTRFS_RAID_RAID10) {
> -               dev_min = 4;
> -               /* Divide by 2 */
> -               min_free >>= 1;
> -       } else if (index == BTRFS_RAID_RAID1) {
> -               dev_min = 2;
> -       } else if (index == BTRFS_RAID_DUP) {
> -               /* Multiply by 2 */
> -               min_free <<= 1;
> -       } else if (index == BTRFS_RAID_RAID0) {
> -               dev_min = fs_devices->rw_devices;
> -               min_free = div64_u64(min_free, dev_min);
> -       }
> -
> -       mutex_lock(&fs_info->chunk_mutex);
> -       list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
> -               u64 dev_offset;
> -
> -               /*
> -                * check to make sure we can actually find a chunk with enough
> -                * space to fit our block group in.
> -                */
> -               if (device->total_bytes > device->bytes_used + min_free &&
> -                   !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
> -                       ret = find_free_dev_extent(device, min_free,
> -                                                  &dev_offset, NULL);
> -                       if (!ret)
> -                               dev_nr++;
> -
> -                       if (dev_nr >= dev_min)
> -                               break;

And here's a bug in that code. Before breaking out of the loop, ret
should be set to 0.

In general I'm ok with the change, but would like an answer to those questions.

Thanks.

> -
> -                       ret = -1;
> -               }
> -       }
> -       if (debug && ret == -1)
> -               btrfs_warn(fs_info,
> -                          "no space to allocate a new chunk for block group %llu",
> -                          block_group->key.objectid);
> -       mutex_unlock(&fs_info->chunk_mutex);
> -out:
> -       btrfs_put_block_group(block_group);
> -       return ret;
> -}
> -
>  static int find_first_block_group(struct btrfs_fs_info *fs_info,
>                                   struct btrfs_path *path,
>                                   struct btrfs_key *key)
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1c2a6e4b39da..f209127a8bc6 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3071,10 +3071,6 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
>          */
>         lockdep_assert_held(&fs_info->delete_unused_bgs_mutex);
>
> -       ret = btrfs_can_relocate(fs_info, chunk_offset);
> -       if (ret)
> -               return -ENOSPC;
> -
>         /* step one, relocate all the extents inside this chunk */
>         btrfs_scrub_pause(fs_info);
>         ret = btrfs_relocate_block_group(fs_info, chunk_offset);
> --
> 2.22.0
>
Nikolay Borisov July 18, 2019, 11:18 a.m. UTC | #2
On 18.07.19 г. 8:48 ч., Qu Wenruo wrote:
> [BUG]
> The following script can easily cause unexpected ENOSPC:
>   umount $dev &> /dev/null
>   umount $mnt &> /dev/null
> 
>   mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null
> 
>   mount $dev $mnt -o enospc_debug
> 
>   for i in $(seq -w 0 511); do
>   	xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
>   done
>   sync
> 
>   btrfs balance start --full $mnt || return 1
>   sync
> 
>   # This will report -ENOSPC
>   btrfs balance start --full $mnt || return 1
>   umount $mnt
> 
> Also, btrfs/156 can also fail due to ENOSPC.
> 
> [CAUSE]
> The ENOSPC is reported by btrfs_can_relocate().
> 
> In btrfs_can_relocate(), it does the following check:
> - If the block group is empty
>   If empty, definitely we can relocate this block group.
> - If we are not the only block group and we have enough space
>   Then we can relocate this block group.
> 
> Above two checks are completely OK, although I could argue they doesn't
> make much sense, but the following check is vague and even sometimes
> too cautious to cause ENOSPC:
> - If we can allocate a new block group as large as current one.
>   If we failed previous two checks, we must pass this to relocate this
>   block group.

btrfs_can_relocate chunk requires min_free to be allocatable. 
min_free is defined as the used space in the  block group being 
relocated, which I think is correct. Also I find the logic which 
adjusts min_free and dev_min to also be correct. Finally the function 
checks whether the device's freespace is fragmented by trying to find a 
device chunk with the appropriate size. The question is - can we really 
have a device that has enough free space, yet is fragmented such that 
find_free_dev_extent fails which results in failing the allocation? I 
think the answer is no since we allocate in chunk granularity. What am I missing?


OTOH, in btrfs_inc_block_group_ro we only allocate a chunk if: 

a) we are changing raid profiles 
b) if inc_block_group_ro fails for our block group. 

And for b) I'm a bit puzzled as to what the code is supposed to mean. We have: 

num_bytes = cache->key.offset - cache->reserved - cache->pinned -       
                      cache->bytes_super - btrfs_block_group_used(&cache->item);  
          sinfo_used = btrfs_space_info_used(sinfo, true);                        
                                                                                  
          if (sinfo_used + num_bytes + min_allocable_bytes <=                     
              sinfo->total_bytes) {                   
//set ro

}

This means if the free space in the block group + the used space in the 
space info is smaller than the total space in 
the space info - make this block group RO. What's the rationale behind that?
Nikolay Borisov July 18, 2019, 11:23 a.m. UTC | #3
On 18.07.19 г. 14:16 ч., Filipe Manana wrote:
> On Thu, Jul 18, 2019 at 6:50 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> [BUG]
>> The following script can easily cause unexpected ENOSPC:
>>   umount $dev &> /dev/null
>>   umount $mnt &> /dev/null
>>
>>   mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null
>>
>>   mount $dev $mnt -o enospc_debug
>>
>>   for i in $(seq -w 0 511); do
>>         xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
>>   done
>>   sync
>>
>>   btrfs balance start --full $mnt || return 1
>>   sync
>>
>>   # This will report -ENOSPC
>>   btrfs balance start --full $mnt || return 1
>>   umount $mnt
>>
>> Also, btrfs/156 can also fail due to ENOSPC.
> 
> Well, that script you pasted is btrfs/156 essentially.
> 
> When did the test started failing? When the test was added, it didn't
> fail, did it?
> 
>>
>> [CAUSE]
>> The ENOSPC is reported by btrfs_can_relocate().
>>
>> In btrfs_can_relocate(), it does the following check:
>> - If the block group is empty
>>   If empty, definitely we can relocate this block group.
>> - If we are not the only block group and we have enough space
>>   Then we can relocate this block group.
>>
>> Above two checks are completely OK, although I could argue they doesn't
>> make much sense, but the following check is vague and even sometimes
>> too cautious to cause ENOSPC:
>> - If we can allocate a new block group as large as current one.
>>   If we failed previous two checks, we must pass this to relocate this
>>   block group.
>>
>> There are several problems here:
>> 1. We don't need to allocate as large as the source block group.
>>    E.g. source block group is 1G sized, but only 1M used. We only need
>>    to allocated a data chunk larger than 1M to continue relocation.
> 
> Right. But where does btrfs_can_relocate() do such assumption?
> It only tries to check if there's enough space for an amount that
> corresponds to the amount used in the block group, that is, not the
> size of the block group (unless the block group is completely full).
> 
>>
>> 2. The check in btrfs_can_relocate() is vague and impossible to be as
>>    accurate as __btrfs_alloc_chunk()
>>    How could this less than 200 lines code do the same work as
>>    __btrfs_alloc_chunk()? And it's hard to maintain two different
>>    functions to do similar work.
>>
>> 3. We have more accurate check in btrfs_inc_block_group_ro().
>>    Btrfs_inc_block_group_ro() is doing similar check but much better.
>>    In btrfs_inc_block_group_ro() we do:
>>    * Forced chunk allocation if we're converting
>>
>>    * Try to mark block group ro first
>>      in inc_btrfs_block_group_ro(), we will do comprehensive space
>>      check to ensure we have enough free space for the used and reserved
>>      space of the block group.
>>      If succeeded, we're done.
>>
>>    * Force chunk allocation for more space
>>      If we failed here, we indeed hits ENOSPC.
>>
>>    * Try to mark block group ro again
>>      As we have extra space, we can try again.
>>      This is the last chance, either we have enough space now and
>>      success, or the newly allocated space is not large enough, ENOSPC
>>      is returned.
>>
>>    Such try-allocate-try behavior is way more accurate in every way
>>    compared to btrfs_can_relocate(), we can rely on
>>    btrfs_inc_block_group_ro() to replace btrfs_can_relocate()
>>    completely.
>>
>> [FIX]
>> Since regular balance routine already has a better ENOSPC detector,
>> there is no need to keep the false-alert-prone btrfs_can_relocate().
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/ctree.h       |   1 -
>>  fs/btrfs/extent-tree.c | 141 -----------------------------------------
>>  fs/btrfs/volumes.c     |   4 --
>>  3 files changed, 146 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 0a61dff27f57..965d1e5a4af7 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -2772,7 +2772,6 @@ int btrfs_setup_space_cache(struct btrfs_trans_handle *trans);
>>  int btrfs_extent_readonly(struct btrfs_fs_info *fs_info, u64 bytenr);
>>  int btrfs_free_block_groups(struct btrfs_fs_info *info);
>>  int btrfs_read_block_groups(struct btrfs_fs_info *info);
>> -int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr);
>>  int btrfs_make_block_group(struct btrfs_trans_handle *trans,
>>                            u64 bytes_used, u64 type, u64 chunk_offset,
>>                            u64 size);
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 5faf057f6f37..822a4102980d 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -9774,147 +9774,6 @@ void btrfs_dec_block_group_ro(struct btrfs_block_group_cache *cache)
>>         spin_unlock(&sinfo->lock);
>>  }
>>
>> -/*
>> - * Checks to see if it's even possible to relocate this block group.
>> - *
>> - * @return - -1 if it's not a good idea to relocate this block group, 0 if its
>> - * ok to go ahead and try.
>> - */
>> -int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr)
>> -{
>> -       struct btrfs_block_group_cache *block_group;
>> -       struct btrfs_space_info *space_info;
>> -       struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
>> -       struct btrfs_device *device;
>> -       u64 min_free;
>> -       u64 dev_min = 1;
>> -       u64 dev_nr = 0;
>> -       u64 target;
>> -       int debug;
>> -       int index;
>> -       int full = 0;
>> -       int ret = 0;
>> -
>> -       debug = btrfs_test_opt(fs_info, ENOSPC_DEBUG);
>> -
>> -       block_group = btrfs_lookup_block_group(fs_info, bytenr);
>> -
>> -       /* odd, couldn't find the block group, leave it alone */
>> -       if (!block_group) {
>> -               if (debug)
>> -                       btrfs_warn(fs_info,
>> -                                  "can't find block group for bytenr %llu",
>> -                                  bytenr);
>> -               return -1;
>> -       }
>> -
>> -       min_free = btrfs_block_group_used(&block_group->item);
>> -
>> -       /* no bytes used, we're good */
>> -       if (!min_free)
>> -               goto out;
>> -
>> -       space_info = block_group->space_info;
>> -       spin_lock(&space_info->lock);
>> -
>> -       full = space_info->full;
>> -
>> -       /*
>> -        * if this is the last block group we have in this space, we can't
>> -        * relocate it unless we're able to allocate a new chunk below.
>> -        *
>> -        * Otherwise, we need to make sure we have room in the space to handle
>> -        * all of the extents from this block group.  If we can, we're good
>> -        */
>> -       if ((space_info->total_bytes != block_group->key.offset) &&
>> -           (btrfs_space_info_used(space_info, false) + min_free <
>> -            space_info->total_bytes)) {
>> -               spin_unlock(&space_info->lock);
>> -               goto out;
>> -       }
>> -       spin_unlock(&space_info->lock);
>> -
>> -       /*
>> -        * ok we don't have enough space, but maybe we have free space on our
>> -        * devices to allocate new chunks for relocation, so loop through our
>> -        * alloc devices and guess if we have enough space.  if this block
>> -        * group is going to be restriped, run checks against the target
>> -        * profile instead of the current one.
>> -        */
>> -       ret = -1;
>> -
>> -       /*
>> -        * index:
>> -        *      0: raid10
>> -        *      1: raid1
>> -        *      2: dup
>> -        *      3: raid0
>> -        *      4: single
>> -        */
>> -       target = get_restripe_target(fs_info, block_group->flags);
>> -       if (target) {
>> -               index = btrfs_bg_flags_to_raid_index(extended_to_chunk(target));
>> -       } else {
>> -               /*
>> -                * this is just a balance, so if we were marked as full
>> -                * we know there is no space for a new chunk
>> -                */
>> -               if (full) {
>> -                       if (debug)
>> -                               btrfs_warn(fs_info,
>> -                                          "no space to alloc new chunk for block group %llu",
>> -                                          block_group->key.objectid);
>> -                       goto out;
>> -               }
>> -
>> -               index = btrfs_bg_flags_to_raid_index(block_group->flags);
>> -       }
>> -
>> -       if (index == BTRFS_RAID_RAID10) {
>> -               dev_min = 4;
>> -               /* Divide by 2 */
>> -               min_free >>= 1;
>> -       } else if (index == BTRFS_RAID_RAID1) {
>> -               dev_min = 2;
>> -       } else if (index == BTRFS_RAID_DUP) {
>> -               /* Multiply by 2 */
>> -               min_free <<= 1;
>> -       } else if (index == BTRFS_RAID_RAID0) {
>> -               dev_min = fs_devices->rw_devices;
>> -               min_free = div64_u64(min_free, dev_min);
>> -       }
>> -
>> -       mutex_lock(&fs_info->chunk_mutex);
>> -       list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
>> -               u64 dev_offset;
>> -
>> -               /*
>> -                * check to make sure we can actually find a chunk with enough
>> -                * space to fit our block group in.
>> -                */
>> -               if (device->total_bytes > device->bytes_used + min_free &&
>> -                   !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
>> -                       ret = find_free_dev_extent(device, min_free,
>> -                                                  &dev_offset, NULL);
>> -                       if (!ret)
>> -                               dev_nr++;
>> -
>> -                       if (dev_nr >= dev_min)
>> -                               break;
> 
> And here's a bug in that code. Before breaking out of the loop, ret
> should be set to 0.

I have looked at that and actually it's correct. because if
find_free_dev_extent has returned successfully and dev_nr >= dev_min
is true then we break out of the loop with ret = 0. Though this logic is
somewhat implied and it took me some time to rationalize it.

> 
> In general I'm ok with the change, but would like an answer to those questions.
> 
> Thanks.
> 
>> -
>> -                       ret = -1;
>> -               }
>> -       }
>> -       if (debug && ret == -1)
>> -               btrfs_warn(fs_info,
>> -                          "no space to allocate a new chunk for block group %llu",
>> -                          block_group->key.objectid);
>> -       mutex_unlock(&fs_info->chunk_mutex);
>> -out:
>> -       btrfs_put_block_group(block_group);
>> -       return ret;
>> -}
>> -
>>  static int find_first_block_group(struct btrfs_fs_info *fs_info,
>>                                   struct btrfs_path *path,
>>                                   struct btrfs_key *key)
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1c2a6e4b39da..f209127a8bc6 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -3071,10 +3071,6 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
>>          */
>>         lockdep_assert_held(&fs_info->delete_unused_bgs_mutex);
>>
>> -       ret = btrfs_can_relocate(fs_info, chunk_offset);
>> -       if (ret)
>> -               return -ENOSPC;
>> -
>>         /* step one, relocate all the extents inside this chunk */
>>         btrfs_scrub_pause(fs_info);
>>         ret = btrfs_relocate_block_group(fs_info, chunk_offset);
>> --
>> 2.22.0
>>
> 
>
Qu Wenruo July 18, 2019, 11:35 a.m. UTC | #4
On 2019/7/18 下午7:16, Filipe Manana wrote:
> On Thu, Jul 18, 2019 at 6:50 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> [BUG]
>> The following script can easily cause unexpected ENOSPC:
>>   umount $dev &> /dev/null
>>   umount $mnt &> /dev/null
>>
>>   mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null
>>
>>   mount $dev $mnt -o enospc_debug
>>
>>   for i in $(seq -w 0 511); do
>>         xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
>>   done
>>   sync
>>
>>   btrfs balance start --full $mnt || return 1
>>   sync
>>
>>   # This will report -ENOSPC
>>   btrfs balance start --full $mnt || return 1
>>   umount $mnt
>>
>> Also, btrfs/156 can also fail due to ENOSPC.
> 
> Well, that script you pasted is btrfs/156 essentially.
> 
> When did the test started failing? When the test was added, it didn't
> fail, did it?

Biosect to commit 302167c50b32 ("btrfs: don't end the transaction for
delayed refs in throttle").

But that commit itself looks pretty valid to me.

And as described, it failed at btrfs_can_relocate(), not exactly the
test case is expected to fail.
(IIRC when the test case is submitted, the error happens at
inc_block_group_ro(), hence why I added some debug message there)

> 
>>
>> [CAUSE]
>> The ENOSPC is reported by btrfs_can_relocate().
>>
>> In btrfs_can_relocate(), it does the following check:
>> - If the block group is empty
>>   If empty, definitely we can relocate this block group.
>> - If we are not the only block group and we have enough space
>>   Then we can relocate this block group.
>>
>> Above two checks are completely OK, although I could argue they doesn't
>> make much sense, but the following check is vague and even sometimes
>> too cautious to cause ENOSPC:
>> - If we can allocate a new block group as large as current one.
>>   If we failed previous two checks, we must pass this to relocate this
>>   block group.
>>
>> There are several problems here:
>> 1. We don't need to allocate as large as the source block group.
>>    E.g. source block group is 1G sized, but only 1M used. We only need
>>    to allocated a data chunk larger than 1M to continue relocation.
> 
> Right. But where does btrfs_can_relocate() do such assumption?
> It only tries to check if there's enough space for an amount that
> corresponds to the amount used in the block group, that is, not the
> size of the block group (unless the block group is completely full).

You're right, my description here is wrong.

I'll remove this paragraph completely.

Thanks,
Qu

> 
>>
>> 2. The check in btrfs_can_relocate() is vague and impossible to be as
>>    accurate as __btrfs_alloc_chunk()
>>    How could this less than 200 lines code do the same work as
>>    __btrfs_alloc_chunk()? And it's hard to maintain two different
>>    functions to do similar work.
>>
>> 3. We have more accurate check in btrfs_inc_block_group_ro().
>>    Btrfs_inc_block_group_ro() is doing similar check but much better.
>>    In btrfs_inc_block_group_ro() we do:
>>    * Forced chunk allocation if we're converting
>>
>>    * Try to mark block group ro first
>>      in inc_btrfs_block_group_ro(), we will do comprehensive space
>>      check to ensure we have enough free space for the used and reserved
>>      space of the block group.
>>      If succeeded, we're done.
>>
>>    * Force chunk allocation for more space
>>      If we failed here, we indeed hits ENOSPC.
>>
>>    * Try to mark block group ro again
>>      As we have extra space, we can try again.
>>      This is the last chance, either we have enough space now and
>>      success, or the newly allocated space is not large enough, ENOSPC
>>      is returned.
>>
>>    Such try-allocate-try behavior is way more accurate in every way
>>    compared to btrfs_can_relocate(), we can rely on
>>    btrfs_inc_block_group_ro() to replace btrfs_can_relocate()
>>    completely.
>>
>> [FIX]
>> Since regular balance routine already has a better ENOSPC detector,
>> there is no need to keep the false-alert-prone btrfs_can_relocate().
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/ctree.h       |   1 -
>>  fs/btrfs/extent-tree.c | 141 -----------------------------------------
>>  fs/btrfs/volumes.c     |   4 --
>>  3 files changed, 146 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 0a61dff27f57..965d1e5a4af7 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -2772,7 +2772,6 @@ int btrfs_setup_space_cache(struct btrfs_trans_handle *trans);
>>  int btrfs_extent_readonly(struct btrfs_fs_info *fs_info, u64 bytenr);
>>  int btrfs_free_block_groups(struct btrfs_fs_info *info);
>>  int btrfs_read_block_groups(struct btrfs_fs_info *info);
>> -int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr);
>>  int btrfs_make_block_group(struct btrfs_trans_handle *trans,
>>                            u64 bytes_used, u64 type, u64 chunk_offset,
>>                            u64 size);
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 5faf057f6f37..822a4102980d 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -9774,147 +9774,6 @@ void btrfs_dec_block_group_ro(struct btrfs_block_group_cache *cache)
>>         spin_unlock(&sinfo->lock);
>>  }
>>
>> -/*
>> - * Checks to see if it's even possible to relocate this block group.
>> - *
>> - * @return - -1 if it's not a good idea to relocate this block group, 0 if its
>> - * ok to go ahead and try.
>> - */
>> -int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr)
>> -{
>> -       struct btrfs_block_group_cache *block_group;
>> -       struct btrfs_space_info *space_info;
>> -       struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
>> -       struct btrfs_device *device;
>> -       u64 min_free;
>> -       u64 dev_min = 1;
>> -       u64 dev_nr = 0;
>> -       u64 target;
>> -       int debug;
>> -       int index;
>> -       int full = 0;
>> -       int ret = 0;
>> -
>> -       debug = btrfs_test_opt(fs_info, ENOSPC_DEBUG);
>> -
>> -       block_group = btrfs_lookup_block_group(fs_info, bytenr);
>> -
>> -       /* odd, couldn't find the block group, leave it alone */
>> -       if (!block_group) {
>> -               if (debug)
>> -                       btrfs_warn(fs_info,
>> -                                  "can't find block group for bytenr %llu",
>> -                                  bytenr);
>> -               return -1;
>> -       }
>> -
>> -       min_free = btrfs_block_group_used(&block_group->item);
>> -
>> -       /* no bytes used, we're good */
>> -       if (!min_free)
>> -               goto out;
>> -
>> -       space_info = block_group->space_info;
>> -       spin_lock(&space_info->lock);
>> -
>> -       full = space_info->full;
>> -
>> -       /*
>> -        * if this is the last block group we have in this space, we can't
>> -        * relocate it unless we're able to allocate a new chunk below.
>> -        *
>> -        * Otherwise, we need to make sure we have room in the space to handle
>> -        * all of the extents from this block group.  If we can, we're good
>> -        */
>> -       if ((space_info->total_bytes != block_group->key.offset) &&
>> -           (btrfs_space_info_used(space_info, false) + min_free <
>> -            space_info->total_bytes)) {
>> -               spin_unlock(&space_info->lock);
>> -               goto out;
>> -       }
>> -       spin_unlock(&space_info->lock);
>> -
>> -       /*
>> -        * ok we don't have enough space, but maybe we have free space on our
>> -        * devices to allocate new chunks for relocation, so loop through our
>> -        * alloc devices and guess if we have enough space.  if this block
>> -        * group is going to be restriped, run checks against the target
>> -        * profile instead of the current one.
>> -        */
>> -       ret = -1;
>> -
>> -       /*
>> -        * index:
>> -        *      0: raid10
>> -        *      1: raid1
>> -        *      2: dup
>> -        *      3: raid0
>> -        *      4: single
>> -        */
>> -       target = get_restripe_target(fs_info, block_group->flags);
>> -       if (target) {
>> -               index = btrfs_bg_flags_to_raid_index(extended_to_chunk(target));
>> -       } else {
>> -               /*
>> -                * this is just a balance, so if we were marked as full
>> -                * we know there is no space for a new chunk
>> -                */
>> -               if (full) {
>> -                       if (debug)
>> -                               btrfs_warn(fs_info,
>> -                                          "no space to alloc new chunk for block group %llu",
>> -                                          block_group->key.objectid);
>> -                       goto out;
>> -               }
>> -
>> -               index = btrfs_bg_flags_to_raid_index(block_group->flags);
>> -       }
>> -
>> -       if (index == BTRFS_RAID_RAID10) {
>> -               dev_min = 4;
>> -               /* Divide by 2 */
>> -               min_free >>= 1;
>> -       } else if (index == BTRFS_RAID_RAID1) {
>> -               dev_min = 2;
>> -       } else if (index == BTRFS_RAID_DUP) {
>> -               /* Multiply by 2 */
>> -               min_free <<= 1;
>> -       } else if (index == BTRFS_RAID_RAID0) {
>> -               dev_min = fs_devices->rw_devices;
>> -               min_free = div64_u64(min_free, dev_min);
>> -       }
>> -
>> -       mutex_lock(&fs_info->chunk_mutex);
>> -       list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
>> -               u64 dev_offset;
>> -
>> -               /*
>> -                * check to make sure we can actually find a chunk with enough
>> -                * space to fit our block group in.
>> -                */
>> -               if (device->total_bytes > device->bytes_used + min_free &&
>> -                   !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
>> -                       ret = find_free_dev_extent(device, min_free,
>> -                                                  &dev_offset, NULL);
>> -                       if (!ret)
>> -                               dev_nr++;
>> -
>> -                       if (dev_nr >= dev_min)
>> -                               break;
> 
> And here's a bug in that code. Before breaking out of the loop, ret
> should be set to 0.
> 
> In general I'm ok with the change, but would like an answer to those questions.
> 
> Thanks.
> 
>> -
>> -                       ret = -1;
>> -               }
>> -       }
>> -       if (debug && ret == -1)
>> -               btrfs_warn(fs_info,
>> -                          "no space to allocate a new chunk for block group %llu",
>> -                          block_group->key.objectid);
>> -       mutex_unlock(&fs_info->chunk_mutex);
>> -out:
>> -       btrfs_put_block_group(block_group);
>> -       return ret;
>> -}
>> -
>>  static int find_first_block_group(struct btrfs_fs_info *fs_info,
>>                                   struct btrfs_path *path,
>>                                   struct btrfs_key *key)
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1c2a6e4b39da..f209127a8bc6 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -3071,10 +3071,6 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
>>          */
>>         lockdep_assert_held(&fs_info->delete_unused_bgs_mutex);
>>
>> -       ret = btrfs_can_relocate(fs_info, chunk_offset);
>> -       if (ret)
>> -               return -ENOSPC;
>> -
>>         /* step one, relocate all the extents inside this chunk */
>>         btrfs_scrub_pause(fs_info);
>>         ret = btrfs_relocate_block_group(fs_info, chunk_offset);
>> --
>> 2.22.0
>>
> 
>
Qu Wenruo July 18, 2019, 12:48 p.m. UTC | #5
On 2019/7/18 下午7:18, Nikolay Borisov wrote:
>
>
> On 18.07.19 г. 8:48 ч., Qu Wenruo wrote:
>> [BUG]
>> The following script can easily cause unexpected ENOSPC:
>>   umount $dev &> /dev/null
>>   umount $mnt &> /dev/null
>>
>>   mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null
>>
>>   mount $dev $mnt -o enospc_debug
>>
>>   for i in $(seq -w 0 511); do
>>   	xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
>>   done
>>   sync
>>
>>   btrfs balance start --full $mnt || return 1
>>   sync
>>
>>   # This will report -ENOSPC
>>   btrfs balance start --full $mnt || return 1
>>   umount $mnt
>>
>> Also, btrfs/156 can also fail due to ENOSPC.
>>
>> [CAUSE]
>> The ENOSPC is reported by btrfs_can_relocate().
>>
>> In btrfs_can_relocate(), it does the following check:
>> - If the block group is empty
>>   If empty, definitely we can relocate this block group.
>> - If we are not the only block group and we have enough space
>>   Then we can relocate this block group.
>>
>> Above two checks are completely OK, although I could argue they doesn't
>> make much sense, but the following check is vague and even sometimes
>> too cautious to cause ENOSPC:
>> - If we can allocate a new block group as large as current one.
>>   If we failed previous two checks, we must pass this to relocate this
>>   block group.
>
> btrfs_can_relocate chunk requires min_free to be allocatable.
> min_free is defined as the used space in the  block group being
> relocated, which I think is correct.

Yep, you and Filipe are completely right here.

I mischecked the @min_free.

But compared to the check in inc_block_group_ro(), it doesn't account
values like block group reserved (which is allocated for delayed ref,
but not committed to extent tree), pinned (commit tree blocks which is
not used in current trans) and super bytes (used by super block).

So the check is still not comprehensive.

> Also I find the logic which
> adjusts min_free and dev_min to also be correct. Finally the function
> checks whether the device's freespace is fragmented by trying to find a
> device chunk with the appropriate size. The question is - can we really
> have a device that has enough free space, yet is fragmented such that
> find_free_dev_extent fails which results in failing the allocation? I
> think the answer is no since we allocate in chunk granularity. What am I missing?

In fact, I have already hidden one more problem here.

At the timing of find_free_dev_extents() in btrfs_can_relocate(),
find_free_dev_extents() is not working properly as it always search
*commit* root of dev tree.

With extra dump tree, I found that commit root of dev tree at that
timing has an extra dev extent in commit tree but not in current root.
This is the root cause of the false ENOSPC.

I should explain the root cause and why at that timing calling
btrfs_can_relocate() is not reliable.

Anyway, for the removal part, it still makes sense.

>
>
> OTOH, in btrfs_inc_block_group_ro we only allocate a chunk if:
>
> a) we are changing raid profiles
> b) if inc_block_group_ro fails for our block group.
>
> And for b) I'm a bit puzzled as to what the code is supposed to mean. We have:
>
> num_bytes = cache->key.offset - cache->reserved - cache->pinned -
>                       cache->bytes_super - btrfs_block_group_used(&cache->item);
>           sinfo_used = btrfs_space_info_used(sinfo, true);
>
>           if (sinfo_used + num_bytes + min_allocable_bytes <=
>               sinfo->total_bytes) {
> //set ro
>
> }
>
> This means if the free space in the block group + the used space in the
> space info is smaller than the total space in
> the space info - make this block group RO. What's the rationale behind that?

Oh, this should be a similar check in btrfs_can_relocate(), but I was
confused by the complex check and even considered it correct after
several strange calculations.

What we really want is:

Available space in other block groups >= used/pinned/resved space in
this block_group + some buff

In code it should be
sinfo->avail - bg->available >= bg->used + buff

Then adds bg->available, we get
sinfo->avail >= bg->used + bg->available + buff

And bg->used + bg->available = bg->total and sinfo->avail = sinfo->total
- sinfo->used, we should get something like:

sinfo->total >= sinfo->used + bg->total + buff.
Compared to current one:
sinfo->total >= sinfo->used + bg->avail + buff.

In fact the current one is much easier to meet than the correct one.

I need to double check the calculation.

Thanks for pointing this out,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0a61dff27f57..965d1e5a4af7 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2772,7 +2772,6 @@  int btrfs_setup_space_cache(struct btrfs_trans_handle *trans);
 int btrfs_extent_readonly(struct btrfs_fs_info *fs_info, u64 bytenr);
 int btrfs_free_block_groups(struct btrfs_fs_info *info);
 int btrfs_read_block_groups(struct btrfs_fs_info *info);
-int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr);
 int btrfs_make_block_group(struct btrfs_trans_handle *trans,
 			   u64 bytes_used, u64 type, u64 chunk_offset,
 			   u64 size);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 5faf057f6f37..822a4102980d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -9774,147 +9774,6 @@  void btrfs_dec_block_group_ro(struct btrfs_block_group_cache *cache)
 	spin_unlock(&sinfo->lock);
 }
 
-/*
- * Checks to see if it's even possible to relocate this block group.
- *
- * @return - -1 if it's not a good idea to relocate this block group, 0 if its
- * ok to go ahead and try.
- */
-int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr)
-{
-	struct btrfs_block_group_cache *block_group;
-	struct btrfs_space_info *space_info;
-	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
-	struct btrfs_device *device;
-	u64 min_free;
-	u64 dev_min = 1;
-	u64 dev_nr = 0;
-	u64 target;
-	int debug;
-	int index;
-	int full = 0;
-	int ret = 0;
-
-	debug = btrfs_test_opt(fs_info, ENOSPC_DEBUG);
-
-	block_group = btrfs_lookup_block_group(fs_info, bytenr);
-
-	/* odd, couldn't find the block group, leave it alone */
-	if (!block_group) {
-		if (debug)
-			btrfs_warn(fs_info,
-				   "can't find block group for bytenr %llu",
-				   bytenr);
-		return -1;
-	}
-
-	min_free = btrfs_block_group_used(&block_group->item);
-
-	/* no bytes used, we're good */
-	if (!min_free)
-		goto out;
-
-	space_info = block_group->space_info;
-	spin_lock(&space_info->lock);
-
-	full = space_info->full;
-
-	/*
-	 * if this is the last block group we have in this space, we can't
-	 * relocate it unless we're able to allocate a new chunk below.
-	 *
-	 * Otherwise, we need to make sure we have room in the space to handle
-	 * all of the extents from this block group.  If we can, we're good
-	 */
-	if ((space_info->total_bytes != block_group->key.offset) &&
-	    (btrfs_space_info_used(space_info, false) + min_free <
-	     space_info->total_bytes)) {
-		spin_unlock(&space_info->lock);
-		goto out;
-	}
-	spin_unlock(&space_info->lock);
-
-	/*
-	 * ok we don't have enough space, but maybe we have free space on our
-	 * devices to allocate new chunks for relocation, so loop through our
-	 * alloc devices and guess if we have enough space.  if this block
-	 * group is going to be restriped, run checks against the target
-	 * profile instead of the current one.
-	 */
-	ret = -1;
-
-	/*
-	 * index:
-	 *      0: raid10
-	 *      1: raid1
-	 *      2: dup
-	 *      3: raid0
-	 *      4: single
-	 */
-	target = get_restripe_target(fs_info, block_group->flags);
-	if (target) {
-		index = btrfs_bg_flags_to_raid_index(extended_to_chunk(target));
-	} else {
-		/*
-		 * this is just a balance, so if we were marked as full
-		 * we know there is no space for a new chunk
-		 */
-		if (full) {
-			if (debug)
-				btrfs_warn(fs_info,
-					   "no space to alloc new chunk for block group %llu",
-					   block_group->key.objectid);
-			goto out;
-		}
-
-		index = btrfs_bg_flags_to_raid_index(block_group->flags);
-	}
-
-	if (index == BTRFS_RAID_RAID10) {
-		dev_min = 4;
-		/* Divide by 2 */
-		min_free >>= 1;
-	} else if (index == BTRFS_RAID_RAID1) {
-		dev_min = 2;
-	} else if (index == BTRFS_RAID_DUP) {
-		/* Multiply by 2 */
-		min_free <<= 1;
-	} else if (index == BTRFS_RAID_RAID0) {
-		dev_min = fs_devices->rw_devices;
-		min_free = div64_u64(min_free, dev_min);
-	}
-
-	mutex_lock(&fs_info->chunk_mutex);
-	list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
-		u64 dev_offset;
-
-		/*
-		 * check to make sure we can actually find a chunk with enough
-		 * space to fit our block group in.
-		 */
-		if (device->total_bytes > device->bytes_used + min_free &&
-		    !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
-			ret = find_free_dev_extent(device, min_free,
-						   &dev_offset, NULL);
-			if (!ret)
-				dev_nr++;
-
-			if (dev_nr >= dev_min)
-				break;
-
-			ret = -1;
-		}
-	}
-	if (debug && ret == -1)
-		btrfs_warn(fs_info,
-			   "no space to allocate a new chunk for block group %llu",
-			   block_group->key.objectid);
-	mutex_unlock(&fs_info->chunk_mutex);
-out:
-	btrfs_put_block_group(block_group);
-	return ret;
-}
-
 static int find_first_block_group(struct btrfs_fs_info *fs_info,
 				  struct btrfs_path *path,
 				  struct btrfs_key *key)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1c2a6e4b39da..f209127a8bc6 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3071,10 +3071,6 @@  static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
 	 */
 	lockdep_assert_held(&fs_info->delete_unused_bgs_mutex);
 
-	ret = btrfs_can_relocate(fs_info, chunk_offset);
-	if (ret)
-		return -ENOSPC;
-
 	/* step one, relocate all the extents inside this chunk */
 	btrfs_scrub_pause(fs_info);
 	ret = btrfs_relocate_block_group(fs_info, chunk_offset);