mbox series

[0/2] btrfs-progs: free space tree fixes

Message ID cover.1653009947.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs-progs: free space tree fixes | expand

Message

Qu Wenruo May 20, 2022, 1:31 a.m. UTC
I was debugging a weird behavior that btrfs kernel chooses not to
allocate a new data extent at an empty data block group.

And when checking the free space tree, it turned out that, we always
use bitmaps in btrfs-progs no matter what.

This results some every concerning free space tree after mkfs:

  $ mkfs.btrfs  -f -m raid1 -d raid0 /dev/test/scratch[1234]
  btrfs-progs v5.17
  [...]
  Block group profiles:
    Data:             RAID0             4.00GiB
    Metadata:         RAID1           256.00MiB
    System:           RAID1             8.00MiB
  [..]

  $ btrfs ins dump-tree -t free-space /dev/test/scratch1
  btrfs-progs v5.17
  free space tree key (FREE_SPACE_TREE ROOT_ITEM 0)
  node 30441472 level 1 items 10 free space 483 generation 6 owner FREE_SPACE_TREE
  node 30441472 flags 0x1(WRITTEN) backref revision 1
  fs uuid deddccae-afd0-4160-9a12-48fe7b526fb1
  chunk uuid 68f6cf98-afe3-4f47-9797-37fd9c610219
          key (1048576 FREE_SPACE_INFO 4194304) block 30457856 gen 6
          key (475004928 FREE_SPACE_BITMAP 8388608) block 30703616 gen 5
          key (953155584 FREE_SPACE_BITMAP 8388608) block 30720000 gen 5
          key (1431306240 FREE_SPACE_BITMAP 8388608) block 30736384 gen 5
          key (1909456896 FREE_SPACE_BITMAP 8388608) block 30752768 gen 5
          key (2387607552 FREE_SPACE_BITMAP 8388608) block 30769152 gen 5
          key (2865758208 FREE_SPACE_BITMAP 8388608) block 30785536 gen 5
          key (3343908864 FREE_SPACE_BITMAP 8388608) block 30801920 gen 5
          key (3822059520 FREE_SPACE_BITMAP 8388608) block 30818304 gen 5
          key (4300210176 FREE_SPACE_BITMAP 8388608) block 30834688 gen 5
  [...]
  ^^^ So many bitmaps that an empty fs will have two levels for free
      space tree already

Thankfully, kernel can properly merge those bitmaps into a large extent
at mount, so it won't be that scary forever.

It turns out that, we never set btrfs_block_group::bitmap_high_thresh,
thus we always convert free space extents to bitmaps, and waste space
unnecessarily.

Fix it by cross-port the needed function
set_free_space_tree_thresholds() from kernel and call it at correct
timing.

And finally add a test case for it.

Unfortunately, even with this fixed, kernel is still doing its weird
behavior, as it's the cached un-clustered allocation code causing the
problem...

Qu Wenruo (2):
  btrfs-progs: properly initialize btrfs_block_group::bitmap_high_thresh
  btrfs-progs: mkfs-tests: add test case to make sure we don't create
    bitmaps for empty fs

 kernel-shared/extent-tree.c              |  2 ++
 kernel-shared/free-space-tree.c          | 29 ++++++++++++++++++++
 kernel-shared/free-space-tree.h          |  2 ++
 tests/mkfs-tests/024-fst-bitmaps/test.sh | 35 ++++++++++++++++++++++++
 4 files changed, 68 insertions(+)
 create mode 100755 tests/mkfs-tests/024-fst-bitmaps/test.sh

Comments

David Sterba May 20, 2022, 2:34 p.m. UTC | #1
On Fri, May 20, 2022 at 09:31:49AM +0800, Qu Wenruo wrote:
> I was debugging a weird behavior that btrfs kernel chooses not to
> allocate a new data extent at an empty data block group.
> 
> And when checking the free space tree, it turned out that, we always
> use bitmaps in btrfs-progs no matter what.
> 
> This results some every concerning free space tree after mkfs:
> 
>   $ mkfs.btrfs  -f -m raid1 -d raid0 /dev/test/scratch[1234]
>   btrfs-progs v5.17
>   [...]
>   Block group profiles:
>     Data:             RAID0             4.00GiB
>     Metadata:         RAID1           256.00MiB
>     System:           RAID1             8.00MiB
>   [..]
> 
>   $ btrfs ins dump-tree -t free-space /dev/test/scratch1
>   btrfs-progs v5.17
>   free space tree key (FREE_SPACE_TREE ROOT_ITEM 0)
>   node 30441472 level 1 items 10 free space 483 generation 6 owner FREE_SPACE_TREE
>   node 30441472 flags 0x1(WRITTEN) backref revision 1
>   fs uuid deddccae-afd0-4160-9a12-48fe7b526fb1
>   chunk uuid 68f6cf98-afe3-4f47-9797-37fd9c610219
>           key (1048576 FREE_SPACE_INFO 4194304) block 30457856 gen 6
>           key (475004928 FREE_SPACE_BITMAP 8388608) block 30703616 gen 5
>           key (953155584 FREE_SPACE_BITMAP 8388608) block 30720000 gen 5
>           key (1431306240 FREE_SPACE_BITMAP 8388608) block 30736384 gen 5
>           key (1909456896 FREE_SPACE_BITMAP 8388608) block 30752768 gen 5
>           key (2387607552 FREE_SPACE_BITMAP 8388608) block 30769152 gen 5
>           key (2865758208 FREE_SPACE_BITMAP 8388608) block 30785536 gen 5
>           key (3343908864 FREE_SPACE_BITMAP 8388608) block 30801920 gen 5
>           key (3822059520 FREE_SPACE_BITMAP 8388608) block 30818304 gen 5
>           key (4300210176 FREE_SPACE_BITMAP 8388608) block 30834688 gen 5
>   [...]
>   ^^^ So many bitmaps that an empty fs will have two levels for free
>       space tree already
> 
> Thankfully, kernel can properly merge those bitmaps into a large extent
> at mount, so it won't be that scary forever.
> 
> It turns out that, we never set btrfs_block_group::bitmap_high_thresh,
> thus we always convert free space extents to bitmaps, and waste space
> unnecessarily.
> 
> Fix it by cross-port the needed function
> set_free_space_tree_thresholds() from kernel and call it at correct
> timing.
> 
> And finally add a test case for it.
> 
> Unfortunately, even with this fixed, kernel is still doing its weird
> behavior, as it's the cached un-clustered allocation code causing the
> problem...
> 
> Qu Wenruo (2):
>   btrfs-progs: properly initialize btrfs_block_group::bitmap_high_thresh
>   btrfs-progs: mkfs-tests: add test case to make sure we don't create
>     bitmaps for empty fs

Good catch, thanks. The free-space-tree.c has a high similarity with the
kernel sources, there are possibly more changes missing in the progs
implementation. Getting this file in sync would be desirable, function
by function or small updates are fine, if anybody is interested.