mbox series

[0/3] btrfs-progs: allow tree-checker to be triggered more frequently for btrfs-convert

Message ID cover.1705135055.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs-progs: allow tree-checker to be triggered more frequently for btrfs-convert | expand

Message

Qu Wenruo Jan. 13, 2024, 8:45 a.m. UTC
We have issue #731, which is a large ext4 fs, and triggered tree-checker
very reliably.

The root cause is the bad write behavior of btrfs-convert, which always
insert inode refs/file extents/xattrs first, then the inode item.

Such bad behavior would normally trigger tree-checker, but for our
regular convert tests, it doesn't get trigger because:

- We have metadata cache
  The default size is system memory / 4, which can be very very large.
  And written tree blocks would still be cached thus no tree-checker
  triggered for those cached tree blocks.

- We won't commit transaction for every copied inodes
  This means for a lot of cases, we may just commit a large transaction
  for all inodes, further reduce the chance to trigger tree checker.

This patchset would introduce two debug environment variables:

- BTRFS_PROGS_DEBUG_METADATA_CACHE_SIZE
  Global metadata cache size, affecting all btrfs-progs tools that opens
  an unmounted btrfs.

- BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD
  This only affects ext2 conversion, which determine how many bytes of
  dirty tree blocks we need before commit a transaction.

Those two variables would affect the speed of btrfs-convert greatly, and
we mostly use them for the selftests, thus they won't be documented
anyway but the source code.

With those new debug environment variables, we can reduce those values
to minimal, so that the existing convert tests are enough to trigger the
bug of issue #731.

Although the cost is, we make btrfs-convert test case 001 and 003 much
slower than usual (due to much frequent commit transaction commitment
and more IO to read tree blocks).

But I'd say, it's still worthy, as long as it can detect bad
btrfs-convert behaviors.

Qu Wenruo (3):
  btrfs-progs: convert/ext2: new debug environment variable to finetune
    transaction size
  btrfs-progs: new debug environment variable to finetune metadata cache
    size
  btrfs-progs: convert-tests: trigger tree-checker more frequently

 common/utils.c                             | 62 ++++++++++++++++++++++
 common/utils.h                             |  1 +
 convert/source-ext2.c                      |  9 +++-
 kernel-shared/extent_io.c                  |  5 +-
 tests/convert-tests/001-ext2-basic/test.sh |  7 +++
 tests/convert-tests/003-ext4-basic/test.sh |  7 +++
 6 files changed, 89 insertions(+), 2 deletions(-)

--
2.43.0

Comments

David Sterba Jan. 16, 2024, 6:37 p.m. UTC | #1
On Sat, Jan 13, 2024 at 07:15:28PM +1030, Qu Wenruo wrote:
> We have issue #731, which is a large ext4 fs, and triggered tree-checker
> very reliably.
> 
> The root cause is the bad write behavior of btrfs-convert, which always
> insert inode refs/file extents/xattrs first, then the inode item.
> 
> Such bad behavior would normally trigger tree-checker, but for our
> regular convert tests, it doesn't get trigger because:
> 
> - We have metadata cache
>   The default size is system memory / 4, which can be very very large.
>   And written tree blocks would still be cached thus no tree-checker
>   triggered for those cached tree blocks.
> 
> - We won't commit transaction for every copied inodes
>   This means for a lot of cases, we may just commit a large transaction
>   for all inodes, further reduce the chance to trigger tree checker.
> 
> This patchset would introduce two debug environment variables:
> 
> - BTRFS_PROGS_DEBUG_METADATA_CACHE_SIZE
>   Global metadata cache size, affecting all btrfs-progs tools that opens
>   an unmounted btrfs.
> 
> - BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD
>   This only affects ext2 conversion, which determine how many bytes of
>   dirty tree blocks we need before commit a transaction.
> 
> Those two variables would affect the speed of btrfs-convert greatly, and
> we mostly use them for the selftests, thus they won't be documented
> anyway but the source code.

Please add some documentation for them to tests/README.md a new '###'
section after 'Verbosity, test tuning'

> Although the cost is, we make btrfs-convert test case 001 and 003 much
> slower than usual (due to much frequent commit transaction commitment
> and more IO to read tree blocks).
> 
> But I'd say, it's still worthy, as long as it can detect bad
> btrfs-convert behaviors.

Yes it's worth, the tests are run manually and before release but it
does not matter if they take more time. We can keep only the 4k and 64k
nodesize cases if the speed is really a big issue.