mbox series

[v2,0/5] btrfs-progs: check: properly generate csum for various complex combinations

Message ID 20220114005123.19426-1-wqu@suse.com (mailing list archive)
Headers show
Series btrfs-progs: check: properly generate csum for various complex combinations | expand

Message

Qu Wenruo Jan. 14, 2022, 12:51 a.m. UTC
[BUG]
Issue #420 mentions that --init-csum-tree creates extra csum for
preallocated extents.

Causing extra errors after --init-csum-tree.

[CAUSE]
Csum re-calculation code doesn't take the following factors into
consideration:

- NODATASUM inodes
- Preallocated extents
  No csum should be calculated for those data extents

- Partically written preallocated csum
  Csum should only be created for the referred part

So currently csum recalculation will just create csum for any data
extents it found, causing the mentioned errors.

[FIX]
- Bug fix for backref iteration
  Firstly fix a bug in backref code where indirect ref is not properly
  resolved.

  This allows us to use iterate_extent_inodes() to make our lives much
  easier checking the file extents.

- Code move for --init-csum-tree
  Move it to mode independent mode-common.[ch]

- Fix for --init-csum-tree
  There are some extreme corner cases to consider, in fact we can not
  really determine a range should have csum or not:
  * Preallocated, written, then hole punched range
    Should still have csum for the written (now hole) part.

  * Preallocated, then hole punched range.
    Should not have csum for the now hole part.

  * Regular written extent, then hole punched range
    Should have csum for the now hole part.

  But there is still one thing to follow, if a range is preallocated,
  it should not have csum.

  So here we go a different route, by always generate csum for the whole
  extent (as long as it's not belonging to NODATASUM inode), then remove
  csums for preallocated range.

- Fix for --init-csum-tree --init-extent-tree
  The same fix, just into a different context

- New test case

[CHANGELOG]
v2:
- Handle written extents then hole punched cases
  Now we always generate csum for a data extent as long as it doesn't
  belong to a NODATASUM inode, then remove the csum for preallocated
  range.

- Enhance test case to include hole punching and compressed extents


Qu Wenruo (5):
  btrfs-progs: backref: properly queue indirect refs
  btrfs-progs: check: move csum tree population into mode-common.[ch]
  btrfs-progs: check: don't calculate csum for preallocated file extents
  btrfs-progs: check: handle csum generation properly for
    `--init-csum-tree --init-extent-tree`
  btrfs-progs: fsck-tests: add test case for init-csum-tree

 check/main.c                                | 244 ------------
 check/mode-common.c                         | 393 ++++++++++++++++++++
 check/mode-common.h                         |   1 +
 kernel-shared/backref.c                     |  10 +-
 tests/fsck-tests/052-init-csum-tree/test.sh |  72 ++++
 5 files changed, 474 insertions(+), 246 deletions(-)
 create mode 100755 tests/fsck-tests/052-init-csum-tree/test.sh

Comments

David Sterba Jan. 14, 2022, 4:50 p.m. UTC | #1
On Fri, Jan 14, 2022 at 08:51:18AM +0800, Qu Wenruo wrote:
> [BUG]
> Issue #420 mentions that --init-csum-tree creates extra csum for
> preallocated extents.
> 
> Causing extra errors after --init-csum-tree.
> 
> [CAUSE]
> Csum re-calculation code doesn't take the following factors into
> consideration:
> 
> - NODATASUM inodes
> - Preallocated extents
>   No csum should be calculated for those data extents
> 
> - Partically written preallocated csum
>   Csum should only be created for the referred part
> 
> So currently csum recalculation will just create csum for any data
> extents it found, causing the mentioned errors.
> 
> [FIX]
> - Bug fix for backref iteration
>   Firstly fix a bug in backref code where indirect ref is not properly
>   resolved.
> 
>   This allows us to use iterate_extent_inodes() to make our lives much
>   easier checking the file extents.
> 
> - Code move for --init-csum-tree
>   Move it to mode independent mode-common.[ch]
> 
> - Fix for --init-csum-tree
>   There are some extreme corner cases to consider, in fact we can not
>   really determine a range should have csum or not:
>   * Preallocated, written, then hole punched range
>     Should still have csum for the written (now hole) part.
> 
>   * Preallocated, then hole punched range.
>     Should not have csum for the now hole part.
> 
>   * Regular written extent, then hole punched range
>     Should have csum for the now hole part.
> 
>   But there is still one thing to follow, if a range is preallocated,
>   it should not have csum.
> 
>   So here we go a different route, by always generate csum for the whole
>   extent (as long as it's not belonging to NODATASUM inode), then remove
>   csums for preallocated range.
> 
> - Fix for --init-csum-tree --init-extent-tree
>   The same fix, just into a different context
> 
> - New test case
> 
> [CHANGELOG]
> v2:
> - Handle written extents then hole punched cases
>   Now we always generate csum for a data extent as long as it doesn't
>   belong to a NODATASUM inode, then remove the csum for preallocated
>   range.
> 
> - Enhance test case to include hole punching and compressed extents

Added to devel, thanks, I guess this fixes some of the problems that I
saw when trying to implement the checksum algo switch.