mbox series

[0/2] btrfs: do most metadata parentnesss check at endio time

Message ID cover.1663133223.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: do most metadata parentnesss check at endio time | expand

Message

Qu Wenruo Sept. 14, 2022, 5:32 a.m. UTC
[BACKGROUND]

Btrfs metadata and data verification are both done at endio time.

But metadata has its own extra verifiaction, mostly related to
parentness check, done at btrfs_read_extent_buffer() and
read_tree_block().

This is not a big deal, but if we want to make metadata read-repair to
share the same code base with data, we may want the metadata parentness
check also to happen at endio time.

[ENHANCEMENT]

This patchset will move all the parentness check code into
btrfs_validate_metadata_buffer().

As the first step, the first patch will concentrate all the existing
parentness check into one structure.

Then the second patch will pass all the parentness info into btrfs_bio,
using the shared space of data csum, so at endio time we can grab all
the metadata parentness info and do the verification.

This means the following mismatch at read time would be rejected
directly:

- First key mismatch
- Transid mismatch
- Level mismatch
- Owner root mismatch

Since all the read-time parentness check is all done at endio now,
btrfs_read_extent_buffer() can do less verification work for new extent
buffers which is going to be read from disk.

But please note that, we still do parentness check for cached extent
buffer, to avoid some cached/stale extent buffer read by other parent
tree blocks to cause problems.

Thankfully that part will not trigger read repair thus won't affect us
for now.

[TODO]
Make metadata and data share the same code base to do read-repair.

The main blockage here is, we have a lot of pending patches changing
the read-repair code, thus it's going to cause conflicts for the already
lengthy pending patches.

Thus the refactor part is sent out first, then after read-repair is
settled down, I can work on the unified read-repair code.

Qu Wenruo (2):
  btrfs: concentrate all tree block parentness check parameters into one
    structure
  btrfs: move tree block parentness check into validate_extent_buffer()

 fs/btrfs/backref.c      |  15 +++--
 fs/btrfs/ctree.c        |  28 +++++----
 fs/btrfs/disk-io.c      | 125 +++++++++++++++++++++++++++-------------
 fs/btrfs/disk-io.h      |  36 ++++++++++--
 fs/btrfs/extent-tree.c  |  12 ++--
 fs/btrfs/extent_io.c    |  18 ++++--
 fs/btrfs/extent_io.h    |   5 +-
 fs/btrfs/print-tree.c   |  13 +++--
 fs/btrfs/qgroup.c       |  18 ++++--
 fs/btrfs/relocation.c   |  11 +++-
 fs/btrfs/tree-log.c     |  25 +++++---
 fs/btrfs/tree-mod-log.c |   9 ++-
 fs/btrfs/volumes.h      |  25 +++++++-
 13 files changed, 248 insertions(+), 92 deletions(-)

Comments

David Sterba Nov. 11, 2022, 4:07 p.m. UTC | #1
On Wed, Sep 14, 2022 at 01:32:49PM +0800, Qu Wenruo wrote:
> [BACKGROUND]
> 
> Btrfs metadata and data verification are both done at endio time.
> 
> But metadata has its own extra verifiaction, mostly related to
> parentness check, done at btrfs_read_extent_buffer() and
> read_tree_block().
> 
> This is not a big deal, but if we want to make metadata read-repair to
> share the same code base with data, we may want the metadata parentness
> check also to happen at endio time.
> 
> [ENHANCEMENT]
> 
> This patchset will move all the parentness check code into
> btrfs_validate_metadata_buffer().
> 
> As the first step, the first patch will concentrate all the existing
> parentness check into one structure.
> 
> Then the second patch will pass all the parentness info into btrfs_bio,
> using the shared space of data csum, so at endio time we can grab all
> the metadata parentness info and do the verification.
> 
> This means the following mismatch at read time would be rejected
> directly:
> 
> - First key mismatch
> - Transid mismatch
> - Level mismatch
> - Owner root mismatch
> 
> Since all the read-time parentness check is all done at endio now,
> btrfs_read_extent_buffer() can do less verification work for new extent
> buffers which is going to be read from disk.
> 
> But please note that, we still do parentness check for cached extent
> buffer, to avoid some cached/stale extent buffer read by other parent
> tree blocks to cause problems.
> 
> Thankfully that part will not trigger read repair thus won't affect us
> for now.
> 
> [TODO]
> Make metadata and data share the same code base to do read-repair.
> 
> The main blockage here is, we have a lot of pending patches changing
> the read-repair code, thus it's going to cause conflicts for the already
> lengthy pending patches.
> 
> Thus the refactor part is sent out first, then after read-repair is
> settled down, I can work on the unified read-repair code.
> 
> Qu Wenruo (2):
>   btrfs: concentrate all tree block parentness check parameters into one
>     structure
>   btrfs: move tree block parentness check into validate_extent_buffer()

As this is only 2 patches and they apply with minimal conflicts I've
added it to misc-next, thanks.