mbox series

[v2,00/39] btrfs: qgroup: Use backref cache based backref walk for commit roots

Message ID 20200326083316.48847-1-wqu@suse.com (mailing list archive)
Headers show
Series btrfs: qgroup: Use backref cache based backref walk for commit roots | expand

Message

Qu Wenruo March 26, 2020, 8:32 a.m. UTC
This patchset is based on misc-5.7 branch.

The branch can be fetched from github for review/testing.
https://github.com/adam900710/linux/tree/backref_cache_all

The patchset survives all the existing qgroup/volume/replace/balance tests.


=== BACKGROUND ===
One of the biggest problem for qgroup is its performance impact.
Although we have improved it in since v5.0 kernel, there is still
something slowing down qgroup, the backref walk.

Before this patchset, we use btrfs_find_all_roots() to iterate all roots
referring to one extent.
That function is doing a pretty good job, but it doesn't has any cache,
which means even we're looking up the same extent, we still need to do
the full backref walk.

On the other hand, relocation is doing its own backref cache, and
provides a much faster backref walk.

So the patchset is mostly trying to make qgroup backref walk (at least
commit root backref walk) to use the same mechanism provided by
relocation.

=== BENCHMARK ===
For the performance improvement, the last patch has a benchmark.
The following content is completely copied from that patch:
------
Here is a small script to test it:

  mkfs.btrfs -f $dev
  mount $dev -o space_cache=v2 $mnt

  btrfs subvolume create $mnt/src

  for ((i = 0; i < 64; i++)); do
          for (( j = 0; j < 16; j++)); do
                  xfs_io -f -c "pwrite 0 2k" $mnt/src/file_inline_$(($i * 16 + $j)) > /dev/null
          done
          xfs_io -f -c "pwrite 0 1M" $mnt/src/file_reg_$i > /dev/null
          sync
          btrfs subvol snapshot $mnt/src $mnt/snapshot_$i
  done
  sync

  btrfs quota enable $mnt
  btrfs quota rescan -w $mnt

Here is the benchmark for above small tests.
The performance material is the total execution time of get_old_roots()
for patched kernel (*), and find_all_roots() for original kernel.

*: With CONFIG_BTRFS_FS_CHECK_INTEGRITY disabled, as get_old_roots()
   will call find_all_roots() to verify the result if that config is
   enabled.

		|  Number of calls | Total exec time |
------------------------------------------------------
find_all_roots()|  732		   | 529991034ns
get_old_roots() |  732		   | 127998312ns
------------------------------------------------------
diff		|  0.00 %	   | -75.8 %
------


=== PATCHSET STRUCTURE ===
Patch 01~14 are refactors of relocation backref.
Patch 15~31 are code move.
Patch 32 is the patch that is already in misc-next.
Patch 33 is the final preparation for qgroup backref.
Patch 34~40 are the qgroup backref cache implementation.

=== CHANGELOG ===
v1:
- Use btrfs_backref_ prefix for exported structure/function
- Add one extra patch to rename backref_(node/edge/cache)
  The renaming itself is not small, thus better to do the rename
  first then move them to backref.[ch].
- Add extra Reviewed-by tags.

v2:
- Rebased to misc-next branch
- Add new reviewed-by tags from v1.

Qu Wenruo (39):
  btrfs: backref: Introduce the skeleton of btrfs_backref_iter
  btrfs: backref: Implement btrfs_backref_iter_next()
  btrfs: relocation: Use btrfs_backref_iter infrastructure
  btrfs: relocation: Rename mark_block_processed() and
    __mark_block_processed()
  btrfs: relocation: Add backref_cache::pending_edge and
    backref_cache::useless_node members
  btrfs: relocation: Add backref_cache::fs_info member
  btrfs: relocation: Make reloc root search specific for relocation
    backref cache
  btrfs: relocation: Refactor direct tree backref processing into its
    own function
  btrfs: relocation: Refactor indirect tree backref processing into its
    own function
  btrfs: relocation: Use wrapper to replace open-coded edge linking
  btrfs: relocation: Specify essential members for alloc_backref_node()
  btrfs: relocation: Remove the open-coded goto loop for breadth-first
    search
  btrfs: relocation: Refactor the finishing part of upper linkage into
    finish_upper_links()
  btrfs: relocation: Refactor the useless nodes handling into its own
    function
  btrfs: relocation: Add btrfs_ prefix for backref_node/edge/cache
  btrfs: Move btrfs_backref_(node|edge|cache) structures to backref.h
  btrfs: Rename tree_entry to simple_node and export it
  btrfs: Rename backref_cache_init() to btrfs_backref_cache_init() and
    move it to backref.c
  btrfs: Rename alloc_backref_node() to btrfs_backref_alloc_node() and
    move it backref.c
  btrfs: Rename alloc_backref_edge() to btrfs_backref_alloc_edge() and
    move it backref.c
  btrfs: Rename link_backref_edge() to btrfs_backref_link_edge() and
    move it backref.h
  btrfs: Rename free_backref_(node|edge) to
    btrfs_backref_free_(node|edge) and move them to backref.h
  btrfs: Rename drop_backref_node() to btrfs_backref_drop_node() and
    move its needed facilities to backref.h
  btrfs: Rename remove_backref_node() to btrfs_backref_cleanup_node()
    and move it to backref.c
  btrfs: Rename backref_cache_cleanup() to btrfs_backref_release_cache()
    and move it to backref.c
  btrfs: Rename backref_tree_panic() to btrfs_backref_panic(), and move
    it to backref.c
  btrfs: Rename should_ignore_root() to btrfs_should_ignore_reloc_root()
    and export it
  btrfs: relocation: Open-code read_fs_root() for
    handle_indirect_tree_backref()
  btrfs: Rename handle_one_tree_block() to btrfs_backref_add_tree_node()
    and move it to backref.c
  btrfs: Rename finish_upper_links() to
    btrfs_backref_finish_upper_links() and move it to backref.c
  btrfs: relocation: Move error handling of build_backref_tree() to
    backref.c
  btrfs: backref: Only ignore reloc roots for indrect backref resolve if
    the backref cache is for reloction purpose
  btrfs: qgroup: Introduce qgroup backref cache
  btrfs: qgroup: Introduce qgroup_backref_cache_build() function
  btrfs: qgroup: Introduce a function to iterate through backref_cache
    to find all parents for specified node
  btrfs: qgroup: Introduce helpers to get needed tree block info
  btrfs: qgroup: Introduce verification for function to ensure old roots
    ulist matches btrfs_find_all_roots() result
  btrfs: qgroup: Introduce a new function to get old_roots ulist using
    backref cache
  btrfs: qgroup: Use backref cache to speed up old_roots search

 fs/btrfs/backref.c    |  808 ++++++++++++++++++++++++++++
 fs/btrfs/backref.h    |  319 +++++++++++
 fs/btrfs/ctree.h      |    5 +
 fs/btrfs/disk-io.c    |    1 +
 fs/btrfs/misc.h       |   54 ++
 fs/btrfs/qgroup.c     |  516 +++++++++++++++++-
 fs/btrfs/relocation.c | 1187 ++++++++---------------------------------
 7 files changed, 1925 insertions(+), 965 deletions(-)

Comments

David Sterba March 27, 2020, 3:51 p.m. UTC | #1
On Thu, Mar 26, 2020 at 04:32:37PM +0800, Qu Wenruo wrote:
> This patchset is based on misc-5.7 branch.
> 
> The branch can be fetched from github for review/testing.
> https://github.com/adam900710/linux/tree/backref_cache_all
> 
> The patchset survives all the existing qgroup/volume/replace/balance tests.

Thanks for the rebase, the whole patchset passed fstests so I'll start
merging it. The backref cache (patches 33-39) still need review but
because the tests pass it can be in for-next. The cleanup part 1-32
seems safe so that'll go to misc-next soonish, once I go through the
patches, there are some minor style issues.
David Sterba April 2, 2020, 4:18 p.m. UTC | #2
I went through the patches, overall like the patch separation made it
easy. Thanks.

There are some things that I fixed or updated.

On Thu, Mar 26, 2020 at 04:32:37PM +0800, Qu Wenruo wrote:
> Qu Wenruo (39):
>   btrfs: backref: Introduce the skeleton of btrfs_backref_iter
>   btrfs: backref: Implement btrfs_backref_iter_next()
>   btrfs: relocation: Use btrfs_backref_iter infrastructure
>   btrfs: relocation: Rename mark_block_processed() and
>     __mark_block_processed()
>   btrfs: relocation: Add backref_cache::pending_edge and
>     backref_cache::useless_node members
>   btrfs: relocation: Add backref_cache::fs_info member
>   btrfs: relocation: Make reloc root search specific for relocation
>     backref cache
>   btrfs: relocation: Refactor direct tree backref processing into its
>     own function
>   btrfs: relocation: Refactor indirect tree backref processing into its
>     own function
>   btrfs: relocation: Use wrapper to replace open-coded edge linking
>   btrfs: relocation: Specify essential members for alloc_backref_node()
>   btrfs: relocation: Remove the open-coded goto loop for breadth-first
>     search
>   btrfs: relocation: Refactor the finishing part of upper linkage into
>     finish_upper_links()
>   btrfs: relocation: Refactor the useless nodes handling into its own
>     function
>   btrfs: relocation: Add btrfs_ prefix for backref_node/edge/cache
>   btrfs: Move btrfs_backref_(node|edge|cache) structures to backref.h
>   btrfs: Rename tree_entry to simple_node and export it
>   btrfs: Rename backref_cache_init() to btrfs_backref_cache_init() and
>     move it to backref.c
>   btrfs: Rename alloc_backref_node() to btrfs_backref_alloc_node() and
>     move it backref.c
>   btrfs: Rename alloc_backref_edge() to btrfs_backref_alloc_edge() and
>     move it backref.c
>   btrfs: Rename link_backref_edge() to btrfs_backref_link_edge() and
>     move it backref.h
>   btrfs: Rename free_backref_(node|edge) to
>     btrfs_backref_free_(node|edge) and move them to backref.h
>   btrfs: Rename drop_backref_node() to btrfs_backref_drop_node() and
>     move its needed facilities to backref.h
>   btrfs: Rename remove_backref_node() to btrfs_backref_cleanup_node()
>     and move it to backref.c
>   btrfs: Rename backref_cache_cleanup() to btrfs_backref_release_cache()
>     and move it to backref.c
>   btrfs: Rename backref_tree_panic() to btrfs_backref_panic(), and move
>     it to backref.c
>   btrfs: Rename should_ignore_root() to btrfs_should_ignore_reloc_root()
>     and export it
>   btrfs: relocation: Open-code read_fs_root() for
>     handle_indirect_tree_backref()
>   btrfs: Rename handle_one_tree_block() to btrfs_backref_add_tree_node()
>     and move it to backref.c
>   btrfs: Rename finish_upper_links() to
>     btrfs_backref_finish_upper_links() and move it to backref.c
>   btrfs: relocation: Move error handling of build_backref_tree() to
>     backref.c
>   btrfs: backref: Only ignore reloc roots for indrect backref resolve if
>     the backref cache is for reloction purpose

This subject line is way too long and also quite hard to grasp what's
the patch actually doing. The other subjects about moving functions are
too long as well, I understand you want to put the new name there too,
but it's IMHO not necessary. When the function is 'renamed and moved',
the details are in the patch. So the final list of subject lines I got
to:

  btrfs: backref: introduce the skeleton of btrfs_backref_iter
  btrfs: backref: implement btrfs_backref_iter_next()
  btrfs: reloc: use btrfs_backref_iter infrastructure
  btrfs: reloc: rename mark_block_processed and __mark_block_processed
  btrfs: reloc: add backref_cache::pending_edge and backref_cache::useless_node
  btrfs: reloc: add backref_cache::fs_info member
  btrfs: reloc: make reloc root search-specific for relocation backref cache
  btrfs: reloc: refactor direct tree backref processing into its own function
  btrfs: reloc: refactor indirect tree backref processing into its own function
  btrfs: reloc: use wrapper to replace open-coded edge linking
  btrfs: reloc: pass essential members for alloc_backref_node()
  btrfs: reloc: remove the open-coded goto loop for breadth-first search
  btrfs: reloc: refactor finishing part of upper linkage into finish_upper_links()
  btrfs: reloc: refactor useless nodes handling into its own function
  btrfs: reloc: add btrfs_ prefix for backref_node/edge/cache
  btrfs: move btrfs_backref_(node|edge|cache) structures to backref.h
  btrfs: rename tree_entry to simple_node and export it
  btrfs: rename and move backref_cache_init()
  btrfs: rename and move alloc_backref_node()
  btrfs: rename and move alloc_backref_edge()
  btrfs: rename and move link_backref_edge()
  btrfs: rename and move free_backref_(node|edge)
  btrfs: rename and move drop_backref_node()
  btrfs: rename and move remove_backref_node()
  btrfs: rename and move backref_cache_cleanup()
  btrfs: rename and move backref_tree_panic()
  btrfs: rename and move should_ignore_root()
  btrfs: reloc: open-code read_fs_root() for handle_indirect_tree_backref()
  btrfs: rename and move handle_one_tree_block()
  btrfs: rename and move finish_upper_links()
  btrfs: reloc: move error handling of build_backref_tree() to backref.c
  btrfs: backref: distinguish reloc and non-reloc use of indirect resolution

For a cleanup series I'd really like to see more focus on making the
code also look better, namely when the comments are moved/updated.

There's a common style that the old comments don't follow, eg. no
capital letter at the beginning, or not using the full line width. There
are also grammar mistakes or spelling typos. It's ok to fix that on the
fly.

The function comments should go to the .c file, not the headers (eg.
btrfs_backref_finish_upper_links, btrfs_backref_add_tree_node,
btrfs_backref_cleanup_node).

When you add something to the end of a header file, please keep an empty
line before the last #endif.

For the static inlines I want to do another round, most of them are
acceptable so I'll look for some clear examples where it's misused. A
quick grep over the code base shows there are many so it would be a
wider cleanup.
David Sterba April 3, 2020, 3:44 p.m. UTC | #3
On Thu, Mar 26, 2020 at 04:32:37PM +0800, Qu Wenruo wrote:
> Qu Wenruo (39):
>   btrfs: backref: Introduce the skeleton of btrfs_backref_iter
>   btrfs: backref: Implement btrfs_backref_iter_next()
>   btrfs: relocation: Use btrfs_backref_iter infrastructure
>   btrfs: relocation: Rename mark_block_processed() and
>     __mark_block_processed()
>   btrfs: relocation: Add backref_cache::pending_edge and
>     backref_cache::useless_node members
>   btrfs: relocation: Add backref_cache::fs_info member
>   btrfs: relocation: Make reloc root search specific for relocation
>     backref cache
>   btrfs: relocation: Refactor direct tree backref processing into its
>     own function
>   btrfs: relocation: Refactor indirect tree backref processing into its
>     own function
>   btrfs: relocation: Use wrapper to replace open-coded edge linking
>   btrfs: relocation: Specify essential members for alloc_backref_node()
>   btrfs: relocation: Remove the open-coded goto loop for breadth-first
>     search
>   btrfs: relocation: Refactor the finishing part of upper linkage into
>     finish_upper_links()
>   btrfs: relocation: Refactor the useless nodes handling into its own
>     function
>   btrfs: relocation: Add btrfs_ prefix for backref_node/edge/cache
>   btrfs: Move btrfs_backref_(node|edge|cache) structures to backref.h
>   btrfs: Rename tree_entry to simple_node and export it
>   btrfs: Rename backref_cache_init() to btrfs_backref_cache_init() and
>     move it to backref.c
>   btrfs: Rename alloc_backref_node() to btrfs_backref_alloc_node() and
>     move it backref.c
>   btrfs: Rename alloc_backref_edge() to btrfs_backref_alloc_edge() and
>     move it backref.c
>   btrfs: Rename link_backref_edge() to btrfs_backref_link_edge() and
>     move it backref.h
>   btrfs: Rename free_backref_(node|edge) to
>     btrfs_backref_free_(node|edge) and move them to backref.h
>   btrfs: Rename drop_backref_node() to btrfs_backref_drop_node() and
>     move its needed facilities to backref.h
>   btrfs: Rename remove_backref_node() to btrfs_backref_cleanup_node()
>     and move it to backref.c
>   btrfs: Rename backref_cache_cleanup() to btrfs_backref_release_cache()
>     and move it to backref.c
>   btrfs: Rename backref_tree_panic() to btrfs_backref_panic(), and move
>     it to backref.c
>   btrfs: Rename should_ignore_root() to btrfs_should_ignore_reloc_root()
>     and export it
>   btrfs: relocation: Open-code read_fs_root() for
>     handle_indirect_tree_backref()
>   btrfs: Rename handle_one_tree_block() to btrfs_backref_add_tree_node()
>     and move it to backref.c
>   btrfs: Rename finish_upper_links() to
>     btrfs_backref_finish_upper_links() and move it to backref.c
>   btrfs: relocation: Move error handling of build_backref_tree() to
>     backref.c
>   btrfs: backref: Only ignore reloc roots for indrect backref resolve if
>     the backref cache is for reloction purpose

Patches 1-32 are in misc-next.