mbox series

[v2,00/10] btrfs: zoned: write-time activation of metadata block group

Message ID cover.1690823282.git.naohiro.aota@wdc.com (mailing list archive)
Headers show
Series btrfs: zoned: write-time activation of metadata block group | expand

Message

Naohiro Aota July 31, 2023, 5:17 p.m. UTC
In the current implementation, block groups are activated at
reservation time to ensure that all reserved bytes can be written to
an active metadata block group. However, this approach has proven to
be less efficient, as it activates block groups more frequently than
necessary, putting pressure on the active zone resource and leading to
potential issues such as early ENOSPC or hung_task.

Another drawback of the current method is that it hampers metadata
over-commit, and necessitates additional flush operations and block
group allocations, resulting in decreased overall performance.

Actually, we don't need so many active metadata block groups because
there is only one sequential metadata write stream.

So, this series introduces a write-time activation of metadata and
system block group. This involves reserving at least one active block
group specifically for a metadata and system block group. When the
write goes into a new block group, it should have allocated all the
regions in the current active block group. So, we can wait for IOs to
fill the space, and then switch to a new block group.

Switching to the write-time activation solves the above issue and will
lead to better performance.

* Performance

There is a significant difference with a workload (buffered write without
sync) because we re-enable metadata over-commit.

before the patch:  741.00 MB/sec
after the patch:  1430.27 MB/sec (+ 93%)

* Organization

Patches 1-5 are preparation patches involves meta_write_pointer check.

Patches 6 and 7 are the main part of this series, implementing the
write-time activation.

Patches 8-10 addresses code for reserve time activation: counting fresh
block group as zone_unusable, activating a block group on allocation,
and disabling metadata over-commit.

* Changes

- v2
  - Introduce a struct to consolidate extent buffer write context
    (btrfs_eb_write_context)
  - Change return type of btrfs_check_meta_write_pointer to int
  - Calculate the reservation count only when it sees DUP BG
  - Drop unnecessary BG lock

Naohiro Aota (10):
  btrfs: introduce struct to consolidate extent buffer write context
  btrfs: zoned: introduce block_group context to btrfs_eb_write_context
  btrfs: zoned: return int from btrfs_check_meta_write_pointer
  btrfs: zoned: defer advancing meta_write_pointer
  btrfs: zoned: update meta_write_pointer on zone finish
  btrfs: zoned: reserve zones for an active metadata/system block group
  btrfs: zoned: activate metadata block group on write time
  btrfs: zoned: no longer count fresh BG region as zone unusable
  btrfs: zoned: don't activate non-DATA BG on allocation
  btrfs: zoned: re-enable metadata over-commit for zoned mode

 fs/btrfs/block-group.c      |  13 ++-
 fs/btrfs/extent-tree.c      |   8 +-
 fs/btrfs/extent_io.c        |  48 +++++----
 fs/btrfs/extent_io.h        |   6 ++
 fs/btrfs/free-space-cache.c |   8 +-
 fs/btrfs/fs.h               |   9 ++
 fs/btrfs/space-info.c       |  34 +-----
 fs/btrfs/zoned.c            | 201 +++++++++++++++++++++++++++---------
 fs/btrfs/zoned.h            |  20 +---
 9 files changed, 216 insertions(+), 131 deletions(-)