mbox series

[0/3] btrfs: zoned: implement ZONE_RESET space_info reclaiming

Message ID cover.1731309514.git.naohiro.aota@wdc.com (mailing list archive)
Headers show
Series btrfs: zoned: implement ZONE_RESET space_info reclaiming | expand

Message

Naohiro Aota Nov. 11, 2024, 7:46 a.m. UTC
There is a longstanding early ENOSPC issue on the zoned mode. When there
are heavy write operations on a nearly ENOSPC file system, freeing up
the space and resetting the zones often cannot catch up the write speed.
That results in an early ENOSPC. For example, running the following fio
script, which repeatedly over-writes 15 GB files on 20 GB file system
results in a ENOSPC shown below.

Fio script:

  [test]
  filename=/mnt/scratch/test
  readwrite=write
  ioengine=libaio
  direct=1
  loops=10
  filesize=15G
  bs=128k

Result:

  BTRFS info (device nvme0n1): cannot satisfy tickets, dumping space info
  BTRFS info (device nvme0n1): space_info DATA has 0 free, is full
  BTRFS info (device nvme0n1): space_info total=20535312384, used=16106127360, pinned=0, reserved=0, may_use=0,
  readonly=0 zone_unusable=4429185024
  BTRFS info (device nvme0n1): failing ticket with 131072 bytes
  BTRFS info (device nvme0n1): space_info DATA has 0 free, is full
  BTRFS info (device nvme0n1): space_info total=20535312384, used=16106127360, pinned=0, reserved=0, may_use=0,
  readonly=0 zone_unusable=4429185024
  BTRFS info (device nvme0n1): global_block_rsv: size 25870336 reserved 25853952
  BTRFS info (device nvme0n1): trans_block_rsv: size 0 reserved 0
  BTRFS info (device nvme0n1): chunk_block_rsv: size 0 reserved 0
  BTRFS info (device nvme0n1): delayed_block_rsv: size 0 reserved 0
  BTRFS info (device nvme0n1): delayed_refs_rsv: size 0 reserved 0
  fio: io_u error on file /mnt/scratch/test: No space left on device: write offset=13287555072, buflen=131072
  fio: pid=869, err=28/file:io_u.c:1962, func=io_u error, error=No space left on device
  ...
  Run status group 0 (all jobs):
    WRITE: bw=113MiB/s (118MB/s), 113MiB/s-113MiB/s (118MB/s-118MB/s), io=27.4GiB (29.4GB), run=248965-248965msec

As the result shows, fio fails only after 27GB. Instead, it should be
able to write 150 GB by freeing over-written region. The space_info
status shows that there is 4.1 GB zone_unusable in the DATA space. While
this space will be eventually freed after a transaction commit and zone
reset, the space_info dump means btrfs is too slow to reuse the zone_unusable.

There are some reasons to hit ENOSPC early and this series only
addresses one of them: unusable block group is not reclaimed enough
fast. This series introduces a new space_info reclaim method
ZONE_RESET. That method will pick a block group in the unused list and
send ZONE_RESET command to free up and reuse the zone_unusable space.

For the first implementation, the ZONE_RESET is only applied to a block
group whose region is fully zone_unusable. Reclaiming partial
zone_unusable block group could be implemented later.

Patches 1 and 2 do the preparation for the patch 3 and there are no
functional change. Patch 3 introduces the new space_info reclaim method
ZONE_RESET described above.

Following series will fully fix ENOSPC issue on the above fio script.
One will separate space_info of regular data and relocation data. And,
another will rework zone resetting of deleted block group to let it set
the empty zone bit early.

Naohiro Aota (3):
  btrfs: introduce btrfs_return_free_space()
  btrfs: drop fs_info argument from btrfs_update_space_info_*
  btrfs: zoned: reclaim unused zone by zone resetting

 fs/btrfs/block-group.c       |  16 ++----
 fs/btrfs/block-rsv.c         |  10 +---
 fs/btrfs/delalloc-space.c    |   2 +-
 fs/btrfs/delayed-ref.c       |   5 +-
 fs/btrfs/extent-tree.c       |  35 +++---------
 fs/btrfs/inode.c             |   2 +-
 fs/btrfs/space-info.c        |  64 +++++++++++++++++----
 fs/btrfs/space-info.h        |  15 +++--
 fs/btrfs/transaction.c       |   3 +-
 fs/btrfs/zoned.c             | 104 +++++++++++++++++++++++++++++++++++
 fs/btrfs/zoned.h             |   7 +++
 include/trace/events/btrfs.h |   1 +
 12 files changed, 197 insertions(+), 67 deletions(-)