mbox series

[v6,0/5] btrfs: subpage + zoned fixes

Message ID cover.1716445070.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: subpage + zoned fixes | expand

Message

Qu Wenruo May 23, 2024, 7:05 a.m. UTC
[CHANGELOG]
v6:
- Use unsigned int for bit map related members

- One extra ASSERT() to make sure our bit range never exceed the bitmap

- One extra ASSERT() for btrfs_run_delalloc_range() returning >0 case

- "dealloc" typo fix

- Small changes inside writepage_delalloc() main loop to make it a
  little easier to read

v5:
- Enhance the commit message on why we should not clear page dirty
  inside extent_write_locked_range()

- Reorder the patches so that no temporary list based solution for
  delalloc ranges

v4:
- Rebased to the latest for-next branch
  Thankfully no conflicts at all.

- Include all the previous preparation patches
  It turns out I split the preparation into other series and even get
  myself confused.

- Use the correct commit message from my local branch
  It turns out Josef is totally correct, the problem I described in
  "btrfs: do no clera page dirty inside extent_write_locked_range()" is
  really confusing, it has direct IO involved and my local branch is
  already using a much better commit and I just forgot it.
 
v3:
- Use the minimal fsstress workload with trace_printk() output to
  explain the bug better

v2:
- Update the commit message for the first patch
  As there is something wrong with the ASCII art of the memory layout.

[REPO]
https://github.com/adam900710/linux/tree/subpage_delalloc

If running subpage with zoned devices (TCMU emulated HDD, 64K or 16K
page size with 4K sectorsize), btrfs can easily hitting various bugs:

- ASSERT()s related to submitting page range which has no OE coverage

- Various reserved space leakage and some OE never finished

This is caused by two major reasons:

- run_delalloc_cow() is not subpage compatible
  There are several different problems involved furthermore.

  * extent_write_locked_range() would try to submit dirty pages beyond
    the specified subpage range
    Thus hit some ASSERT() that a dirty range has no corresponding OE


  * extent_write_locked_range() would unlock the whole page meanwhile
    we're only triggered for a subpage range
    Thus causing unexpected page to be unlocked.

  This would be addressed by patch 1~3 by:

  * Limited the submission range to follow the subpage ranges

  * Make the page unlocking part also subpage compatible, and always
    lock all delalloc subpage ranges covering the current page.

- Some dirty range is not submitted thus OE would never finish
  This happens due to the mismatch that extent_write_locked_range() can
  clear the full page dirty, even if we're only submitting part of the 
  dirty ranges, causing page dirty flags desync from subpage dirty
  flags.

  Then later __extent_writepage_io() would skip a non-dirty page, as the
  check is only checking the full page dirty flag, not the
  subpage bitmaps.

  This would be addressed by patch 4~5.


Qu Wenruo (5):
  btrfs: make __extent_writepage_io() to write specified range only
  btrfs: subpage: introduce helpers to handle subpage delalloc locking
  btrfs: lock subpage ranges in one go for writepage_delalloc()
  btrfs: do not clear page dirty inside extent_write_locked_range()
  btrfs: make extent_write_locked_range() to handle subpage writeback
    correctly

 fs/btrfs/extent_io.c | 132 +++++++++++++++++++++++++++++++------
 fs/btrfs/subpage.c   | 150 +++++++++++++++++++++++++++++++++++++++++--
 fs/btrfs/subpage.h   |  10 ++-
 3 files changed, 266 insertions(+), 26 deletions(-)