mbox series

[0/7] btrfs: synchronous (but super simple) read-repair rework

Message ID cover.1653270322.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: synchronous (but super simple) read-repair rework | expand

Message

Qu Wenruo May 23, 2022, 1:48 a.m. UTC
This is the initial RFC version revivied, and based on Christoph's
cleanup series.

The branch can be feteched from my repo:
https://github.com/adam900710/linux/tree/read_repair

The core idea of the revived read-repair is the following assumptions:

- Read-repair is already a cold path
- Multiple corruption in a single read is even rarer in real-world

With the above two assumption combined, we are safe to sacrifice the
read-repair performance, by going completely synchronous read-repair.
(the original code is also done sector-by-sector, but in an asynchronous
way).

Now the read-repair is done in a sector-by-sector base:

1) Try to read the next mirror (if have any)
2) Verify the csum (if any)
3) If read failed or csum mismatched, go back to 1)

All the read (from next mirror) or write (to previous bad mirror) is
done synchronously.
Which means, we will wait for the read, then also wait for the write.

This is no doubt slow, but we should be fine with that, as for corrupted
data case, the priority is on the correctness, not the performance
anymore.
Not to mention this performance penalty is only for the cold path.


The advantage of this method is, the helper, btrfs_read_repair_sector()
is less than 100 lines, straight-forward to read/maintain.
And as all later read-repair code, we get rid of
btrfs_inode::failure_io_tree completely.

And since that helper only needs to manage the content of the page,
no need to bother page status update, thus can be easily applied to any endio
context (both buffered/direct IO paths).

Unfortunately since that helper is so simple, there is no need to
introduce btrfs_read_repair_ctl structure, thus the argument list of
that helper is a little longer.

Cc: Christoph Hellwig <hch@lst.de>

Christoph Hellwig (1):
  btrfs: add a btrfs_map_bio_wait helper

Qu Wenruo (6):
  btrfs: save the original bi_iter into btrfs_bio for buffered read
  btrfs: make repair_io_failure available outside of extent_io.c
  btrfs: add new read repair infrastructure
  btrfs: use the new read repair code for buffered reads
  btrfs: use the new read repair code for direct I/O
  btrfs: remove io_failure_record infrastructure completely

 fs/btrfs/Makefile            |   2 +-
 fs/btrfs/btrfs_inode.h       |   5 -
 fs/btrfs/extent-io-tree.h    |  15 --
 fs/btrfs/extent_io.c         | 424 +++--------------------------------
 fs/btrfs/extent_io.h         |  27 +--
 fs/btrfs/inode.c             |  54 ++---
 fs/btrfs/read-repair.c       |  74 ++++++
 fs/btrfs/read-repair.h       |  13 ++
 fs/btrfs/volumes.c           |  21 ++
 fs/btrfs/volumes.h           |   2 +
 include/trace/events/btrfs.h |   1 -
 11 files changed, 164 insertions(+), 474 deletions(-)
 create mode 100644 fs/btrfs/read-repair.c
 create mode 100644 fs/btrfs/read-repair.h