mbox series

[v2,00/15] btrfs: read repair/direct I/O improvements

Message ID cover.1587072977.git.osandov@fb.com (mailing list archive)
Headers show
Series btrfs: read repair/direct I/O improvements | expand

Message

Omar Sandoval April 16, 2020, 9:46 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

Hi,

This is version 2 of my series of fixes, cleanups, and improvements to
direct I/O and read repair. It's preparation for adding read repair to
my RWF_ENCODED series [1], but it can go in independently.

Patch 1 adds a new bio helper needed for patch 4. Patches 2 and 3 are
direct I/O error handling fixes. Patch 4 is a buffered read repair fix.
Patch 5 is a buffered read repair improvement. Patches 6-9 are trivial
cleanups. Patch 10 converts direct I/O to use refcount_t, which would've
helped catch the bug fixed by patch 2. Patches 11-14 drastically
simplify the direct I/O code. Patch 15 unifies buffered and direct I/O
read repair, which also makes direct I/O repair actually do validation
for large failed reads instead of rewriting the whole thing.

Overall, this is net about -350 lines of code and actually makes direct
I/O more functional.

Note that this series causes btrfs/142 to fail. This is a bug in the
test, as it assumes that direct I/O doesn't do read validation. This is
fixed by the fstests patch "btrfs/14{2,3}: use dm-dust instead of
fail_make_request" which I sent up yesterday [2].

Jens and Christoph are cc'd for patches 1 and 4. Instead of looking at
the bio internals like I did in v1, I added a new
bio_for_each_bvec_all() helper.

Changes from v1 [3]:

* Added patch 1 with bio_for_each_bvec_all() helper
* Dropped patch 8 which moved struct definition
* Fixed performance regression [4] in patch 13 caused by accidentally
  making all direct I/O submission asynchronous
* Fixed uninitialized assert in patch 12
* Fixed misplaced assert in btrfs_check_read_dio_bio in patch 13
* Added reviewed-bys
* Refactored btrfs_submit_direct() and btrfs_submit_direct_hook() in
  patch 2 (I didn't add Nikolay's reviewed-by to that one because the
  new patch looks fairly different from patch 1 in v1)
* Rewrapped csum calculations in patch 10 for easier readability
* Clarified several commit messages and comments

This series is based on misc-next.

Thanks!

1: https://lore.kernel.org/linux-fsdevel/cover.1582930832.git.osandov@fb.com/
2: https://lore.kernel.org/fstests/d992390752c612acd0893ee3db929e77affded2b.1586983958.git.osandov@fb.com/
3: https://lore.kernel.org/linux-btrfs/cover.1583789410.git.osandov@fb.com/
4: https://lore.kernel.org/lkml/20200331090145.GH11705@shao2-debian/

Omar Sandoval (15):
  block: add bio_for_each_bvec_all()
  btrfs: fix error handling when submitting direct I/O bio
  btrfs: fix double __endio_write_update_ordered in direct I/O
  btrfs: look at full bi_io_vec for repair decision
  btrfs: don't do repair validation for checksum errors
  btrfs: clarify btrfs_lookup_bio_sums documentation
  btrfs: rename __readpage_endio_check to check_data_csum
  btrfs: make btrfs_check_repairable() static
  btrfs: kill btrfs_dio_private->private
  btrfs: convert btrfs_dio_private->pending_bios to refcount_t
  btrfs: put direct I/O checksums in btrfs_dio_private instead of bio
  btrfs: get rid of one layer of bios in direct I/O
  btrfs: simplify direct I/O read repair
  btrfs: get rid of endio_repair_workers
  btrfs: unify buffered and direct I/O read repair

 .clang-format                   |   1 +
 Documentation/block/biovecs.rst |   2 +
 fs/btrfs/btrfs_inode.h          |  26 +-
 fs/btrfs/ctree.h                |   1 -
 fs/btrfs/disk-io.c              |   8 +-
 fs/btrfs/disk-io.h              |   1 -
 fs/btrfs/extent_io.c            | 152 ++++---
 fs/btrfs/extent_io.h            |  19 +-
 fs/btrfs/file-item.c            |  11 +-
 fs/btrfs/inode.c                | 745 ++++++++------------------------
 include/linux/bio.h             |   8 +
 11 files changed, 312 insertions(+), 662 deletions(-)

Comments

David Sterba April 17, 2020, 11:03 a.m. UTC | #1
On Thu, Apr 16, 2020 at 02:46:10PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Hi,
> 
> This is version 2 of my series of fixes, cleanups, and improvements to
> direct I/O and read repair. It's preparation for adding read repair to
> my RWF_ENCODED series [1], but it can go in independently.
> 
> Patch 1 adds a new bio helper needed for patch 4. Patches 2 and 3 are
> direct I/O error handling fixes. Patch 4 is a buffered read repair fix.
> Patch 5 is a buffered read repair improvement. Patches 6-9 are trivial
> cleanups. Patch 10 converts direct I/O to use refcount_t, which would've
> helped catch the bug fixed by patch 2. Patches 11-14 drastically
> simplify the direct I/O code. Patch 15 unifies buffered and direct I/O
> read repair, which also makes direct I/O repair actually do validation
> for large failed reads instead of rewriting the whole thing.
> 
> Overall, this is net about -350 lines of code and actually makes direct
> I/O more functional.
> 
> Note that this series causes btrfs/142 to fail. This is a bug in the
> test, as it assumes that direct I/O doesn't do read validation. This is
> fixed by the fstests patch "btrfs/14{2,3}: use dm-dust instead of
> fail_make_request" which I sent up yesterday [2].
> 
> Jens and Christoph are cc'd for patches 1 and 4. Instead of looking at
> the bio internals like I did in v1, I added a new
> bio_for_each_bvec_all() helper.
> 
> Changes from v1 [3]:
> 
> * Added patch 1 with bio_for_each_bvec_all() helper
> * Dropped patch 8 which moved struct definition
> * Fixed performance regression [4] in patch 13 caused by accidentally
>   making all direct I/O submission asynchronous
> * Fixed uninitialized assert in patch 12
> * Fixed misplaced assert in btrfs_check_read_dio_bio in patch 13
> * Added reviewed-bys
> * Refactored btrfs_submit_direct() and btrfs_submit_direct_hook() in
>   patch 2 (I didn't add Nikolay's reviewed-by to that one because the
>   new patch looks fairly different from patch 1 in v1)
> * Rewrapped csum calculations in patch 10 for easier readability
> * Clarified several commit messages and comments

Thanks, I haven't read the changes yet but fstests passed. There are
some conflicts with the dio-iomap patches, all seem to be in the
expected range and solvable.
David Sterba April 21, 2020, 11:46 p.m. UTC | #2
On Thu, Apr 16, 2020 at 02:46:10PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> Omar Sandoval (15):
>   block: add bio_for_each_bvec_all()
>   btrfs: fix error handling when submitting direct I/O bio
>   btrfs: fix double __endio_write_update_ordered in direct I/O
>   btrfs: look at full bi_io_vec for repair decision
>   btrfs: don't do repair validation for checksum errors
>   btrfs: clarify btrfs_lookup_bio_sums documentation
>   btrfs: rename __readpage_endio_check to check_data_csum
>   btrfs: make btrfs_check_repairable() static
>   btrfs: kill btrfs_dio_private->private
>   btrfs: convert btrfs_dio_private->pending_bios to refcount_t
>   btrfs: put direct I/O checksums in btrfs_dio_private instead of bio
>   btrfs: get rid of one layer of bios in direct I/O
>   btrfs: simplify direct I/O read repair
>   btrfs: get rid of endio_repair_workers
>   btrfs: unify buffered and direct I/O read repair

Thanks to all who did the reviews, I'm adding the patchset to misc-next.
There are some minor updates, in changelogs or in code. There are also
some comments that might lead to more fixups but I don't think it's
too serious to hold the patches unmerged.

If there are futher comments or requests for clarification, changelog
updates, please let me know, I'll do that directly. We need to give it a
test and also to provide a base for the dio-iomap patchset.