mbox series

[v4,0/9] btrfs: compression: refactor and enhancement preparing for subpage compression support

Message ID 20210617051450.206704-1-wqu@suse.com (mailing list archive)
Headers show
Series btrfs: compression: refactor and enhancement preparing for subpage compression support | expand

Message

Qu Wenruo June 17, 2021, 5:14 a.m. UTC
There are quite some problems in compression code:

- Weird compressed_bio::pending_bios dance
  If we just don't want compressed_bio being freed halfway, we have more
  sane methods, just like btrfs_subpage::readers.

  So here we fix it by introducing compressed_bio::io_sectors to do the
  job.

- BUG_ON()s inside btrfs_submit_compressed_*()
  Even they are just ENOMEM, we should handle them.
  With io_sectors introduced, we have a way to finish compressed_bio
  all by ourselves, as long as we haven't submitted last bio.

  If we have last bio submitted, then endio will handle it well.

- Duplicated code for compressed bio allocation and submission
  Just small refactor can handle it

- Stripe boundary is checked every time one page is added
  This is overkilled.
  Just learn from extent_io.c refactor which use bio_ctrl to do the
  boundary check only once for each bio.

  Although in compression context, we don't need extra checks in
  extent_io.c, thus we don't need bio_ctrl structure, but
  can afford to do it locally.

- Dead code removal
  One dead comment and a new zombie function,
  btrfs_bio_fits_in_stripe(), can be removed now.

Changelog:
v2:
- Rebased to latest misc-next

- Fix a bug in btrfs_submit_compressed_write() where zoned write is not
  taken into consideration

- Reuse the existing chunk mapping of btrfs_get_chunk_map()

v3:
- Fix a bug that zoned device can't even pass btrfs/001
  This is because at endio time, bi_size for zoned device is always 0.
  We have to use bio_for_each_segment_all() to calculate the real bio
  size instead.
  In theory, it should also happen more frequently for non-zoned device,
  but no catch for all test cases (with "-o compress") except btrfs/011.

- Fix btrfs/011 hang when tested with "-o compress"
  This is caused by checking both atomic value without protection.
  Checking two atomic values is no longer atomic.

  In fact, with compressed_bio::io_sectors introduced, pending_bios is
  only used to wait for any pending bio to finish in error path.

  Thus dec_and_test_compressed_bio() only need to check if io_sectors is
  zero

- Fix a error that in error handling path, we may hang due to missing
  wake_up() in dec_and_test_compressed_bio()

v4:
- Use formal words for BUG_ON() removal patch titles

- Remove compressed_bio::pending_bios
  As compressed_bio::pending_sectors can replace it completely

- Remove unnecessary comments and BUG_ON()s

- Use wait_var_event() APIs to reduce the memory overhead

- Comments update to follow the same schema for moved comments

Qu Wenruo (9):
  btrfs: remove a dead comment for btrfs_decompress_bio()
  btrfs: introduce compressed_bio::pending_sectors to trace compressed
    bio more elegantly
  btrfs: handle errors properly inside btrfs_submit_compressed_read()
  btrfs: handle errors properly inside btrfs_submit_compressed_write()
  btrfs: introduce submit_compressed_bio() for compression
  btrfs: introduce alloc_compressed_bio() for compression
  btrfs: make btrfs_submit_compressed_read() to determine stripe
    boundary at bio allocation time
  btrfs: make btrfs_submit_compressed_write() to determine stripe
    boundary at bio allocation time
  btrfs: remove unused function btrfs_bio_fits_in_stripe()

 fs/btrfs/compression.c | 587 +++++++++++++++++++++++------------------
 fs/btrfs/compression.h |   4 +-
 fs/btrfs/ctree.h       |   2 -
 fs/btrfs/inode.c       |  42 ---
 4 files changed, 336 insertions(+), 299 deletions(-)

Comments

David Sterba June 17, 2021, 4:47 p.m. UTC | #1
On Thu, Jun 17, 2021 at 01:14:41PM +0800, Qu Wenruo wrote:
> There are quite some problems in compression code:
> 
> - Weird compressed_bio::pending_bios dance
>   If we just don't want compressed_bio being freed halfway, we have more
>   sane methods, just like btrfs_subpage::readers.
> 
>   So here we fix it by introducing compressed_bio::io_sectors to do the
>   job.
> 
> - BUG_ON()s inside btrfs_submit_compressed_*()
>   Even they are just ENOMEM, we should handle them.
>   With io_sectors introduced, we have a way to finish compressed_bio
>   all by ourselves, as long as we haven't submitted last bio.
> 
>   If we have last bio submitted, then endio will handle it well.
> 
> - Duplicated code for compressed bio allocation and submission
>   Just small refactor can handle it
> 
> - Stripe boundary is checked every time one page is added
>   This is overkilled.
>   Just learn from extent_io.c refactor which use bio_ctrl to do the
>   boundary check only once for each bio.
> 
>   Although in compression context, we don't need extra checks in
>   extent_io.c, thus we don't need bio_ctrl structure, but
>   can afford to do it locally.
> 
> - Dead code removal
>   One dead comment and a new zombie function,
>   btrfs_bio_fits_in_stripe(), can be removed now.

I went through it several times, the changes are scary, but the overall
direction is IMHO the right one, not to say it's fixing the difficult
BUG_ONs.

I'll put it to for-next once it passes a few rounds of fstests. Taking
it to 5.14 could be risky if we don't have enough review and testing,
time is almost up before the code freeze.
Qu Wenruo June 17, 2021, 10:46 p.m. UTC | #2
On 2021/6/18 上午12:47, David Sterba wrote:
> On Thu, Jun 17, 2021 at 01:14:41PM +0800, Qu Wenruo wrote:
>> There are quite some problems in compression code:
>>
>> - Weird compressed_bio::pending_bios dance
>>    If we just don't want compressed_bio being freed halfway, we have more
>>    sane methods, just like btrfs_subpage::readers.
>>
>>    So here we fix it by introducing compressed_bio::io_sectors to do the
>>    job.
>>
>> - BUG_ON()s inside btrfs_submit_compressed_*()
>>    Even they are just ENOMEM, we should handle them.
>>    With io_sectors introduced, we have a way to finish compressed_bio
>>    all by ourselves, as long as we haven't submitted last bio.
>>
>>    If we have last bio submitted, then endio will handle it well.
>>
>> - Duplicated code for compressed bio allocation and submission
>>    Just small refactor can handle it
>>
>> - Stripe boundary is checked every time one page is added
>>    This is overkilled.
>>    Just learn from extent_io.c refactor which use bio_ctrl to do the
>>    boundary check only once for each bio.
>>
>>    Although in compression context, we don't need extra checks in
>>    extent_io.c, thus we don't need bio_ctrl structure, but
>>    can afford to do it locally.
>>
>> - Dead code removal
>>    One dead comment and a new zombie function,
>>    btrfs_bio_fits_in_stripe(), can be removed now.
>
> I went through it several times, the changes are scary, but the overall
> direction is IMHO the right one, not to say it's fixing the difficult
> BUG_ONs.
>
> I'll put it to for-next once it passes a few rounds of fstests. Taking
> it to 5.14 could be risky if we don't have enough review and testing,
> time is almost up before the code freeze.
>
Please don't put it into 5.14.

It's really a preparation for subpage compression support.
However we don't even have subpage queued for v5.14, thus I'm not in a
hurry.

Thanks,
Qu
David Sterba June 22, 2021, 11:14 a.m. UTC | #3
On Fri, Jun 18, 2021 at 06:46:51AM +0800, Qu Wenruo wrote:
> > I went through it several times, the changes are scary, but the overall
> > direction is IMHO the right one, not to say it's fixing the difficult
> > BUG_ONs.
> >
> > I'll put it to for-next once it passes a few rounds of fstests. Taking
> > it to 5.14 could be risky if we don't have enough review and testing,
> > time is almost up before the code freeze.
> >
> Please don't put it into 5.14.
> 
> It's really a preparation for subpage compression support.
> However we don't even have subpage queued for v5.14, thus I'm not in a
> hurry.

Ok, no problem. As merge window is close I'll keep the compression and
subpage out of for-next until rc1 is out, the timestamped for-next
snapshots branches could contain it so we can start testing.
Qu Wenruo June 22, 2021, 11:50 a.m. UTC | #4
On 2021/6/22 下午7:14, David Sterba wrote:
> On Fri, Jun 18, 2021 at 06:46:51AM +0800, Qu Wenruo wrote:
>>> I went through it several times, the changes are scary, but the overall
>>> direction is IMHO the right one, not to say it's fixing the difficult
>>> BUG_ONs.
>>>
>>> I'll put it to for-next once it passes a few rounds of fstests. Taking
>>> it to 5.14 could be risky if we don't have enough review and testing,
>>> time is almost up before the code freeze.
>>>
>> Please don't put it into 5.14.
>>
>> It's really a preparation for subpage compression support.
>> However we don't even have subpage queued for v5.14, thus I'm not in a
>> hurry.
>
> Ok, no problem. As merge window is close I'll keep the compression and
> subpage out of for-next until rc1 is out, the timestamped for-next
> snapshots branches could contain it so we can start testing.
>
Sounds great.

Although the biggest challenge for testing is the hardware.

I guess even with subpage merged into upstream in v5.14, it won't really
get many tests from real world, unlike x86_64 that every guys in the
mail list is testing on.

Although we have cheap ARM SoC boards in recent days, there aren't
really much good candidates for us to utilize.

Either they don't have fast enough CPUs (2x 2GHz+ A72 or more) or don't
have even a single PCIe lane for NVME, or don't have good enough
upstream kernel support.

So far RPI CM4 would be a pretty good candidate, and I'm already using
it, but without overclocking and good heatsink, its CPU can be a little
bit slow, and the PCIe2.0 x1 lane is far from enough.

But I totally understand how difficult it could be for even kernel
developers to setup all the small things up.

Like TTY, libvirt, edk2 firmware for VM, RPI specific boot sequence etc.

Thus even subpage get merged, I still think we need way more rounds of
upstream release to really get some feedback.

Thanks,
Qu
David Sterba June 22, 2021, 12:41 p.m. UTC | #5
On Tue, Jun 22, 2021 at 07:50:38PM +0800, Qu Wenruo wrote:
> On 2021/6/22 下午7:14, David Sterba wrote:
> > On Fri, Jun 18, 2021 at 06:46:51AM +0800, Qu Wenruo wrote:
> Although the biggest challenge for testing is the hardware.
> 
> I guess even with subpage merged into upstream in v5.14, it won't really
> get many tests from real world, unlike x86_64 that every guys in the
> mail list is testing on.
> 
> Although we have cheap ARM SoC boards in recent days, there aren't
> really much good candidates for us to utilize.
> 
> Either they don't have fast enough CPUs (2x 2GHz+ A72 or more) or don't
> have even a single PCIe lane for NVME, or don't have good enough
> upstream kernel support.
> 
> So far RPI CM4 would be a pretty good candidate, and I'm already using
> it, but without overclocking and good heatsink, its CPU can be a little
> bit slow, and the PCIe2.0 x1 lane is far from enough.
> 
> But I totally understand how difficult it could be for even kernel
> developers to setup all the small things up.
> 
> Like TTY, libvirt, edk2 firmware for VM, RPI specific boot sequence etc.
> 
> Thus even subpage get merged, I still think we need way more rounds of
> upstream release to really get some feedback.

Back in January I got a hold of a decent arm64 machine and tested the 64K
build in a VM, the hardware should be good enough for running at least
3 in parallel and I think there were more identical physical machines
available so we can use that for testing once all the prep work is done.
Right now all people involved know that the subpage support may be buggy
or incomplete so we can rely on external testing for some time.