diff mbox series

btrfs: do not assume the full page range is not dirty in extent_writepage_io()

Message ID 6239c8bb8ce319c2940d6469ffcb7f5f6641db79.1726011300.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: do not assume the full page range is not dirty in extent_writepage_io() | expand

Commit Message

Qu Wenruo Sept. 11, 2024, 12:21 a.m. UTC
The function extent_writepage_io() will submit the dirty sectors inside
the page for the write.

But recently to co-operate with the incoming subpage compression
enhancement, a new bitmap is introduced to
btrfs_bio_ctrl::submit_bitmap, to only avoid a subset of the dirty
range.

This is because we can have the following cases with 64K page size:

    0      16K       32K       48K       64K
    |      |/////////|         |/|
                                 52K

For range [16K, 32K), we queue the dirty range for compression, which is
ran in a delayed workqueue.
Then for range [48K, 52K), we go through the regular submission path.

In that case, our btrfs_bio_ctrl::submit_bitmap will exclude the range
[16K, 32K).

The dirty flags for the range [16K, 32K) is only cleared when the
compression is done, by the extent_clear_unlock_delalloc() call inside
submit_one_async_extent().

This patch fix the false alert by removing the
btrfs_folio_assert_not_dirty() check, since it's no longer correct for
subpage compression cases.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
This should be the last patch before the enablement of sector perfect
compression support for subpage.

Locally I have already been testing the sector perfect compression, and
that's the last fix.

All the subpage related fixes can be applied out of order as long as the
final enablement patch is at the end of the queue, but for now
my local branch has the following patch order (git log sequence):

 btrfs: allow compression even if the range is not page aligned
 btrfs: do not assume the full page range is not dirty in extent_writepage_io()
 btrfs: make extent_range_clear_diryt_for_io() to handle sector size < page size cases
 btrfs: wait for writeback if sector size is smaller than page size
 btrfs: compression: add an ASSERT() to ensure the read-in length is sane
 btrfs: zstd: make the compression path to handle sector size < page size
 btrfs: zlib: make the compression path to handle sector size < page size
---
 fs/btrfs/extent_io.c | 2 --
 1 file changed, 2 deletions(-)

Comments

David Sterba Sept. 17, 2024, 4:59 p.m. UTC | #1
On Wed, Sep 11, 2024 at 09:51:02AM +0930, Qu Wenruo wrote:
> The function extent_writepage_io() will submit the dirty sectors inside
> the page for the write.
> 
> But recently to co-operate with the incoming subpage compression
> enhancement, a new bitmap is introduced to
> btrfs_bio_ctrl::submit_bitmap, to only avoid a subset of the dirty
> range.
> 
> This is because we can have the following cases with 64K page size:
> 
>     0      16K       32K       48K       64K
>     |      |/////////|         |/|
>                                  52K
> 
> For range [16K, 32K), we queue the dirty range for compression, which is
> ran in a delayed workqueue.
> Then for range [48K, 52K), we go through the regular submission path.
> 
> In that case, our btrfs_bio_ctrl::submit_bitmap will exclude the range
> [16K, 32K).
> 
> The dirty flags for the range [16K, 32K) is only cleared when the
> compression is done, by the extent_clear_unlock_delalloc() call inside
> submit_one_async_extent().
> 
> This patch fix the false alert by removing the
> btrfs_folio_assert_not_dirty() check, since it's no longer correct for
> subpage compression cases.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> This should be the last patch before the enablement of sector perfect
> compression support for subpage.
> 
> Locally I have already been testing the sector perfect compression, and
> that's the last fix.
> 
> All the subpage related fixes can be applied out of order as long as the
> final enablement patch is at the end of the queue, but for now
> my local branch has the following patch order (git log sequence):
> 
>  btrfs: allow compression even if the range is not page aligned
>  btrfs: do not assume the full page range is not dirty in extent_writepage_io()
>  btrfs: make extent_range_clear_diryt_for_io() to handle sector size < page size cases
>  btrfs: wait for writeback if sector size is smaller than page size
>  btrfs: compression: add an ASSERT() to ensure the read-in length is sane
>  btrfs: zstd: make the compression path to handle sector size < page size
>  btrfs: zlib: make the compression path to handle sector size < page size

That's great news. I don't think there's anything else of comparable
size missing from the subpage support.

We're now in the middle of merge window but as the pull request has been
merged there's nothing blocking for-next so you can post the patches and
then add them.
Qu Wenruo Sept. 17, 2024, 9:23 p.m. UTC | #2
在 2024/9/18 02:29, David Sterba 写道:
> On Wed, Sep 11, 2024 at 09:51:02AM +0930, Qu Wenruo wrote:
>> The function extent_writepage_io() will submit the dirty sectors inside
>> the page for the write.
>>
>> But recently to co-operate with the incoming subpage compression
>> enhancement, a new bitmap is introduced to
>> btrfs_bio_ctrl::submit_bitmap, to only avoid a subset of the dirty
>> range.
>>
>> This is because we can have the following cases with 64K page size:
>>
>>      0      16K       32K       48K       64K
>>      |      |/////////|         |/|
>>                                   52K
>>
>> For range [16K, 32K), we queue the dirty range for compression, which is
>> ran in a delayed workqueue.
>> Then for range [48K, 52K), we go through the regular submission path.
>>
>> In that case, our btrfs_bio_ctrl::submit_bitmap will exclude the range
>> [16K, 32K).
>>
>> The dirty flags for the range [16K, 32K) is only cleared when the
>> compression is done, by the extent_clear_unlock_delalloc() call inside
>> submit_one_async_extent().
>>
>> This patch fix the false alert by removing the
>> btrfs_folio_assert_not_dirty() check, since it's no longer correct for
>> subpage compression cases.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> This should be the last patch before the enablement of sector perfect
>> compression support for subpage.
>>
>> Locally I have already been testing the sector perfect compression, and
>> that's the last fix.
>>
>> All the subpage related fixes can be applied out of order as long as the
>> final enablement patch is at the end of the queue, but for now
>> my local branch has the following patch order (git log sequence):
>>
>>   btrfs: allow compression even if the range is not page aligned
>>   btrfs: do not assume the full page range is not dirty in extent_writepage_io()
>>   btrfs: make extent_range_clear_diryt_for_io() to handle sector size < page size cases
>>   btrfs: wait for writeback if sector size is smaller than page size
>>   btrfs: compression: add an ASSERT() to ensure the read-in length is sane
>>   btrfs: zstd: make the compression path to handle sector size < page size
>>   btrfs: zlib: make the compression path to handle sector size < page size
>
> That's great news. I don't think there's anything else of comparable
> size missing from the subpage support.

Well, one more submitted series, which slightly reworks the delalloc
locking inside a page, to fix a possible double folio unlock which leads
to some hang:

https://lore.kernel.org/linux-btrfs/cover.1726441226.git.wqu@suse.com/

Then we're able to enable the feature:

https://lore.kernel.org/linux-btrfs/05299dac9e4379aff3f24986b4ca3fafb04d5d6a.1726527472.git.wqu@suse.com/


>
> We're now in the middle of merge window but as the pull request has been
> merged there's nothing blocking for-next so you can post the patches and
> then add them.

Sure, but IIRC none of the small fixes gets reviewed, I'm still waiting
for feedback.

Appreciate any review on those fixes, especially for those compression
path ones (the first 3), as they are really small fixes and will still
be needed even if we later change how we submit compressed writes.

Thanks,
Qu
David Sterba Sept. 19, 2024, 1:07 p.m. UTC | #3
On Wed, Sep 18, 2024 at 06:53:32AM +0930, Qu Wenruo wrote:
> >>   btrfs: allow compression even if the range is not page aligned
> >>   btrfs: do not assume the full page range is not dirty in extent_writepage_io()
> >>   btrfs: make extent_range_clear_diryt_for_io() to handle sector size < page size cases
> >>   btrfs: wait for writeback if sector size is smaller than page size
> >>   btrfs: compression: add an ASSERT() to ensure the read-in length is sane
> >>   btrfs: zstd: make the compression path to handle sector size < page size
> >>   btrfs: zlib: make the compression path to handle sector size < page size
> >
> > That's great news. I don't think there's anything else of comparable
> > size missing from the subpage support.
> 
> Well, one more submitted series, which slightly reworks the delalloc
> locking inside a page, to fix a possible double folio unlock which leads
> to some hang:
> 
> https://lore.kernel.org/linux-btrfs/cover.1726441226.git.wqu@suse.com/
> 
> Then we're able to enable the feature:
> 
> https://lore.kernel.org/linux-btrfs/05299dac9e4379aff3f24986b4ca3fafb04d5d6a.1726527472.git.wqu@suse.com/
> 
> > We're now in the middle of merge window but as the pull request has been
> > merged there's nothing blocking for-next so you can post the patches and
> > then add them.
> 
> Sure, but IIRC none of the small fixes gets reviewed, I'm still waiting
> for feedback.

I've read it, from high level POV it all seems OK. I haven't given it a
rev-by yet because I'm not sure there could be some corner case left,
with the page and bitmap state handling. But if you don't get any other
feedback please add the patches to for-next so we can get early testing.

> Appreciate any review on those fixes, especially for those compression
> path ones (the first 3), as they are really small fixes and will still
> be needed even if we later change how we submit compressed writes.

Well, yeah small changes with potentially big consequences. I have the
patches in my misc-next to check the default case.
diff mbox series

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 644e00d5b0f8..982bb469046f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1391,8 +1391,6 @@  static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
 			goto out;
 		submitted_io = true;
 	}
-
-	btrfs_folio_assert_not_dirty(fs_info, folio, start, len);
 out:
 	/*
 	 * If we didn't submitted any sector (>= i_size), folio dirty get