[04/31] ext4: Convert ext4_finish_bio() to use folios

Message ID	20230126202415.1682629-5-willy@infradead.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> From: "Matthew Wilcox (Oracle)" <willy@infradead.org> To: "Theodore Tso" <tytso@mit.edu>, Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>, linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios Date: Thu, 26 Jan 2023 20:23:48 +0000 Message-Id: <20230126202415.1682629-5-willy@infradead.org> In-Reply-To: <20230126202415.1682629-1-willy@infradead.org> References: <20230126202415.1682629-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Convert most of ext4 to folios \| expand [00/31] Convert most of ext4 to folios [01/31] fs: Add FGP_WRITEBEGIN [02/31] fscrypt: Add some folio helper functions [03/31] ext4: Convert ext4_bio_write_page() to use a folio [04/31] ext4: Convert ext4_finish_bio() to use folios [05/31] ext4: Convert ext4_writepage() to use a folio [06/31] ext4: Turn mpage_process_page() into mpage_process_folio() [07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio() [08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio() [09/31] ext4: Convert ext4_readpage_inline() to take a folio [10/31] ext4: Convert ext4_convert_inline_data_to_extent() to use a folio [11/31] ext4: Convert ext4_try_to_write_inline_data() to use a folio [12/31] ext4: Convert ext4_da_convert_inline_data_to_extent() to use a folio [13/31] ext4: Convert ext4_da_write_inline_data_begin() to use a folio [14/31] ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio() [15/31] ext4: Convert ext4_write_inline_data_end() to use a folio [16/31] ext4: Convert ext4_write_begin() to use a folio [17/31] ext4: Convert ext4_write_end() to use a folio [18/31] ext4: Use a folio in ext4_journalled_write_end() [19/31] ext4: Convert ext4_journalled_zero_new_buffers() to use a folio [20/31] ext4: Convert __ext4_block_zero_page_range() to use a folio [21/31] ext4: Convert __ext4_journalled_writepage() to take a folio [22/31] ext4: Convert ext4_page_nomap_can_writeout() to take a folio [23/31] ext4: Use a folio in ext4_da_write_begin() [24/31] ext4: Convert ext4_mpage_readpages() to work on folios [25/31] ext4: Convert ext4_block_write_begin() to take a folio [26/31] ext4: Convert ext4_writepage() to take a folio [27/31] ext4: Use a folio in ext4_page_mkwrite() [28/31] ext4: Use a folio iterator in __read_end_io() [29/31] ext4: Convert mext_page_mkuptodate() to take a folio [30/31] ext4: Convert pagecache_read() to use a folio [31/31] ext4: Use a folio in ext4_read_merkle_tree_page

Matthew Wilcox Jan. 26, 2023, 8:23 p.m. UTC

Prepare ext4 to support large folios in the page writeback path.
Also set the actual error in the mapping, not just -EIO.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/ext4/page-io.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

Ritesh Harjani (IBM) March 6, 2023, 9:10 a.m. UTC | #1

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> Prepare ext4 to support large folios in the page writeback path.

Sure. I am guessing for ext4 to completely support large folio
requires more work like fscrypt bounce page handling doesn't
yet support folios right?

Could you please give a little background on what all would be required
to add large folio support in ext4 buffered I/O path?
(I mean ofcourse other than saying move ext4 to iomap ;))

What I was interested in was, what other components in particular for
e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?

And how should one go about in adding this support? So can we move
ext4's read path to have large folio support to get started?
Have you already identified what all is missing from this path to
convert it?

> Also set the actual error in the mapping, not just -EIO.

Right. I looked at the history and I think it always just had EIO.
I think setting the actual err in mapping_set_error() is the right thing
to do here.

>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

w.r.t this patch series. I reviewed the mechanical changes & error paths
which converts ext4 ext4_finish_bio() to use folio.

The changes looks good to me from that perspective. Feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

> ---
>  fs/ext4/page-io.c | 32 ++++++++++++++++----------------
>  1 file changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 982791050892..fd6c0dca24b9 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -99,30 +99,30 @@ static void buffer_io_error(struct buffer_head *bh)
>
>  static void ext4_finish_bio(struct bio *bio)
>  {
> -	struct bio_vec *bvec;
> -	struct bvec_iter_all iter_all;
> +	struct folio_iter fi;
>
> -	bio_for_each_segment_all(bvec, bio, iter_all) {
> -		struct page *page = bvec->bv_page;
> -		struct page *bounce_page = NULL;
> +	bio_for_each_folio_all(fi, bio) {
> +		struct folio *folio = fi.folio;
> +		struct folio *io_folio = NULL;
>  		struct buffer_head *bh, *head;
> -		unsigned bio_start = bvec->bv_offset;
> -		unsigned bio_end = bio_start + bvec->bv_len;
> +		size_t bio_start = fi.offset;
> +		size_t bio_end = bio_start + fi.length;
>  		unsigned under_io = 0;
>  		unsigned long flags;
>
> -		if (fscrypt_is_bounce_page(page)) {
> -			bounce_page = page;
> -			page = fscrypt_pagecache_page(bounce_page);
> +		if (fscrypt_is_bounce_folio(folio)) {
> +			io_folio = folio;
> +			folio = fscrypt_pagecache_folio(folio);
>  		}
>
>  		if (bio->bi_status) {
> -			SetPageError(page);
> -			mapping_set_error(page->mapping, -EIO);
> +			int err = blk_status_to_errno(bio->bi_status);
> +			folio_set_error(folio);
> +			mapping_set_error(folio->mapping, err);
>  		}
> -		bh = head = page_buffers(page);
> +		bh = head = folio_buffers(folio);
>  		/*
> -		 * We check all buffers in the page under b_uptodate_lock
> +		 * We check all buffers in the folio under b_uptodate_lock
>  		 * to avoid races with other end io clearing async_write flags
>  		 */
>  		spin_lock_irqsave(&head->b_uptodate_lock, flags);
> @@ -141,8 +141,8 @@ static void ext4_finish_bio(struct bio *bio)
>  		} while ((bh = bh->b_this_page) != head);
>  		spin_unlock_irqrestore(&head->b_uptodate_lock, flags);
>  		if (!under_io) {
> -			fscrypt_free_bounce_page(bounce_page);
> -			end_page_writeback(page);
> +			fscrypt_free_bounce_page(&io_folio->page);

Could you please help understand what would it take to convert bounce
page in fscrypt to folio?

Today, we allocate 32 bounce pages of order 0 via mempool in
    fscrypt_initialize()
    <...>
        fscrypt_bounce_page_pool =
            mempool_create_page_pool(num_prealloc_crypto_pages, 0);
    <...>

And IIUC, we might need to add some support for having higher order
pages in the pool so that one can allocate a folio->_folio_order
folio from this pool for bounce page to support large folio.
Is that understanding correct? Your thoughts on this please?

-ritesh

Theodore Ts'o March 14, 2023, 10:08 p.m. UTC | #2

On Thu, Jan 26, 2023 at 08:23:48PM +0000, Matthew Wilcox (Oracle) wrote:
> Prepare ext4 to support large folios in the page writeback path.
> Also set the actual error in the mapping, not just -EIO.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>

Matthew Wilcox March 23, 2023, 3:26 a.m. UTC | #3

On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> 
> > Prepare ext4 to support large folios in the page writeback path.
> 
> Sure. I am guessing for ext4 to completely support large folio
> requires more work like fscrypt bounce page handling doesn't
> yet support folios right?
> 
> Could you please give a little background on what all would be required
> to add large folio support in ext4 buffered I/O path?
> (I mean ofcourse other than saying move ext4 to iomap ;))
> 
> What I was interested in was, what other components in particular for
> e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> 
> And how should one go about in adding this support? So can we move
> ext4's read path to have large folio support to get started?
> Have you already identified what all is missing from this path to
> convert it?

Honestly, I don't know what else needs to be done beyond this patch
series.  I can point at some stuff and say "This doesn't work", but in
general, you have to just enable it and see what breaks.  A lot of the
buffer_head code is not large-folio safe right now, so that's somewhere
to go and look.  Or maybe we "just" convert to iomap, and never bother
fixing the bufferhead code for large folios.

> > Also set the actual error in the mapping, not just -EIO.
> 
> Right. I looked at the history and I think it always just had EIO.
> I think setting the actual err in mapping_set_error() is the right thing
> to do here.
> 
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> w.r.t this patch series. I reviewed the mechanical changes & error paths
> which converts ext4 ext4_finish_bio() to use folio.
> 
> The changes looks good to me from that perspective. Feel free to add -
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Thanks!

Darrick J. Wong March 23, 2023, 2:51 p.m. UTC | #4

On Thu, Mar 23, 2023 at 03:26:43AM +0000, Matthew Wilcox wrote:
> On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> > "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> > 
> > > Prepare ext4 to support large folios in the page writeback path.
> > 
> > Sure. I am guessing for ext4 to completely support large folio
> > requires more work like fscrypt bounce page handling doesn't
> > yet support folios right?
> > 
> > Could you please give a little background on what all would be required
> > to add large folio support in ext4 buffered I/O path?
> > (I mean ofcourse other than saying move ext4 to iomap ;))
> > 
> > What I was interested in was, what other components in particular for
> > e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> > 
> > And how should one go about in adding this support? So can we move
> > ext4's read path to have large folio support to get started?
> > Have you already identified what all is missing from this path to
> > convert it?
> 
> Honestly, I don't know what else needs to be done beyond this patch
> series.  I can point at some stuff and say "This doesn't work", but in
> general, you have to just enable it and see what breaks.  A lot of the
> buffer_head code is not large-folio safe right now, so that's somewhere
> to go and look.  Or maybe we "just" convert to iomap, and never bother
> fixing the bufferhead code for large folios.

Yes.  Let's leave bufferheads in the legacy doo-doo-dooooo basement
instead of wasting more time on them.  Ideally we'd someday run all the
filesystems through:

bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
large folios -> retire to somewhere cheaper than Hawaii

--D

> > > Also set the actual error in the mapping, not just -EIO.
> > 
> > Right. I looked at the history and I think it always just had EIO.
> > I think setting the actual err in mapping_set_error() is the right thing
> > to do here.
> > 
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > 
> > w.r.t this patch series. I reviewed the mechanical changes & error paths
> > which converts ext4 ext4_finish_bio() to use folio.
> > 
> > The changes looks good to me from that perspective. Feel free to add -
> > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> 
> Thanks!

Matthew Wilcox March 23, 2023, 3:30 p.m. UTC | #5

On Thu, Mar 23, 2023 at 07:51:09AM -0700, Darrick J. Wong wrote:
> On Thu, Mar 23, 2023 at 03:26:43AM +0000, Matthew Wilcox wrote:
> > On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> > > "Matthew Wilcox (Oracle)" <willy@infradead.org> writes:
> > > 
> > > > Prepare ext4 to support large folios in the page writeback path.
> > > 
> > > Sure. I am guessing for ext4 to completely support large folio
> > > requires more work like fscrypt bounce page handling doesn't
> > > yet support folios right?
> > > 
> > > Could you please give a little background on what all would be required
> > > to add large folio support in ext4 buffered I/O path?
> > > (I mean ofcourse other than saying move ext4 to iomap ;))
> > > 
> > > What I was interested in was, what other components in particular for
> > > e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> > > 
> > > And how should one go about in adding this support? So can we move
> > > ext4's read path to have large folio support to get started?
> > > Have you already identified what all is missing from this path to
> > > convert it?
> > 
> > Honestly, I don't know what else needs to be done beyond this patch
> > series.  I can point at some stuff and say "This doesn't work", but in
> > general, you have to just enable it and see what breaks.  A lot of the
> > buffer_head code is not large-folio safe right now, so that's somewhere
> > to go and look.  Or maybe we "just" convert to iomap, and never bother
> > fixing the bufferhead code for large folios.
> 
> Yes.  Let's leave bufferheads in the legacy doo-doo-dooooo basement
> instead of wasting more time on them.  Ideally we'd someday run all the
> filesystems through:
> 
> bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
> large folios -> retire to somewhere cheaper than Hawaii

Places cheaper than Hawaii probably aren't as pretty as Hawaii though :-(

XFS is fine because it uses xfs_buf, but if we don't add support for
large folios to bufferheads, we can't support LBA size > PAGE_SIZE even
to read the superblock.  Maybe that's fine ... only filesystems which
don't use sb_bread() get to support LBA size > PAGE_SIZE.

I really want to see a cheaper abstraction for accessing the block device
than BHs.  Or xfs_buf for that matter.

Christoph Hellwig March 27, 2023, 12:57 a.m. UTC | #6

On Thu, Mar 23, 2023 at 07:51:09AM -0700, Darrick J. Wong wrote:
> Yes.  Let's leave bufferheads in the legacy doo-doo-dooooo basement
> instead of wasting more time on them.  Ideally we'd someday run all the
> filesystems through:
> 
> bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
> large folios -> retire to somewhere cheaper than Hawaii

For a lot of the legacy stuff (and with that I don't mean ext4) we'd
really need volunteers to do any work other than typo fixing and
cosmetic cleanups.  I suspect just dropping many of them is the only
thing we can do long term.

But even if we do the above for the data path for all file systems
remaining in tree, we still have buffer_heads for metadata.  And
I think buffered_heads really are more or less the right abstraction
there anyway.  And for these existing file systems we also do not
care about using large folios for metadata caching anyway.  The
only nasty part is that these buffer_heads use the same mapping
as all block device access, so we'll need to find a way to use
large folios for the block device mapping for the LBA size > page size
case, while supporting buffer heads for those file systems.  Nothing
unsolvable, but a bit tricky.

Christoph Hellwig March 27, 2023, 12:58 a.m. UTC | #7

On Thu, Mar 23, 2023 at 03:30:38PM +0000, Matthew Wilcox wrote:
> I really want to see a cheaper abstraction for accessing the block device
> than BHs.  Or xfs_buf for that matter.

You literally can just use the bdev page cache using the normal page
cache helpers.  It's not quite what most of these file systems expect,
though, especially for the block size < PAGE_SIZE case.

[04/31] ext4: Convert ext4_finish_bio() to use folios

Commit Message

Comments

Patch