Message ID | 20240926140121.203821-1-kernel@pankajraghav.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | fs/writeback: convert wbc_account_cgroup_owner to take a folio | expand |
On Thu, Sep 26, 2024 at 04:01:21PM +0200, Pankaj Raghav (Samsung) wrote: > Convert wbc_account_cgroup_owner() to take a folio instead of a page, > and convert all callers to pass a folio directly except f2fs. > > Convert the page to folio for all the callers from f2fs as they were the > only callers calling wbc_account_cgroup_owner() with a page. As f2fs is > already in the process of converting to folios, these call sites might > also soon be calling wbc_account_cgroup_owner() with a folio directly in > the future. I was hoping for more from f2fs. I still don't have an answer from them whether they're going to support large folios. There's all kinds of crud already in these functions like: f2fs_set_bio_crypt_ctx(bio, fio->page->mapping->host, page_folio(fio->page)->index, fio, GFP_NOIO); and this patch is making it worse, not better. A series of patches which at least started to spread folios throughout f2fs would be better. I think that struct f2fs_io_info should have its page converted to a folio, for example. Although maybe not; perhaps this structure can carry data which doesn't belong to a folio that came from the page cache. It's very hard to tell because f2fs is so mind-numbingly complex and riddled with stupid abstraction layers. But I don't know what the f2fs maintainers have planned. And they won't tell me despite many times of asking.
On Thu, Sep 26, 2024 at 04:01:21PM +0200, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav <p.raghav@samsung.com> > > Most of the callers of wbc_account_cgroup_owner() are converting a folio > to page before calling the function. wbc_account_cgroup_owner() is > converting the page back to a folio to call mem_cgroup_css_from_folio(). > > Convert wbc_account_cgroup_owner() to take a folio instead of a page, > and convert all callers to pass a folio directly except f2fs. > > Convert the page to folio for all the callers from f2fs as they were the > only callers calling wbc_account_cgroup_owner() with a page. As f2fs is > already in the process of converting to folios, these call sites might > also soon be calling wbc_account_cgroup_owner() with a folio directly in > the future. > > No functional changes. Only compile tested. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> From cgroup writeback POV: Acked-by: Tejun Heo <tj@kernel.org> Thanks.
On 09/26, Matthew Wilcox wrote: > On Thu, Sep 26, 2024 at 04:01:21PM +0200, Pankaj Raghav (Samsung) wrote: > > Convert wbc_account_cgroup_owner() to take a folio instead of a page, > > and convert all callers to pass a folio directly except f2fs. > > > > Convert the page to folio for all the callers from f2fs as they were the > > only callers calling wbc_account_cgroup_owner() with a page. As f2fs is > > already in the process of converting to folios, these call sites might > > also soon be calling wbc_account_cgroup_owner() with a folio directly in > > the future. > > I was hoping for more from f2fs. I still don't have an answer from them > whether they're going to support large folios. There's all kinds of > crud already in these functions like: > > f2fs_set_bio_crypt_ctx(bio, fio->page->mapping->host, > page_folio(fio->page)->index, fio, GFP_NOIO); > > and this patch is making it worse, not better. A series of patches > which at least started to spread folios throughout f2fs would be better. > I think that struct f2fs_io_info should have its page converted to > a folio, for example. Although maybe not; perhaps this structure can > carry data which doesn't belong to a folio that came from the page cache. > It's very hard to tell because f2fs is so mind-numbingly complex and > riddled with stupid abstraction layers. Hah, I don't think it's too complex at all tho, there's a somewhat complexity to support file-based encryption, compression, and fsverity, which are useful for Android users. Well, I don't see any strong needs to support large folio, but some requests exist which was why we had to do some conversion. > > But I don't know what the f2fs maintainers have planned. And they won't > tell me despite many times of asking.
On 2024/10/1 6:58, Jaegeuk Kim wrote: > On 09/26, Matthew Wilcox wrote: >> On Thu, Sep 26, 2024 at 04:01:21PM +0200, Pankaj Raghav (Samsung) wrote: >>> Convert wbc_account_cgroup_owner() to take a folio instead of a page, >>> and convert all callers to pass a folio directly except f2fs. >>> >>> Convert the page to folio for all the callers from f2fs as they were the >>> only callers calling wbc_account_cgroup_owner() with a page. As f2fs is >>> already in the process of converting to folios, these call sites might >>> also soon be calling wbc_account_cgroup_owner() with a folio directly in >>> the future. >> >> I was hoping for more from f2fs. I still don't have an answer from them >> whether they're going to support large folios. There's all kinds of >> crud already in these functions like: >> >> f2fs_set_bio_crypt_ctx(bio, fio->page->mapping->host, >> page_folio(fio->page)->index, fio, GFP_NOIO); >> >> and this patch is making it worse, not better. A series of patches >> which at least started to spread folios throughout f2fs would be better. >> I think that struct f2fs_io_info should have its page converted to >> a folio, for example. Although maybe not; perhaps this structure can >> carry data which doesn't belong to a folio that came from the page cache. >> It's very hard to tell because f2fs is so mind-numbingly complex and >> riddled with stupid abstraction layers. > > Hah, I don't think it's too complex at all tho, there's a somewhat complexity to > support file-based encryption, compression, and fsverity, which are useful I agree w/ Jaegeuk. > for Android users. Well, I don't see any strong needs to support large folio, > but some requests exist which was why we had to do some conversion. > >> >> But I don't know what the f2fs maintainers have planned. And they won't >> tell me despite many times of asking. I supported large folio in f2fs by using a hacking way /w iomap fwk, it can only be enabled in very limited condition, after some seqread tests, I can see performance gain in server environment, but none in android device, and in addition, there is a memory leak bug which can cause out-of-memory issue. Unlucky, I have no slots to dig into these issues recently. Thanks,
On Thu, Sep 26, 2024 at 04:01:21PM +0200, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav <p.raghav@samsung.com> > > Most of the callers of wbc_account_cgroup_owner() are converting a folio > to page before calling the function. wbc_account_cgroup_owner() is > converting the page back to a folio to call mem_cgroup_css_from_folio(). > > Convert wbc_account_cgroup_owner() to take a folio instead of a page, > and convert all callers to pass a folio directly except f2fs. > > Convert the page to folio for all the callers from f2fs as they were the > only callers calling wbc_account_cgroup_owner() with a page. As f2fs is > already in the process of converting to folios, these call sites might > also soon be calling wbc_account_cgroup_owner() with a folio directly in > the future. > > No functional changes. Only compile tested. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > --- > fs/btrfs/extent_io.c | 7 +++---- > fs/btrfs/inode.c | 2 +- If you ever want to continue with the conversion then for the btrfs part Acked-by: David Sterba <dsterba@suse.com>
On Thu, 26 Sep 2024 16:01:21 +0200, Pankaj Raghav (Samsung) wrote: > Most of the callers of wbc_account_cgroup_owner() are converting a folio > to page before calling the function. wbc_account_cgroup_owner() is > converting the page back to a folio to call mem_cgroup_css_from_folio(). > > Convert wbc_account_cgroup_owner() to take a folio instead of a page, > and convert all callers to pass a folio directly except f2fs. > > [...] Applied to the vfs.misc branch of the vfs/vfs.git tree. Patches in the vfs.misc branch should appear in linux-next soon. Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it. It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated. Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch. tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs.misc [1/1] fs/writeback: convert wbc_account_cgroup_owner to take a folio https://git.kernel.org/vfs/vfs/c/30dac24e14b5
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 69af2173555fb..064012ea6f366 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -2945,7 +2945,7 @@ following two functions. a queue (device) has been associated with the bio and before submission. - wbc_account_cgroup_owner(@wbc, @page, @bytes) + wbc_account_cgroup_owner(@wbc, @folio, @bytes) Should be called for each data segment being written out. While this function doesn't care exactly when it's called during the writeback session, it's the easiest and most diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 39c9677c47d5a..4667d1e034e0e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -785,7 +785,7 @@ static void submit_extent_folio(struct btrfs_bio_ctrl *bio_ctrl, } if (bio_ctrl->wbc) - wbc_account_cgroup_owner(bio_ctrl->wbc, &folio->page, + wbc_account_cgroup_owner(bio_ctrl->wbc, folio, len); size -= len; @@ -1707,7 +1707,7 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb, ret = bio_add_folio(&bbio->bio, folio, eb->len, eb->start - folio_pos(folio)); ASSERT(ret); - wbc_account_cgroup_owner(wbc, folio_page(folio, 0), eb->len); + wbc_account_cgroup_owner(wbc, folio, eb->len); folio_unlock(folio); } else { int num_folios = num_extent_folios(eb); @@ -1721,8 +1721,7 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb, folio_start_writeback(folio); ret = bio_add_folio(&bbio->bio, folio, eb->folio_size, 0); ASSERT(ret); - wbc_account_cgroup_owner(wbc, folio_page(folio, 0), - eb->folio_size); + wbc_account_cgroup_owner(wbc, folio, eb->folio_size); wbc->nr_to_write -= folio_nr_pages(folio); folio_unlock(folio); } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index edac499fd83d2..eb64f04755c23 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1729,7 +1729,7 @@ static bool run_delalloc_compressed(struct btrfs_inode *inode, * need full accuracy. Just account the whole thing * against the first page. */ - wbc_account_cgroup_owner(wbc, &locked_folio->page, + wbc_account_cgroup_owner(wbc, locked_folio, cur_end - start); async_chunk[i].locked_folio = locked_folio; locked_folio = NULL; diff --git a/fs/buffer.c b/fs/buffer.c index 1fc9a50def0b5..32bd0f4c42236 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2803,7 +2803,7 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh, bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9); bio->bi_write_hint = write_hint; - __bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh)); + bio_add_folio_nofail(bio, bh->b_folio, bh->b_size, bh_offset(bh)); bio->bi_end_io = end_bio_bh_io_sync; bio->bi_private = bh; @@ -2813,7 +2813,7 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh, if (wbc) { wbc_init_bio(wbc, bio); - wbc_account_cgroup_owner(wbc, bh->b_page, bh->b_size); + wbc_account_cgroup_owner(wbc, bh->b_folio, bh->b_size); } submit_bio(bio); diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index ad5543866d215..b7b9261fec3b5 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -421,7 +421,7 @@ static void io_submit_add_bh(struct ext4_io_submit *io, io_submit_init_bio(io, bh); if (!bio_add_folio(io->io_bio, io_folio, bh->b_size, bh_offset(bh))) goto submit_and_retry; - wbc_account_cgroup_owner(io->io_wbc, &folio->page, bh->b_size); + wbc_account_cgroup_owner(io->io_wbc, folio, bh->b_size); io->io_next_block++; } diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 94f7b084f6016..e3ce763cce18f 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -711,7 +711,8 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio) } if (fio->io_wbc && !is_read_io(fio->op)) - wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); + wbc_account_cgroup_owner(fio->io_wbc, page_folio(fio->page), + PAGE_SIZE); inc_page_count(fio->sbi, is_read_io(fio->op) ? __read_io_type(page) : WB_DATA_TYPE(fio->page, false)); @@ -911,7 +912,8 @@ int f2fs_merge_page_bio(struct f2fs_io_info *fio) } if (fio->io_wbc) - wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); + wbc_account_cgroup_owner(fio->io_wbc, page_folio(fio->page), + PAGE_SIZE); inc_page_count(fio->sbi, WB_DATA_TYPE(page, false)); @@ -1011,7 +1013,8 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio) } if (fio->io_wbc) - wbc_account_cgroup_owner(fio->io_wbc, fio->page, PAGE_SIZE); + wbc_account_cgroup_owner(fio->io_wbc, page_folio(fio->page), + PAGE_SIZE); io->last_block_in_bio = fio->new_blkaddr; diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index d8bec3c1bb1fa..2391b09f4cede 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -890,17 +890,16 @@ EXPORT_SYMBOL_GPL(wbc_detach_inode); /** * wbc_account_cgroup_owner - account writeback to update inode cgroup ownership * @wbc: writeback_control of the writeback in progress - * @page: page being written out + * @folio: folio being written out * @bytes: number of bytes being written out * - * @bytes from @page are about to written out during the writeback + * @bytes from @folio are about to written out during the writeback * controlled by @wbc. Keep the book for foreign inode detection. See * wbc_detach_inode(). */ -void wbc_account_cgroup_owner(struct writeback_control *wbc, struct page *page, +void wbc_account_cgroup_owner(struct writeback_control *wbc, struct folio *folio, size_t bytes) { - struct folio *folio; struct cgroup_subsys_state *css; int id; @@ -913,7 +912,6 @@ void wbc_account_cgroup_owner(struct writeback_control *wbc, struct page *page, if (!wbc->wb || wbc->no_cgroup_owner) return; - folio = page_folio(page); css = mem_cgroup_css_from_folio(folio); /* dead cgroups shouldn't contribute to inode ownership arbitration */ if (!(css->flags & CSS_ONLINE)) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 11ea747228aee..3d1fae7d3a64e 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1833,7 +1833,7 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc, if (ifs) atomic_add(len, &ifs->write_bytes_pending); wpc->ioend->io_size += len; - wbc_account_cgroup_owner(wbc, &folio->page, len); + wbc_account_cgroup_owner(wbc, folio, len); return 0; } diff --git a/fs/mpage.c b/fs/mpage.c index b5b5ddf9d513d..82aecf3727437 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -606,7 +606,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, * the confused fail path above (OOM) will be very confused when * it finds all bh marked clean (i.e. it will not write anything) */ - wbc_account_cgroup_owner(wbc, &folio->page, folio_size(folio)); + wbc_account_cgroup_owner(wbc, folio, folio_size(folio)); length = first_unmapped << blkbits; if (!bio_add_folio(bio, folio, length, 0)) { bio = mpage_bio_submit_write(bio); diff --git a/include/linux/writeback.h b/include/linux/writeback.h index d6db822e4bb30..641a057e04132 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -217,7 +217,7 @@ void wbc_attach_and_unlock_inode(struct writeback_control *wbc, struct inode *inode) __releases(&inode->i_lock); void wbc_detach_inode(struct writeback_control *wbc); -void wbc_account_cgroup_owner(struct writeback_control *wbc, struct page *page, +void wbc_account_cgroup_owner(struct writeback_control *wbc, struct folio *folio, size_t bytes); int cgroup_writeback_by_id(u64 bdi_id, int memcg_id, enum wb_reason reason, struct wb_completion *done); @@ -324,7 +324,7 @@ static inline void wbc_init_bio(struct writeback_control *wbc, struct bio *bio) } static inline void wbc_account_cgroup_owner(struct writeback_control *wbc, - struct page *page, size_t bytes) + struct folio *folio, size_t bytes) { }