Message ID | 20230315123233.121593-2-p.raghav@samsung.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | convert page_endio to folio_endio | expand |
On Wed, Mar 15, 2023 at 01:32:31PM +0100, Pankaj Raghav wrote: > page_endio() already works on folios by converting a page in to a folio as > the first step. Convert page_endio to folio_endio by taking a folio as the > first parameter. > > Instead of doing a page to folio conversion in the page_endio() > function, the consumers of this API do this conversion and call the > folio_endio() function in this patch. > The following patches will convert the consumers of this API to use native > folio functions to pass to folio_endio(). > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Perhaps the hard thing is what tree would this go through? Luis
On 3/15/23 13:32, Pankaj Raghav wrote: > page_endio() already works on folios by converting a page in to a folio as > the first step. Convert page_endio to folio_endio by taking a folio as the > first parameter. > > Instead of doing a page to folio conversion in the page_endio() > function, the consumers of this API do this conversion and call the > folio_endio() function in this patch. > The following patches will convert the consumers of this API to use native > folio functions to pass to folio_endio(). > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > --- > drivers/block/zram/zram_drv.c | 2 +- > fs/mpage.c | 2 +- > fs/orangefs/inode.c | 2 +- > include/linux/pagemap.h | 2 +- > mm/filemap.c | 8 +++----- > 5 files changed, 7 insertions(+), 9 deletions(-) > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index aa490da3cef2..f441251c9138 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -610,7 +610,7 @@ static void zram_page_end_io(struct bio *bio) > { > struct page *page = bio_first_page_all(bio); > > - page_endio(page, op_is_write(bio_op(bio)), > + folio_endio(page_folio(page), op_is_write(bio_op(bio)), > blk_status_to_errno(bio->bi_status)); > bio_put(bio); > } > diff --git a/fs/mpage.c b/fs/mpage.c > index 22b9de5ddd68..40e86e839e77 100644 > --- a/fs/mpage.c > +++ b/fs/mpage.c > @@ -50,7 +50,7 @@ static void mpage_end_io(struct bio *bio) > > bio_for_each_segment_all(bv, bio, iter_all) { > struct page *page = bv->bv_page; > - page_endio(page, bio_op(bio), > + folio_endio(page_folio(page), bio_op(bio), > blk_status_to_errno(bio->bi_status)); > } > Can't this be converted to use 'bio_for_each_folio_all()' instead of bio_for_each_segment_all()? Cheers, Hannes
Can we take a step back and figure out if page_endio is a good idea to start with? The zram usage seems clearly wrong to me. zram is a block driver and does not own the pages, so it shouldn't touch any of the page state. It seems like this mostly operates on it's own pages allocated using alloc_page so the harm might not be horrible at least. orangefs uses it on readahead pages, with ret known for the whole iteration. So one quick loop for the success and one for the failure case would look simpler an more obvious. mpage really should use separate end_io handler for read vs write as well like most other aops do. So overall I'd be happier to just kill the helper.
Hi Christoph, On 2023-03-15 15:56, Christoph Hellwig wrote: > Can we take a step back and figure out if page_endio is a good > idea to start with? > > The zram usage seems clearly wrong to me. zram is a block driver > and does not own the pages, so it shouldn't touch any of the page > state. It seems like this mostly operates on it's own > pages allocated using alloc_page so the harm might not be horrible > at least. > It looks like this endio function is called when alloc_page is used (for partial IO) to trigger writeback from the user space `echo "idle" > /sys/block/zram0/writeback`. I don't understand when you say the harm might not be horrible if we don't call folio_endio here. Do you think it is just safe to remove the call to folio_endio function? > orangefs uses it on readahead pages, with ret known for the whole > iteration. So one quick loop for the success and one for the > failure case would look simpler an more obvious. > Got it. Something like this? @@ -266,18 +266,23 @@ static void orangefs_readahead(struct readahead_control *rac) iov_iter_xarray(&iter, ITER_DEST, i_pages, offset, readahead_length(rac)); /* read in the pages. */ - if ((ret = wait_for_direct_io(ORANGEFS_IO_READ, inode, - &offset, &iter, readahead_length(rac), - inode->i_size, NULL, NULL, rac->file)) < 0) + if ((ret = wait_for_direct_io(ORANGEFS_IO_READ, inode, &offset, &iter, + readahead_length(rac), inode->i_size, + NULL, NULL, rac->file)) < 0) { gossip_debug(GOSSIP_FILE_DEBUG, - "%s: wait_for_direct_io failed. \n", __func__); - else - ret = 0; + "%s: wait_for_direct_io failed. \n", __func__); - /* clean up. */ - while ((page = readahead_page(rac))) { - page_endio(page, false, ret); - put_page(page); + while ((folio = readahead_folio(rac))) { + folio_clear_uptodate(folio); + folio_set_error(folio); + folio_unlock(folio); + } + return; + } + + while ((folio = readahead_folio(rac))) { + folio_mark_uptodate(folio); + folio_unlock(folio); } } > mpage really should use separate end_io handler for read vs write > as well like most other aops do. > I don't know if this is the right abstraction to do the switch, but let me know what you think: @@ -59,7 +59,8 @@ static void mpage_end_io(struct bio *bio) static struct bio *mpage_bio_submit(struct bio *bio) { - bio->bi_end_io = mpage_end_io; + bio->bi_end_io = (op_is_write(bio_op(bio))) ? mpage_write_end_io : + mpage_read_end_io; guard_bio_eod(bio); submit_bio(bio); return NULL; And mpage_{write,read}_end_io will iterate over the folio and call the respective functions. > So overall I'd be happier to just kill the helper.
On Thu, Mar 16, 2023 at 11:04:54AM +0100, Pankaj Raghav wrote: > It looks like this endio function is called when alloc_page is used (for > partial IO) to trigger writeback from the user space `echo "idle" > > /sys/block/zram0/writeback`. Yes. > I don't understand when you say the harm might not be horrible if we don't > call folio_endio here. Do you think it is just safe to remove the call to > folio_endio function? I suspect so. It doesn't seem like the involved pages are ever locked or have the writeback set, so it should be fine. > + while ((folio = readahead_folio(rac))) { > + folio_clear_uptodate(folio); > + folio_set_error(folio); > + folio_unlock(folio); > + } > + return; > + } > + > + while ((folio = readahead_folio(rac))) { > + folio_mark_uptodate(folio); > + folio_unlock(folio); > } > } Looks good. > @@ -59,7 +59,8 @@ static void mpage_end_io(struct bio *bio) > static struct bio *mpage_bio_submit(struct bio *bio) > { > - bio->bi_end_io = mpage_end_io; > + bio->bi_end_io = (op_is_write(bio_op(bio))) ? mpage_write_end_io : > + mpage_read_end_io; > guard_bio_eod(bio); > submit_bio(bio); > return NULL; > And mpage_{write,read}_end_io will iterate over the folio and call the > respective functions. Yes, although I'd do it with a good old if/else and with less braces. It might make sense to split mpage_bio_submit as well, though.
On Thu, Mar 16, 2023 at 11:04:54AM +0100, Pankaj Raghav wrote: > - /* clean up. */ > - while ((page = readahead_page(rac))) { > - page_endio(page, false, ret); > - put_page(page); > + while ((folio = readahead_folio(rac))) { > + folio_clear_uptodate(folio); > + folio_set_error(folio); > + folio_unlock(folio); > + } > + return; > + } > + > + while ((folio = readahead_folio(rac))) { > + folio_mark_uptodate(folio); > + folio_unlock(folio); > } readahead_folio() is a bit too heavy-weight for that, IMO. I'd do this as; while ((folio = readahead_folio(rac))) { if (!ret) folio_mark_uptodate(folio); folio_unlock(folio); } (there's no need to call folio_set_error(), nor folio_clear_uptodate())
On 2023-03-17 16:31, Matthew Wilcox wrote: >> + >> + while ((folio = readahead_folio(rac))) { >> + folio_mark_uptodate(folio); >> + folio_unlock(folio); >> } > > readahead_folio() is a bit too heavy-weight for that, IMO. I'd do this > as; > > while ((folio = readahead_folio(rac))) { > if (!ret) > folio_mark_uptodate(folio); > folio_unlock(folio); > } > This looks good. > (there's no need to call folio_set_error(), nor folio_clear_uptodate()) I am trying to understand why these calls are not needed for the error case. I see similar pattern, for e.g. in iomap_finish_folio_read() where we call these functions for the error case. If we don't need to call these anymore, can the mpage code also be shortened like this: -static void mpage_end_io(struct bio *bio) +static void mpage_read_end_io(struct bio *bio) { - struct bio_vec *bv; - struct bvec_iter_all iter_all; + struct folio_iter fi; + int err = blk_status_to_errno(bio->bi_status); - bio_for_each_segment_all(bv, bio, iter_all) { - struct page *page = bv->bv_page; - page_endio(page, bio_op(bio), - blk_status_to_errno(bio->bi_status)); + bio_for_each_folio_all(fi, bio) { + struct folio *folio = fi.folio; + + if (!err) + folio_mark_uptodate(folio); + folio_unlock(folio); + } + + bio_put(bio); +} + +static void mpage_write_end_io(struct bio *bio) +{ .... +
Hi Willy, On 2023-03-17 17:14, Pankaj Raghav wrote: > On 2023-03-17 16:31, Matthew Wilcox wrote: >>> + >>> + while ((folio = readahead_folio(rac))) { >>> + folio_mark_uptodate(folio); >>> + folio_unlock(folio); >>> } >> >> readahead_folio() is a bit too heavy-weight for that, IMO. I'd do this >> as; >> >> while ((folio = readahead_folio(rac))) { >> if (!ret) >> folio_mark_uptodate(folio); >> folio_unlock(folio); >> } >> > > This looks good. > >> (there's no need to call folio_set_error(), nor folio_clear_uptodate()) > > I am trying to understand why these calls are not needed for the error case. > I see similar pattern, for e.g. in iomap_finish_folio_read() where we call these > functions for the error case. > I am planning to send the next version. It would be great if I can get a rationale for your statement regarding not needing to call folio_set_error() or folio_clear_uptodate(). > If we don't need to call these anymore, can the mpage code also be shortened like this: > > -static void mpage_end_io(struct bio *bio) > +static void mpage_read_end_io(struct bio *bio) > { > - struct bio_vec *bv; > - struct bvec_iter_all iter_all; > + struct folio_iter fi; > + int err = blk_status_to_errno(bio->bi_status); > > - bio_for_each_segment_all(bv, bio, iter_all) { > - struct page *page = bv->bv_page; > - page_endio(page, bio_op(bio), > - blk_status_to_errno(bio->bi_status)); > + bio_for_each_folio_all(fi, bio) { > + struct folio *folio = fi.folio; > + > + if (!err) > + folio_mark_uptodate(folio); > + folio_unlock(folio); > + } > + > + bio_put(bio); > +} > + > +static void mpage_write_end_io(struct bio *bio) > +{ > .... > +
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index aa490da3cef2..f441251c9138 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -610,7 +610,7 @@ static void zram_page_end_io(struct bio *bio) { struct page *page = bio_first_page_all(bio); - page_endio(page, op_is_write(bio_op(bio)), + folio_endio(page_folio(page), op_is_write(bio_op(bio)), blk_status_to_errno(bio->bi_status)); bio_put(bio); } diff --git a/fs/mpage.c b/fs/mpage.c index 22b9de5ddd68..40e86e839e77 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -50,7 +50,7 @@ static void mpage_end_io(struct bio *bio) bio_for_each_segment_all(bv, bio, iter_all) { struct page *page = bv->bv_page; - page_endio(page, bio_op(bio), + folio_endio(page_folio(page), bio_op(bio), blk_status_to_errno(bio->bi_status)); } diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index aefdf1d3be7c..b12d099510ea 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -276,7 +276,7 @@ static void orangefs_readahead(struct readahead_control *rac) /* clean up. */ while ((page = readahead_page(rac))) { - page_endio(page, false, ret); + folio_endio(page_folio(page), false, ret); put_page(page); } } diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index fdcd595d2294..80eab64b834f 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1076,7 +1076,7 @@ int filemap_migrate_folio(struct address_space *mapping, struct folio *dst, #else #define filemap_migrate_folio NULL #endif -void page_endio(struct page *page, bool is_write, int err); +void folio_endio(struct folio *folio, bool is_write, int err); void folio_end_private_2(struct folio *folio); void folio_wait_private_2(struct folio *folio); diff --git a/mm/filemap.c b/mm/filemap.c index a34abfe8c654..a89940f74974 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1626,13 +1626,11 @@ void folio_end_writeback(struct folio *folio) EXPORT_SYMBOL(folio_end_writeback); /* - * After completing I/O on a page, call this routine to update the page + * After completing I/O on a folio, call this routine to update the folio * flags appropriately */ -void page_endio(struct page *page, bool is_write, int err) +void folio_endio(struct folio *folio, bool is_write, int err) { - struct folio *folio = page_folio(page); - if (!is_write) { if (!err) { folio_mark_uptodate(folio); @@ -1653,7 +1651,7 @@ void page_endio(struct page *page, bool is_write, int err) folio_end_writeback(folio); } } -EXPORT_SYMBOL_GPL(page_endio); +EXPORT_SYMBOL_GPL(folio_endio); /** * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
page_endio() already works on folios by converting a page in to a folio as the first step. Convert page_endio to folio_endio by taking a folio as the first parameter. Instead of doing a page to folio conversion in the page_endio() function, the consumers of this API do this conversion and call the folio_endio() function in this patch. The following patches will convert the consumers of this API to use native folio functions to pass to folio_endio(). Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- drivers/block/zram/zram_drv.c | 2 +- fs/mpage.c | 2 +- fs/orangefs/inode.c | 2 +- include/linux/pagemap.h | 2 +- mm/filemap.c | 8 +++----- 5 files changed, 7 insertions(+), 9 deletions(-)