diff mbox series

[RFCv3,6/7] iomap: Optimize iomap_read_folio

Message ID a01641c22af0856fa2b19ab00a6660706056666d.1714046808.git.ritesh.list@gmail.com (mailing list archive)
State Superseded, archived
Headers show
Series ext2 iomap changes and iomap improvements | expand

Commit Message

Ritesh Harjani (IBM) April 25, 2024, 1:28 p.m. UTC
iomap_readpage_iter() handles "uptodate blocks" and "not uptodate blocks"
within a folio separately. This makes iomap_read_folio() to call into
->iomap_begin() to request for extent mapping even though it might already
have an extent which is not fully processed.

This happens when we either have a large folio or with bs < ps. In these
cases we can have sub blocks which can be uptodate (say for e.g. due to
previous writes). With iomap_read_folio_iter(), this is handled more
efficiently by not calling ->iomap_begin() call until all the sub blocks
with the current folio are processed.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
cc: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 fs/iomap/buffered-io.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox April 25, 2024, 1:53 p.m. UTC | #1
On Thu, Apr 25, 2024 at 06:58:50PM +0530, Ritesh Harjani (IBM) wrote:
> +static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
> +		struct iomap_readpage_ctx *ctx)
> +{
> +	struct folio *folio = ctx->cur_folio;
> +	size_t pos = offset_in_folio(folio, iter->pos);

"pos" is position in file.  You should call this 'offset'.

> +	loff_t length = min_t(loff_t, folio_size(folio) - pos,
> +			      iomap_length(iter));
> +	loff_t done, ret;
> +
> +	for (done = 0; done < length; done += ret) {
> +		ret = iomap_readpage_iter(iter, ctx, done);
> +		if (ret <= 0)
> +			return ret;
> +	}
Christoph Hellwig April 26, 2024, 6:53 a.m. UTC | #2
On Thu, Apr 25, 2024 at 06:58:50PM +0530, Ritesh Harjani (IBM) wrote:
> iomap_readpage_iter() handles "uptodate blocks" and "not uptodate blocks"
> within a folio separately. This makes iomap_read_folio() to call into
> ->iomap_begin() to request for extent mapping even though it might already
> have an extent which is not fully processed.
> 
> This happens when we either have a large folio or with bs < ps. In these
> cases we can have sub blocks which can be uptodate (say for e.g. due to
> previous writes). With iomap_read_folio_iter(), this is handled more
> efficiently by not calling ->iomap_begin() call until all the sub blocks
> with the current folio are processed.

Maybe throw in a sentence here that this copies what
iomap_readahead_iter already does?

Otherwise this looks good to me modulo the offset comment from willy.
Ritesh Harjani (IBM) April 26, 2024, 8:50 a.m. UTC | #3
Christoph Hellwig <hch@infradead.org> writes:

> On Thu, Apr 25, 2024 at 06:58:50PM +0530, Ritesh Harjani (IBM) wrote:
>> iomap_readpage_iter() handles "uptodate blocks" and "not uptodate blocks"
>> within a folio separately. This makes iomap_read_folio() to call into
>> ->iomap_begin() to request for extent mapping even though it might already
>> have an extent which is not fully processed.
>> 
>> This happens when we either have a large folio or with bs < ps. In these
>> cases we can have sub blocks which can be uptodate (say for e.g. due to
>> previous writes). With iomap_read_folio_iter(), this is handled more
>> efficiently by not calling ->iomap_begin() call until all the sub blocks
>> with the current folio are processed.
>
> Maybe throw in a sentence here that this copies what
> iomap_readahead_iter already does?

Does this sound any better?

iomap_read_folio_iter() handles multiple sub blocks within a given
folio but it's implementation logic is similar to how
iomap_readahead_iter() handles multiple folios within a single mapped
extent. Both of them iterate over a given range of folio/mapped extent
and call iomap_readpage_iter() for reading.


>
> Otherwise this looks good to me modulo the offset comment from willy.

Yes, I will address willy's comment too. 
Thanks for the review!

-ritesh
Christoph Hellwig April 27, 2024, 4:44 a.m. UTC | #4
On Fri, Apr 26, 2024 at 02:20:05PM +0530, Ritesh Harjani wrote:
> iomap_read_folio_iter() handles multiple sub blocks within a given
> folio but it's implementation logic is similar to how
> iomap_readahead_iter() handles multiple folios within a single mapped
> extent. Both of them iterate over a given range of folio/mapped extent
> and call iomap_readpage_iter() for reading.

Sounds good.
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 9f79c82d1f73..0a4269095ae2 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -444,6 +444,24 @@  static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	return pos - orig_pos + plen;
 }
 
+static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
+		struct iomap_readpage_ctx *ctx)
+{
+	struct folio *folio = ctx->cur_folio;
+	size_t pos = offset_in_folio(folio, iter->pos);
+	loff_t length = min_t(loff_t, folio_size(folio) - pos,
+			      iomap_length(iter));
+	loff_t done, ret;
+
+	for (done = 0; done < length; done += ret) {
+		ret = iomap_readpage_iter(iter, ctx, done);
+		if (ret <= 0)
+			return ret;
+	}
+
+	return done;
+}
+
 int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
 {
 	struct iomap_iter iter = {
@@ -459,7 +477,7 @@  int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
 	trace_iomap_readpage(iter.inode, 1);
 
 	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.processed = iomap_readpage_iter(&iter, &ctx, 0);
+		iter.processed = iomap_read_folio_iter(&iter, &ctx);
 
 	if (ret < 0)
 		folio_set_error(folio);