[RFCv3,6/7] iomap: Optimize iomap_read_folio

Message ID	a01641c22af0856fa2b19ab00a6660706056666d.1714046808.git.ritesh.list@gmail.com (mailing list archive)
State	Superseded, archived
Headers	show Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7EF4149C43; Thu, 25 Apr 2024 13:29:26 +0000 (UTC) From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> To: linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox <willy@infradead.org>, "Darrick J . Wong" <djwong@kernel.org>, Ojaswin Mujoo <ojaswin@linux.ibm.com>, Ritesh Harjani <ritesh.list@gmail.com>, Jan Kara <jack@suse.cz> Subject: [RFCv3 6/7] iomap: Optimize iomap_read_folio Date: Thu, 25 Apr 2024 18:58:50 +0530 Message-ID: <a01641c22af0856fa2b19ab00a6660706056666d.1714046808.git.ritesh.list@gmail.com> In-Reply-To: <cover.1714046808.git.ritesh.list@gmail.com> References: <cover.1714046808.git.ritesh.list@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	ext2 iomap changes and iomap improvements \| expand [RFCv3,0/7] ext2 iomap changes and iomap improvements [RFCv3,1/7] ext2: Remove comment related to journal handle [RFCv3,2/7] ext2: Convert ext2 regular file buffered I/O to use iomap [RFCv3,3/7] ext2: Enable large folio support [RFCv3,4/7] ext2: Implement seq counter for validating cached iomap [RFCv3,5/7] iomap: Fix iomap_adjust_read_range for plen calculation [RFCv3,6/7] iomap: Optimize iomap_read_folio [RFCv3,7/7] iomap: Optimize data access patterns for filesystems with indirect mappings

Message ID

a01641c22af0856fa2b19ab00a6660706056666d.1714046808.git.ritesh.list@gmail.com (mailing list archive)

State

Superseded, archived

Headers

From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
To: linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Jan Kara <jack@suse.cz>
Subject: [RFCv3 6/7] iomap: Optimize iomap_read_folio
Date: Thu, 25 Apr 2024 18:58:50 +0530
Message-ID: 
 <a01641c22af0856fa2b19ab00a6660706056666d.1714046808.git.ritesh.list@gmail.com>
In-Reply-To: <cover.1714046808.git.ritesh.list@gmail.com>
References: <cover.1714046808.git.ritesh.list@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

ext2 iomap changes and iomap improvements | expand

Commit Message

Ritesh Harjani (IBM) April 25, 2024, 1:28 p.m. UTC

iomap_readpage_iter() handles "uptodate blocks" and "not uptodate blocks"
within a folio separately. This makes iomap_read_folio() to call into
->iomap_begin() to request for extent mapping even though it might already
have an extent which is not fully processed.

This happens when we either have a large folio or with bs < ps. In these
cases we can have sub blocks which can be uptodate (say for e.g. due to
previous writes). With iomap_read_folio_iter(), this is handled more
efficiently by not calling ->iomap_begin() call until all the sub blocks
with the current folio are processed.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
cc: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 fs/iomap/buffered-io.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox April 25, 2024, 1:53 p.m. UTC | #1

On Thu, Apr 25, 2024 at 06:58:50PM +0530, Ritesh Harjani (IBM) wrote:
> +static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
> +		struct iomap_readpage_ctx *ctx)
> +{
> +	struct folio *folio = ctx->cur_folio;
> +	size_t pos = offset_in_folio(folio, iter->pos);

"pos" is position in file.  You should call this 'offset'.

> +	loff_t length = min_t(loff_t, folio_size(folio) - pos,
> +			      iomap_length(iter));
> +	loff_t done, ret;
> +
> +	for (done = 0; done < length; done += ret) {
> +		ret = iomap_readpage_iter(iter, ctx, done);
> +		if (ret <= 0)
> +			return ret;
> +	}

Christoph Hellwig April 26, 2024, 6:53 a.m. UTC | #2

On Thu, Apr 25, 2024 at 06:58:50PM +0530, Ritesh Harjani (IBM) wrote:
> iomap_readpage_iter() handles "uptodate blocks" and "not uptodate blocks"
> within a folio separately. This makes iomap_read_folio() to call into
> ->iomap_begin() to request for extent mapping even though it might already
> have an extent which is not fully processed.
> 
> This happens when we either have a large folio or with bs < ps. In these
> cases we can have sub blocks which can be uptodate (say for e.g. due to
> previous writes). With iomap_read_folio_iter(), this is handled more
> efficiently by not calling ->iomap_begin() call until all the sub blocks
> with the current folio are processed.

Maybe throw in a sentence here that this copies what
iomap_readahead_iter already does?

Otherwise this looks good to me modulo the offset comment from willy.

Ritesh Harjani (IBM) April 26, 2024, 8:50 a.m. UTC | #3

Christoph Hellwig <hch@infradead.org> writes:

> On Thu, Apr 25, 2024 at 06:58:50PM +0530, Ritesh Harjani (IBM) wrote:
>> iomap_readpage_iter() handles "uptodate blocks" and "not uptodate blocks"
>> within a folio separately. This makes iomap_read_folio() to call into
>> ->iomap_begin() to request for extent mapping even though it might already
>> have an extent which is not fully processed.
>> 
>> This happens when we either have a large folio or with bs < ps. In these
>> cases we can have sub blocks which can be uptodate (say for e.g. due to
>> previous writes). With iomap_read_folio_iter(), this is handled more
>> efficiently by not calling ->iomap_begin() call until all the sub blocks
>> with the current folio are processed.
>
> Maybe throw in a sentence here that this copies what
> iomap_readahead_iter already does?

Does this sound any better?

iomap_read_folio_iter() handles multiple sub blocks within a given
folio but it's implementation logic is similar to how
iomap_readahead_iter() handles multiple folios within a single mapped
extent. Both of them iterate over a given range of folio/mapped extent
and call iomap_readpage_iter() for reading.

>
> Otherwise this looks good to me modulo the offset comment from willy.

Yes, I will address willy's comment too. 
Thanks for the review!

-ritesh

Christoph Hellwig April 27, 2024, 4:44 a.m. UTC | #4

On Fri, Apr 26, 2024 at 02:20:05PM +0530, Ritesh Harjani wrote:
> iomap_read_folio_iter() handles multiple sub blocks within a given
> folio but it's implementation logic is similar to how
> iomap_readahead_iter() handles multiple folios within a single mapped
> extent. Both of them iterate over a given range of folio/mapped extent
> and call iomap_readpage_iter() for reading.

Sounds good.

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 9f79c82d1f73..0a4269095ae2 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -444,6 +444,24 @@  static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	return pos - orig_pos + plen;
 }
 
+static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
+		struct iomap_readpage_ctx *ctx)
+{
+	struct folio *folio = ctx->cur_folio;
+	size_t pos = offset_in_folio(folio, iter->pos);
+	loff_t length = min_t(loff_t, folio_size(folio) - pos,
+			      iomap_length(iter));
+	loff_t done, ret;
+
+	for (done = 0; done < length; done += ret) {
+		ret = iomap_readpage_iter(iter, ctx, done);
+		if (ret <= 0)
+			return ret;
+	}
+
+	return done;
+}
+
 int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
 {
 	struct iomap_iter iter = {
@@ -459,7 +477,7 @@  int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
 	trace_iomap_readpage(iter.inode, 1);
 
 	while ((ret = iomap_iter(&iter, ops)) > 0)
-		iter.processed = iomap_readpage_iter(&iter, &ctx, 0);
+		iter.processed = iomap_read_folio_iter(&iter, &ctx);
 
 	if (ret < 0)
 		folio_set_error(folio);

[RFCv3,6/7] iomap: Optimize iomap_read_folio

Commit Message

Comments

Patch