diff mbox series

[11/13] iomap: make buffered writes work with RWF_UNCACHED

Message ID 20241108174505.1214230-12-axboe@kernel.dk (mailing list archive)
State New
Headers show
Series [01/13] mm/filemap: change filemap_create_folio() to take a struct kiocb | expand

Commit Message

Jens Axboe Nov. 8, 2024, 5:43 p.m. UTC
Add iomap buffered write support for RWF_UNCACHED. If RWF_UNCACHED is
set for a write, mark the folios being written with drop_writeback. Then
writeback completion will drop the pages. The write_iter handler simply
kicks off writeback for the pages, and writeback completion will take
care of the rest.

See the similar patch for the generic filemap handling for performance
results, those were in fact done on XFS using this patch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/iomap/buffered-io.c | 12 +++++++++++-
 include/linux/iomap.h  |  3 ++-
 2 files changed, 13 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox Nov. 8, 2024, 6:46 p.m. UTC | #1
On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote:
> +++ b/fs/iomap/buffered-io.c
> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>  		}
>  		if (iter->iomap.flags & IOMAP_F_STALE)
>  			break;
> +		if (iter->flags & IOMAP_UNCACHED)
> +			folio_set_uncached(folio);

This seems like it'd convert an existing page cache folio into being
uncached?  Is this just leftover from a previous version or is that a
design decision you made?
Jens Axboe Nov. 8, 2024, 7:26 p.m. UTC | #2
On 11/8/24 11:46 AM, Matthew Wilcox wrote:
> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote:
>> +++ b/fs/iomap/buffered-io.c
>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>>  		}
>>  		if (iter->iomap.flags & IOMAP_F_STALE)
>>  			break;
>> +		if (iter->flags & IOMAP_UNCACHED)
>> +			folio_set_uncached(folio);
> 
> This seems like it'd convert an existing page cache folio into being
> uncached?  Is this just leftover from a previous version or is that a
> design decision you made?

I'll see if we can improve that. Currently both the read and write side
do drop whatever it touches. We could feasibly just have it drop
newly instantiated pages - iow, uncached just won't create new persistent
folios, but it'll happily use the ones that are there already.
Jens Axboe Nov. 8, 2024, 7:49 p.m. UTC | #3
On 11/8/24 12:26 PM, Jens Axboe wrote:
> On 11/8/24 11:46 AM, Matthew Wilcox wrote:
>> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote:
>>> +++ b/fs/iomap/buffered-io.c
>>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>>>  		}
>>>  		if (iter->iomap.flags & IOMAP_F_STALE)
>>>  			break;
>>> +		if (iter->flags & IOMAP_UNCACHED)
>>> +			folio_set_uncached(folio);
>>
>> This seems like it'd convert an existing page cache folio into being
>> uncached?  Is this just leftover from a previous version or is that a
>> design decision you made?
> 
> I'll see if we can improve that. Currently both the read and write side
> do drop whatever it touches. We could feasibly just have it drop
> newly instantiated pages - iow, uncached just won't create new persistent
> folios, but it'll happily use the ones that are there already.

Well that was nonsense on the read side, it deliberately only prunes
entries that has uncached set. For the write side, this is a bit
trickier. We'd essentially need to know if the folio populated by
write_begin was found in the page cache, or create from new. Any way we
can do that? One way is to change ->write_begin() so it takes a kiocb
rather than a file, but that's an amount of churn I'd rather avoid!
Maybe there's a way I'm just not seeing?
Matthew Wilcox Nov. 8, 2024, 8:07 p.m. UTC | #4
On Fri, Nov 08, 2024 at 12:49:58PM -0700, Jens Axboe wrote:
> On 11/8/24 12:26 PM, Jens Axboe wrote:
> > On 11/8/24 11:46 AM, Matthew Wilcox wrote:
> >> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote:
> >>> +++ b/fs/iomap/buffered-io.c
> >>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> >>>  		}
> >>>  		if (iter->iomap.flags & IOMAP_F_STALE)
> >>>  			break;
> >>> +		if (iter->flags & IOMAP_UNCACHED)
> >>> +			folio_set_uncached(folio);
> >>
> >> This seems like it'd convert an existing page cache folio into being
> >> uncached?  Is this just leftover from a previous version or is that a
> >> design decision you made?
> > 
> > I'll see if we can improve that. Currently both the read and write side
> > do drop whatever it touches. We could feasibly just have it drop
> > newly instantiated pages - iow, uncached just won't create new persistent
> > folios, but it'll happily use the ones that are there already.
> 
> Well that was nonsense on the read side, it deliberately only prunes
> entries that has uncached set. For the write side, this is a bit
> trickier. We'd essentially need to know if the folio populated by
> write_begin was found in the page cache, or create from new. Any way we
> can do that? One way is to change ->write_begin() so it takes a kiocb
> rather than a file, but that's an amount of churn I'd rather avoid!
> Maybe there's a way I'm just not seeing?

Umm.  We can solve it for iomap with a new FGP_UNCACHED flag and
checking IOMAP_UNCACHED in iomap_get_folio().  Not sure how we solve it
for other filesystems though.  Any filesystem which uses FGP_NOWAIT has
_a_ solution, but eg btrfs will need to plumb through a third boolean
flag (or, more efficiently, just start passing FGP flags to
prepare_one_folio()).
Jens Axboe Nov. 8, 2024, 8:18 p.m. UTC | #5
On 11/8/24 1:07 PM, Matthew Wilcox wrote:
> On Fri, Nov 08, 2024 at 12:49:58PM -0700, Jens Axboe wrote:
>> On 11/8/24 12:26 PM, Jens Axboe wrote:
>>> On 11/8/24 11:46 AM, Matthew Wilcox wrote:
>>>> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote:
>>>>> +++ b/fs/iomap/buffered-io.c
>>>>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>>>>>  		}
>>>>>  		if (iter->iomap.flags & IOMAP_F_STALE)
>>>>>  			break;
>>>>> +		if (iter->flags & IOMAP_UNCACHED)
>>>>> +			folio_set_uncached(folio);
>>>>
>>>> This seems like it'd convert an existing page cache folio into being
>>>> uncached?  Is this just leftover from a previous version or is that a
>>>> design decision you made?
>>>
>>> I'll see if we can improve that. Currently both the read and write side
>>> do drop whatever it touches. We could feasibly just have it drop
>>> newly instantiated pages - iow, uncached just won't create new persistent
>>> folios, but it'll happily use the ones that are there already.
>>
>> Well that was nonsense on the read side, it deliberately only prunes
>> entries that has uncached set. For the write side, this is a bit
>> trickier. We'd essentially need to know if the folio populated by
>> write_begin was found in the page cache, or create from new. Any way we
>> can do that? One way is to change ->write_begin() so it takes a kiocb
>> rather than a file, but that's an amount of churn I'd rather avoid!
>> Maybe there's a way I'm just not seeing?
> 
> Umm.  We can solve it for iomap with a new FGP_UNCACHED flag and
> checking IOMAP_UNCACHED in iomap_get_folio().  Not sure how we solve it
> for other filesystems though.  Any filesystem which uses FGP_NOWAIT has
> _a_ solution, but eg btrfs will need to plumb through a third boolean
> flag (or, more efficiently, just start passing FGP flags to
> prepare_one_folio()).

Yeah that's true, forgot we already have the IOMAP_UNCACHED flag there
and it's available in creation. Thanks, I'll start with that.
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index ef0b68bccbb6..609256885094 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -959,6 +959,8 @@  static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		}
 		if (iter->iomap.flags & IOMAP_F_STALE)
 			break;
+		if (iter->flags & IOMAP_UNCACHED)
+			folio_set_uncached(folio);
 
 		offset = offset_in_folio(folio, pos);
 		if (bytes > folio_size(folio) - offset)
@@ -1023,8 +1025,9 @@  ssize_t
 iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
 		const struct iomap_ops *ops, void *private)
 {
+	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct iomap_iter iter = {
-		.inode		= iocb->ki_filp->f_mapping->host,
+		.inode		= mapping->host,
 		.pos		= iocb->ki_pos,
 		.len		= iov_iter_count(i),
 		.flags		= IOMAP_WRITE,
@@ -1034,12 +1037,19 @@  iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
 
 	if (iocb->ki_flags & IOCB_NOWAIT)
 		iter.flags |= IOMAP_NOWAIT;
+	if (iocb->ki_flags & IOCB_UNCACHED)
+		iter.flags |= IOMAP_UNCACHED;
 
 	while ((ret = iomap_iter(&iter, ops)) > 0)
 		iter.processed = iomap_write_iter(&iter, i);
 
 	if (unlikely(iter.pos == iocb->ki_pos))
 		return ret;
+	if (iocb->ki_flags & IOCB_UNCACHED) {
+		/* kick off uncached writeback, completion will drop it */
+		__filemap_fdatawrite_range(mapping, iocb->ki_pos, iter.pos,
+						WB_SYNC_NONE);
+	}
 	ret = iter.pos - iocb->ki_pos;
 	iocb->ki_pos = iter.pos;
 	return ret;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index f61407e3b121..89b24fbb1399 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -173,8 +173,9 @@  struct iomap_folio_ops {
 #define IOMAP_NOWAIT		(1 << 5) /* do not block */
 #define IOMAP_OVERWRITE_ONLY	(1 << 6) /* only pure overwrites allowed */
 #define IOMAP_UNSHARE		(1 << 7) /* unshare_file_range */
+#define IOMAP_UNCACHED		(1 << 8) /* uncached IO */
 #ifdef CONFIG_FS_DAX
-#define IOMAP_DAX		(1 << 8) /* DAX mapping */
+#define IOMAP_DAX		(1 << 9) /* DAX mapping */
 #else
 #define IOMAP_DAX		0
 #endif /* CONFIG_FS_DAX */