diff mbox series

[14/17] iomap: make buffered writes work with RWF_UNCACHED

Message ID 20241114152743.2381672-16-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series Uncached buffered IO | expand

Commit Message

Jens Axboe Nov. 14, 2024, 3:25 p.m. UTC
Add iomap buffered write support for RWF_UNCACHED. If RWF_UNCACHED is
set for a write, mark the folios being written as uncached. Then
writeback completion will drop the pages. The write_iter handler simply
kicks off writeback for the pages, and writeback completion will take
care of the rest.

This still needs the user of the iomap buffered write helpers to call
iocb_uncached_write() upon successful issue of the writes.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/iomap/buffered-io.c | 15 +++++++++++++--
 include/linux/iomap.h  |  8 +++++++-
 2 files changed, 20 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig Dec. 12, 2024, 5:50 a.m. UTC | #1
On Thu, Nov 14, 2024 at 08:25:18AM -0700, Jens Axboe wrote:
> +	if (iocb->ki_flags & IOCB_UNCACHED)
> +		iter.flags |= IOMAP_UNCACHED;
>  
> -	while ((ret = iomap_iter(&iter, ops)) > 0)
> +	while ((ret = iomap_iter(&iter, ops)) > 0) {
> +		if (iocb->ki_flags & IOCB_UNCACHED)
> +			iter.iomap.flags |= IOMAP_F_UNCACHED;

iomap.flags and the IOMAP_F_* namespace is used to communicate flags
from the file system to the iomap core, so this looks wrong.

>  	size_t poff = offset_in_folio(folio, pos);
>  	int error;
>  
> +	if (folio_test_uncached(folio))
> +		wpc->iomap.flags |= IOMAP_F_UNCACHED;

I guess this is what actually makes it work.  Note that with the iomap
zoned series I posted yesteday things change a bit here in that the flags
in the wpc are decouple from the iomap flags, and this would now become
a wpc only flag as it isn't really a fs to iomap cummunication, but
based on iomap / page cache state.
Darrick J. Wong Dec. 12, 2024, 6:26 a.m. UTC | #2
On Wed, Dec 11, 2024 at 09:50:19PM -0800, Christoph Hellwig wrote:
> On Thu, Nov 14, 2024 at 08:25:18AM -0700, Jens Axboe wrote:
> > +	if (iocb->ki_flags & IOCB_UNCACHED)
> > +		iter.flags |= IOMAP_UNCACHED;
> >  
> > -	while ((ret = iomap_iter(&iter, ops)) > 0)
> > +	while ((ret = iomap_iter(&iter, ops)) > 0) {
> > +		if (iocb->ki_flags & IOCB_UNCACHED)
> > +			iter.iomap.flags |= IOMAP_F_UNCACHED;
> 
> iomap.flags and the IOMAP_F_* namespace is used to communicate flags
> from the file system to the iomap core, so this looks wrong.

Not entirely true -- IOMAP_F_SIZE_CHANGED is used to communicate state
from iomap to gfs2, and IOMAP_F_STALE is set/checked only by the iomap
core.  iomap.h even says as much.

Though given that there's a 4-byte gap in struct iomap between flags and
the bdev pointer (at least on 64-bit) maybe we should make a separate
field for these iomap state bits?

> >  	size_t poff = offset_in_folio(folio, pos);
> >  	int error;
> >  
> > +	if (folio_test_uncached(folio))
> > +		wpc->iomap.flags |= IOMAP_F_UNCACHED;
> 
> I guess this is what actually makes it work.  Note that with the iomap
> zoned series I posted yesteday things change a bit here in that the flags
> in the wpc are decouple from the iomap flags, and this would now become
> a wpc only flag as it isn't really a fs to iomap cummunication, but
> based on iomap / page cache state.

Hrmmm I'll go take a second look at that patch in the morning in case a
better idea comes along.

--D
Christoph Hellwig Dec. 12, 2024, 6:31 a.m. UTC | #3
On Wed, Dec 11, 2024 at 10:26:41PM -0800, Darrick J. Wong wrote:
> > iomap.flags and the IOMAP_F_* namespace is used to communicate flags
> > from the file system to the iomap core, so this looks wrong.
> 
> Not entirely true -- IOMAP_F_SIZE_CHANGED is used to communicate state
> from iomap to gfs2, and IOMAP_F_STALE is set/checked only by the iomap
> core.  iomap.h even says as much.

Indeed, some of the non-initial additions already broke this.  And now
that you mentioned it I ran into that before because it was in the way
of some further constifycation I attempted in fs/iomap/.

> Though given that there's a 4-byte gap in struct iomap between flags and
> the bdev pointer (at least on 64-bit) maybe we should make a separate
> field for these iomap state bits?

Probably.  Preferably in a way that isn't too painful for Jens, though.
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index ef0b68bccbb6..2f2a5db04a68 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -603,6 +603,8 @@  struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
 
 	if (iter->flags & IOMAP_NOWAIT)
 		fgp |= FGP_NOWAIT;
+	if (iter->flags & IOMAP_UNCACHED)
+		fgp |= FGP_UNCACHED;
 	fgp |= fgf_set_order(len);
 
 	return __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
@@ -1023,8 +1025,9 @@  ssize_t
 iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
 		const struct iomap_ops *ops, void *private)
 {
+	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct iomap_iter iter = {
-		.inode		= iocb->ki_filp->f_mapping->host,
+		.inode		= mapping->host,
 		.pos		= iocb->ki_pos,
 		.len		= iov_iter_count(i),
 		.flags		= IOMAP_WRITE,
@@ -1034,9 +1037,14 @@  iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
 
 	if (iocb->ki_flags & IOCB_NOWAIT)
 		iter.flags |= IOMAP_NOWAIT;
+	if (iocb->ki_flags & IOCB_UNCACHED)
+		iter.flags |= IOMAP_UNCACHED;
 
-	while ((ret = iomap_iter(&iter, ops)) > 0)
+	while ((ret = iomap_iter(&iter, ops)) > 0) {
+		if (iocb->ki_flags & IOCB_UNCACHED)
+			iter.iomap.flags |= IOMAP_F_UNCACHED;
 		iter.processed = iomap_write_iter(&iter, i);
+	}
 
 	if (unlikely(iter.pos == iocb->ki_pos))
 		return ret;
@@ -1770,6 +1778,9 @@  static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
 	size_t poff = offset_in_folio(folio, pos);
 	int error;
 
+	if (folio_test_uncached(folio))
+		wpc->iomap.flags |= IOMAP_F_UNCACHED;
+
 	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos)) {
 new_ioend:
 		error = iomap_submit_ioend(wpc, 0);
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index f61407e3b121..0a88043676f2 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -74,9 +74,14 @@  struct vm_fault;
  * IOMAP_F_STALE indicates that the iomap is not valid any longer and the file
  * range it covers needs to be remapped by the high level before the operation
  * can proceed.
+ *
+ * IOMAP_F_UNCACHED is set to indicate that writes to the page cache (and
+ * hence writeback) will result in folios being evicted as soon as the
+ * updated bytes are written back to the storage.
  */
 #define IOMAP_F_SIZE_CHANGED	(1U << 8)
 #define IOMAP_F_STALE		(1U << 9)
+#define IOMAP_F_UNCACHED	(1U << 10)
 
 /*
  * Flags from 0x1000 up are for file system specific usage:
@@ -173,8 +178,9 @@  struct iomap_folio_ops {
 #define IOMAP_NOWAIT		(1 << 5) /* do not block */
 #define IOMAP_OVERWRITE_ONLY	(1 << 6) /* only pure overwrites allowed */
 #define IOMAP_UNSHARE		(1 << 7) /* unshare_file_range */
+#define IOMAP_UNCACHED		(1 << 8) /* uncached IO */
 #ifdef CONFIG_FS_DAX
-#define IOMAP_DAX		(1 << 8) /* DAX mapping */
+#define IOMAP_DAX		(1 << 9) /* DAX mapping */
 #else
 #define IOMAP_DAX		0
 #endif /* CONFIG_FS_DAX */