Message ID | 20241108174505.1214230-12-axboe@kernel.dk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [01/13] mm/filemap: change filemap_create_folio() to take a struct kiocb | expand |
On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote: > +++ b/fs/iomap/buffered-io.c > @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) > } > if (iter->iomap.flags & IOMAP_F_STALE) > break; > + if (iter->flags & IOMAP_UNCACHED) > + folio_set_uncached(folio); This seems like it'd convert an existing page cache folio into being uncached? Is this just leftover from a previous version or is that a design decision you made?
On 11/8/24 11:46 AM, Matthew Wilcox wrote: > On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote: >> +++ b/fs/iomap/buffered-io.c >> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) >> } >> if (iter->iomap.flags & IOMAP_F_STALE) >> break; >> + if (iter->flags & IOMAP_UNCACHED) >> + folio_set_uncached(folio); > > This seems like it'd convert an existing page cache folio into being > uncached? Is this just leftover from a previous version or is that a > design decision you made? I'll see if we can improve that. Currently both the read and write side do drop whatever it touches. We could feasibly just have it drop newly instantiated pages - iow, uncached just won't create new persistent folios, but it'll happily use the ones that are there already.
On 11/8/24 12:26 PM, Jens Axboe wrote: > On 11/8/24 11:46 AM, Matthew Wilcox wrote: >> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote: >>> +++ b/fs/iomap/buffered-io.c >>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) >>> } >>> if (iter->iomap.flags & IOMAP_F_STALE) >>> break; >>> + if (iter->flags & IOMAP_UNCACHED) >>> + folio_set_uncached(folio); >> >> This seems like it'd convert an existing page cache folio into being >> uncached? Is this just leftover from a previous version or is that a >> design decision you made? > > I'll see if we can improve that. Currently both the read and write side > do drop whatever it touches. We could feasibly just have it drop > newly instantiated pages - iow, uncached just won't create new persistent > folios, but it'll happily use the ones that are there already. Well that was nonsense on the read side, it deliberately only prunes entries that has uncached set. For the write side, this is a bit trickier. We'd essentially need to know if the folio populated by write_begin was found in the page cache, or create from new. Any way we can do that? One way is to change ->write_begin() so it takes a kiocb rather than a file, but that's an amount of churn I'd rather avoid! Maybe there's a way I'm just not seeing?
On Fri, Nov 08, 2024 at 12:49:58PM -0700, Jens Axboe wrote: > On 11/8/24 12:26 PM, Jens Axboe wrote: > > On 11/8/24 11:46 AM, Matthew Wilcox wrote: > >> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote: > >>> +++ b/fs/iomap/buffered-io.c > >>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) > >>> } > >>> if (iter->iomap.flags & IOMAP_F_STALE) > >>> break; > >>> + if (iter->flags & IOMAP_UNCACHED) > >>> + folio_set_uncached(folio); > >> > >> This seems like it'd convert an existing page cache folio into being > >> uncached? Is this just leftover from a previous version or is that a > >> design decision you made? > > > > I'll see if we can improve that. Currently both the read and write side > > do drop whatever it touches. We could feasibly just have it drop > > newly instantiated pages - iow, uncached just won't create new persistent > > folios, but it'll happily use the ones that are there already. > > Well that was nonsense on the read side, it deliberately only prunes > entries that has uncached set. For the write side, this is a bit > trickier. We'd essentially need to know if the folio populated by > write_begin was found in the page cache, or create from new. Any way we > can do that? One way is to change ->write_begin() so it takes a kiocb > rather than a file, but that's an amount of churn I'd rather avoid! > Maybe there's a way I'm just not seeing? Umm. We can solve it for iomap with a new FGP_UNCACHED flag and checking IOMAP_UNCACHED in iomap_get_folio(). Not sure how we solve it for other filesystems though. Any filesystem which uses FGP_NOWAIT has _a_ solution, but eg btrfs will need to plumb through a third boolean flag (or, more efficiently, just start passing FGP flags to prepare_one_folio()).
On 11/8/24 1:07 PM, Matthew Wilcox wrote: > On Fri, Nov 08, 2024 at 12:49:58PM -0700, Jens Axboe wrote: >> On 11/8/24 12:26 PM, Jens Axboe wrote: >>> On 11/8/24 11:46 AM, Matthew Wilcox wrote: >>>> On Fri, Nov 08, 2024 at 10:43:34AM -0700, Jens Axboe wrote: >>>>> +++ b/fs/iomap/buffered-io.c >>>>> @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) >>>>> } >>>>> if (iter->iomap.flags & IOMAP_F_STALE) >>>>> break; >>>>> + if (iter->flags & IOMAP_UNCACHED) >>>>> + folio_set_uncached(folio); >>>> >>>> This seems like it'd convert an existing page cache folio into being >>>> uncached? Is this just leftover from a previous version or is that a >>>> design decision you made? >>> >>> I'll see if we can improve that. Currently both the read and write side >>> do drop whatever it touches. We could feasibly just have it drop >>> newly instantiated pages - iow, uncached just won't create new persistent >>> folios, but it'll happily use the ones that are there already. >> >> Well that was nonsense on the read side, it deliberately only prunes >> entries that has uncached set. For the write side, this is a bit >> trickier. We'd essentially need to know if the folio populated by >> write_begin was found in the page cache, or create from new. Any way we >> can do that? One way is to change ->write_begin() so it takes a kiocb >> rather than a file, but that's an amount of churn I'd rather avoid! >> Maybe there's a way I'm just not seeing? > > Umm. We can solve it for iomap with a new FGP_UNCACHED flag and > checking IOMAP_UNCACHED in iomap_get_folio(). Not sure how we solve it > for other filesystems though. Any filesystem which uses FGP_NOWAIT has > _a_ solution, but eg btrfs will need to plumb through a third boolean > flag (or, more efficiently, just start passing FGP flags to > prepare_one_folio()). Yeah that's true, forgot we already have the IOMAP_UNCACHED flag there and it's available in creation. Thanks, I'll start with that.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index ef0b68bccbb6..609256885094 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -959,6 +959,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) } if (iter->iomap.flags & IOMAP_F_STALE) break; + if (iter->flags & IOMAP_UNCACHED) + folio_set_uncached(folio); offset = offset_in_folio(folio, pos); if (bytes > folio_size(folio) - offset) @@ -1023,8 +1025,9 @@ ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i, const struct iomap_ops *ops, void *private) { + struct address_space *mapping = iocb->ki_filp->f_mapping; struct iomap_iter iter = { - .inode = iocb->ki_filp->f_mapping->host, + .inode = mapping->host, .pos = iocb->ki_pos, .len = iov_iter_count(i), .flags = IOMAP_WRITE, @@ -1034,12 +1037,19 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i, if (iocb->ki_flags & IOCB_NOWAIT) iter.flags |= IOMAP_NOWAIT; + if (iocb->ki_flags & IOCB_UNCACHED) + iter.flags |= IOMAP_UNCACHED; while ((ret = iomap_iter(&iter, ops)) > 0) iter.processed = iomap_write_iter(&iter, i); if (unlikely(iter.pos == iocb->ki_pos)) return ret; + if (iocb->ki_flags & IOCB_UNCACHED) { + /* kick off uncached writeback, completion will drop it */ + __filemap_fdatawrite_range(mapping, iocb->ki_pos, iter.pos, + WB_SYNC_NONE); + } ret = iter.pos - iocb->ki_pos; iocb->ki_pos = iter.pos; return ret; diff --git a/include/linux/iomap.h b/include/linux/iomap.h index f61407e3b121..89b24fbb1399 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -173,8 +173,9 @@ struct iomap_folio_ops { #define IOMAP_NOWAIT (1 << 5) /* do not block */ #define IOMAP_OVERWRITE_ONLY (1 << 6) /* only pure overwrites allowed */ #define IOMAP_UNSHARE (1 << 7) /* unshare_file_range */ +#define IOMAP_UNCACHED (1 << 8) /* uncached IO */ #ifdef CONFIG_FS_DAX -#define IOMAP_DAX (1 << 8) /* DAX mapping */ +#define IOMAP_DAX (1 << 9) /* DAX mapping */ #else #define IOMAP_DAX 0 #endif /* CONFIG_FS_DAX */
Add iomap buffered write support for RWF_UNCACHED. If RWF_UNCACHED is set for a write, mark the folios being written with drop_writeback. Then writeback completion will drop the pages. The write_iter handler simply kicks off writeback for the pages, and writeback completion will take care of the rest. See the similar patch for the generic filemap handling for performance results, those were in fact done on XFS using this patch. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/iomap/buffered-io.c | 12 +++++++++++- include/linux/iomap.h | 3 ++- 2 files changed, 13 insertions(+), 2 deletions(-)