Message ID | 20240914140613.2334139-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [-next] tmpfs: fault in smaller chunks if large folio allocation not allowed | expand |
On Sat, Sep 14, 2024 at 10:06:13PM +0800, Kefeng Wang wrote: > +++ b/mm/shmem.c > @@ -3228,6 +3228,7 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > { > struct file *file = iocb->ki_filp; > struct inode *inode = file->f_mapping->host; > + pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; > ssize_t ret; > > inode_lock(inode); > @@ -3240,6 +3241,10 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > ret = file_update_time(file); > if (ret) > goto unlock; > + > + if (!shmem_allowable_huge_orders(inode, NULL, index, 0, false)) > + iocb->ki_flags |= IOCB_NO_LARGE_CHUNK; Wouldn't it be better to call mapping_set_folio_order_range() so we don't need this IOCB flag?
On 2024/9/15 18:40, Matthew Wilcox wrote: > On Sat, Sep 14, 2024 at 10:06:13PM +0800, Kefeng Wang wrote: >> +++ b/mm/shmem.c >> @@ -3228,6 +3228,7 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from) >> { >> struct file *file = iocb->ki_filp; >> struct inode *inode = file->f_mapping->host; >> + pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; >> ssize_t ret; >> >> inode_lock(inode); >> @@ -3240,6 +3241,10 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from) >> ret = file_update_time(file); >> if (ret) >> goto unlock; >> + >> + if (!shmem_allowable_huge_orders(inode, NULL, index, 0, false)) >> + iocb->ki_flags |= IOCB_NO_LARGE_CHUNK; > > Wouldn't it be better to call mapping_set_folio_order_range() so we > don't need this IOCB flag? > I think it before, but the comment of mapping_set_folio_order_range() said, "The filesystem should call this function in its inode constructor to indicate which base size (min) and maximum size (max) of folio the VFS can use to cache the contents of the file. This should only be used if the filesystem needs special handling of folio sizes (ie there is something the core cannot know). Do not tune it based on, eg, i_size. Context: This should not be called while the inode is active as it is non-atomic." and dynamically modify mappings without protect maybe a risk. Or adding a new __generic_perform_write() with a chunk_size argument to avoid to use a IOCB flag?
diff --git a/include/linux/fs.h b/include/linux/fs.h index 1e25267e2e48..d642e323354b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -346,6 +346,8 @@ struct readahead_control; #define IOCB_DIO_CALLER_COMP (1 << 22) /* kiocb is a read or write operation submitted by fs/aio.c. */ #define IOCB_AIO_RW (1 << 23) +/* fault int small chunks(PAGE_SIZE) from userspace */ +#define IOCB_NO_LARGE_CHUNK (1 << 24) /* for use in trace events */ #define TRACE_IOCB_STRINGS \ diff --git a/mm/filemap.c b/mm/filemap.c index 3e46ca45e13d..27e73f2680ef 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -4132,9 +4132,14 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i) loff_t pos = iocb->ki_pos; struct address_space *mapping = file->f_mapping; const struct address_space_operations *a_ops = mapping->a_ops; - size_t chunk = mapping_max_folio_size(mapping); long status = 0; ssize_t written = 0; + size_t chunk; + + if (iocb->ki_flags & IOCB_NO_LARGE_CHUNK) + chunk = PAGE_SIZE; + else + chunk = mapping_max_folio_size(mapping); do { struct folio *folio; diff --git a/mm/shmem.c b/mm/shmem.c index 530fe8cfcc21..2909bead774b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3228,6 +3228,7 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; + pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; ssize_t ret; inode_lock(inode); @@ -3240,6 +3241,10 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from) ret = file_update_time(file); if (ret) goto unlock; + + if (!shmem_allowable_huge_orders(inode, NULL, index, 0, false)) + iocb->ki_flags |= IOCB_NO_LARGE_CHUNK; + ret = generic_perform_write(iocb, from); unlock: inode_unlock(inode);
The tmpfs supports large folio, but there is some configurable options to enable/disable large folio allocation, and for huge=within_size, large folio only allowabled if it fully within i_size, so there is performance issue when perform write without large folio, the issue is similar to commit 4e527d5841e2 ("iomap: fault in smaller chunks for non-large folio mappings"). Fix it by checking whether it allows large folio allocation or not before perform write. Fixes: 9aac777aaf94 ("filemap: Convert generic_perform_write() to support large folios") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- include/linux/fs.h | 2 ++ mm/filemap.c | 7 ++++++- mm/shmem.c | 5 +++++ 3 files changed, 13 insertions(+), 1 deletion(-)