Message ID | 20240405180038.2618624-1-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: Convert pagecache_isize_extended to use a folio | expand |
On 05.04.24 20:00, Matthew Wilcox (Oracle) wrote: > Remove four hidden calls to compound_head(). Also exit early if the > filesystem block size is >= PAGE_SIZE instead of just equal to PAGE_SIZE. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> > --- > mm/truncate.c | 36 +++++++++++++++++------------------- > 1 file changed, 17 insertions(+), 19 deletions(-) > > diff --git a/mm/truncate.c b/mm/truncate.c > index 725b150e47ac..e99085bf3d34 100644 > --- a/mm/truncate.c > +++ b/mm/truncate.c > @@ -764,15 +764,15 @@ EXPORT_SYMBOL(truncate_setsize); > * @from: original inode size > * @to: new inode size > * > - * Handle extension of inode size either caused by extending truncate or by > - * write starting after current i_size. We mark the page straddling current > - * i_size RO so that page_mkwrite() is called on the nearest write access to > - * the page. This way filesystem can be sure that page_mkwrite() is called on > - * the page before user writes to the page via mmap after the i_size has been > - * changed. > + * Handle extension of inode size either caused by extending truncate or > + * by write starting after current i_size. We mark the page straddling > + * current i_size RO so that page_mkwrite() is called on the first > + * write access to the page. The filesystem will update its per-block > + * information before user writes to the page via mmap after the i_size > + * has been changed. Did you intend not to s/page/folio/ ? > * > * The function must be called after i_size is updated so that page fault > - * coming after we unlock the page will already see the new i_size. > + * coming after we unlock the folio will already see the new i_size. > * The function must be called while we still hold i_rwsem - this not only > * makes sure i_size is stable but also that userspace cannot observe new > * i_size value before we are prepared to store mmap writes at new inode size. > @@ -781,31 +781,29 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to) Reviewed-by: David Hildenbrand <david@redhat.com>
On Tue, Apr 09, 2024 at 03:39:35PM +0200, David Hildenbrand wrote: > On 05.04.24 20:00, Matthew Wilcox (Oracle) wrote: > > + * Handle extension of inode size either caused by extending truncate or > > + * by write starting after current i_size. We mark the page straddling > > + * current i_size RO so that page_mkwrite() is called on the first > > + * write access to the page. The filesystem will update its per-block > > + * information before user writes to the page via mmap after the i_size > > + * has been changed. > > Did you intend not to s/page/folio/ ? I did! I think here we're talking about page faults, and we can control the RO property on a per-PTE (ie per-page) basis. Do you think it'd be clearer if we talked about folios here? We're in a bit of an interesting spot because filesystems generally don't have folios which overhang the end of the file by more than a page. It can happen if truncate fails to split a folio. > Reviewed-by: David Hildenbrand <david@redhat.com> Thanks!
On 09.04.24 16:09, Matthew Wilcox wrote: > On Tue, Apr 09, 2024 at 03:39:35PM +0200, David Hildenbrand wrote: >> On 05.04.24 20:00, Matthew Wilcox (Oracle) wrote: >>> + * Handle extension of inode size either caused by extending truncate or >>> + * by write starting after current i_size. We mark the page straddling >>> + * current i_size RO so that page_mkwrite() is called on the first >>> + * write access to the page. The filesystem will update its per-block >>> + * information before user writes to the page via mmap after the i_size >>> + * has been changed. >> >> Did you intend not to s/page/folio/ ? > > I did! I think here we're talking about page faults, and we can control > the RO property on a per-PTE (ie per-page) basis. Do you think it'd be > clearer if we talked about folios here? Good point! It might have been clearer to me if that paragraph would talk about PTEs mapping folio pages, whereby the relevant PTEs are R/O such that we will catch first write access to these folio pages. But re-reading it now, it's clearer to me taht the focus is on per-page handling. (although it might boil down to per-folio tracking in some cases, staring at filemap_page_mkwrite()). > We're in a bit of an interesting > spot because filesystems generally don't have folios which overhang the > end of the file by more than a page. It can happen if truncate fails > to split a folio. Right. I remember FALLOCATE_PUNCHOLE is in a similar position if splitting fails (cannot free up memory but have to zero it out IIRC).
diff --git a/mm/truncate.c b/mm/truncate.c index 725b150e47ac..e99085bf3d34 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -764,15 +764,15 @@ EXPORT_SYMBOL(truncate_setsize); * @from: original inode size * @to: new inode size * - * Handle extension of inode size either caused by extending truncate or by - * write starting after current i_size. We mark the page straddling current - * i_size RO so that page_mkwrite() is called on the nearest write access to - * the page. This way filesystem can be sure that page_mkwrite() is called on - * the page before user writes to the page via mmap after the i_size has been - * changed. + * Handle extension of inode size either caused by extending truncate or + * by write starting after current i_size. We mark the page straddling + * current i_size RO so that page_mkwrite() is called on the first + * write access to the page. The filesystem will update its per-block + * information before user writes to the page via mmap after the i_size + * has been changed. * * The function must be called after i_size is updated so that page fault - * coming after we unlock the page will already see the new i_size. + * coming after we unlock the folio will already see the new i_size. * The function must be called while we still hold i_rwsem - this not only * makes sure i_size is stable but also that userspace cannot observe new * i_size value before we are prepared to store mmap writes at new inode size. @@ -781,31 +781,29 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to) { int bsize = i_blocksize(inode); loff_t rounded_from; - struct page *page; - pgoff_t index; + struct folio *folio; WARN_ON(to > inode->i_size); - if (from >= to || bsize == PAGE_SIZE) + if (from >= to || bsize >= PAGE_SIZE) return; /* Page straddling @from will not have any hole block created? */ rounded_from = round_up(from, bsize); if (to <= rounded_from || !(rounded_from & (PAGE_SIZE - 1))) return; - index = from >> PAGE_SHIFT; - page = find_lock_page(inode->i_mapping, index); - /* Page not cached? Nothing to do */ - if (!page) + folio = filemap_lock_folio(inode->i_mapping, from / PAGE_SIZE); + /* Folio not cached? Nothing to do */ + if (IS_ERR(folio)) return; /* - * See clear_page_dirty_for_io() for details why set_page_dirty() + * See folio_clear_dirty_for_io() for details why folio_mark_dirty() * is needed. */ - if (page_mkclean(page)) - set_page_dirty(page); - unlock_page(page); - put_page(page); + if (folio_mkclean(folio)) + folio_mark_dirty(folio); + folio_unlock(folio); + folio_put(folio); } EXPORT_SYMBOL(pagecache_isize_extended);