mm: Convert pagecache_isize_extended to use a folio

Message ID	20240405180038.2618624-1-willy@infradead.org (mailing list archive)
State	New
Headers	show Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8ED3917167C for <linux-fsdevel@vger.kernel.org>; Fri, 5 Apr 2024 18:00:43 +0000 (UTC) From: "Matthew Wilcox (Oracle)" <willy@infradead.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Pankaj Raghav <p.raghav@samsung.com> Subject: [PATCH] mm: Convert pagecache_isize_extended to use a folio Date: Fri, 5 Apr 2024 19:00:36 +0100 Message-ID: <20240405180038.2618624-1-willy@infradead.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	mm: Convert pagecache_isize_extended to use a folio \| expand mm: Convert pagecache_isize_extended to use a folio

Message ID

20240405180038.2618624-1-willy@infradead.org (mailing list archive)

State

New

Headers

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org,
	Pankaj Raghav <p.raghav@samsung.com>
Subject: [PATCH] mm: Convert pagecache_isize_extended to use a folio
Date: Fri,  5 Apr 2024 19:00:36 +0100
Message-ID: <20240405180038.2618624-1-willy@infradead.org>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

mm: Convert pagecache_isize_extended to use a folio | expand

Commit Message

Matthew Wilcox April 5, 2024, 6 p.m. UTC

Remove four hidden calls to compound_head().  Also exit early if the
filesystem block size is >= PAGE_SIZE instead of just equal to PAGE_SIZE.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
---
 mm/truncate.c | 36 +++++++++++++++++-------------------
 1 file changed, 17 insertions(+), 19 deletions(-)

Comments

David Hildenbrand April 9, 2024, 1:39 p.m. UTC | #1

On 05.04.24 20:00, Matthew Wilcox (Oracle) wrote:
> Remove four hidden calls to compound_head().  Also exit early if the
> filesystem block size is >= PAGE_SIZE instead of just equal to PAGE_SIZE.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
> ---
>   mm/truncate.c | 36 +++++++++++++++++-------------------
>   1 file changed, 17 insertions(+), 19 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 725b150e47ac..e99085bf3d34 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -764,15 +764,15 @@ EXPORT_SYMBOL(truncate_setsize);
>    * @from:	original inode size
>    * @to:		new inode size
>    *
> - * Handle extension of inode size either caused by extending truncate or by
> - * write starting after current i_size. We mark the page straddling current
> - * i_size RO so that page_mkwrite() is called on the nearest write access to
> - * the page.  This way filesystem can be sure that page_mkwrite() is called on
> - * the page before user writes to the page via mmap after the i_size has been
> - * changed.
> + * Handle extension of inode size either caused by extending truncate or
> + * by write starting after current i_size.  We mark the page straddling
> + * current i_size RO so that page_mkwrite() is called on the first
> + * write access to the page.  The filesystem will update its per-block
> + * information before user writes to the page via mmap after the i_size
> + * has been changed.

Did you intend not to s/page/folio/ ?

>    *
>    * The function must be called after i_size is updated so that page fault
> - * coming after we unlock the page will already see the new i_size.
> + * coming after we unlock the folio will already see the new i_size.
>    * The function must be called while we still hold i_rwsem - this not only
>    * makes sure i_size is stable but also that userspace cannot observe new
>    * i_size value before we are prepared to store mmap writes at new inode size.
> @@ -781,31 +781,29 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)


Reviewed-by: David Hildenbrand <david@redhat.com>

Matthew Wilcox April 9, 2024, 2:09 p.m. UTC | #2

On Tue, Apr 09, 2024 at 03:39:35PM +0200, David Hildenbrand wrote:
> On 05.04.24 20:00, Matthew Wilcox (Oracle) wrote:
> > + * Handle extension of inode size either caused by extending truncate or
> > + * by write starting after current i_size.  We mark the page straddling
> > + * current i_size RO so that page_mkwrite() is called on the first
> > + * write access to the page.  The filesystem will update its per-block
> > + * information before user writes to the page via mmap after the i_size
> > + * has been changed.
> 
> Did you intend not to s/page/folio/ ?

I did!  I think here we're talking about page faults, and we can control
the RO property on a per-PTE (ie per-page) basis.  Do you think it'd be
clearer if we talked about folios here?  We're in a bit of an interesting
spot because filesystems generally don't have folios which overhang the
end of the file by more than a page.  It can happen if truncate fails
to split a folio.

> Reviewed-by: David Hildenbrand <david@redhat.com>

Thanks!

David Hildenbrand April 9, 2024, 2:27 p.m. UTC | #3

On 09.04.24 16:09, Matthew Wilcox wrote:
> On Tue, Apr 09, 2024 at 03:39:35PM +0200, David Hildenbrand wrote:
>> On 05.04.24 20:00, Matthew Wilcox (Oracle) wrote:
>>> + * Handle extension of inode size either caused by extending truncate or
>>> + * by write starting after current i_size.  We mark the page straddling
>>> + * current i_size RO so that page_mkwrite() is called on the first
>>> + * write access to the page.  The filesystem will update its per-block
>>> + * information before user writes to the page via mmap after the i_size
>>> + * has been changed.
>>
>> Did you intend not to s/page/folio/ ?
> 
> I did!  I think here we're talking about page faults, and we can control
> the RO property on a per-PTE (ie per-page) basis.  Do you think it'd be
> clearer if we talked about folios here? 

Good point! It might have been clearer to me if that paragraph would 
talk about PTEs mapping folio pages, whereby the relevant PTEs are R/O 
such that we will catch first write access to these folio pages.

But re-reading it now, it's clearer to me taht the focus is on per-page 
handling. (although it might boil down to per-folio tracking in some 
cases, staring at filemap_page_mkwrite()).

> We're in a bit of an interesting
> spot because filesystems generally don't have folios which overhang the
> end of the file by more than a page.  It can happen if truncate fails
> to split a folio.

Right. I remember FALLOCATE_PUNCHOLE is in a similar position if 
splitting fails (cannot free up memory but have to zero it out IIRC).

diff --git a/mm/truncate.c b/mm/truncate.c
index 725b150e47ac..e99085bf3d34 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -764,15 +764,15 @@  EXPORT_SYMBOL(truncate_setsize);
  * @from:	original inode size
  * @to:		new inode size
  *
- * Handle extension of inode size either caused by extending truncate or by
- * write starting after current i_size. We mark the page straddling current
- * i_size RO so that page_mkwrite() is called on the nearest write access to
- * the page.  This way filesystem can be sure that page_mkwrite() is called on
- * the page before user writes to the page via mmap after the i_size has been
- * changed.
+ * Handle extension of inode size either caused by extending truncate or
+ * by write starting after current i_size.  We mark the page straddling
+ * current i_size RO so that page_mkwrite() is called on the first
+ * write access to the page.  The filesystem will update its per-block
+ * information before user writes to the page via mmap after the i_size
+ * has been changed.
  *
  * The function must be called after i_size is updated so that page fault
- * coming after we unlock the page will already see the new i_size.
+ * coming after we unlock the folio will already see the new i_size.
  * The function must be called while we still hold i_rwsem - this not only
  * makes sure i_size is stable but also that userspace cannot observe new
  * i_size value before we are prepared to store mmap writes at new inode size.
@@ -781,31 +781,29 @@  void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
 {
 	int bsize = i_blocksize(inode);
 	loff_t rounded_from;
-	struct page *page;
-	pgoff_t index;
+	struct folio *folio;
 
 	WARN_ON(to > inode->i_size);
 
-	if (from >= to || bsize == PAGE_SIZE)
+	if (from >= to || bsize >= PAGE_SIZE)
 		return;
 	/* Page straddling @from will not have any hole block created? */
 	rounded_from = round_up(from, bsize);
 	if (to <= rounded_from || !(rounded_from & (PAGE_SIZE - 1)))
 		return;
 
-	index = from >> PAGE_SHIFT;
-	page = find_lock_page(inode->i_mapping, index);
-	/* Page not cached? Nothing to do */
-	if (!page)
+	folio = filemap_lock_folio(inode->i_mapping, from / PAGE_SIZE);
+	/* Folio not cached? Nothing to do */
+	if (IS_ERR(folio))
 		return;
 	/*
-	 * See clear_page_dirty_for_io() for details why set_page_dirty()
+	 * See folio_clear_dirty_for_io() for details why folio_mark_dirty()
 	 * is needed.
 	 */
-	if (page_mkclean(page))
-		set_page_dirty(page);
-	unlock_page(page);
-	put_page(page);
+	if (folio_mkclean(folio))
+		folio_mark_dirty(folio);
+	folio_unlock(folio);
+	folio_put(folio);
 }
 EXPORT_SYMBOL(pagecache_isize_extended);

mm: Convert pagecache_isize_extended to use a folio

Commit Message

Comments

Patch