diff mbox series

netfs: fix test for whether we can skip read when writing beyond EOF

Message ID 20210613233345.113565-1-jlayton@kernel.org (mailing list archive)
State New, archived
Headers show
Series netfs: fix test for whether we can skip read when writing beyond EOF | expand

Commit Message

Jeff Layton June 13, 2021, 11:33 p.m. UTC
It's not sufficient to skip reading when the pos is beyond the EOF.
There may be data at the head of the page that we need to fill in
before the write.

Add a new helper function that corrects and clarifies the logic of
when we can skip reads, and have it only zero out the part of the page
that won't have data copied in for the write.

Finally, don't set the page Uptodate after zeroing. It's not up to date
since the write data won't have been copied in yet.

Fixes: e1b1240c1ff5f ("netfs: Add write_begin helper")
Reported-by: Andrew W Elble <aweits@rit.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/netfs/read_helper.c | 53 ++++++++++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 12 deletions(-)

Comments

Matthew Wilcox June 14, 2021, 12:08 a.m. UTC | #1
On Sun, Jun 13, 2021 at 07:33:45PM -0400, Jeff Layton wrote:
> +static bool prep_noread_page(struct page *page, loff_t pos, size_t len)
>  {
> -	unsigned int i;
> +	struct inode *inode = page->mapping->host;
> +	loff_t i_size = i_size_read(inode);
> +	pgoff_t index = pos / thp_size(page);
> +	size_t offset = offset_in_page(pos);

offset_in_thp(page, pos);

With that change:

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
David Howells June 14, 2021, 10:04 a.m. UTC | #2
Jeff Layton <jlayton@kernel.org> wrote:

> +/**
> + * prep_noread_page - prep a page for writing without reading first

It's a static function, so I'm not sure it needs the kernel doc marker.

It also needs prefixing with "netfs_".

> +	/* pos beyond last page in the file */
> +	if (index > ((i_size - 1) / thp_size(page)))
> +		goto zero_out;

thp_size() is not a constant, so this gets you a DIV instruction.

Why not:

	if (page_offset(page) >= i_size)

or maybe:

	if (pos - offset >= i_size)

> +	zero_user_segments(page, 0, offset, offset + len, thp_size(page));

If you're going to leave a hole in the file, this will break afs, so this
patch needs to deal with that too (basically if copied < len, then the
remainder needs clearing, give or take len being trimmed to the end of the
page).  I can look at adding that.

Matthew Wilcox <willy@infradead.org> wrote:

> > +	size_t offset = offset_in_page(pos);
> 
> offset_in_thp(page, pos);

I can make this change too.

(btw, can offset_in_thp() have it's second arg renamed to 'pos', not just 'p'?
 'p' is normally used to indicate a pointer of some sort).
David Howells June 14, 2021, 10:19 a.m. UTC | #3
David Howells <dhowells@redhat.com> wrote:

> > +	zero_user_segments(page, 0, offset, offset + len, thp_size(page));
> 
> If you're going to leave a hole in the file, this will break afs, so this
> patch needs to deal with that too (basically if copied < len, then the
> remainder needs clearing, give or take len being trimmed to the end of the
> page).  I can look at adding that.

Clearing or reading.

David
David Howells June 14, 2021, 10:23 a.m. UTC | #4
Jeff Layton <jlayton@kernel.org> wrote:

> +	/* full page write */
> +	if (offset == 0 && len >= thp_size(page))
> +		goto zero_out;

Why not just return?

David Howells <dhowells@redhat.com> wrote:

> Why not:
> 
> 	if (page_offset(page) >= i_size)

And if I switch to this, then:

	/* Zero-length file */
	if (i_size == 0)
		goto zero_out;

this is redundant.

David
Jeff Layton June 14, 2021, 11:35 a.m. UTC | #5
On Mon, 2021-06-14 at 11:04 +0100, David Howells wrote:
> Jeff Layton <jlayton@kernel.org> wrote:
> 
> > +/**
> > + * prep_noread_page - prep a page for writing without reading first
> 
> It's a static function, so I'm not sure it needs the kernel doc marker.
> 
> It also needs prefixing with "netfs_".
> 

I added the comment since the logic here is somewhat complex. It didn't
need to be a kerneldoc header, but I figured that didn't hurt anything.

> > +	/* pos beyond last page in the file */
> > +	if (index > ((i_size - 1) / thp_size(page)))
> > +		goto zero_out;
> 
> thp_size() is not a constant, so this gets you a DIV instruction.
> 

Ugh, ok.

> Why not:
> 
> 	if (page_offset(page) >= i_size)
> 

That doesn't handle THP's correctly. It's just a PAGE_SIZE shift.

> or maybe:
> 
> 	if (pos - offset >= i_size)
> 

That might work.

> > +	zero_user_segments(page, 0, offset, offset + len, thp_size(page));
> 
> If you're going to leave a hole in the file, this will break afs, so this
> patch needs to deal with that too (basically if copied < len, then the
> remainder needs clearing, give or take len being trimmed to the end of the
> page).  I can look at adding that.
> 

I think we have to contend with that in write_end. Basically if the copy
is short, then we probably want to pretend it was a zero length copy and
let generic_perform_write handle it as such. See commit b9de313cf05fe
where Al fixed some sketchy error handling in ceph_write_end along those
lines.

> Matthew Wilcox <willy@infradead.org> wrote:
> 
> > > +	size_t offset = offset_in_page(pos);
> > 
> > offset_in_thp(page, pos);
> 
> I can make this change too.
> 

Thanks.

> (btw, can offset_in_thp() have it's second arg renamed to 'pos', not just 'p'?
>  'p' is normally used to indicate a pointer of some sort).
>
David Howells June 14, 2021, 11:45 a.m. UTC | #6
Jeff Layton <jlayton@kernel.org> wrote:

> > Why not:
> > 
> > 	if (page_offset(page) >= i_size)
> > 
> 
> That doesn't handle THP's correctly. It's just a PAGE_SIZE shift.

I asked Willy about that one and he said it will.  Now, granted, the code
doesn't seem to do that, but possibly he has a patch for it?

David
Matthew Wilcox June 14, 2021, 11:51 a.m. UTC | #7
On Mon, Jun 14, 2021 at 11:04:53AM +0100, David Howells wrote:
> (btw, can offset_in_thp() have it's second arg renamed to 'pos', not just 'p'?
>  'p' is normally used to indicate a pointer of some sort).

the argument is sometimes a pointer.  for example:

arch/arm64/kernel/mte.c:                offset = offset_in_page(addr);
fs/jbd2/commit.c:               (void *)(addr + offset_in_page(bh->b_data)), bh->b_size);

yes, those are offset_in_page(), not offset_in_thp(), but i'll bet
you a cadbury's creme egg that we find someone who needs to use
offset_in_thp() (or offset_in_folio()) on a pointer within three years.
Matthew Wilcox June 14, 2021, 11:58 a.m. UTC | #8
On Mon, Jun 14, 2021 at 12:45:34PM +0100, David Howells wrote:
> Jeff Layton <jlayton@kernel.org> wrote:
> 
> > > Why not:
> > > 
> > > 	if (page_offset(page) >= i_size)
> > > 
> > 
> > That doesn't handle THP's correctly. It's just a PAGE_SIZE shift.
> 
> I asked Willy about that one and he said it will.  Now, granted, the code
> doesn't seem to do that, but possibly he has a patch for it?

a THP has its index in units of PAGE_SIZE.  If you have an order-4
page at file offset 32 * PAGE_SIZE, it will have page->index set to 32.
So shifting by PAGE_SIZE is correct.  This contrasts with the insanity
of hugetlbfs which has its index in units of the hpage_size.

The only thing is that you have to pass around the head page.  tail->index
is meaningless.  But you should always be passing around the head page
unless there's a really good reason not to (eg vmf->page where we really
need to know which subpage the fault was in).
David Howells June 14, 2021, 12:25 p.m. UTC | #9
Here's my take on Jeff's patch.

David
---
commit 2821dcb8a99b5da3b914cfc57ba6010635719b12
Author: Jeff Layton <jlayton@kernel.org>
Date:   Sun Jun 13 19:33:45 2021 -0400

    netfs: fix test for whether we can skip read when writing beyond EOF
    
    It's not sufficient to skip reading when the pos is beyond the EOF.
    There may be data at the head of the page that we need to fill in
    before the write.
    
    Add a new helper function that corrects and clarifies the logic of
    when we can skip reads, and have it only zero out the part of the page
    that won't have data copied in for the write.
    
    Finally, don't set the page Uptodate after zeroing. It's not up to date
    since the write data won't have been copied in yet.
    
    [DH made the following changes:
    
     - Prefixed the new function with "netfs_".
    
     - Don't call zero_user_segments() for a full-page write.
    
     - Altered the beyond-last-page check to avoid a DIV instruction and got
       rid of then-redundant zero-length file check.
    
     - Made afs handle a short copy (didn't matter before as the page was fully
       set up before the copy).  Now it returns 0 from afs_write_end() if the
       page was not uptodate as commit b9de313cf05fe08fa59efaf19756ec5283af672a
       does for ceph.
    
     - Made afs handle len indicating a write that extended over the end of the
       page allocated.]
    
    Fixes: e1b1240c1ff5f ("netfs: Add write_begin helper")
    Reported-by: Andrew W Elble <aweits@rit.edu>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/340984.1623666185@warthog.procyon.org.uk/T/#m5e9b155310973bc71cf101d3dea6aced492bfd49
---
 fs/afs/write.c         |   23 ++++++++++++++++------
 fs/netfs/read_helper.c |   50 ++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 54 insertions(+), 19 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index 4e90e094d770..15fdc3f8a3ae 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -28,7 +28,8 @@ int afs_set_page_dirty(struct page *page)
 }
 
 /*
- * prepare to perform part of a write to a page
+ * Prepare to perform part of a write to a page.  Note that len may extend
+ * beyond the end of the page.
  */
 int afs_write_begin(struct file *file, struct address_space *mapping,
 		    loff_t pos, unsigned len, unsigned flags,
@@ -55,7 +56,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 		return ret;
 
 	index = page->index;
-	from = pos - index * PAGE_SIZE;
+	from = offset_in_thp(page, pos);
+	len = min_t(size_t, len, thp_size(page) - from);
 	to = from + len;
 
 try_again:
@@ -106,7 +108,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 }
 
 /*
- * finalise part of a write to a page
+ * Finalise part of a write to a page.  Note that len may extend beyond the end
+ * of the page.
  */
 int afs_write_end(struct file *file, struct address_space *mapping,
 		  loff_t pos, unsigned len, unsigned copied,
@@ -115,13 +118,23 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
 	struct page *page = thp_head(subpage);
 	unsigned long priv;
-	unsigned int f, from = pos & (thp_size(page) - 1);
+	unsigned int f, from = offset_in_thp(page, pos);
 	unsigned int t, to = from + copied;
+	unsigned int l = min_t(size_t, len, thp_size(page) - from);
 	loff_t i_size, write_end_pos;
 
 	_enter("{%llx:%llu},{%lx}",
 	       vnode->fid.vid, vnode->fid.vnode, page->index);
 
+	if (!PageUptodate(page)) {
+		if (copied < l) {
+			copied = 0;
+			goto out;
+		}
+
+		SetPageUptodate(page);
+	}
+
 	if (copied == 0)
 		goto out;
 
@@ -137,8 +150,6 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 		fscache_update_cookie(afs_vnode_cache(vnode), NULL, &write_end_pos);
 	}
 
-	ASSERT(PageUptodate(page));
-
 	if (PagePrivate(page)) {
 		priv = page_private(page);
 		f = afs_page_dirty_from(page, priv);
diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c
index 725614625ed4..0c4e68ae8127 100644
--- a/fs/netfs/read_helper.c
+++ b/fs/netfs/read_helper.c
@@ -1011,12 +1011,43 @@ int netfs_readpage(struct file *file,
 }
 EXPORT_SYMBOL(netfs_readpage);
 
-static void netfs_clear_thp(struct page *page)
+/**
+ * prep_noread_page - prep a page for writing without reading first
+ * @page: page being prepared
+ * @pos: starting position for the write
+ * @len: length of write
+ *
+ * In some cases, write_begin doesn't need to read at all:
+ * - full page write
+ * - file is currently zero-length
+ * - write that lies in a page that is completely beyond EOF
+ * - write that covers the the page from start to EOF or beyond it
+ *
+ * If any of these criteria are met, then zero out the unwritten parts
+ * of the page and return true. Otherwise, return false.
+ */
+static noinline bool netfs_prep_noread_page(struct page *page, loff_t pos, size_t len)
 {
-	unsigned int i;
+	struct inode *inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	size_t offset = offset_in_thp(page, pos);
+
+	/* Full page write */
+	if (offset == 0 && len >= thp_size(page))
+		return true;
+
+	/* pos beyond last page in the file */
+	if (pos - offset >= i_size)
+		goto zero_out;
+
+	/* Write that covers from the start of the page  to EOF or beyond */
+	if (offset == 0 && (pos + len) >= i_size)
+		goto zero_out;
 
-	for (i = 0; i < thp_nr_pages(page); i++)
-		clear_highpage(page + i);
+	return false;
+zero_out:
+	zero_user_segments(page, 0, offset, offset + len, thp_size(page));
+	return true;
 }
 
 /**
@@ -1024,7 +1055,7 @@ static void netfs_clear_thp(struct page *page)
  * @file: The file to read from
  * @mapping: The mapping to read from
  * @pos: File position at which the write will begin
- * @len: The length of the write in this page
+ * @len: The length of the write (may extend beyond the end of the page chosen)
  * @flags: AOP_* flags
  * @_page: Where to put the resultant page
  * @_fsdata: Place for the netfs to store a cookie
@@ -1061,8 +1092,6 @@ int netfs_write_begin(struct file *file, struct address_space *mapping,
 	struct inode *inode = file_inode(file);
 	unsigned int debug_index = 0;
 	pgoff_t index = pos >> PAGE_SHIFT;
-	int pos_in_page = pos & ~PAGE_MASK;
-	loff_t size;
 	int ret;
 
 	DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
@@ -1090,13 +1119,8 @@ int netfs_write_begin(struct file *file, struct address_space *mapping,
 	 * within the cache granule containing the EOF, in which case we need
 	 * to preload the granule.
 	 */
-	size = i_size_read(inode);
 	if (!ops->is_cache_enabled(inode) &&
-	    ((pos_in_page == 0 && len == thp_size(page)) ||
-	     (pos >= size) ||
-	     (pos_in_page == 0 && (pos + len) >= size))) {
-		netfs_clear_thp(page);
-		SetPageUptodate(page);
+	    netfs_prep_noread_page(page, pos, len)) {
 		netfs_stat(&netfs_n_rh_write_zskip);
 		goto have_page_no_wait;
 	}
Jeff Layton June 14, 2021, 12:31 p.m. UTC | #10
On Mon, 2021-06-14 at 13:25 +0100, David Howells wrote:
> Here's my take on Jeff's patch.
> 
> David
> ---
> commit 2821dcb8a99b5da3b914cfc57ba6010635719b12
> Author: Jeff Layton <jlayton@kernel.org>
> Date:   Sun Jun 13 19:33:45 2021 -0400
> 
>     netfs: fix test for whether we can skip read when writing beyond EOF
>     
>     It's not sufficient to skip reading when the pos is beyond the EOF.
>     There may be data at the head of the page that we need to fill in
>     before the write.
>     
>     Add a new helper function that corrects and clarifies the logic of
>     when we can skip reads, and have it only zero out the part of the page
>     that won't have data copied in for the write.
>     
>     Finally, don't set the page Uptodate after zeroing. It's not up to date
>     since the write data won't have been copied in yet.
>     
>     [DH made the following changes:
>     
>      - Prefixed the new function with "netfs_".
>     
>      - Don't call zero_user_segments() for a full-page write.
>     
>      - Altered the beyond-last-page check to avoid a DIV instruction and got
>        rid of then-redundant zero-length file check.
>     
>      - Made afs handle a short copy (didn't matter before as the page was fully
>        set up before the copy).  Now it returns 0 from afs_write_end() if the
>        page was not uptodate as commit b9de313cf05fe08fa59efaf19756ec5283af672a
>        does for ceph.
>     
>      - Made afs handle len indicating a write that extended over the end of the
>        page allocated.]
>     
>     Fixes: e1b1240c1ff5f ("netfs: Add write_begin helper")
>     Reported-by: Andrew W Elble <aweits@rit.edu>
>     Signed-off-by: Jeff Layton <jlayton@kernel.org>
>     Signed-off-by: David Howells <dhowells@redhat.com>
>     Link: https://lore.kernel.org/r/340984.1623666185@warthog.procyon.org.uk/T/#m5e9b155310973bc71cf101d3dea6aced492bfd49
> ---
>  fs/afs/write.c         |   23 ++++++++++++++++------
>  fs/netfs/read_helper.c |   50 ++++++++++++++++++++++++++++++++++++-------------
>  2 files changed, 54 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/afs/write.c b/fs/afs/write.c
> index 4e90e094d770..15fdc3f8a3ae 100644
> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -28,7 +28,8 @@ int afs_set_page_dirty(struct page *page)
>  }
>  
>  /*
> - * prepare to perform part of a write to a page
> + * Prepare to perform part of a write to a page.  Note that len may extend
> + * beyond the end of the page.
>   */
>  int afs_write_begin(struct file *file, struct address_space *mapping,
>  		    loff_t pos, unsigned len, unsigned flags,
> @@ -55,7 +56,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
>  		return ret;
>  
>  	index = page->index;
> -	from = pos - index * PAGE_SIZE;
> +	from = offset_in_thp(page, pos);
> +	len = min_t(size_t, len, thp_size(page) - from);
>  	to = from + len;
>  
>  try_again:
> @@ -106,7 +108,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
>  }
>  
>  /*
> - * finalise part of a write to a page
> + * Finalise part of a write to a page.  Note that len may extend beyond the end
> + * of the page.
>   */
>  int afs_write_end(struct file *file, struct address_space *mapping,
>  		  loff_t pos, unsigned len, unsigned copied,
> @@ -115,13 +118,23 @@ int afs_write_end(struct file *file, struct address_space *mapping,
>  	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
>  	struct page *page = thp_head(subpage);
>  	unsigned long priv;
> -	unsigned int f, from = pos & (thp_size(page) - 1);
> +	unsigned int f, from = offset_in_thp(page, pos);
>  	unsigned int t, to = from + copied;
> +	unsigned int l = min_t(size_t, len, thp_size(page) - from);
>  	loff_t i_size, write_end_pos;
>  
>  	_enter("{%llx:%llu},{%lx}",
>  	       vnode->fid.vid, vnode->fid.vnode, page->index);
>  
> +	if (!PageUptodate(page)) {
> +		if (copied < l) {
> +			copied = 0;
> +			goto out;
> +		}
> +
> +		SetPageUptodate(page);
> +	}
> +
>  	if (copied == 0)
>  		goto out;
>  
> @@ -137,8 +150,6 @@ int afs_write_end(struct file *file, struct address_space *mapping,
>  		fscache_update_cookie(afs_vnode_cache(vnode), NULL, &write_end_pos);
>  	}
>  
> -	ASSERT(PageUptodate(page));
> -
>  	if (PagePrivate(page)) {
>  		priv = page_private(page);
>  		f = afs_page_dirty_from(page, priv);

AFS changes should probably go into a separate patch.

> diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c
> index 725614625ed4..0c4e68ae8127 100644
> --- a/fs/netfs/read_helper.c
> +++ b/fs/netfs/read_helper.c
> @@ -1011,12 +1011,43 @@ int netfs_readpage(struct file *file,
>  }
>  EXPORT_SYMBOL(netfs_readpage);
>  
> -static void netfs_clear_thp(struct page *page)
> +/**
> + * prep_noread_page - prep a page for writing without reading first

nit: fix function name here too?

> + * @page: page being prepared
> + * @pos: starting position for the write
> + * @len: length of write
> + *
> + * In some cases, write_begin doesn't need to read at all:
> + * - full page write
> + * - file is currently zero-length
> + * - write that lies in a page that is completely beyond EOF
> + * - write that covers the the page from start to EOF or beyond it
> + *
> + * If any of these criteria are met, then zero out the unwritten parts
> + * of the page and return true. Otherwise, return false.
> + */
> +static noinline bool netfs_prep_noread_page(struct page *page, loff_t pos, size_t len)
>  {
> -	unsigned int i;
> +	struct inode *inode = page->mapping->host;
> +	loff_t i_size = i_size_read(inode);
> +	size_t offset = offset_in_thp(page, pos);
> +
> +	/* Full page write */
> +	if (offset == 0 && len >= thp_size(page))
> +		return true;
> +
> +	/* pos beyond last page in the file */
> +	if (pos - offset >= i_size)
> +		goto zero_out;
> +
> +	/* Write that covers from the start of the page  to EOF or beyond */
> +	if (offset == 0 && (pos + len) >= i_size)
> +		goto zero_out;
>  
> -	for (i = 0; i < thp_nr_pages(page); i++)
> -		clear_highpage(page + i);
> +	return false;
> +zero_out:
> +	zero_user_segments(page, 0, offset, offset + len, thp_size(page));
> +	return true;
>  }
>  
>  /**
> @@ -1024,7 +1055,7 @@ static void netfs_clear_thp(struct page *page)
>   * @file: The file to read from
>   * @mapping: The mapping to read from
>   * @pos: File position at which the write will begin
> - * @len: The length of the write in this page
> + * @len: The length of the write (may extend beyond the end of the page chosen)
>   * @flags: AOP_* flags
>   * @_page: Where to put the resultant page
>   * @_fsdata: Place for the netfs to store a cookie
> @@ -1061,8 +1092,6 @@ int netfs_write_begin(struct file *file, struct address_space *mapping,
>  	struct inode *inode = file_inode(file);
>  	unsigned int debug_index = 0;
>  	pgoff_t index = pos >> PAGE_SHIFT;
> -	int pos_in_page = pos & ~PAGE_MASK;
> -	loff_t size;
>  	int ret;
>  
>  	DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
> @@ -1090,13 +1119,8 @@ int netfs_write_begin(struct file *file, struct address_space *mapping,
>  	 * within the cache granule containing the EOF, in which case we need
>  	 * to preload the granule.
>  	 */
> -	size = i_size_read(inode);
>  	if (!ops->is_cache_enabled(inode) &&
> -	    ((pos_in_page == 0 && len == thp_size(page)) ||
> -	     (pos >= size) ||
> -	     (pos_in_page == 0 && (pos + len) >= size))) {
> -		netfs_clear_thp(page);
> -		SetPageUptodate(page);
> +	    netfs_prep_noread_page(page, pos, len)) {
>  		netfs_stat(&netfs_n_rh_write_zskip);
>  		goto have_page_no_wait;
>  	}
> 

The rest looks good to me.
Matthew Wilcox June 14, 2021, 12:54 p.m. UTC | #11
On Mon, Jun 14, 2021 at 01:25:32PM +0100, David Howells wrote:
> +	/* Full page write */
> +	if (offset == 0 && len >= thp_size(page))
> +		return true;
> +
> +	/* pos beyond last page in the file */
> +	if (pos - offset >= i_size)
> +		goto zero_out;
> +
> +	/* Write that covers from the start of the page  to EOF or beyond */

spurious double space between page and to.

> @@ -1090,13 +1119,8 @@ int netfs_write_begin(struct file *file, struct address_space *mapping,
>  	 * within the cache granule containing the EOF, in which case we need
>  	 * to preload the granule.
>  	 */
> -	size = i_size_read(inode);
>  	if (!ops->is_cache_enabled(inode) &&
> -	    ((pos_in_page == 0 && len == thp_size(page)) ||
> -	     (pos >= size) ||
> -	     (pos_in_page == 0 && (pos + len) >= size))) {
> -		netfs_clear_thp(page);
> -		SetPageUptodate(page);
> +	    netfs_prep_noread_page(page, pos, len)) {

I don't like the name ... netfs_skip_page_read()?
diff mbox series

Patch

diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c
index 725614625ed4..fcf3a8a09a00 100644
--- a/fs/netfs/read_helper.c
+++ b/fs/netfs/read_helper.c
@@ -1011,12 +1011,48 @@  int netfs_readpage(struct file *file,
 }
 EXPORT_SYMBOL(netfs_readpage);
 
-static void netfs_clear_thp(struct page *page)
+/**
+ * prep_noread_page - prep a page for writing without reading first
+ * @page: page being prepared
+ * @pos: starting position for the write
+ * @len: length of write
+ *
+ * In some cases, write_begin doesn't need to read at all:
+ * - full page write
+ * - file is currently zero-length
+ * - write that lies in a page that is completely beyond EOF
+ * - write that covers the the page from start to EOF or beyond it
+ *
+ * If any of these criteria are met, then zero out the unwritten parts
+ * of the page and return true. Otherwise, return false.
+ */
+static bool prep_noread_page(struct page *page, loff_t pos, size_t len)
 {
-	unsigned int i;
+	struct inode *inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	pgoff_t index = pos / thp_size(page);
+	size_t offset = offset_in_page(pos);
+
+	/* full page write */
+	if (offset == 0 && len >= thp_size(page))
+		goto zero_out;
 
-	for (i = 0; i < thp_nr_pages(page); i++)
-		clear_highpage(page + i);
+	/* zero-length file */
+	if (i_size == 0)
+		goto zero_out;
+
+	/* pos beyond last page in the file */
+	if (index > ((i_size - 1) / thp_size(page)))
+		goto zero_out;
+
+	/* write that covers the whole page from start to EOF or beyond it */
+	if (offset == 0 && (pos + len) >= i_size)
+		goto zero_out;
+
+	return false;
+zero_out:
+	zero_user_segments(page, 0, offset, offset + len, thp_size(page));
+	return true;
 }
 
 /**
@@ -1061,8 +1097,6 @@  int netfs_write_begin(struct file *file, struct address_space *mapping,
 	struct inode *inode = file_inode(file);
 	unsigned int debug_index = 0;
 	pgoff_t index = pos >> PAGE_SHIFT;
-	int pos_in_page = pos & ~PAGE_MASK;
-	loff_t size;
 	int ret;
 
 	DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
@@ -1090,13 +1124,8 @@  int netfs_write_begin(struct file *file, struct address_space *mapping,
 	 * within the cache granule containing the EOF, in which case we need
 	 * to preload the granule.
 	 */
-	size = i_size_read(inode);
 	if (!ops->is_cache_enabled(inode) &&
-	    ((pos_in_page == 0 && len == thp_size(page)) ||
-	     (pos >= size) ||
-	     (pos_in_page == 0 && (pos + len) >= size))) {
-		netfs_clear_thp(page);
-		SetPageUptodate(page);
+	    prep_noread_page(page, pos, len)) {
 		netfs_stat(&netfs_n_rh_write_zskip);
 		goto have_page_no_wait;
 	}