diff mbox series

[01/67] mm: Stop filemap_read() from grabbing a superfluous page

Message ID 163456863216.2614702.6384850026368833133.stgit@warthog.procyon.org.uk (mailing list archive)
State New, archived
Headers show
Series fscache: Rewrite index API and management system | expand

Commit Message

David Howells Oct. 18, 2021, 2:50 p.m. UTC
Under some circumstances, filemap_read() will allocate sufficient pages to
read to the end of the file, call readahead/readpages on them and copy the
data over - and then it will allocate another page at the EOF and call
readpage on that and then ignore it.  This is unnecessary and a waste of
time and resources.

filemap_read() *does* check for this, but only after it has already done
the allocation and I/O.  Fix this by checking before calling
filemap_get_pages() also.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
cc: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/160588481358.3465195.16552616179674485179.stgit@warthog.procyon.org.uk/
---

 mm/filemap.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Jeff Layton Oct. 19, 2021, 5:13 p.m. UTC | #1
On Mon, 2021-10-18 at 15:50 +0100, David Howells wrote:
> Under some circumstances, filemap_read() will allocate sufficient pages to
> read to the end of the file, call readahead/readpages on them and copy the
> data over - and then it will allocate another page at the EOF and call
> readpage on that and then ignore it.  This is unnecessary and a waste of
> time and resources.
> 
> filemap_read() *does* check for this, but only after it has already done
> the allocation and I/O.  Fix this by checking before calling
> filemap_get_pages() also.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
> cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> cc: linux-mm@kvack.org
> cc: linux-fsdevel@vger.kernel.org
> Link: https://lore.kernel.org/r/160588481358.3465195.16552616179674485179.stgit@warthog.procyon.org.uk/
> ---
> 
>  mm/filemap.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index dae481293b5d..c0cdc44c844e 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2625,6 +2625,10 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
>  		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
>  			iocb->ki_flags |= IOCB_NOWAIT;
>  
> +		isize = i_size_read(inode);
> +		if (unlikely(iocb->ki_pos >= isize))
> +			goto put_pages;
> +
>  		error = filemap_get_pages(iocb, iter, &pvec);
>  		if (error < 0)
>  			break;
> 
> 

I would wager that it's worth checking for this. I imagine read calls
beyond EOF are common enough that it's probably helpful to optimize that
case:

Acked-by: Jeff Layton <jlayton@redhat.com>
Matthew Wilcox Oct. 19, 2021, 6:28 p.m. UTC | #2
On Mon, Oct 18, 2021 at 03:50:32PM +0100, David Howells wrote:
> @@ -2625,6 +2625,10 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
>  		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
>  			iocb->ki_flags |= IOCB_NOWAIT;
>  
> +		isize = i_size_read(inode);
> +		if (unlikely(iocb->ki_pos >= isize))
> +			goto put_pages;
> +

Is there a good reason to assign to isize here?  I'd rather not,
because it complicates analysis, and a later change might look at
the isize read here, not realising it was a racy use.  So I'd
rather see:

		if (unlikely(iocb->ki_pos >= i_size_read(inode)))
			goto put_pages;
David Howells Oct. 19, 2021, 6:48 p.m. UTC | #3
Matthew Wilcox <willy@infradead.org> wrote:

> > +		isize = i_size_read(inode);
> > +		if (unlikely(iocb->ki_pos >= isize))
> > +			goto put_pages;
> > +
> 
> Is there a good reason to assign to isize here?  I'd rather not,
> because it complicates analysis, and a later change might look at
> the isize read here, not realising it was a racy use.  So I'd
> rather see:

If we don't set isize, the loop will never end.  Actually, maybe we can just
break out at that point rather than going to put_pages.

David
Matthew Wilcox Oct. 19, 2021, 8:04 p.m. UTC | #4
On Tue, Oct 19, 2021 at 07:48:15PM +0100, David Howells wrote:
> Matthew Wilcox <willy@infradead.org> wrote:
> 
> > > +		isize = i_size_read(inode);
> > > +		if (unlikely(iocb->ki_pos >= isize))
> > > +			goto put_pages;
> > > +
> > 
> > Is there a good reason to assign to isize here?  I'd rather not,
> > because it complicates analysis, and a later change might look at
> > the isize read here, not realising it was a racy use.  So I'd
> > rather see:
> 
> If we don't set isize, the loop will never end.  Actually, maybe we can just
> break out at that point rather than going to put_pages.

Umm, yes, of course.  Sorry.

It makes more sense to just break because we haven't got any pages,
so putting pages that we haven't got seems unnecessary.
>
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index dae481293b5d..c0cdc44c844e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2625,6 +2625,10 @@  ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
 		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
 			iocb->ki_flags |= IOCB_NOWAIT;
 
+		isize = i_size_read(inode);
+		if (unlikely(iocb->ki_pos >= isize))
+			goto put_pages;
+
 		error = filemap_get_pages(iocb, iter, &pvec);
 		if (error < 0)
 			break;