[4/4] mm: use the cached page for filemap_fault

Message ID	20181130195812.19536-5-josef@toxicpanda.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: Josef Bacik <josef@toxicpanda.com> To: kernel-team@fb.com, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, tj@kernel.org, david@fromorbit.com, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, riel@redhat.com, jack@suse.cz Subject: [PATCH 4/4] mm: use the cached page for filemap_fault Date: Fri, 30 Nov 2018 14:58:12 -0500 Message-Id: <20181130195812.19536-5-josef@toxicpanda.com> In-Reply-To: <20181130195812.19536-1-josef@toxicpanda.com> References: <20181130195812.19536-1-josef@toxicpanda.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk
Series	drop the mmap_sem when doing IO in the fault path \| expand [0/4,V4] drop the mmap_sem when doing IO in the fault path [1/4] mm: infrastructure for page fault page caching [2/4] filemap: kill page_cache_read usage in filemap_fault [3/4] filemap: drop the mmap_sem for all blocking operations [4/4] mm: use the cached page for filemap_fault

Message ID

20181130195812.19536-5-josef@toxicpanda.com (mailing list archive)

State

New, archived

Headers

From: Josef Bacik <josef@toxicpanda.com>
To: kernel-team@fb.com, hannes@cmpxchg.org,
        linux-kernel@vger.kernel.org, tj@kernel.org, david@fromorbit.com,
        akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
        linux-mm@kvack.org, riel@redhat.com, jack@suse.cz
Subject: [PATCH 4/4] mm: use the cached page for filemap_fault
Date: Fri, 30 Nov 2018 14:58:12 -0500
Message-Id: <20181130195812.19536-5-josef@toxicpanda.com>
In-Reply-To: <20181130195812.19536-1-josef@toxicpanda.com>
References: <20181130195812.19536-1-josef@toxicpanda.com>
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk

Series

drop the mmap_sem when doing IO in the fault path | expand

Commit Message

Josef Bacik Nov. 30, 2018, 7:58 p.m. UTC

If we drop the mmap_sem we have to redo the vma lookup which requires
redoing the fault handler.  Chances are we will just come back to the
same page, so save this page in our vmf->cached_page and reuse it in the
next loop through the fault handler.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 mm/filemap.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

Comments

Andrew Morton Dec. 4, 2018, 10:50 p.m. UTC | #1

On Fri, 30 Nov 2018 14:58:12 -0500 Josef Bacik <josef@toxicpanda.com> wrote:

> If we drop the mmap_sem we have to redo the vma lookup which requires
> redoing the fault handler.  Chances are we will just come back to the
> same page, so save this page in our vmf->cached_page and reuse it in the
> next loop through the fault handler.
> 

Is this really worthwhile?  Rerunning the fault handler is rare (we
hope) and a single pagecache lookup is fast.

Some performance testing results would be helpful here.  It's
practically obligatory when claiming a performance improvement.

Josef Bacik Dec. 5, 2018, 2:58 p.m. UTC | #2

On Tue, Dec 04, 2018 at 02:50:34PM -0800, Andrew Morton wrote:
> On Fri, 30 Nov 2018 14:58:12 -0500 Josef Bacik <josef@toxicpanda.com> wrote:
> 
> > If we drop the mmap_sem we have to redo the vma lookup which requires
> > redoing the fault handler.  Chances are we will just come back to the
> > same page, so save this page in our vmf->cached_page and reuse it in the
> > next loop through the fault handler.
> > 
> 
> Is this really worthwhile?  Rerunning the fault handler is rare (we
> hope) and a single pagecache lookup is fast.
> 
> Some performance testing results would be helpful here.  It's
> practically obligatory when claiming a performance improvement.
> 
> 

Honestly the big thing is just not doing IO under the mmap_sem.  I had this
infrastructure originally for the mkwrite portion of these patches that I
dropped, because I was worried about the page being messed with after we did all
the mkwrite work.  However since I'm not doing that anymore there's less of a
need for it.  I have no performance numbers for this, just seemed like a good
idea since we are likely to just have the page again, and this keeps us from
evicting the page right away and causing more thrashing.

I'll try and set something up to see if there's a difference.  If there's no
difference do you want me to drop this?  Thanks,

Josef

Jan Kara Dec. 7, 2018, 11:03 a.m. UTC | #3

On Wed 05-12-18 09:58:10, Josef Bacik wrote:
> On Tue, Dec 04, 2018 at 02:50:34PM -0800, Andrew Morton wrote:
> > On Fri, 30 Nov 2018 14:58:12 -0500 Josef Bacik <josef@toxicpanda.com> wrote:
> > 
> > > If we drop the mmap_sem we have to redo the vma lookup which requires
> > > redoing the fault handler.  Chances are we will just come back to the
> > > same page, so save this page in our vmf->cached_page and reuse it in the
> > > next loop through the fault handler.
> > > 
> > 
> > Is this really worthwhile?  Rerunning the fault handler is rare (we
> > hope) and a single pagecache lookup is fast.
> > 
> > Some performance testing results would be helpful here.  It's
> > practically obligatory when claiming a performance improvement.
> > 
> > 
> 
> Honestly the big thing is just not doing IO under the mmap_sem.  I had this
> infrastructure originally for the mkwrite portion of these patches that I
> dropped, because I was worried about the page being messed with after we did all
> the mkwrite work.  However since I'm not doing that anymore there's less of a
> need for it.  I have no performance numbers for this, just seemed like a good
> idea since we are likely to just have the page again, and this keeps us from
> evicting the page right away and causing more thrashing.
> 
> I'll try and set something up to see if there's a difference.  If there's no
> difference do you want me to drop this?  Thanks,

If there's no difference, I'd like to drop this as well. It just
complicates the fault state handling which is already complex enough.

								Honza

diff --git a/mm/filemap.c b/mm/filemap.c
index 5e76b24b2a0f..d4385b704e04 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2392,6 +2392,35 @@  static struct file *do_async_mmap_readahead(struct vm_area_struct *vma,
 	return fpin;
 }
 
+static int vmf_has_cached_page(struct vm_fault *vmf, struct page **page)
+{
+	struct page *cached_page = vmf->cached_page;
+	struct mm_struct *mm = vmf->vma->vm_mm;
+	struct address_space *mapping = vmf->vma->vm_file->f_mapping;
+	pgoff_t offset = vmf->pgoff;
+
+	if (!cached_page)
+		return 0;
+
+	if (vmf->flags & FAULT_FLAG_KILLABLE) {
+		int ret = lock_page_killable(cached_page);
+		if (ret) {
+			up_read(&mm->mmap_sem);
+			return ret;
+		}
+	} else
+		lock_page(cached_page);
+	vmf->cached_page = NULL;
+	if (cached_page->mapping == mapping &&
+	    cached_page->index == offset) {
+		*page = cached_page;
+	} else {
+		unlock_page(cached_page);
+		put_page(cached_page);
+	}
+	return 0;
+}
+
 /**
  * filemap_fault - read in file data for page fault handling
  * @vmf:	struct vm_fault containing details of the fault
@@ -2425,13 +2454,24 @@  vm_fault_t filemap_fault(struct vm_fault *vmf)
 	struct inode *inode = mapping->host;
 	pgoff_t offset = vmf->pgoff;
 	pgoff_t max_off;
-	struct page *page;
+	struct page *page = NULL;
 	vm_fault_t ret = 0;
 
 	max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
 	if (unlikely(offset >= max_off))
 		return VM_FAULT_SIGBUS;
 
+	/*
+	 * We may have read in the page already and have a page from an earlier
+	 * loop.  If so we need to see if this page is still valid, and if not
+	 * do the whole dance over again.
+	 */
+	error = vmf_has_cached_page(vmf, &page);
+	if (error)
+		goto out_retry;
+	if (page)
+		goto have_cached_page;
+
 	/*
 	 * Do we have something in the page cache already?
 	 */
@@ -2492,6 +2532,7 @@  vm_fault_t filemap_fault(struct vm_fault *vmf)
 		put_page(page);
 		goto retry_find;
 	}
+have_cached_page:
 	VM_BUG_ON_PAGE(page->index != offset, page);
 
 	/*
@@ -2558,7 +2599,7 @@  vm_fault_t filemap_fault(struct vm_fault *vmf)
 	 * page.
 	 */
 	if (page)
-		put_page(page);
+		vmf->cached_page = page;
 	if (fpin)
 		fput(fpin);
 	return ret | VM_FAULT_RETRY;

[4/4] mm: use the cached page for filemap_fault

Commit Message

Comments

Patch