diff mbox series

[RFC] mm: add a vma to vmacache when addr overlaps the vma range

Message ID 20181016134712.18123-1-richard.weiyang@gmail.com (mailing list archive)
State New, archived
Headers show
Series [RFC] mm: add a vma to vmacache when addr overlaps the vma range | expand

Commit Message

Wei Yang Oct. 16, 2018, 1:47 p.m. UTC
vmacache will cache the latest visited vma to improve the performance of
find_vma(). While current implementation would cache a vma even its range
doesn't overlap with addr.

This entry in vmacache will be a dummy entry, since the vmacache_find()
only returns a vma when its range overlap with addr.

This patch avoid to add a vma to vmacache when the range doesn't
overlap.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---

Based on my understanding, this change would put more accurate vma entry in the
cache, which means reduce unnecessary vmacache update and vmacache find.

But the test result is not as expected. From the original changelog, I don't
see the explanation to add this non-overlap entry into the vmacache, so
curious about why this performs a little better than putting an overlapped
entry.

Below is the test result for building kernel in two cases:

         make -j4                   make -j8
base-line:

real    6m15.947s          real    5m11.684s
user    21m14.481s         user    27m23.471s
sys     2m34.407s          sys     3m13.233s

real    6m16.089s          real    5m11.445s
user    21m18.295s         user    27m24.045s
sys     2m35.551s          sys     3m13.443s

real    6m16.239s          real    5m11.218s
user    21m17.590s         user    27m19.133s
sys     2m35.252s          sys     3m12.684s

patched:

real    6m15.416s          real    5m10.810s
user    21m21.800s         user    27m25.223s
sys     2m33.398s          sys     3m14.784s

real    6m15.114s          real    5m12.285s
user    21m19.986s         user    27m32.055s
sys     2m34.718s          sys     3m13.107s


real    6m16.206s          real    5m11.509s
user    21m22.557s         user    27m28.265s
sys     2m35.637s          sys     3m12.747s


---
 mm/mmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Matthew Wilcox Oct. 16, 2018, 1:54 p.m. UTC | #1
On Tue, Oct 16, 2018 at 09:47:12PM +0800, Wei Yang wrote:
> Based on my understanding, this change would put more accurate vma entry in the
> cache, which means reduce unnecessary vmacache update and vmacache find.
> 
> But the test result is not as expected. From the original changelog, I don't
> see the explanation to add this non-overlap entry into the vmacache, so
> curious about why this performs a little better than putting an overlapped
> entry.

What makes you think this performs any differently for this test-case?
The numbers all seem to fall within a "reasonable variation" range to me.
You're going to need to do some statistics (with a much larger sample
size) to know whether there's any difference at all.

> Below is the test result for building kernel in two cases:
> 
>          make -j4                   make -j8
> base-line:
> 
> real    6m15.947s          real    5m11.684s
> user    21m14.481s         user    27m23.471s
> sys     2m34.407s          sys     3m13.233s
> 
> real    6m16.089s          real    5m11.445s
> user    21m18.295s         user    27m24.045s
> sys     2m35.551s          sys     3m13.443s
> 
> real    6m16.239s          real    5m11.218s
> user    21m17.590s         user    27m19.133s
> sys     2m35.252s          sys     3m12.684s
> 
> patched:
> 
> real    6m15.416s          real    5m10.810s
> user    21m21.800s         user    27m25.223s
> sys     2m33.398s          sys     3m14.784s
> 
> real    6m15.114s          real    5m12.285s
> user    21m19.986s         user    27m32.055s
> sys     2m34.718s          sys     3m13.107s
> 
> 
> real    6m16.206s          real    5m11.509s
> user    21m22.557s         user    27m28.265s
> sys     2m35.637s          sys     3m12.747s
> 
> 
> ---
>  mm/mmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2e0daf666f42..dda495d84862 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2214,7 +2214,7 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
>  			rb_node = rb_node->rb_right;
>  	}
>  
> -	if (vma)
> +	if (vma && vma->vm_start <= addr)
>  		vmacache_update(addr, vma);
>  	return vma;
>  }
> -- 
> 2.15.1
>
Wei Yang Oct. 16, 2018, 2:17 p.m. UTC | #2
On Tue, Oct 16, 2018 at 06:54:04AM -0700, Matthew Wilcox wrote:
>On Tue, Oct 16, 2018 at 09:47:12PM +0800, Wei Yang wrote:
>> Based on my understanding, this change would put more accurate vma entry in the
>> cache, which means reduce unnecessary vmacache update and vmacache find.
>> 
>> But the test result is not as expected. From the original changelog, I don't
>> see the explanation to add this non-overlap entry into the vmacache, so
>> curious about why this performs a little better than putting an overlapped
>> entry.
>
>What makes you think this performs any differently for this test-case?
>The numbers all seem to fall within a "reasonable variation" range to me.
>You're going to need to do some statistics (with a much larger sample
>size) to know whether there's any difference at all.
>

Matthew,

Thanks for your comment.

I use this test-case because I have little experience in performance
test and I see the original author lists the hit-rate improvement in
kernel build test.

I am thinking to evaluate the cache hit-rate, while I don't know how to
gather the statistic.

If you could give me some hint on the statistics gathering or a more
proper test case, I would appreciate a lot.

BTW, I don't get your point on "a much larger sample size". To map a
larger memory area? I lost at this point.
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index 2e0daf666f42..dda495d84862 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2214,7 +2214,7 @@  struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
 			rb_node = rb_node->rb_right;
 	}
 
-	if (vma)
+	if (vma && vma->vm_start <= addr)
 		vmacache_update(addr, vma);
 	return vma;
 }