diff mbox series

[v4,17/66] mmap: Change zeroing of maple tree in __vma_adjust

Message ID 20211201142918.921493-18-Liam.Howlett@oracle.com (mailing list archive)
State New
Headers show
Series Introducing the Maple Tree | expand

Commit Message

Liam R. Howlett Dec. 1, 2021, 2:29 p.m. UTC
Only write to the maple tree if we are not inserting or the insert isn't
going to overwrite the area to clear.  This avoids spanning writes and
node coealescing when unnecessary.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 mm/mmap.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

Comments

Vlastimil Babka Jan. 12, 2022, 2:55 p.m. UTC | #1
On 12/1/21 15:29, Liam Howlett wrote:
> Only write to the maple tree if we are not inserting or the insert isn't
> going to overwrite the area to clear.  This avoids spanning writes and
> node coealescing when unnecessary.
> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> ---
>  mm/mmap.c | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 93ed19b2c6ce..c5f92666d145 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -615,6 +615,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  	bool vma_changed = false;
>  	long adjust_next = 0;
>  	int remove_next = 0;
> +	unsigned long old_start;
>  
>  	if (next && !insert) {
>  		struct vm_area_struct *exporter = NULL, *importer = NULL;
> @@ -740,25 +741,29 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  			vma_interval_tree_remove(next, root);
>  	}
>  
> +	old_start = vma->vm_start;
>  	if (start != vma->vm_start) {
> -		if (vma->vm_start < start)
> -			vma_mt_szero(mm, vma->vm_start, start);
> -		else
> -			vma_changed = true;
> +		vma_changed = true;

This says vma_changed = true even if vma is shrinking, so below we might do
an unnecessary vma_store(), no?

>  		vma->vm_start = start;
>  	}
>  	if (end != vma->vm_end) {
> -		if (vma->vm_end > end)
> -			vma_mt_szero(mm, end, vma->vm_end);
> -		else
> +		if (vma->vm_end > end) {

In contrast to the above, here vma_changed is only set when expanding
('vma_expand' would be a more descriptive name maybe?). So why are the two
cases handled differently, am I missing something?

> +			if (!insert || (insert && (insert->vm_start != end)))

Note: thanks to lazy evaluation, "insert &&" should be unnecessary?
More importantly: is "insert->vm_start == end" a guarantee that insert
covers the whole interval from end to vma->vm_end? Probably yes, but a
VM_WARN_ON would be in order?

> +				vma_mt_szero(mm, end, vma->vm_end);

I guess it can't happen that insert would cover a later part of this
interval, so we could zero only between end vna insert->vm_start?

> +		} else
>  			vma_changed = true;

Same nit about { } block as previously.

>  		vma->vm_end = end;
>  		if (!next)
>  			mm->highest_vm_end = vm_end_gap(vma);
>  	}
>  
> -	if (vma_changed)
> +	if (vma_changed) {
>  		vma_store(mm, vma);
> +		if (old_start < start) {
> +			if (insert && (insert->vm_start != old_start))
> +				vma_mt_szero(mm, old_start, start);

This condition looks actively wrong, no zeroing at all if insert is NULL?

> +		}
> +	}
>  
>  	vma->vm_pgoff = pgoff;
>  	if (adjust_next) {
Liam R. Howlett Jan. 17, 2022, 8:02 p.m. UTC | #2
* Vlastimil Babka <vbabka@suse.cz> [220112 09:56]:
> On 12/1/21 15:29, Liam Howlett wrote:
> > Only write to the maple tree if we are not inserting or the insert isn't
> > going to overwrite the area to clear.  This avoids spanning writes and
> > node coealescing when unnecessary.
> > 
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> > ---
> >  mm/mmap.c | 21 +++++++++++++--------
> >  1 file changed, 13 insertions(+), 8 deletions(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 93ed19b2c6ce..c5f92666d145 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -615,6 +615,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,

This function is the worst.

> >  	bool vma_changed = false;
> >  	long adjust_next = 0;
> >  	int remove_next = 0;
> > +	unsigned long old_start;
> >  
> >  	if (next && !insert) {
> >  		struct vm_area_struct *exporter = NULL, *importer = NULL;
> > @@ -740,25 +741,29 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
> >  			vma_interval_tree_remove(next, root);
> >  	}
> >  
> > +	old_start = vma->vm_start;
> >  	if (start != vma->vm_start) {
> > -		if (vma->vm_start < start)
> > -			vma_mt_szero(mm, vma->vm_start, start);
> > -		else
> > -			vma_changed = true;
> > +		vma_changed = true;
> 
> This says vma_changed = true even if vma is shrinking, so below we might do
> an unnecessary vma_store(), no?

I think you are correct.  This should be handled like the vm_end case
and how it was handled before.  I should only zero if an insert won't be
overwriting the area that is being zeroed and store in vma->vm_start >
start case.




> 
> >  		vma->vm_start = start;
> >  	}
> >  	if (end != vma->vm_end) {
> > -		if (vma->vm_end > end)
> > -			vma_mt_szero(mm, end, vma->vm_end);
> > -		else
> > +		if (vma->vm_end > end) {
> 
> In contrast to the above, here vma_changed is only set when expanding
> ('vma_expand' would be a more descriptive name maybe?). So why are the two
> cases handled differently, am I missing something?

I should not have altered the above case to be so different.

> 
> > +			if (!insert || (insert && (insert->vm_start != end)))
> 
> Note: thanks to lazy evaluation, "insert &&" should be unnecessary?
> More importantly: is "insert->vm_start == end" a guarantee that insert
> covers the whole interval from end to vma->vm_end? Probably yes, but a
> VM_WARN_ON would be in order?

Yes, the insert will cover the whole interval from end to vma->vm_end,
otherwise it's a split followed by a vma_adjust().  I will add a
VM_WARN_ON for good measure here and above.  I will also fix the
conditions as you pointed out.


> 
> > +				vma_mt_szero(mm, end, vma->vm_end);
> 
> I guess it can't happen that insert would cover a later part of this
> interval, so we could zero only between end vna insert->vm_start?

Well.. it doesn't happen right now. I believe that insert is always
passed as NULL except in the vma_split() case.  If you think about how
this works with the rbtree, it cannot split a vma by partially
overwriting the middle of a vma as that would break the linked list, the
vma start/end would not match what the tree expects, etc.

> 
> > +		} else
> >  			vma_changed = true;
> 
> Same nit about { } block as previously.

ack

> 
> >  		vma->vm_end = end;
> >  		if (!next)
> >  			mm->highest_vm_end = vm_end_gap(vma);
> >  	}
> >  
> > -	if (vma_changed)
> > +	if (vma_changed) {
> >  		vma_store(mm, vma);
> > +		if (old_start < start) {
> > +			if (insert && (insert->vm_start != old_start))
> > +				vma_mt_szero(mm, old_start, start);
> 
> This condition looks actively wrong, no zeroing at all if insert is NULL?

I think you are correct but it did work because we never shrink from the
front today.  The insert is only used on splits so this was never used.
The resizing of a vma is done in shift_arg_pages() which grows lower
then shrinks the back and mremap which grows the back but leaves the
start the same.  The only other place is vma_merge() case 3 and 8 which
also don't need zeroing here as they expand other VMAs over it... case 3
is when a new vma can be merged into the next. case 8 is when a vma is
overwriting an older vma and can be combined into the next vma 

Moving this code back to the first change block simplifies the statement
here to just a store.

> 
> > +		}
> > +	}
> >  
> >  	vma->vm_pgoff = pgoff;
> >  	if (adjust_next) {
>
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index 93ed19b2c6ce..c5f92666d145 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -615,6 +615,7 @@  int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 	bool vma_changed = false;
 	long adjust_next = 0;
 	int remove_next = 0;
+	unsigned long old_start;
 
 	if (next && !insert) {
 		struct vm_area_struct *exporter = NULL, *importer = NULL;
@@ -740,25 +741,29 @@  int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 			vma_interval_tree_remove(next, root);
 	}
 
+	old_start = vma->vm_start;
 	if (start != vma->vm_start) {
-		if (vma->vm_start < start)
-			vma_mt_szero(mm, vma->vm_start, start);
-		else
-			vma_changed = true;
+		vma_changed = true;
 		vma->vm_start = start;
 	}
 	if (end != vma->vm_end) {
-		if (vma->vm_end > end)
-			vma_mt_szero(mm, end, vma->vm_end);
-		else
+		if (vma->vm_end > end) {
+			if (!insert || (insert && (insert->vm_start != end)))
+				vma_mt_szero(mm, end, vma->vm_end);
+		} else
 			vma_changed = true;
 		vma->vm_end = end;
 		if (!next)
 			mm->highest_vm_end = vm_end_gap(vma);
 	}
 
-	if (vma_changed)
+	if (vma_changed) {
 		vma_store(mm, vma);
+		if (old_start < start) {
+			if (insert && (insert->vm_start != old_start))
+				vma_mt_szero(mm, old_start, start);
+		}
+	}
 
 	vma->vm_pgoff = pgoff;
 	if (adjust_next) {