diff mbox series

mm/mmap: Add case 9 in vma_merge()

Message ID 20240218085028.3294332-1-yajun.deng@linux.dev (mailing list archive)
State New
Headers show
Series mm/mmap: Add case 9 in vma_merge() | expand

Commit Message

Yajun Deng Feb. 18, 2024, 8:50 a.m. UTC
If the prev vma exists and the end is less than the end of prev, we
can return NULL immediately. This reduces unnecessary operations.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
---
 mm/mmap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Lorenzo Stoakes Feb. 18, 2024, 11:03 p.m. UTC | #1
On Sun, Feb 18, 2024 at 04:50:28PM +0800, Yajun Deng wrote:
> If the prev vma exists and the end is less than the end of prev, we
> can return NULL immediately. This reduces unnecessary operations.
>
> Signed-off-by: Yajun Deng <yajun.deng@linux.dev>

Adding Vlastimil, while get_maintainers.pl might not show it very clearly,
myself, Vlastimil and Liam often work with vma_merge() so it's handy to cc
us on these if you can!

> ---
>  mm/mmap.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 8f176027583c..b738849321c0 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
>   *
>   *     ****             ****                   ****
>   *    PPPPPPNNNNNN    PPPPPPNNNNNN       PPPPPPCCCCCC
> - *    cannot merge    might become       might become
> + *    cannot merge 9  might become       might become

While I welcome your interest here :) I am not a fan of the 'case' approach
to this function as-is and plan to heavily refactor this when I get a chance.

But at any rate, an early-exit situation is not a merge case, merge cases
describe cases where we _can_ merge, so we can drop this case 9 stuff (this
is not your fault, it's understandable why you would label this, this
function is just generally unclear).

>   *                    PPNNNNNNNNNN       PPPPPPPPPPCC
>   *    mmap, brk or    case 4 below       case 5 below
>   *    mremap move:
> @@ -890,6 +890,9 @@ static struct vm_area_struct
>  	if (vm_flags & VM_SPECIAL)
>  		return NULL;
>
> +	if (prev && end < prev->vm_end) /* case 9 */
> +		return NULL;
> +

I need to get back into vma_merge() head space, but I don't actually think
a caller that's behaving correctly should ever do this. I know the ASCII
diagram above lists it as a thing that can happen, but I think we
implicitly avoid this from the way we invoke callers. Either prev == vma as
per vma_merge_extend(), or the loops that invoke vma_merge_new_vma()
wouldn't permit this to occur.

Let me look into it more deeply + reply again a bit later, I mean we could
perhaps do with asserting this somehow, but I don't think it's useful to do
an early exit for something that ostensibly _shouldn't_ happen.

>  	/* Does the input range span an existing VMA? (cases 5 - 8) */
>  	curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
>
> --
> 2.25.1
>
Yajun Deng Feb. 20, 2024, 3 a.m. UTC | #2
On 2024/2/19 07:03, Lorenzo Stoakes wrote:
> On Sun, Feb 18, 2024 at 04:50:28PM +0800, Yajun Deng wrote:
>> If the prev vma exists and the end is less than the end of prev, we
>> can return NULL immediately. This reduces unnecessary operations.
>>
>> Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
> Adding Vlastimil, while get_maintainers.pl might not show it very clearly,
> myself, Vlastimil and Liam often work with vma_merge() so it's handy to cc
> us on these if you can!
Okay.
>> ---
>>   mm/mmap.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 8f176027583c..b738849321c0 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
>>    *
>>    *     ****             ****                   ****
>>    *    PPPPPPNNNNNN    PPPPPPNNNNNN       PPPPPPCCCCCC
>> - *    cannot merge    might become       might become
>> + *    cannot merge 9  might become       might become
> While I welcome your interest here :) I am not a fan of the 'case' approach
> to this function as-is and plan to heavily refactor this when I get a chance.
>
> But at any rate, an early-exit situation is not a merge case, merge cases
> describe cases where we _can_ merge, so we can drop this case 9 stuff (this
> is not your fault, it's understandable why you would label this, this
> function is just generally unclear).

Yes, it's not a merge case. I label this to make it easier to understand.

>>    *                    PPNNNNNNNNNN       PPPPPPPPPPCC
>>    *    mmap, brk or    case 4 below       case 5 below
>>    *    mremap move:
>> @@ -890,6 +890,9 @@ static struct vm_area_struct
>>   	if (vm_flags & VM_SPECIAL)
>>   		return NULL;
>>
>> +	if (prev && end < prev->vm_end) /* case 9 */
>> +		return NULL;
>> +
> I need to get back into vma_merge() head space, but I don't actually think
> a caller that's behaving correctly should ever do this. I know the ASCII
> diagram above lists it as a thing that can happen, but I think we
> implicitly avoid this from the way we invoke callers. Either prev == vma as
> per vma_merge_extend(), or the loops that invoke vma_merge_new_vma()
> wouldn't permit this to occur.
No, it will actually happen. That's why I submitted this patch.
> Let me look into it more deeply + reply again a bit later, I mean we could
> perhaps do with asserting this somehow, but I don't think it's useful to do
> an early exit for something that ostensibly _shouldn't_ happen.
>
>>   	/* Does the input range span an existing VMA? (cases 5 - 8) */
>>   	curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
>>
>> --
>> 2.25.1
>>
Liam R. Howlett Feb. 20, 2024, 6:10 p.m. UTC | #3
* Yajun Deng <yajun.deng@linux.dev> [240219 22:00]:
> 
> On 2024/2/19 07:03, Lorenzo Stoakes wrote:
> > On Sun, Feb 18, 2024 at 04:50:28PM +0800, Yajun Deng wrote:
> > > If the prev vma exists and the end is less than the end of prev, we
> > > can return NULL immediately. This reduces unnecessary operations.
> > > 
> > > Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
> > Adding Vlastimil, while get_maintainers.pl might not show it very clearly,
> > myself, Vlastimil and Liam often work with vma_merge() so it's handy to cc
> > us on these if you can!
> Okay.
> > > ---
> > >   mm/mmap.c | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 8f176027583c..b738849321c0 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
> > >    *
> > >    *     ****             ****                   ****
> > >    *    PPPPPPNNNNNN    PPPPPPNNNNNN       PPPPPPCCCCCC
> > > - *    cannot merge    might become       might become
> > > + *    cannot merge 9  might become       might become
> > While I welcome your interest here :) I am not a fan of the 'case' approach
> > to this function as-is and plan to heavily refactor this when I get a chance.
> > 
> > But at any rate, an early-exit situation is not a merge case, merge cases
> > describe cases where we _can_ merge, so we can drop this case 9 stuff (this
> > is not your fault, it's understandable why you would label this, this
> > function is just generally unclear).
> 
> Yes, it's not a merge case. I label this to make it easier to understand.

But it isn't.  It's not a case at all, it's a failure to merge.

> 
> > >    *                    PPNNNNNNNNNN       PPPPPPPPPPCC
> > >    *    mmap, brk or    case 4 below       case 5 below
> > >    *    mremap move:
> > > @@ -890,6 +890,9 @@ static struct vm_area_struct
> > >   	if (vm_flags & VM_SPECIAL)
> > >   		return NULL;
> > > 
> > > +	if (prev && end < prev->vm_end) /* case 9 */
> > > +		return NULL;
> > > +
> > I need to get back into vma_merge() head space, but I don't actually think
> > a caller that's behaving correctly should ever do this. I know the ASCII
> > diagram above lists it as a thing that can happen, but I think we
> > implicitly avoid this from the way we invoke callers. Either prev == vma as
> > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma()
> > wouldn't permit this to occur.
> No, it will actually happen. That's why I submitted this patch.

Can you elaborate on where it happens?  I mean, you seem to have already
looked into it but haven't shared what you found of where it reduces the
unnecessary operations.

Such a detail should also be added to the commit log so that, when the
call sites change, this check could be dropped - or be seen as
necessary.
Lorenzo Stoakes Feb. 20, 2024, 9 p.m. UTC | #4
On Tue, Feb 20, 2024 at 11:00:30AM +0800, Yajun Deng wrote:
>
> On 2024/2/19 07:03, Lorenzo Stoakes wrote:
[snip]
>
> Yes, it's not a merge case. I label this to make it easier to understand.

OK, I guess I have to be more explicit + less soft here to avoid confusion
as you seem not to be paying attention to what I have said - We can't have
this in the patch, full stop.

I (+ Liam) have already explained above as to why, but to emphasise - each
case number refers to a merge case consistently throughout. Arbitrarily
adding a new case label to describe one of the many early exit conditions
proactively HURTS understanding.

>
> > >    *                    PPNNNNNNNNNN       PPPPPPPPPPCC
> > >    *    mmap, brk or    case 4 below       case 5 below
> > >    *    mremap move:
> > > @@ -890,6 +890,9 @@ static struct vm_area_struct
> > >   	if (vm_flags & VM_SPECIAL)
> > >   		return NULL;
> > >
> > > +	if (prev && end < prev->vm_end) /* case 9 */
> > > +		return NULL;
> > > +
> > I need to get back into vma_merge() head space, but I don't actually think
> > a caller that's behaving correctly should ever do this. I know the ASCII
> > diagram above lists it as a thing that can happen, but I think we
> > implicitly avoid this from the way we invoke callers. Either prev == vma as
> > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma()
> > wouldn't permit this to occur.
> No, it will actually happen. That's why I submitted this patch.

You aren't explaining any situation where this would happen. As Liam says,
this is something you have to provide.

I have taken a moment to look into this and I am afraid I don't feel this
patch makes sense.

Firstly, let's assume you're right and we can reach this function with end
< prev->vm_end:

1. curr will be NULL as find_vma_intersection(mm, prev->vm_end, end) will
   always find nothing since end < prev->vm_end.

2. We discover next by using vma_lookup(mm, end). This will always be NULL
   since no VMA starts at end (it is < prev->vm_end so within prev).

3. Therefore next will always be NULL.

4. Therefore the only situation in which the function would proceed is that
   checked in the 'if (prev)' block, but that checks whether addr ==
   prev->vm_end, but since end < prev->vm_end, it can't [we explicitly
   check for addr >= end in a VM_WARN_ON()].

Therefore - we will always abort in this case, and your early check is
really not that useful - it's not something that is likely to come up
(actually I don't think that it can come up, we'll come on to that), and so
being very slightly delayed in exiting is not a great gain.

You are then also introducing a fairly useless branch for everybody else
for - if it even exists - a very rare scenario. I do not think this is a
good RoI.

As to whether this can happen - I have dug a bit into callers:

1. vma_merge_extend() always specifies vma->vm_end as the start explicitly
   to extend the VMA so this scenario isn't possible.

2. Both callers of vma_merge_new_vma() are trying to insert a new VMA and
   explicitly look for a prev VMA and thus should never trigger this
   scenario.

This leaves vma_modify(), and again I can't see a case where prev would not
actually be the previous VMA, with start/end set accordingly.

I am happy to be corrected/embarrassed if I'm missed something out here
(vma_merge() is a great function for creating confusion + causing unlikely
scenarios), so please do provide details of such a case if you can find
one.

TL;DR:

- The case 9 stuff is completely wrong.
- I do not think this patch is useful even if the scenario you describe
  arises.
- I can't see how the scenario you describe could arise.

So overall, unless you can provide compelling evidence for both this
scenario actually occurring in practice AND the need for an early exit,
this patch is a no-go.

In addition, if you were to find such, you'd really really need to beef out
the commit message, which is far too short, and frankly incorrect at this
point - if you perform a branch which 99.9999% of the time is not taken,
you are not 'reducing unnecessary operations' you are creating them.

If you could find compelling evidence to support this patch and send this
as a v2 then I'd consider it, but for the patch in its current form:

NACK.
Lorenzo Stoakes Feb. 20, 2024, 10:10 p.m. UTC | #5
On Tue, Feb 20, 2024 at 09:00:15PM +0000, Lorenzo Stoakes wrote:
> On Tue, Feb 20, 2024 at 11:00:30AM +0800, Yajun Deng wrote:
> >
> > On 2024/2/19 07:03, Lorenzo Stoakes wrote:
> [snip]
> >
> > Yes, it's not a merge case. I label this to make it easier to understand.
>
> OK, I guess I have to be more explicit + less soft here to avoid confusion
> as you seem not to be paying attention to what I have said - We can't have
> this in the patch, full stop.
>
> I (+ Liam) have already explained above as to why, but to emphasise - each
> case number refers to a merge case consistently throughout. Arbitrarily
> adding a new case label to describe one of the many early exit conditions
> proactively HURTS understanding.
>
> >
> > > >    *                    PPNNNNNNNNNN       PPPPPPPPPPCC
> > > >    *    mmap, brk or    case 4 below       case 5 below
> > > >    *    mremap move:
> > > > @@ -890,6 +890,9 @@ static struct vm_area_struct
> > > >   	if (vm_flags & VM_SPECIAL)
> > > >   		return NULL;
> > > >
> > > > +	if (prev && end < prev->vm_end) /* case 9 */
> > > > +		return NULL;
> > > > +
> > > I need to get back into vma_merge() head space, but I don't actually think
> > > a caller that's behaving correctly should ever do this. I know the ASCII
> > > diagram above lists it as a thing that can happen, but I think we
> > > implicitly avoid this from the way we invoke callers. Either prev == vma as
> > > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma()
> > > wouldn't permit this to occur.
> > No, it will actually happen. That's why I submitted this patch.
>
> You aren't explaining any situation where this would happen. As Liam says,
> this is something you have to provide.
>
> I have taken a moment to look into this and I am afraid I don't feel this
> patch makes sense.
>
> Firstly, let's assume you're right and we can reach this function with end
> < prev->vm_end:
>
> 1. curr will be NULL as find_vma_intersection(mm, prev->vm_end, end) will
>    always find nothing since end < prev->vm_end.
>
> 2. We discover next by using vma_lookup(mm, end). This will always be NULL
>    since no VMA starts at end (it is < prev->vm_end so within prev).
>
> 3. Therefore next will always be NULL.
>
> 4. Therefore the only situation in which the function would proceed is that
>    checked in the 'if (prev)' block, but that checks whether addr ==
>    prev->vm_end, but since end < prev->vm_end, it can't [we explicitly
>    check for addr >= end in a VM_WARN_ON()].

OK I say below about feel fre to embarrass me if I am mistaken, but I can
go ahead and do embarrass myself directly :)

Apologies - the above is not correct, vma_lookup() doesn't look for an
exact match so will assign next = prev in this scenario.

However the vm_pgoff check will cause this case to fail also.

>
> Therefore - we will always abort in this case, and your early check is
> really not that useful - it's not something that is likely to come up
> (actually I don't think that it can come up, we'll come on to that), and so
> being very slightly delayed in exiting is not a great gain.
>
> You are then also introducing a fairly useless branch for everybody else
> for - if it even exists - a very rare scenario. I do not think this is a
> good RoI.
>
> As to whether this can happen - I have dug a bit into callers:
>
> 1. vma_merge_extend() always specifies vma->vm_end as the start explicitly
>    to extend the VMA so this scenario isn't possible.
>
> 2. Both callers of vma_merge_new_vma() are trying to insert a new VMA and
>    explicitly look for a prev VMA and thus should never trigger this
>    scenario.
>
> This leaves vma_modify(), and again I can't see a case where prev would not
> actually be the previous VMA, with start/end set accordingly.
>
> I am happy to be corrected/embarrassed if I'm missed something out here
> (vma_merge() is a great function for creating confusion + causing unlikely
> scenarios), so please do provide details of such a case if you can find
> one.

Also, I discovered that I can *ahem* reliably reproduce this scenario, so
apologies, you were right to say it can arise (but it would have been
useful for you to give details!)

I can repro it in mprotect_fixup(), quite reliably, so it might be worth
digging into why this scenario is happening.

>
> TL;DR:
>
> - The case 9 stuff is completely wrong.
> - I do not think this patch is useful even if the scenario you describe
>   arises.
> - I can't see how the scenario you describe could arise.
>
> So overall, unless you can provide compelling evidence for both this
> scenario actually occurring in practice AND the need for an early exit,
> this patch is a no-go.
>
> In addition, if you were to find such, you'd really really need to beef out
> the commit message, which is far too short, and frankly incorrect at this
> point - if you perform a branch which 99.9999% of the time is not taken,
> you are not 'reducing unnecessary operations' you are creating them.
>
> If you could find compelling evidence to support this patch and send this
> as a v2 then I'd consider it, but for the patch in its current form:
>
> NACK.

I'll stick with the NACK (though slightly less assuredly), as the case 9
stuff has to go, that is the real bugbear here for me. It's more review
feedback, but the commit message needs beefing up in any case too.

I'm still not hugely convinced this is a worthwhile early exit, since we
don't treat any other 'can't merge' case specially here.

I think the more interesting thing to do here is to figure out why
mprotect_fix() (or perhaps others) is even trying the merge under these
circumstances.

Again as I said in the first reply, thanks for your efforts looking at
this! The nacking + such is simply an effort to ensure we are being
extremely careful around this highly delicate function.

Thanks, Lorenzo
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index 8f176027583c..b738849321c0 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -827,7 +827,7 @@  can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
  *
  *     ****             ****                   ****
  *    PPPPPPNNNNNN    PPPPPPNNNNNN       PPPPPPCCCCCC
- *    cannot merge    might become       might become
+ *    cannot merge 9  might become       might become
  *                    PPNNNNNNNNNN       PPPPPPPPPPCC
  *    mmap, brk or    case 4 below       case 5 below
  *    mremap move:
@@ -890,6 +890,9 @@  static struct vm_area_struct
 	if (vm_flags & VM_SPECIAL)
 		return NULL;
 
+	if (prev && end < prev->vm_end) /* case 9 */
+		return NULL;
+
 	/* Does the input range span an existing VMA? (cases 5 - 8) */
 	curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);