Message ID | 20240218085028.3294332-1-yajun.deng@linux.dev (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/mmap: Add case 9 in vma_merge() | expand |
On Sun, Feb 18, 2024 at 04:50:28PM +0800, Yajun Deng wrote: > If the prev vma exists and the end is less than the end of prev, we > can return NULL immediately. This reduces unnecessary operations. > > Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Adding Vlastimil, while get_maintainers.pl might not show it very clearly, myself, Vlastimil and Liam often work with vma_merge() so it's handy to cc us on these if you can! > --- > mm/mmap.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 8f176027583c..b738849321c0 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, > * > * **** **** **** > * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC > - * cannot merge might become might become > + * cannot merge 9 might become might become While I welcome your interest here :) I am not a fan of the 'case' approach to this function as-is and plan to heavily refactor this when I get a chance. But at any rate, an early-exit situation is not a merge case, merge cases describe cases where we _can_ merge, so we can drop this case 9 stuff (this is not your fault, it's understandable why you would label this, this function is just generally unclear). > * PPNNNNNNNNNN PPPPPPPPPPCC > * mmap, brk or case 4 below case 5 below > * mremap move: > @@ -890,6 +890,9 @@ static struct vm_area_struct > if (vm_flags & VM_SPECIAL) > return NULL; > > + if (prev && end < prev->vm_end) /* case 9 */ > + return NULL; > + I need to get back into vma_merge() head space, but I don't actually think a caller that's behaving correctly should ever do this. I know the ASCII diagram above lists it as a thing that can happen, but I think we implicitly avoid this from the way we invoke callers. Either prev == vma as per vma_merge_extend(), or the loops that invoke vma_merge_new_vma() wouldn't permit this to occur. Let me look into it more deeply + reply again a bit later, I mean we could perhaps do with asserting this somehow, but I don't think it's useful to do an early exit for something that ostensibly _shouldn't_ happen. > /* Does the input range span an existing VMA? (cases 5 - 8) */ > curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end); > > -- > 2.25.1 >
On 2024/2/19 07:03, Lorenzo Stoakes wrote: > On Sun, Feb 18, 2024 at 04:50:28PM +0800, Yajun Deng wrote: >> If the prev vma exists and the end is less than the end of prev, we >> can return NULL immediately. This reduces unnecessary operations. >> >> Signed-off-by: Yajun Deng <yajun.deng@linux.dev> > Adding Vlastimil, while get_maintainers.pl might not show it very clearly, > myself, Vlastimil and Liam often work with vma_merge() so it's handy to cc > us on these if you can! Okay. >> --- >> mm/mmap.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 8f176027583c..b738849321c0 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, >> * >> * **** **** **** >> * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC >> - * cannot merge might become might become >> + * cannot merge 9 might become might become > While I welcome your interest here :) I am not a fan of the 'case' approach > to this function as-is and plan to heavily refactor this when I get a chance. > > But at any rate, an early-exit situation is not a merge case, merge cases > describe cases where we _can_ merge, so we can drop this case 9 stuff (this > is not your fault, it's understandable why you would label this, this > function is just generally unclear). Yes, it's not a merge case. I label this to make it easier to understand. >> * PPNNNNNNNNNN PPPPPPPPPPCC >> * mmap, brk or case 4 below case 5 below >> * mremap move: >> @@ -890,6 +890,9 @@ static struct vm_area_struct >> if (vm_flags & VM_SPECIAL) >> return NULL; >> >> + if (prev && end < prev->vm_end) /* case 9 */ >> + return NULL; >> + > I need to get back into vma_merge() head space, but I don't actually think > a caller that's behaving correctly should ever do this. I know the ASCII > diagram above lists it as a thing that can happen, but I think we > implicitly avoid this from the way we invoke callers. Either prev == vma as > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma() > wouldn't permit this to occur. No, it will actually happen. That's why I submitted this patch. > Let me look into it more deeply + reply again a bit later, I mean we could > perhaps do with asserting this somehow, but I don't think it's useful to do > an early exit for something that ostensibly _shouldn't_ happen. > >> /* Does the input range span an existing VMA? (cases 5 - 8) */ >> curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end); >> >> -- >> 2.25.1 >>
* Yajun Deng <yajun.deng@linux.dev> [240219 22:00]: > > On 2024/2/19 07:03, Lorenzo Stoakes wrote: > > On Sun, Feb 18, 2024 at 04:50:28PM +0800, Yajun Deng wrote: > > > If the prev vma exists and the end is less than the end of prev, we > > > can return NULL immediately. This reduces unnecessary operations. > > > > > > Signed-off-by: Yajun Deng <yajun.deng@linux.dev> > > Adding Vlastimil, while get_maintainers.pl might not show it very clearly, > > myself, Vlastimil and Liam often work with vma_merge() so it's handy to cc > > us on these if you can! > Okay. > > > --- > > > mm/mmap.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > index 8f176027583c..b738849321c0 100644 > > > --- a/mm/mmap.c > > > +++ b/mm/mmap.c > > > @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, > > > * > > > * **** **** **** > > > * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC > > > - * cannot merge might become might become > > > + * cannot merge 9 might become might become > > While I welcome your interest here :) I am not a fan of the 'case' approach > > to this function as-is and plan to heavily refactor this when I get a chance. > > > > But at any rate, an early-exit situation is not a merge case, merge cases > > describe cases where we _can_ merge, so we can drop this case 9 stuff (this > > is not your fault, it's understandable why you would label this, this > > function is just generally unclear). > > Yes, it's not a merge case. I label this to make it easier to understand. But it isn't. It's not a case at all, it's a failure to merge. > > > > * PPNNNNNNNNNN PPPPPPPPPPCC > > > * mmap, brk or case 4 below case 5 below > > > * mremap move: > > > @@ -890,6 +890,9 @@ static struct vm_area_struct > > > if (vm_flags & VM_SPECIAL) > > > return NULL; > > > > > > + if (prev && end < prev->vm_end) /* case 9 */ > > > + return NULL; > > > + > > I need to get back into vma_merge() head space, but I don't actually think > > a caller that's behaving correctly should ever do this. I know the ASCII > > diagram above lists it as a thing that can happen, but I think we > > implicitly avoid this from the way we invoke callers. Either prev == vma as > > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma() > > wouldn't permit this to occur. > No, it will actually happen. That's why I submitted this patch. Can you elaborate on where it happens? I mean, you seem to have already looked into it but haven't shared what you found of where it reduces the unnecessary operations. Such a detail should also be added to the commit log so that, when the call sites change, this check could be dropped - or be seen as necessary.
On Tue, Feb 20, 2024 at 11:00:30AM +0800, Yajun Deng wrote: > > On 2024/2/19 07:03, Lorenzo Stoakes wrote: [snip] > > Yes, it's not a merge case. I label this to make it easier to understand. OK, I guess I have to be more explicit + less soft here to avoid confusion as you seem not to be paying attention to what I have said - We can't have this in the patch, full stop. I (+ Liam) have already explained above as to why, but to emphasise - each case number refers to a merge case consistently throughout. Arbitrarily adding a new case label to describe one of the many early exit conditions proactively HURTS understanding. > > > > * PPNNNNNNNNNN PPPPPPPPPPCC > > > * mmap, brk or case 4 below case 5 below > > > * mremap move: > > > @@ -890,6 +890,9 @@ static struct vm_area_struct > > > if (vm_flags & VM_SPECIAL) > > > return NULL; > > > > > > + if (prev && end < prev->vm_end) /* case 9 */ > > > + return NULL; > > > + > > I need to get back into vma_merge() head space, but I don't actually think > > a caller that's behaving correctly should ever do this. I know the ASCII > > diagram above lists it as a thing that can happen, but I think we > > implicitly avoid this from the way we invoke callers. Either prev == vma as > > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma() > > wouldn't permit this to occur. > No, it will actually happen. That's why I submitted this patch. You aren't explaining any situation where this would happen. As Liam says, this is something you have to provide. I have taken a moment to look into this and I am afraid I don't feel this patch makes sense. Firstly, let's assume you're right and we can reach this function with end < prev->vm_end: 1. curr will be NULL as find_vma_intersection(mm, prev->vm_end, end) will always find nothing since end < prev->vm_end. 2. We discover next by using vma_lookup(mm, end). This will always be NULL since no VMA starts at end (it is < prev->vm_end so within prev). 3. Therefore next will always be NULL. 4. Therefore the only situation in which the function would proceed is that checked in the 'if (prev)' block, but that checks whether addr == prev->vm_end, but since end < prev->vm_end, it can't [we explicitly check for addr >= end in a VM_WARN_ON()]. Therefore - we will always abort in this case, and your early check is really not that useful - it's not something that is likely to come up (actually I don't think that it can come up, we'll come on to that), and so being very slightly delayed in exiting is not a great gain. You are then also introducing a fairly useless branch for everybody else for - if it even exists - a very rare scenario. I do not think this is a good RoI. As to whether this can happen - I have dug a bit into callers: 1. vma_merge_extend() always specifies vma->vm_end as the start explicitly to extend the VMA so this scenario isn't possible. 2. Both callers of vma_merge_new_vma() are trying to insert a new VMA and explicitly look for a prev VMA and thus should never trigger this scenario. This leaves vma_modify(), and again I can't see a case where prev would not actually be the previous VMA, with start/end set accordingly. I am happy to be corrected/embarrassed if I'm missed something out here (vma_merge() is a great function for creating confusion + causing unlikely scenarios), so please do provide details of such a case if you can find one. TL;DR: - The case 9 stuff is completely wrong. - I do not think this patch is useful even if the scenario you describe arises. - I can't see how the scenario you describe could arise. So overall, unless you can provide compelling evidence for both this scenario actually occurring in practice AND the need for an early exit, this patch is a no-go. In addition, if you were to find such, you'd really really need to beef out the commit message, which is far too short, and frankly incorrect at this point - if you perform a branch which 99.9999% of the time is not taken, you are not 'reducing unnecessary operations' you are creating them. If you could find compelling evidence to support this patch and send this as a v2 then I'd consider it, but for the patch in its current form: NACK.
On Tue, Feb 20, 2024 at 09:00:15PM +0000, Lorenzo Stoakes wrote: > On Tue, Feb 20, 2024 at 11:00:30AM +0800, Yajun Deng wrote: > > > > On 2024/2/19 07:03, Lorenzo Stoakes wrote: > [snip] > > > > Yes, it's not a merge case. I label this to make it easier to understand. > > OK, I guess I have to be more explicit + less soft here to avoid confusion > as you seem not to be paying attention to what I have said - We can't have > this in the patch, full stop. > > I (+ Liam) have already explained above as to why, but to emphasise - each > case number refers to a merge case consistently throughout. Arbitrarily > adding a new case label to describe one of the many early exit conditions > proactively HURTS understanding. > > > > > > > * PPNNNNNNNNNN PPPPPPPPPPCC > > > > * mmap, brk or case 4 below case 5 below > > > > * mremap move: > > > > @@ -890,6 +890,9 @@ static struct vm_area_struct > > > > if (vm_flags & VM_SPECIAL) > > > > return NULL; > > > > > > > > + if (prev && end < prev->vm_end) /* case 9 */ > > > > + return NULL; > > > > + > > > I need to get back into vma_merge() head space, but I don't actually think > > > a caller that's behaving correctly should ever do this. I know the ASCII > > > diagram above lists it as a thing that can happen, but I think we > > > implicitly avoid this from the way we invoke callers. Either prev == vma as > > > per vma_merge_extend(), or the loops that invoke vma_merge_new_vma() > > > wouldn't permit this to occur. > > No, it will actually happen. That's why I submitted this patch. > > You aren't explaining any situation where this would happen. As Liam says, > this is something you have to provide. > > I have taken a moment to look into this and I am afraid I don't feel this > patch makes sense. > > Firstly, let's assume you're right and we can reach this function with end > < prev->vm_end: > > 1. curr will be NULL as find_vma_intersection(mm, prev->vm_end, end) will > always find nothing since end < prev->vm_end. > > 2. We discover next by using vma_lookup(mm, end). This will always be NULL > since no VMA starts at end (it is < prev->vm_end so within prev). > > 3. Therefore next will always be NULL. > > 4. Therefore the only situation in which the function would proceed is that > checked in the 'if (prev)' block, but that checks whether addr == > prev->vm_end, but since end < prev->vm_end, it can't [we explicitly > check for addr >= end in a VM_WARN_ON()]. OK I say below about feel fre to embarrass me if I am mistaken, but I can go ahead and do embarrass myself directly :) Apologies - the above is not correct, vma_lookup() doesn't look for an exact match so will assign next = prev in this scenario. However the vm_pgoff check will cause this case to fail also. > > Therefore - we will always abort in this case, and your early check is > really not that useful - it's not something that is likely to come up > (actually I don't think that it can come up, we'll come on to that), and so > being very slightly delayed in exiting is not a great gain. > > You are then also introducing a fairly useless branch for everybody else > for - if it even exists - a very rare scenario. I do not think this is a > good RoI. > > As to whether this can happen - I have dug a bit into callers: > > 1. vma_merge_extend() always specifies vma->vm_end as the start explicitly > to extend the VMA so this scenario isn't possible. > > 2. Both callers of vma_merge_new_vma() are trying to insert a new VMA and > explicitly look for a prev VMA and thus should never trigger this > scenario. > > This leaves vma_modify(), and again I can't see a case where prev would not > actually be the previous VMA, with start/end set accordingly. > > I am happy to be corrected/embarrassed if I'm missed something out here > (vma_merge() is a great function for creating confusion + causing unlikely > scenarios), so please do provide details of such a case if you can find > one. Also, I discovered that I can *ahem* reliably reproduce this scenario, so apologies, you were right to say it can arise (but it would have been useful for you to give details!) I can repro it in mprotect_fixup(), quite reliably, so it might be worth digging into why this scenario is happening. > > TL;DR: > > - The case 9 stuff is completely wrong. > - I do not think this patch is useful even if the scenario you describe > arises. > - I can't see how the scenario you describe could arise. > > So overall, unless you can provide compelling evidence for both this > scenario actually occurring in practice AND the need for an early exit, > this patch is a no-go. > > In addition, if you were to find such, you'd really really need to beef out > the commit message, which is far too short, and frankly incorrect at this > point - if you perform a branch which 99.9999% of the time is not taken, > you are not 'reducing unnecessary operations' you are creating them. > > If you could find compelling evidence to support this patch and send this > as a v2 then I'd consider it, but for the patch in its current form: > > NACK. I'll stick with the NACK (though slightly less assuredly), as the case 9 stuff has to go, that is the real bugbear here for me. It's more review feedback, but the commit message needs beefing up in any case too. I'm still not hugely convinced this is a worthwhile early exit, since we don't treat any other 'can't merge' case specially here. I think the more interesting thing to do here is to figure out why mprotect_fix() (or perhaps others) is even trying the merge under these circumstances. Again as I said in the first reply, thanks for your efforts looking at this! The nacking + such is simply an effort to ensure we are being extremely careful around this highly delicate function. Thanks, Lorenzo
diff --git a/mm/mmap.c b/mm/mmap.c index 8f176027583c..b738849321c0 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -827,7 +827,7 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, * * **** **** **** * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC - * cannot merge might become might become + * cannot merge 9 might become might become * PPNNNNNNNNNN PPPPPPPPPPCC * mmap, brk or case 4 below case 5 below * mremap move: @@ -890,6 +890,9 @@ static struct vm_area_struct if (vm_flags & VM_SPECIAL) return NULL; + if (prev && end < prev->vm_end) /* case 9 */ + return NULL; + /* Does the input range span an existing VMA? (cases 5 - 8) */ curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
If the prev vma exists and the end is less than the end of prev, we can return NULL immediately. This reduces unnecessary operations. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> --- mm/mmap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)