Message ID | 20250127075527.16614-2-richard.weiyang@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | vma: fix unmapped_area() | expand |
You have a subject line of 'fix gap check for unmapped_area with VM_GROWSDOWN'. I'm not sure this is quite accurate. I don't really have time to do a deep dive (again, this is why it's so important to give a decent commit message - explaining under what _real world_ circumstances this will be used etc.). But anyway, it seems it will only be the case if MMF_TOPDOWN is not set in the mm flags, which usually requires an mmap compatibility mode to achieve unless the arch otherwise forces it. And these arches would be ones where the stack grows UP, right? Or at least ones where this is possible? So already we're into specifics - either arches that grow the stack up, or ones that intentionally use the old mmap compatibility mode are affected. This happens in: [ pretty much all unmapped area callers ] -> vm_unmapped_area() -> unmapped_area() (if !(info->flags & VM_UNMAPPED_AREA_TOPDOWN) Where VM_UNMAPPED_AREA_TOPDOWN is only not set in the circumstances mentioned above. So, for this issue you claim is the case to happen, you have to: 1. Either be using a stack grows up arch, or enabling an mmap() compatibility mode. 2. Also set MAP_GROWSDOWN on the mmap() call, which is translated to VM_GROWSDOWN. We are already far from 'fix gap check for VM_GROWSDOWN' right? I mean I don't have the time to absolutely dive into the guts of this, but I assume this is correct right? I'm not saying we shouldn't address this, but it's VITAL to clarify what exactly it is you're tackling. On Mon, Jan 27, 2025 at 07:55:26AM +0000, Wei Yang wrote: > Current unmapped_area() may fail to get the available area. > > For example, we have a vma range like below: > > 0123456789abcdef > m m A m m I don't understand this diagram at all. What is going on here? What is 'm' what is 'A' what are these values? page offsets * 0x1000? is that a page of memory allocated at each 'm'? Is A somehow an address under consideration? You _have_ to add a key and explanation here, my mind reading skills are much diminished in my old age... :P > > Let's assume start_gap is 0x2000 and stack_guard_gap is 0x1000. And now > we are looking for free area with size 0x1000 within [0x2000, 0xd000]. How can start_gap be 0x2000 when it is only ever 0x1000 at most and only applicable in x86-64 anyway? > It'd be good if you'd shown this on the diagram somehow? Like this: <---------> 0123456789abcdef m m A m m But then I'm confused as to what A is once again. Ideally you'd actually provide what the struct vm_unmapped_area_info fields actually are with other parameters and _work through_ an example. Also you're now talking about a stack but you haven't mentioned the word 'stack' anywhere in any part of this series afaict. 'Fix case where the arch grows stacks upwards or we are in legacy mmap mode but still want to map a grows-down stack' is a LOT more specific than 'fix unmapped_area()'. > The unmapped_area_topdown() could find address at 0x8000, while > unmapped_area() fails. OK you need to WORK THROUGH why this is. You're putting all the work on me as a reviewer to go check to see if this is indeed the case. It's not a fair distribution of work. > > In original code before commit 3499a13168da ("mm/mmap: use maple tree > for unmapped_area{_topdown}"), the logic is: > > * find a gap with total length, including alignment > * adjust start address with alignment > > What we do now is: > > * find a gap with total length, including alignment > * adjust start address with alignment > * then compare the left range with total length What is 'left range'? This explanation is really hard to follow. > > This is not necessary to compare with total length again after start > address adjusted. > > Also, it is not correct to minus 1 here. This may not trigger an issue > in real world, since address are usually aligned with page size. You aren't saying why. Also the VMA's start is _always_ page-aligned. Presumably the minus 1 is intentionally to amke it an inclusive value once offset by length? > > Fixes: 58c5d0d6d522 ("mm/mmap: regression fix for unmapped_area{_topdown}") Fixes it how? > Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > CC: Liam R. Howlett <Liam.Howlett@Oracle.com> > CC: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > CC: Vlastimil Babka <vbabka@suse.cz> > CC: Jann Horn <jannh@google.com> > CC: Rick Edgecombe <rick.p.edgecombe@intel.com> > Cc: <stable@vger.kernel.org> > --- > mm/vma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/vma.c b/mm/vma.c > index 3f45a245e31b..d82fdbc710b0 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -2668,7 +2668,7 @@ unsigned long unmapped_area(struct vm_unmapped_area_info *info) > gap += (info->align_offset - gap) & info->align_mask; > tmp = vma_next(&vmi); > if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */ > - if (vm_start_gap(tmp) < gap + length - 1) { > + if (vm_start_gap(tmp) < gap + info->length) { Have already spent all morning on this :) Sigh. OK so let's expand this (again - this is the kind of thing you should do for a tricky change like this). info->start_gap is set based on stack_guard_placement() and is either PAGE_SIZE (0x1000) if VM_SHADOW_STACK is set or 0. This is only applicable in x86-64. The align mask is likely to be PAGE_SIZE - 1 but can vary. length = info->length + align_mask + start_gap This takes into account the worst possible alignment overhead accounting for any following VMA also. gap is equal to the current start of the range under consideration (via vma_iter_addr) and as well institutes the appropriate alignment and accounts for start_gap for any _prior_ VMA. Then we try to find the vm_start_gap() for a candidate _next_ VMA, which if a stack, uses stack_guard_gap() to SUBTRACT the stack guard gap from vma->vm_start or account for the shadow stack. Then we finally have: if (next->vm_start - stack gap < start_of_range + [preceeding gap/alignment requirements] + [worst case length] - 1) { Try again } Or: start_of_range >= next->vm_start - stack gap + [preceeding gap/alignment requirements] + [worst case length] + 1 start of range v |[preceeding gap][ VMA, worst length ][following gap]<stack gap> Surely the + 1 (which is the -1 in the original calculation) is simply accounting for the fact that the start of the range is _inclusive_? You're proposing eliminating entirely the after-this-VMA requirements... why? This just seems incorrect to me? You really need to argue the case better if this has some validity, otherwise I think it's just wrong? > low_limit = tmp->vm_end; > vma_iter_reset(&vmi); > goto retry; > -- > 2.34.1 >
On Mon, Jan 27, 2025 at 12:08:04PM +0000, Lorenzo Stoakes wrote: >You have a subject line of 'fix gap check for unmapped_area with >VM_GROWSDOWN'. I'm not sure this is quite accurate. > >I don't really have time to do a deep dive (again, this is why it's so >important to give a decent commit message - explaining under what _real >world_ circumstances this will be used etc.). > >But anyway, it seems it will only be the case if MMF_TOPDOWN is not set in >the mm flags, which usually requires an mmap compatibility mode to achieve >unless the arch otherwise forces it. > >And these arches would be ones where the stack grows UP, right? Or at least >ones where this is possible? > >So already we're into specifics - either arches that grow the stack up, or >ones that intentionally use the old mmap compatibility mode are affected. > >This happens in: > >[ pretty much all unmapped area callers ] >-> vm_unmapped_area() >-> unmapped_area() (if !(info->flags & VM_UNMAPPED_AREA_TOPDOWN) > >Where VM_UNMAPPED_AREA_TOPDOWN is only not set in the circumstances >mentioned above. > >So, for this issue you claim is the case to happen, you have to: > >1. Either be using a stack grows up arch, or enabling an mmap() >compatibility mode. >2. Also set MAP_GROWSDOWN on the mmap() call, which is translated to >VM_GROWSDOWN. > >We are already far from 'fix gap check for VM_GROWSDOWN' right? I mean I >don't have the time to absolutely dive into the guts of this, but I assume >this is correct right? > >I'm not saying we shouldn't address this, but it's VITAL to clarify what >exactly it is you're tackling. > Thanks for taking a look. If my understanding is correct, your concern here is the case here never happen in real world. We are searching a gap bottom-up, while the vma wants top-down. This maybe possible to me. Here is my understanding, (but maybe not correct). We have two separate flags affecting the search: * mm->flags: MMF_TOPDOWN or not * vma->vm_flags: VM_GROWSDOWN or VM_GROWSUP To me, they are independent. For mm->flags, arch_pick_mmap_layout() could set/clear MMF_TOPDOWN it based on the result of mmap_is_legacy(). Even we provide a sysctl file /proc/sys/vm/legacy_vm_layout for configuration. For vma->vm_flags, for general, VM_STACK is set to VM_GROWSDOWN by default. And we use the flag in __bprm_mm_init() and setup_arg_pages(). So to me the case is real and not a very specific one. But maybe I missed some background. Would you mind telling me the miss part, if it is not too time wasting?
On Tue, Jan 28, 2025 at 03:30:30AM +0000, Wei Yang wrote: > On Mon, Jan 27, 2025 at 12:08:04PM +0000, Lorenzo Stoakes wrote: > >You have a subject line of 'fix gap check for unmapped_area with > >VM_GROWSDOWN'. I'm not sure this is quite accurate. > > > >I don't really have time to do a deep dive (again, this is why it's so > >important to give a decent commit message - explaining under what _real > >world_ circumstances this will be used etc.). > > > >But anyway, it seems it will only be the case if MMF_TOPDOWN is not set in > >the mm flags, which usually requires an mmap compatibility mode to achieve > >unless the arch otherwise forces it. > > > >And these arches would be ones where the stack grows UP, right? Or at least > >ones where this is possible? > > > >So already we're into specifics - either arches that grow the stack up, or > >ones that intentionally use the old mmap compatibility mode are affected. > > > >This happens in: > > > >[ pretty much all unmapped area callers ] > >-> vm_unmapped_area() > >-> unmapped_area() (if !(info->flags & VM_UNMAPPED_AREA_TOPDOWN) > > > >Where VM_UNMAPPED_AREA_TOPDOWN is only not set in the circumstances > >mentioned above. > > > >So, for this issue you claim is the case to happen, you have to: > > > >1. Either be using a stack grows up arch, or enabling an mmap() > >compatibility mode. > >2. Also set MAP_GROWSDOWN on the mmap() call, which is translated to > >VM_GROWSDOWN. > > > >We are already far from 'fix gap check for VM_GROWSDOWN' right? I mean I > >don't have the time to absolutely dive into the guts of this, but I assume > >this is correct right? > > > >I'm not saying we shouldn't address this, but it's VITAL to clarify what > >exactly it is you're tackling. > > > > Thanks for taking a look. > > If my understanding is correct, your concern here is the case here never > happen in real world. No, it's not, re-read what I've said. I'm saying you have completely failed to be specific in your commit message about how, what where and why the alleged issue happens, forcing me to spend my entire morning to figure out what on earth it is you're trying to do. I also (but you've clipped it) say I think your solution is just wrong and that there isn't an issue here. I also (but you've clipped it) say you utterly fail to explain what on earth you are doing. I also (but you've clipped it) say you assume the start_gap can be 0x2000 even though it can only ever be 0 or 0x1000. I have taken a great deal of time to specifically critique this even though you've given me little to work on. I simply do not have time to do this and in future if your commit messages are this bad I will just reject your series. > > We are searching a gap bottom-up, while the vma wants top-down. > > This maybe possible to me. Here is my understanding, (but maybe not correct). > > We have two separate flags affecting the search: > > * mm->flags: MMF_TOPDOWN or not > * vma->vm_flags: VM_GROWSDOWN or VM_GROWSUP > > To me, they are independent. You are simply reiterating things I've told you but you failed to mention in your series. > > For mm->flags, arch_pick_mmap_layout() could set/clear MMF_TOPDOWN it based on > the result of mmap_is_legacy(). Even we provide a sysctl file > /proc/sys/vm/legacy_vm_layout for configuration. Yes I know. > > > For vma->vm_flags, for general, VM_STACK is set to VM_GROWSDOWN by default. > And we use the flag in __bprm_mm_init() and setup_arg_pages(). Yes I know. > > So to me the case is real and not a very specific one. It's literally very specific to the point where you have to enable a compatibility mode. As you've just said. Sorry, this is unacceptable for _core_ mm work. This is not an area where you can provide nebulous and vague commit messages. > > But maybe I missed some background. Would you mind telling me the miss part, > if it is not too time wasting? > > -- > Wei Yang > Help you, Help me You have clipped the part where I point out that I think your 'solution' to the alleged 'problem' is incorrect. Honestly, drop this series Wei. The finding unmapped area code is necessarily conservative and fuzzy as it accounts for -worst case- alignment and providing appropriate gaps. It failing to find a gap in very specific and awkward circumstances across the vastness of the virtual address space isn't a bug.
diff --git a/mm/vma.c b/mm/vma.c index 3f45a245e31b..d82fdbc710b0 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -2668,7 +2668,7 @@ unsigned long unmapped_area(struct vm_unmapped_area_info *info) gap += (info->align_offset - gap) & info->align_mask; tmp = vma_next(&vmi); if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */ - if (vm_start_gap(tmp) < gap + length - 1) { + if (vm_start_gap(tmp) < gap + info->length) { low_limit = tmp->vm_end; vma_iter_reset(&vmi); goto retry;
Current unmapped_area() may fail to get the available area. For example, we have a vma range like below: 0123456789abcdef m m A m m Let's assume start_gap is 0x2000 and stack_guard_gap is 0x1000. And now we are looking for free area with size 0x1000 within [0x2000, 0xd000]. The unmapped_area_topdown() could find address at 0x8000, while unmapped_area() fails. In original code before commit 3499a13168da ("mm/mmap: use maple tree for unmapped_area{_topdown}"), the logic is: * find a gap with total length, including alignment * adjust start address with alignment What we do now is: * find a gap with total length, including alignment * adjust start address with alignment * then compare the left range with total length This is not necessary to compare with total length again after start address adjusted. Also, it is not correct to minus 1 here. This may not trigger an issue in real world, since address are usually aligned with page size. Fixes: 58c5d0d6d522 ("mm/mmap: regression fix for unmapped_area{_topdown}") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Liam R. Howlett <Liam.Howlett@Oracle.com> CC: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> CC: Vlastimil Babka <vbabka@suse.cz> CC: Jann Horn <jannh@google.com> CC: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> --- mm/vma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)