Message ID | 20241031170138.1935115-1-Liam.Howlett@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | vma: Detect infinite loop in vma tree | expand |
* Liam R. Howlett <Liam.Howlett@oracle.com> [241031 13:01]: > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > There have been no reported infinite loops in the tree, but checking the > detection of an infinite loop during validation is simple enough. Add > the detection to the validate_mm() function so that error reports are > clear and don't just report stalls. > > This does not protect against internal maple tree issues, but it does > detect too many vmas being returned from the tree. I should add: This patch does have the downside of making the count difference less useful (we will only see one extra as apposed to what may exist). > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Jann Horn <jannh@google.com> > --- > mm/vma.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/vma.c b/mm/vma.c > index 68138e8c153e..60ed8cc187ad 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm) > anon_vma_unlock_read(anon_vma); > } > #endif > - i++; > + if (++i > mm->map_count) > + break; > } > if (i != mm->map_count) { > pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); > -- > 2.43.0 >
On 10/31/24 18:01, Liam R. Howlett wrote: > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > There have been no reported infinite loops in the tree, but checking the > detection of an infinite loop during validation is simple enough. Add > the detection to the validate_mm() function so that error reports are > clear and don't just report stalls. > > This does not protect against internal maple tree issues, but it does > detect too many vmas being returned from the tree. > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Jann Horn <jannh@google.com> > --- > mm/vma.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/vma.c b/mm/vma.c > index 68138e8c153e..60ed8cc187ad 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm) > anon_vma_unlock_read(anon_vma); > } > #endif > - i++; > + if (++i > mm->map_count) > + break; Would it make sense to allow some slack so that the error below can distinguish better between off-by-one/few error from a complete corruption? And in that case assign some special value to "i" (-1?) to make it clear this was triggered? > } > if (i != mm->map_count) { > pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
* Vlastimil Babka <vbabka@suse.cz> [241031 13:07]: > On 10/31/24 18:01, Liam R. Howlett wrote: > > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > > > There have been no reported infinite loops in the tree, but checking the > > detection of an infinite loop during validation is simple enough. Add > > the detection to the validate_mm() function so that error reports are > > clear and don't just report stalls. > > > > This does not protect against internal maple tree issues, but it does > > detect too many vmas being returned from the tree. > > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > Cc: Vlastimil Babka <vbabka@suse.cz> > > Cc: Jann Horn <jannh@google.com> > > --- > > mm/vma.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/vma.c b/mm/vma.c > > index 68138e8c153e..60ed8cc187ad 100644 > > --- a/mm/vma.c > > +++ b/mm/vma.c > > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm) > > anon_vma_unlock_read(anon_vma); > > } > > #endif > > - i++; > > + if (++i > mm->map_count) > > + break; > > Would it make sense to allow some slack so that the error below can > distinguish better between off-by-one/few error from a complete corruption? > > And in that case assign some special value to "i" (-1?) to make it clear > this was triggered? Yes, probably. 10 would be plenty. In recent memory I cannot think of an example that we exceeded 7 munmap()'s in a single operation - although it is easily possible to do. I like the idea of -1 too, at least someone would come to inspect where it came from at that point. > > > } > > if (i != mm->map_count) { > > pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); >
On Thu, Oct 31, 2024 at 01:13:28PM -0400, Liam R. Howlett wrote: > * Vlastimil Babka <vbabka@suse.cz> [241031 13:07]: > > On 10/31/24 18:01, Liam R. Howlett wrote: > > > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > > > > > There have been no reported infinite loops in the tree, but checking the > > > detection of an infinite loop during validation is simple enough. Add > > > the detection to the validate_mm() function so that error reports are > > > clear and don't just report stalls. > > > > > > This does not protect against internal maple tree issues, but it does > > > detect too many vmas being returned from the tree. > > > > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > > Cc: Vlastimil Babka <vbabka@suse.cz> > > > Cc: Jann Horn <jannh@google.com> > > > --- > > > mm/vma.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/vma.c b/mm/vma.c > > > index 68138e8c153e..60ed8cc187ad 100644 > > > --- a/mm/vma.c > > > +++ b/mm/vma.c > > > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm) > > > anon_vma_unlock_read(anon_vma); > > > } > > > #endif > > > - i++; > > > + if (++i > mm->map_count) > > > + break; > > > > Would it make sense to allow some slack so that the error below can > > distinguish better between off-by-one/few error from a complete corruption? > > > > And in that case assign some special value to "i" (-1?) to make it clear > > this was triggered? > > Yes, probably. 10 would be plenty. In recent memory I cannot think of > an example that we exceeded 7 munmap()'s in a single operation - > although it is easily possible to do. > > I like the idea of -1 too, at least someone would come to inspect where > it came from at that point. Hm this feels a little arbitrary though... I mean can we race with map_count at this point or is everything locked down such that no munmaps() can happen? Otherwise it feels a bit whack-a-mole. I agree with Vlastimil though it'd be nice to sort of differentiate, but if we _absolutely can only iterate mm->map_count times_ here, it might be worth letting a few more go, then in the next bit of code... > > > > > > } > > > if (i != mm->map_count) { > > > pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); > > ...here which does indeed imply that i literally cannot be anything but mm->map_count, I mean I guess we already get to see here how far off we are. So yeah something like letting it go 10 more times (maybe like #define UNUSUALLY_BAD_CORRUPTION_COUNT 10 or something I don't know naming is hard) just so we can pick that out would be nice. But I do like the general idea here though!
* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [241031 13:22]: > On Thu, Oct 31, 2024 at 01:13:28PM -0400, Liam R. Howlett wrote: > > * Vlastimil Babka <vbabka@suse.cz> [241031 13:07]: > > > On 10/31/24 18:01, Liam R. Howlett wrote: > > > > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > > > > > > > There have been no reported infinite loops in the tree, but checking the > > > > detection of an infinite loop during validation is simple enough. Add > > > > the detection to the validate_mm() function so that error reports are > > > > clear and don't just report stalls. > > > > > > > > This does not protect against internal maple tree issues, but it does > > > > detect too many vmas being returned from the tree. > > > > > > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > > > Cc: Vlastimil Babka <vbabka@suse.cz> > > > > Cc: Jann Horn <jannh@google.com> > > > > --- > > > > mm/vma.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/mm/vma.c b/mm/vma.c > > > > index 68138e8c153e..60ed8cc187ad 100644 > > > > --- a/mm/vma.c > > > > +++ b/mm/vma.c > > > > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm) > > > > anon_vma_unlock_read(anon_vma); > > > > } > > > > #endif > > > > - i++; > > > > + if (++i > mm->map_count) > > > > + break; > > > > > > Would it make sense to allow some slack so that the error below can > > > distinguish better between off-by-one/few error from a complete corruption? > > > > > > And in that case assign some special value to "i" (-1?) to make it clear > > > this was triggered? > > > > Yes, probably. 10 would be plenty. In recent memory I cannot think of > > an example that we exceeded 7 munmap()'s in a single operation - > > although it is easily possible to do. > > > > I like the idea of -1 too, at least someone would come to inspect where > > it came from at that point. > > Hm this feels a little arbitrary though... I mean can we race with > map_count at this point or is everything locked down such that no munmaps() > can happen? > > Otherwise it feels a bit whack-a-mole. > > I agree with Vlastimil though it'd be nice to sort of differentiate, but if > we _absolutely can only iterate mm->map_count times_ here, it might be > worth letting a few more go, then in the next bit of code... this is done under the write lock, otherwise it would vary already. > > > > > > > > > > } > > > > if (i != mm->map_count) { > > > > pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); > > > > > ...here which does indeed imply that i literally cannot be anything but > mm->map_count, I mean I guess we already get to see here how far off we > are. > > So yeah something like letting it go 10 more times (maybe like #define > UNUSUALLY_BAD_CORRUPTION_COUNT 10 or something I don't know naming is hard) > just so we can pick that out would be nice. > > But I do like the general idea here though! This may not ever produce anything - if we are in the maple tree code and never reaching a leaf to return anything then we will still loop forever. But it is something that is easy enough to detect and stop - which may make a syzbot report a bit easier to swallow.
diff --git a/mm/vma.c b/mm/vma.c index 68138e8c153e..60ed8cc187ad 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm) anon_vma_unlock_read(anon_vma); } #endif - i++; + if (++i > mm->map_count) + break; } if (i != mm->map_count) { pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);