diff mbox series

vma: Detect infinite loop in vma tree

Message ID 20241031170138.1935115-1-Liam.Howlett@oracle.com (mailing list archive)
State New
Headers show
Series vma: Detect infinite loop in vma tree | expand

Commit Message

Liam R. Howlett Oct. 31, 2024, 5:01 p.m. UTC
From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>

There have been no reported infinite loops in the tree, but checking the
detection of an infinite loop during validation is simple enough.  Add
the detection to the validate_mm() function so that error reports are
clear and don't just report stalls.

This does not protect against internal maple tree issues, but it does
detect too many vmas being returned from the tree.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jann Horn <jannh@google.com>
---
 mm/vma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Liam R. Howlett Oct. 31, 2024, 5:04 p.m. UTC | #1
* Liam R. Howlett <Liam.Howlett@oracle.com> [241031 13:01]:
> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> 
> There have been no reported infinite loops in the tree, but checking the
> detection of an infinite loop during validation is simple enough.  Add
> the detection to the validate_mm() function so that error reports are
> clear and don't just report stalls.
> 
> This does not protect against internal maple tree issues, but it does
> detect too many vmas being returned from the tree.

I should add:

This patch does have the downside of making the count difference less
useful (we will only see one extra as apposed to what may exist).

> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> ---
>  mm/vma.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vma.c b/mm/vma.c
> index 68138e8c153e..60ed8cc187ad 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm)
>  			anon_vma_unlock_read(anon_vma);
>  		}
>  #endif
> -		i++;
> +		if (++i > mm->map_count)
> +			break;
>  	}
>  	if (i != mm->map_count) {
>  		pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
> -- 
> 2.43.0
>
Vlastimil Babka Oct. 31, 2024, 5:06 p.m. UTC | #2
On 10/31/24 18:01, Liam R. Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> 
> There have been no reported infinite loops in the tree, but checking the
> detection of an infinite loop during validation is simple enough.  Add
> the detection to the validate_mm() function so that error reports are
> clear and don't just report stalls.
> 
> This does not protect against internal maple tree issues, but it does
> detect too many vmas being returned from the tree.
> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Jann Horn <jannh@google.com>
> ---
>  mm/vma.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vma.c b/mm/vma.c
> index 68138e8c153e..60ed8cc187ad 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm)
>  			anon_vma_unlock_read(anon_vma);
>  		}
>  #endif
> -		i++;
> +		if (++i > mm->map_count)
> +			break;

Would it make sense to allow some slack so that the error below can
distinguish better between off-by-one/few error from a complete corruption?

And in that case assign some special value to "i" (-1?) to make it clear
this was triggered?

>  	}
>  	if (i != mm->map_count) {
>  		pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
Liam R. Howlett Oct. 31, 2024, 5:13 p.m. UTC | #3
* Vlastimil Babka <vbabka@suse.cz> [241031 13:07]:
> On 10/31/24 18:01, Liam R. Howlett wrote:
> > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> > 
> > There have been no reported infinite loops in the tree, but checking the
> > detection of an infinite loop during validation is simple enough.  Add
> > the detection to the validate_mm() function so that error reports are
> > clear and don't just report stalls.
> > 
> > This does not protect against internal maple tree issues, but it does
> > detect too many vmas being returned from the tree.
> > 
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Cc: Jann Horn <jannh@google.com>
> > ---
> >  mm/vma.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 68138e8c153e..60ed8cc187ad 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm)
> >  			anon_vma_unlock_read(anon_vma);
> >  		}
> >  #endif
> > -		i++;
> > +		if (++i > mm->map_count)
> > +			break;
> 
> Would it make sense to allow some slack so that the error below can
> distinguish better between off-by-one/few error from a complete corruption?
> 
> And in that case assign some special value to "i" (-1?) to make it clear
> this was triggered?

Yes, probably.  10 would be plenty.  In recent memory I cannot think of
an example that we exceeded 7 munmap()'s in a single operation -
although it is easily possible to do.

I like the idea of -1 too, at least someone would come to inspect where
it came from at that point.

> 
> >  	}
> >  	if (i != mm->map_count) {
> >  		pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
>
Lorenzo Stoakes Oct. 31, 2024, 5:21 p.m. UTC | #4
On Thu, Oct 31, 2024 at 01:13:28PM -0400, Liam R. Howlett wrote:
> * Vlastimil Babka <vbabka@suse.cz> [241031 13:07]:
> > On 10/31/24 18:01, Liam R. Howlett wrote:
> > > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> > >
> > > There have been no reported infinite loops in the tree, but checking the
> > > detection of an infinite loop during validation is simple enough.  Add
> > > the detection to the validate_mm() function so that error reports are
> > > clear and don't just report stalls.
> > >
> > > This does not protect against internal maple tree issues, but it does
> > > detect too many vmas being returned from the tree.
> > >
> > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Jann Horn <jannh@google.com>
> > > ---
> > >  mm/vma.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/vma.c b/mm/vma.c
> > > index 68138e8c153e..60ed8cc187ad 100644
> > > --- a/mm/vma.c
> > > +++ b/mm/vma.c
> > > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm)
> > >  			anon_vma_unlock_read(anon_vma);
> > >  		}
> > >  #endif
> > > -		i++;
> > > +		if (++i > mm->map_count)
> > > +			break;
> >
> > Would it make sense to allow some slack so that the error below can
> > distinguish better between off-by-one/few error from a complete corruption?
> >
> > And in that case assign some special value to "i" (-1?) to make it clear
> > this was triggered?
>
> Yes, probably.  10 would be plenty.  In recent memory I cannot think of
> an example that we exceeded 7 munmap()'s in a single operation -
> although it is easily possible to do.
>
> I like the idea of -1 too, at least someone would come to inspect where
> it came from at that point.

Hm this feels a little arbitrary though... I mean can we race with
map_count at this point or is everything locked down such that no munmaps()
can happen?

Otherwise it feels a bit whack-a-mole.

I agree with Vlastimil though it'd be nice to sort of differentiate, but if
we _absolutely can only iterate mm->map_count times_ here, it might be
worth letting a few more go, then in the next bit of code...

>
> >
> > >  	}
> > >  	if (i != mm->map_count) {
> > >  		pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
> >

...here which does indeed imply that i literally cannot be anything but
mm->map_count, I mean I guess we already get to see here how far off we
are.

So yeah something like letting it go 10 more times (maybe like #define
UNUSUALLY_BAD_CORRUPTION_COUNT 10 or something I don't know naming is hard)
just so we can pick that out would be nice.

But I do like the general idea here though!
Liam R. Howlett Oct. 31, 2024, 5:30 p.m. UTC | #5
* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [241031 13:22]:
> On Thu, Oct 31, 2024 at 01:13:28PM -0400, Liam R. Howlett wrote:
> > * Vlastimil Babka <vbabka@suse.cz> [241031 13:07]:
> > > On 10/31/24 18:01, Liam R. Howlett wrote:
> > > > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> > > >
> > > > There have been no reported infinite loops in the tree, but checking the
> > > > detection of an infinite loop during validation is simple enough.  Add
> > > > the detection to the validate_mm() function so that error reports are
> > > > clear and don't just report stalls.
> > > >
> > > > This does not protect against internal maple tree issues, but it does
> > > > detect too many vmas being returned from the tree.
> > > >
> > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > > Cc: Jann Horn <jannh@google.com>
> > > > ---
> > > >  mm/vma.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > index 68138e8c153e..60ed8cc187ad 100644
> > > > --- a/mm/vma.c
> > > > +++ b/mm/vma.c
> > > > @@ -615,7 +615,8 @@ void validate_mm(struct mm_struct *mm)
> > > >  			anon_vma_unlock_read(anon_vma);
> > > >  		}
> > > >  #endif
> > > > -		i++;
> > > > +		if (++i > mm->map_count)
> > > > +			break;
> > >
> > > Would it make sense to allow some slack so that the error below can
> > > distinguish better between off-by-one/few error from a complete corruption?
> > >
> > > And in that case assign some special value to "i" (-1?) to make it clear
> > > this was triggered?
> >
> > Yes, probably.  10 would be plenty.  In recent memory I cannot think of
> > an example that we exceeded 7 munmap()'s in a single operation -
> > although it is easily possible to do.
> >
> > I like the idea of -1 too, at least someone would come to inspect where
> > it came from at that point.
> 
> Hm this feels a little arbitrary though... I mean can we race with
> map_count at this point or is everything locked down such that no munmaps()
> can happen?
> 
> Otherwise it feels a bit whack-a-mole.
> 
> I agree with Vlastimil though it'd be nice to sort of differentiate, but if
> we _absolutely can only iterate mm->map_count times_ here, it might be
> worth letting a few more go, then in the next bit of code...

this is done under the write lock, otherwise it would vary already.

> 
> >
> > >
> > > >  	}
> > > >  	if (i != mm->map_count) {
> > > >  		pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
> > >
> 
> ...here which does indeed imply that i literally cannot be anything but
> mm->map_count, I mean I guess we already get to see here how far off we
> are.
> 
> So yeah something like letting it go 10 more times (maybe like #define
> UNUSUALLY_BAD_CORRUPTION_COUNT 10 or something I don't know naming is hard)
> just so we can pick that out would be nice.
> 
> But I do like the general idea here though!

This may not ever produce anything - if we are in the maple tree code
and never reaching a leaf to return anything then we will still loop
forever.  But it is something that is easy enough to detect and stop -
which may make a syzbot report a bit easier to swallow.
diff mbox series

Patch

diff --git a/mm/vma.c b/mm/vma.c
index 68138e8c153e..60ed8cc187ad 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -615,7 +615,8 @@  void validate_mm(struct mm_struct *mm)
 			anon_vma_unlock_read(anon_vma);
 		}
 #endif
-		i++;
+		if (++i > mm->map_count)
+			break;
 	}
 	if (i != mm->map_count) {
 		pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);