diff mbox series

[v2] mm/mmap: Don't unlock VMAs in remap_file_pages()

Message ID 20201215155441.1497432-1-Liam.Howlett@Oracle.com (mailing list archive)
State New, archived
Headers show
Series [v2] mm/mmap: Don't unlock VMAs in remap_file_pages() | expand

Commit Message

Liam Howlett Dec. 15, 2020, 3:54 p.m. UTC
do_mmap() will unlock the necessary VMAs.  There is also a bug in the
loop which will evaluate as false and not unlock any VMAs anyways.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
---
 mm/mmap.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

Comments

David Hildenbrand Dec. 16, 2020, 2:58 p.m. UTC | #1
On 15.12.20 16:54, Liam R. Howlett wrote:
> do_mmap() will unlock the necessary VMAs.  There is also a bug in the
> loop which will evaluate as false and not unlock any VMAs anyways.

If there is a BUG, do we have a Fixes: tag? Also

1. Can we fix the bug separately first?
2. Can we have a better description on what the bug actually is
"evaluate as false"? What is the result of the bug?

CCing some people that might know if this is actually a sane change.
Skimming over do_mmap(), it's not immediately clear to me that
"do_mmap() will unlock the necessary VMAs".

> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> ---
>  mm/mmap.c | 18 +-----------------
>  1 file changed, 1 insertion(+), 17 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 5c8b4485860de..f7fecb77f84fd 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3025,25 +3025,9 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
>  
>  	flags &= MAP_NONBLOCK;
>  	flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> -	if (vma->vm_flags & VM_LOCKED) {
> -		struct vm_area_struct *tmp;
> +	if (vma->vm_flags & VM_LOCKED)
>  		flags |= MAP_LOCKED;
>  
> -		/* drop PG_Mlocked flag for over-mapped range */
> -		for (tmp = vma; tmp->vm_start >= start + size;
> -				tmp = tmp->vm_next) {
> -			/*
> -			 * Split pmd and munlock page on the border
> -			 * of the range.
> -			 */
> -			vma_adjust_trans_huge(tmp, start, start + size, 0);
> -
> -			munlock_vma_pages_range(tmp,
> -					max(tmp->vm_start, start),
> -					min(tmp->vm_end, start + size));
> -		}
> -	}
> -
>  	file = get_file(vma->vm_file);
>  	ret = do_mmap(vma->vm_file, start, size,
>  			prot, flags, pgoff, &populate, NULL);
>
Liam Howlett Dec. 16, 2020, 8:42 p.m. UTC | #2
Thank you for looking at this.  I appreciate the scrutiny.

* David Hildenbrand <david@redhat.com> [201216 09:58]:
> On 15.12.20 16:54, Liam R. Howlett wrote:
> > do_mmap() will unlock the necessary VMAs.  There is also a bug in the
> > loop which will evaluate as false and not unlock any VMAs anyways.
> 
> If there is a BUG, do we have a Fixes: tag? Also

The bug would never show up as it is masked by do_mmap() unlocking the
necessary range.  Although there is a bug in this code, the code does
not cause an issue as it won't execute so should I have a Fixes tag?
The code works and what I've done is remove a chunk of code that never
runs.

> 
> 1. Can we fix the bug separately first?

I think it is safer to remove unexecuted code than enable it and then
remove it.

> 2. Can we have a better description on what the bug actually is
> "evaluate as false"? What is the result of the bug?

The bug is in the for loop test expression that I removed in the patch.
Here is the long explaination of why the loop has never run.


Line 2982: if (start + size <= start
Line 2983: 	goto out;

size is positive.

Line 2992: vma = find_vma(mm, start);
Look up the first VMA which satisfies start < vm_end

Line 2997: if (start < vma->vm_start)
Line 2998: 	goto out;

So now vma->vm_start >= start.
If vma->vm_start > start, then there are no VMAs in that area, otherwise
it would have been returned by find_vma().
So we can say that vma->vm_start == start.

Line 3033: for (tmp = vma; tmp->vm_start >= start + size;
Line 3034:                 tmp = tmp->vm_next) {
This is the for loop with the error in the test expression.

tmp->vm_start == start which cannot be >= (start + size).

I believe the intention was to loop through vmas in the range of start
to (start + size) and unlock them.


The result of the bug is no VMA is unlocked in this fuction.  But that
doesn't matter as they are unlocked later in the call chain - which is
why this code works as intended.


> 
> CCing some people that might know if this is actually a sane change.
> Skimming over do_mmap(), it's not immediately clear to me that
> "do_mmap() will unlock the necessary VMAs".

Ah, yes.  That is understandable.

do_mmap() L1583 -> mmap_region() L1752 -> munmap_vma_range() ->
do_munmap() -> __do_munmap() loop at 2891 to unlock the range.

Would you like me to add this call chain to the changelog?

> 
> > 
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > ---
> >  mm/mmap.c | 18 +-----------------
> >  1 file changed, 1 insertion(+), 17 deletions(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 5c8b4485860de..f7fecb77f84fd 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -3025,25 +3025,9 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> >  
> >  	flags &= MAP_NONBLOCK;
> >  	flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> > -	if (vma->vm_flags & VM_LOCKED) {
> > -		struct vm_area_struct *tmp;
> > +	if (vma->vm_flags & VM_LOCKED)
> >  		flags |= MAP_LOCKED;
> >  
> > -		/* drop PG_Mlocked flag for over-mapped range */
> > -		for (tmp = vma; tmp->vm_start >= start + size;
         This should probably be less than ---^

> > -				tmp = tmp->vm_next) {
> > -			/*
> > -			 * Split pmd and munlock page on the border
> > -			 * of the range.
> > -			 */
> > -			vma_adjust_trans_huge(tmp, start, start + size, 0);
> > -
> > -			munlock_vma_pages_range(tmp,
> > -					max(tmp->vm_start, start),
> > -					min(tmp->vm_end, start + size));
> > -		}
> > -	}
> > -
> >  	file = get_file(vma->vm_file);
> >  	ret = do_mmap(vma->vm_file, start, size,
> >  			prot, flags, pgoff, &populate, NULL);
> > 
> 
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 
>
Hugh Dickins Dec. 16, 2020, 9:33 p.m. UTC | #3
On Wed, 16 Dec 2020, Liam R. Howlett wrote:
> 
> Thank you for looking at this.  I appreciate the scrutiny.
> 
> * David Hildenbrand <david@redhat.com> [201216 09:58]:
> > On 15.12.20 16:54, Liam R. Howlett wrote:
> > > do_mmap() will unlock the necessary VMAs.  There is also a bug in the
> > > loop which will evaluate as false and not unlock any VMAs anyways.
> > 
> > If there is a BUG, do we have a Fixes: tag? Also
> 
> The bug would never show up as it is masked by do_mmap() unlocking the
> necessary range.  Although there is a bug in this code, the code does
> not cause an issue as it won't execute so should I have a Fixes tag?
> The code works and what I've done is remove a chunk of code that never
> runs.
> 
> > 
> > 1. Can we fix the bug separately first?
> 
> I think it is safer to remove unexecuted code than enable it and then
> remove it.

Agreed.

> 
> > 2. Can we have a better description on what the bug actually is
> > "evaluate as false"? What is the result of the bug?
> 
> The bug is in the for loop test expression that I removed in the patch.
> Here is the long explaination of why the loop has never run.
> 
> 
> Line 2982: if (start + size <= start
> Line 2983: 	goto out;
> 
> size is positive.
> 
> Line 2992: vma = find_vma(mm, start);
> Look up the first VMA which satisfies start < vm_end
> 
> Line 2997: if (start < vma->vm_start)
> Line 2998: 	goto out;
> 
> So now vma->vm_start >= start.
> If vma->vm_start > start, then there are no VMAs in that area, otherwise
> it would have been returned by find_vma().
> So we can say that vma->vm_start == start.
> 
> Line 3033: for (tmp = vma; tmp->vm_start >= start + size;
> Line 3034:                 tmp = tmp->vm_next) {
> This is the for loop with the error in the test expression.
> 
> tmp->vm_start == start which cannot be >= (start + size).
> 
> I believe the intention was to loop through vmas in the range of start
> to (start + size) and unlock them.
> 
> 
> The result of the bug is no VMA is unlocked in this fuction.  But that
> doesn't matter as they are unlocked later in the call chain - which is
> why this code works as intended.

Yes.

> 
> 
> > 
> > CCing some people that might know if this is actually a sane change.
> > Skimming over do_mmap(), it's not immediately clear to me that
> > "do_mmap() will unlock the necessary VMAs".
> 
> Ah, yes.  That is understandable.
> 
> do_mmap() L1583 -> mmap_region() L1752 -> munmap_vma_range() ->
> do_munmap() -> __do_munmap() loop at 2891 to unlock the range.
> 
> Would you like me to add this call chain to the changelog?

I don't think you need to add that: do_mmap(MAP_FIXED) simply has to
be able to munlock the range, much else would be broken if it did not.

> 
> > 
> > > 
> > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>

Acked-by: Hugh Dickins <hughd@google.com>

This is indeed a sane change.  I stumbled over that mistaken code
back in the days of PageTeam shmem huge pages, when syzkaller hit
a VM_BUG_ON_PAGE because of it; deleted the block as you have in v2;
then it fell off our radar when updating to PageCompound huge pages -
when Salman noticed as you have that the loop was ineffectual anyway.
It's just good to delete this dead code and confusion.

Though, in the course of writing that paragraph, I have come to wonder:
how did syzkaller hit a VM_BUG_ON_PAGE in code that is never executed??
Was something else different back then, or are we overlooking a case?

But whatever, the block is redundant and your v2 patch is good.

> > > ---
> > >  mm/mmap.c | 18 +-----------------
> > >  1 file changed, 1 insertion(+), 17 deletions(-)
> > > 
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 5c8b4485860de..f7fecb77f84fd 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -3025,25 +3025,9 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
> > >  
> > >  	flags &= MAP_NONBLOCK;
> > >  	flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
> > > -	if (vma->vm_flags & VM_LOCKED) {
> > > -		struct vm_area_struct *tmp;
> > > +	if (vma->vm_flags & VM_LOCKED)
> > >  		flags |= MAP_LOCKED;
> > >  
> > > -		/* drop PG_Mlocked flag for over-mapped range */
> > > -		for (tmp = vma; tmp->vm_start >= start + size;
>          This should probably be less than ---^
> 
> > > -				tmp = tmp->vm_next) {
> > > -			/*
> > > -			 * Split pmd and munlock page on the border
> > > -			 * of the range.
> > > -			 */
> > > -			vma_adjust_trans_huge(tmp, start, start + size, 0);
> > > -
> > > -			munlock_vma_pages_range(tmp,
> > > -					max(tmp->vm_start, start),
> > > -					min(tmp->vm_end, start + size));
> > > -		}
> > > -	}
> > > -
> > >  	file = get_file(vma->vm_file);
> > >  	ret = do_mmap(vma->vm_file, start, size,
> > >  			prot, flags, pgoff, &populate, NULL);
> > > 
> > 
> > 
> > -- 
> > Thanks,
> > 
> > David / dhildenb
David Hildenbrand Dec. 17, 2020, 10:05 a.m. UTC | #4
On 16.12.20 21:42, Liam R. Howlett wrote:
> 
> Thank you for looking at this.  I appreciate the scrutiny.
> 
> * David Hildenbrand <david@redhat.com> [201216 09:58]:
>> On 15.12.20 16:54, Liam R. Howlett wrote:
>>> do_mmap() will unlock the necessary VMAs.  There is also a bug in the
>>> loop which will evaluate as false and not unlock any VMAs anyways.
>>
>> If there is a BUG, do we have a Fixes: tag? Also
> 
> The bug would never show up as it is masked by do_mmap() unlocking the
> necessary range.  Although there is a bug in this code, the code does
> not cause an issue as it won't execute so should I have a Fixes tag?
> The code works and what I've done is remove a chunk of code that never
> runs.
> 

Ok I see. The use of "bug" here is misleading. The unnecessary code is
simply not doing what it promised to do without doing any harm.

>>
>> 1. Can we fix the bug separately first?
> 
> I think it is safer to remove unexecuted code than enable it and then
> remove it.

I agree, as it is not actually a bug.

> 
>> 2. Can we have a better description on what the bug actually is
>> "evaluate as false"? What is the result of the bug?
> 
> The bug is in the for loop test expression that I removed in the patch.
> Here is the long explaination of why the loop has never run.
> 
> 
> Line 2982: if (start + size <= start
> Line 2983: 	goto out;
> 
> size is positive.
> 
> Line 2992: vma = find_vma(mm, start);
> Look up the first VMA which satisfies start < vm_end
> 
> Line 2997: if (start < vma->vm_start)
> Line 2998: 	goto out;
> 
> So now vma->vm_start >= start.
> If vma->vm_start > start, then there are no VMAs in that area, otherwise
> it would have been returned by find_vma().
> So we can say that vma->vm_start == start.
> 
> Line 3033: for (tmp = vma; tmp->vm_start >= start + size;
> Line 3034:                 tmp = tmp->vm_next) {
> This is the for loop with the error in the test expression.
> 
> tmp->vm_start == start which cannot be >= (start + size).
> 
> I believe the intention was to loop through vmas in the range of start
> to (start + size) and unlock them.
> 
> 
> The result of the bug is no VMA is unlocked in this fuction.  But that
> doesn't matter as they are unlocked later in the call chain - which is
> why this code works as intended.
> 

Thanks for clarifying!

> 
>>
>> CCing some people that might know if this is actually a sane change.
>> Skimming over do_mmap(), it's not immediately clear to me that
>> "do_mmap() will unlock the necessary VMAs".
> 
> Ah, yes.  That is understandable.
> 
> do_mmap() L1583 -> mmap_region() L1752 -> munmap_vma_range() ->
> do_munmap() -> __do_munmap() loop at 2891 to unlock the range.
> 
> Would you like me to add this call chain to the changelog?

Yes please, in a simplified form.

I suggest something like the following patch description:

"do_mmap(MAP_FIXED) will already unlock pages via munmap_vma_range(). We
can remove the superfluous manual unlocking in remap_file_pages().

Note that the manual unlocking is even incorrect, as it might miss
unlocking some pages - no harm done.
"
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index 5c8b4485860de..f7fecb77f84fd 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3025,25 +3025,9 @@  SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
 
 	flags &= MAP_NONBLOCK;
 	flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE;
-	if (vma->vm_flags & VM_LOCKED) {
-		struct vm_area_struct *tmp;
+	if (vma->vm_flags & VM_LOCKED)
 		flags |= MAP_LOCKED;
 
-		/* drop PG_Mlocked flag for over-mapped range */
-		for (tmp = vma; tmp->vm_start >= start + size;
-				tmp = tmp->vm_next) {
-			/*
-			 * Split pmd and munlock page on the border
-			 * of the range.
-			 */
-			vma_adjust_trans_huge(tmp, start, start + size, 0);
-
-			munlock_vma_pages_range(tmp,
-					max(tmp->vm_start, start),
-					min(tmp->vm_end, start + size));
-		}
-	}
-
 	file = get_file(vma->vm_file);
 	ret = do_mmap(vma->vm_file, start, size,
 			prot, flags, pgoff, &populate, NULL);