diff mbox series

mm: mempolicy: fix the wrong return value and potential pages leak of mbind

Message ID 1572454731-3925-1-git-send-email-yang.shi@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series mm: mempolicy: fix the wrong return value and potential pages leak of mbind | expand

Commit Message

Yang Shi Oct. 30, 2019, 4:58 p.m. UTC
The commit d883544515aa ("mm: mempolicy: make the behavior consistent
when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return
value of mbind() for a couple of corner cases.  But, it altered the
errno for some other cases, for example, mbind() should return -EFAULT
when part or all of the memory range specified by nodemask and maxnode
points  outside your accessible address space, or there was an unmapped
hole in the specified memory range specified by addr and len.

Fixed this by preserving the errno returned by queue_pages_range().
And, the pagelist may be not empty even though queue_pages_range()
returns error, put the pages back to LRU since mbind_range() is not called
to really apply the policy so those pages should not be migrated, this
is also the old behavior before the problematic commit.

Reported-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> v4.19 and v5.2+
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 mm/mempolicy.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Yang Shi Oct. 30, 2019, 6:14 p.m. UTC | #1
On 10/30/19 9:58 AM, Yang Shi wrote:
> The commit d883544515aa ("mm: mempolicy: make the behavior consistent
> when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return
> value of mbind() for a couple of corner cases.  But, it altered the
> errno for some other cases, for example, mbind() should return -EFAULT
> when part or all of the memory range specified by nodemask and maxnode
> points  outside your accessible address space, or there was an unmapped
> hole in the specified memory range specified by addr and len.
>
> Fixed this by preserving the errno returned by queue_pages_range().
> And, the pagelist may be not empty even though queue_pages_range()
> returns error, put the pages back to LRU since mbind_range() is not called
> to really apply the policy so those pages should not be migrated, this
> is also the old behavior before the problematic commit.
Forgot fixes tag.

Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when 
MPOL_MF_MOVE* and MPOL_MF_STRICT were specified")

> Reported-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: <stable@vger.kernel.org> v4.19 and v5.2+
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
>   mm/mempolicy.c | 14 +++++++++-----
>   1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 4ae967b..e08c941 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -672,7 +672,9 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
>    * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
>    *     specified.
>    * 0 - queue pages successfully or no misplaced page.
> - * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
> + * errno - i.e. misplaced pages with MPOL_MF_STRICT specified (-EIO) or
> + *         memory range specified by nodemask and maxnode points outside
> + *         your accessible address space (-EFAULT)
>    */
>   static int
>   queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
> @@ -1286,7 +1288,7 @@ static long do_mbind(unsigned long start, unsigned long len,
>   			  flags | MPOL_MF_INVERT, &pagelist);
>   
>   	if (ret < 0) {
> -		err = -EIO;
> +		err = ret;
>   		goto up_out;
>   	}
>   
> @@ -1305,10 +1307,12 @@ static long do_mbind(unsigned long start, unsigned long len,
>   
>   		if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
>   			err = -EIO;
> -	} else
> -		putback_movable_pages(&pagelist);
> -
> +	} else {
>   up_out:
> +		if (!list_empty(&pagelist))
> +			putback_movable_pages(&pagelist);
> +	}
> +
>   	up_write(&mm->mmap_sem);
>   mpol_out:
>   	mpol_put(new);
Li Xinhai Oct. 31, 2019, 1:53 a.m. UTC | #2
On 2019-10-31 at 02:14 Yang Shi wrote:
>
>
>On 10/30/19 9:58 AM, Yang Shi wrote:
>> The commit d883544515aa ("mm: mempolicy: make the behavior consistent
>> when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return
>> value of mbind() for a couple of corner cases.  But, it altered the
>> errno for some other cases, for example, mbind() should return -EFAULT
>> when part or all of the memory range specified by nodemask and maxnode
>> points  outside your accessible address space, or there was an unmapped
>> hole in the specified memory range specified by addr and len.
>>
>> Fixed this by preserving the errno returned by queue_pages_range().
>> And, the pagelist may be not empty even though queue_pages_range()
>> returns error, put the pages back to LRU since mbind_range() is not called
>> to really apply the policy so those pages should not be migrated, this
>> is also the old behavior before the problematic commit.
>Forgot fixes tag.
>
>Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when
>MPOL_MF_MOVE* and MPOL_MF_STRICT were specified")
> 
Looks good to me.
Reviewed-by: Li Xinhai <lixinhai.lxh@gmail.com>

>> Reported-by: Li Xinhai <lixinhai.lxh@gmail.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: <stable@vger.kernel.org> v4.19 and v5.2+
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>>   mm/mempolicy.c | 14 +++++++++-----
>>   1 file changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 4ae967b..e08c941 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -672,7 +672,9 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
>>    * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
>>    *     specified.
>>    * 0 - queue pages successfully or no misplaced page.
>> - * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
>> + * errno - i.e. misplaced pages with MPOL_MF_STRICT specified (-EIO) or
>> + *         memory range specified by nodemask and maxnode points outside
>> + *         your accessible address space (-EFAULT)
>>    */
>>   static int
>>   queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
>> @@ -1286,7 +1288,7 @@ static long do_mbind(unsigned long start, unsigned long len,
>>     flags | MPOL_MF_INVERT, &pagelist);
>>  
>>   if (ret < 0) {
>> -	err = -EIO;
>> +	err = ret;
>>   goto up_out;
>>   }
>>  
>> @@ -1305,10 +1307,12 @@ static long do_mbind(unsigned long start, unsigned long len,
>>  
>>   if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
>>   err = -EIO;
>> -	} else
>> -	putback_movable_pages(&pagelist);
>> -
>> +	} else {
>>   up_out:
>> +	if (!list_empty(&pagelist))
>> +	putback_movable_pages(&pagelist);
>> +	}
>> +
>>   up_write(&mm->mmap_sem);
>>   mpol_out:
>>   mpol_put(new);
>
Andrew Morton Oct. 31, 2019, 4:31 a.m. UTC | #3
On Wed, 30 Oct 2019 11:14:58 -0700 Yang Shi <yang.shi@linux.alibaba.com> wrote:

> On 10/30/19 9:58 AM, Yang Shi wrote:
> > The commit d883544515aa ("mm: mempolicy: make the behavior consistent
> > when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return
> > value of mbind() for a couple of corner cases.  But, it altered the
> > errno for some other cases, for example, mbind() should return -EFAULT
> > when part or all of the memory range specified by nodemask and maxnode
> > points  outside your accessible address space, or there was an unmapped
> > hole in the specified memory range specified by addr and len.
> >
> > Fixed this by preserving the errno returned by queue_pages_range().
> > And, the pagelist may be not empty even though queue_pages_range()
> > returns error, put the pages back to LRU since mbind_range() is not called
> > to really apply the policy so those pages should not be migrated, this
> > is also the old behavior before the problematic commit.
> Forgot fixes tag.
> 
> Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when 
> MPOL_MF_MOVE* and MPOL_MF_STRICT were specified")

What's the relationship between this patch and
http://lkml.kernel.org/r/201910291756045288126@gmail.com?
Li Xinhai Oct. 31, 2019, 5:28 a.m. UTC | #4
On 2019-10-31 at 12:31 Andrew Morton wrote:
>On Wed, 30 Oct 2019 11:14:58 -0700 Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
>> On 10/30/19 9:58 AM, Yang Shi wrote:
>> > The commit d883544515aa ("mm: mempolicy: make the behavior consistent
>> > when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return
>> > value of mbind() for a couple of corner cases.  But, it altered the
>> > errno for some other cases, for example, mbind() should return -EFAULT
>> > when part or all of the memory range specified by nodemask and maxnode
>> > points  outside your accessible address space, or there was an unmapped
>> > hole in the specified memory range specified by addr and len.
>> >
>> > Fixed this by preserving the errno returned by queue_pages_range().
>> > And, the pagelist may be not empty even though queue_pages_range()
>> > returns error, put the pages back to LRU since mbind_range() is not called
>> > to really apply the policy so those pages should not be migrated, this
>> > is also the old behavior before the problematic commit.
>> Forgot fixes tag.
>>
>> Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when
>> MPOL_MF_MOVE* and MPOL_MF_STRICT were specified")
>
>What's the relationship between this patch and
>http://lkml.kernel.org/r/201910291756045288126@gmail.com?
> 

They are for different issues. I found that -EFAULT is hidden from user space by 
d883544515aa when I was fixing the unmapped hole issue which is described in 
your quoted link.

Now, it is fixed for by current commit in this mail thread. I will explain the other 
one in its thead. 

- Xinhai
Yang Shi Oct. 31, 2019, 3:47 p.m. UTC | #5
On 10/30/19 9:31 PM, Andrew Morton wrote:
> On Wed, 30 Oct 2019 11:14:58 -0700 Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
>> On 10/30/19 9:58 AM, Yang Shi wrote:
>>> The commit d883544515aa ("mm: mempolicy: make the behavior consistent
>>> when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return
>>> value of mbind() for a couple of corner cases.  But, it altered the
>>> errno for some other cases, for example, mbind() should return -EFAULT
>>> when part or all of the memory range specified by nodemask and maxnode
>>> points  outside your accessible address space, or there was an unmapped
>>> hole in the specified memory range specified by addr and len.
>>>
>>> Fixed this by preserving the errno returned by queue_pages_range().
>>> And, the pagelist may be not empty even though queue_pages_range()
>>> returns error, put the pages back to LRU since mbind_range() is not called
>>> to really apply the policy so those pages should not be migrated, this
>>> is also the old behavior before the problematic commit.
>> Forgot fixes tag.
>>
>> Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when
>> MPOL_MF_MOVE* and MPOL_MF_STRICT were specified")
> What's the relationship between this patch and
> http://lkml.kernel.org/r/201910291756045288126@gmail.com?

They are irrelevant. The commit d883544515aa ("mm: mempolicy: make the 
behavior consistent
when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") override the 
-EFAULT return value of queue_pages_range() by -EIO mistakenly and 
missed putting non-empty pagelist back, this patch is aimed to fix the 
two issues.

I think Li Xinhai found the return value override problem during 
debugging his patch.
diff mbox series

Patch

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 4ae967b..e08c941 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -672,7 +672,9 @@  static int queue_pages_test_walk(unsigned long start, unsigned long end,
  * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were
  *     specified.
  * 0 - queue pages successfully or no misplaced page.
- * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.
+ * errno - i.e. misplaced pages with MPOL_MF_STRICT specified (-EIO) or
+ *         memory range specified by nodemask and maxnode points outside
+ *         your accessible address space (-EFAULT)
  */
 static int
 queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
@@ -1286,7 +1288,7 @@  static long do_mbind(unsigned long start, unsigned long len,
 			  flags | MPOL_MF_INVERT, &pagelist);
 
 	if (ret < 0) {
-		err = -EIO;
+		err = ret;
 		goto up_out;
 	}
 
@@ -1305,10 +1307,12 @@  static long do_mbind(unsigned long start, unsigned long len,
 
 		if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))
 			err = -EIO;
-	} else
-		putback_movable_pages(&pagelist);
-
+	} else {
 up_out:
+		if (!list_empty(&pagelist))
+			putback_movable_pages(&pagelist);
+	}
+
 	up_write(&mm->mmap_sem);
 mpol_out:
 	mpol_put(new);