diff mbox series

mm/hugetlb.c: fix unnecessary address expansion of pmd sharing

Message ID 20201229042125.2663029-1-lixinhai.lxh@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm/hugetlb.c: fix unnecessary address expansion of pmd sharing | expand

Commit Message

Li Xinhai Dec. 29, 2020, 4:21 a.m. UTC
The current code would unnecessarily expand the address range. Consider
one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
(1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
without expand. But the current result will be (1G-4M, 3G+4M). Actually,
the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
sharing.

After this patch, if pud aligned *start across vm_start, then we know the
*start and vm_start are in same pud_index, and vm_start is not pud
aligned, so don't adjust *start. Same logic applied to *end.

Fixes: commit 75802ca66354 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible")
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
---
 mm/hugetlb.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Andrew Morton Dec. 29, 2020, 7:21 p.m. UTC | #1
On Tue, 29 Dec 2020 12:21:25 +0800 Li Xinhai <lixinhai.lxh@gmail.com> wrote:

> The current code would unnecessarily expand the address range. Consider
> one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
> (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
> without expand. But the current result will be (1G-4M, 3G+4M). Actually,
> the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
> sharing.
> 
> After this patch, if pud aligned *start across vm_start, then we know the
> *start and vm_start are in same pud_index, and vm_start is not pud
> aligned, so don't adjust *start. Same logic applied to *end.

Thanks. What are the user-visible runtime effects of this issue?
Mike Kravetz Dec. 29, 2020, 9:20 p.m. UTC | #2
On 12/28/20 8:21 PM, Li Xinhai wrote:
> The current code would unnecessarily expand the address range. Consider
> one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
> (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
> without expand. But the current result will be (1G-4M, 3G+4M). Actually,
> the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
> sharing.
> 
> After this patch, if pud aligned *start across vm_start, then we know the
> *start and vm_start are in same pud_index, and vm_start is not pud
> aligned, so don't adjust *start. Same logic applied to *end.
> 
> Fixes: commit 75802ca66354 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible")
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>

Thank you.  That does indeed fix an issue in the current code.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

Andrew,
You asked about user-visible runtime effects.  The 'adjusted range' is used
for calls to mmu notifiers and cache(tlb) flushing.  Since the current code
unnecessarily expands the range in some cases, more entries than necessary
would be flushed.  This would/could result in performance degradation.
However, this is highly dependent on the user runtime.  Is there a combination
of vma layout and calls to actually hit this issue?  If the issue is hit, will
those entries unnecessarily flushed be used again and need to be unnecessarily
reloaded?
Peter Xu Dec. 30, 2020, 6:42 p.m. UTC | #3
On Tue, Dec 29, 2020 at 12:21:25PM +0800, Li Xinhai wrote:
> The current code would unnecessarily expand the address range. Consider
> one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
> (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
> without expand. But the current result will be (1G-4M, 3G+4M). Actually,
> the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
> sharing.
> 
> After this patch, if pud aligned *start across vm_start, then we know the
> *start and vm_start are in same pud_index, and vm_start is not pud
> aligned, so don't adjust *start. Same logic applied to *end.
> 
> Fixes: commit 75802ca66354 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible")
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> ---
>  mm/hugetlb.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index cbf32d2824fd..d1e9ea55b7e6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5249,11 +5249,16 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
>  	a_end = ALIGN(*end, PUD_SIZE);
>  
>  	/*
> -	 * Intersect the range with the vma range, since pmd sharing won't be
> -	 * across vma after all
> +	 * If the PUD aligned address across vma range, then it means the
> +	 * vm_start/vm_end is not PUD aligned. In that case, we must don't
> +	 * adjust range because pmd sharing is not possbile at the start and/or
> +	 * end part of vma.
>  	 */
> -	*start = max(vma->vm_start, a_start);
> -	*end = min(vma->vm_end, a_end);
> +	if (a_start >= vma->vm_start)
> +		*start = a_start;
> +
> +	if (a_end <= vma->vm_end)
> +		*end = a_end;
>  }

Looks correct, thanks.

Reviewed-by: Peter Xu <peterx@redhat.com>
Mike Kravetz Dec. 31, 2020, 5:56 p.m. UTC | #4
On 12/29/20 1:20 PM, Mike Kravetz wrote:
> On 12/28/20 8:21 PM, Li Xinhai wrote:
>> The current code would unnecessarily expand the address range. Consider
>> one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
>> (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
>> without expand. But the current result will be (1G-4M, 3G+4M). Actually,
>> the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
>> sharing.
>>
>> After this patch, if pud aligned *start across vm_start, then we know the
>> *start and vm_start are in same pud_index, and vm_start is not pud
>> aligned, so don't adjust *start. Same logic applied to *end.
>>
>> Fixes: commit 75802ca66354 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible")
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> 
> Thank you.  That does indeed fix an issue in the current code.
> 
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

Upon further thought, this patch also expands the passed range when not
necessary.  Consider the example (start, end) = (1G-6M, 1G-4M), and 
(vm_start, vm_end) = (1G, 1G-2M).  This patch would adjust the range to
(1G, 1G-4M).  However, no adjustment should be performed as no sharing
is possible.

Below is proposed code to address the issue.  I'm not sending a formal
patch yet as I would like comments on the code first.  It is not a critical
issue and any fix can wait a bit.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 7e89f31d7ef8..41ccec617f74 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5264,16 +5264,19 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
 	if (!(vma->vm_flags & VM_MAYSHARE))
 		return;
 
-	/* Extend the range to be PUD aligned for a worst case scenario */
-	a_start = ALIGN_DOWN(*start, PUD_SIZE);
-	a_end = ALIGN(*end, PUD_SIZE);
-
 	/*
-	 * Intersect the range with the vma range, since pmd sharing won't be
-	 * across vma after all
+	 * Check if start and end are within a PUD aligned range of the
+	 * vma.  If they are, then adjust to PUD alignment.
 	 */
-	*start = max(vma->vm_start, a_start);
-	*end = min(vma->vm_end, a_end);
+	a_start = ALIGN_DOWN(*start, PUD_SIZE);
+	a_end = ALIGN(*start, PUD_SIZE);
+	if (range_in_vma(vma, a_start, a_end))
+		*start = a_start;
+
+	a_start = ALIGN_DOWN(*end, PUD_SIZE);
+	a_end = ALIGN(*end, PUD_SIZE);
+	if (range_in_vma(vma, a_start, a_end))
+		*end = a_end;
 }
 
 /*
Li Xinhai Jan. 2, 2021, 11:56 a.m. UTC | #5
On 1/1/21 1:56 AM, Mike Kravetz wrote:
> On 12/29/20 1:20 PM, Mike Kravetz wrote:
>> On 12/28/20 8:21 PM, Li Xinhai wrote:
>>> The current code would unnecessarily expand the address range. Consider
>>> one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
>>> (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
>>> without expand. But the current result will be (1G-4M, 3G+4M). Actually,
>>> the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
>>> sharing.
>>>
>>> After this patch, if pud aligned *start across vm_start, then we know the
>>> *start and vm_start are in same pud_index, and vm_start is not pud
>>> aligned, so don't adjust *start. Same logic applied to *end.
>>>
>>> Fixes: commit 75802ca66354 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible")
>>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>>> Cc: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>>
>> Thank you.  That does indeed fix an issue in the current code.
>>
>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Upon further thought, this patch also expands the passed range when not
> necessary.  Consider the example (start, end) = (1G-6M, 1G-4M), and
> (vm_start, vm_end) = (1G, 1G-2M).  This patch would adjust the range to
> (1G, 1G-4M).  However, no adjustment should be performed as no sharing
> is possible.
>
correct, my previous patch did not fully fix the issue.

Above example maybe typo for vm_start, vm_end. The issue didn't fixed by 
my patch would be with another example,
(vm_start, vm_end) = (1G-8M, 1G+2M), (start, end) = (1G-6M, 1G-4M), end 
should not be adjusted to 1G, although after adjust it still below vm_end.

> Below is proposed code to address the issue.  I'm not sending a formal
> patch yet as I would like comments on the code first.  It is not a critical
> issue and any fix can wait a bit.
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 7e89f31d7ef8..41ccec617f74 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5264,16 +5264,19 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
>   	if (!(vma->vm_flags & VM_MAYSHARE))
>   		return;
>   
> -	/* Extend the range to be PUD aligned for a worst case scenario */
> -	a_start = ALIGN_DOWN(*start, PUD_SIZE);
> -	a_end = ALIGN(*end, PUD_SIZE);
> -
>   	/*
> -	 * Intersect the range with the vma range, since pmd sharing won't be
> -	 * across vma after all
> +	 * Check if start and end are within a PUD aligned range of the
> +	 * vma.  If they are, then adjust to PUD alignment.
>   	 */
> -	*start = max(vma->vm_start, a_start);
> -	*end = min(vma->vm_end, a_end);
> +	a_start = ALIGN_DOWN(*start, PUD_SIZE);
> +	a_end = ALIGN(*start, PUD_SIZE);
> +	if (range_in_vma(vma, a_start, a_end))
> +		*start = a_start;
> +
> +	a_start = ALIGN_DOWN(*end, PUD_SIZE);
> +	a_end = ALIGN(*end, PUD_SIZE);
> +	if (range_in_vma(vma, a_start, a_end))
> +		*end = a_end;
>   }
>   
>   /*
> 
Now, this fully fixed the issue.

One thing to be sure is that the (start, end) as input parameter must 
already within vma's range, although the range_in_vma test() can cover 
the out of range cases.

Reviewed-by: Li Xinhai <lixinhai.lxh@gmail.com>
Mike Kravetz Jan. 4, 2021, 3:55 a.m. UTC | #6
On 1/2/21 3:56 AM, Li Xinhai wrote:
> 
> 
> On 1/1/21 1:56 AM, Mike Kravetz wrote:
>> On 12/29/20 1:20 PM, Mike Kravetz wrote:
>>> On 12/28/20 8:21 PM, Li Xinhai wrote:
>>>> The current code would unnecessarily expand the address range. Consider
>>>> one example, (start, end) = (1G-2M, 3G+2M), and  (vm_start, vm_end) =
>>>> (1G-4M, 3G+4M), the expected adjustment should be keep (1G-2M, 3G+2M)
>>>> without expand. But the current result will be (1G-4M, 3G+4M). Actually,
>>>> the range (1G-4M, 1G) and (3G, 3G+4M) would never been involved in pmd
>>>> sharing.
>>>>
>>>> After this patch, if pud aligned *start across vm_start, then we know the
>>>> *start and vm_start are in same pud_index, and vm_start is not pud
>>>> aligned, so don't adjust *start. Same logic applied to *end.
>>>>
>>>> Fixes: commit 75802ca66354 ("mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible")
>>>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>>>> Cc: Peter Xu <peterx@redhat.com>
>>>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>>>
>>> Thank you.  That does indeed fix an issue in the current code.
>>>
>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
>>
>> Upon further thought, this patch also expands the passed range when not
>> necessary.  Consider the example (start, end) = (1G-6M, 1G-4M), and
>> (vm_start, vm_end) = (1G, 1G-2M).  This patch would adjust the range to
>> (1G, 1G-4M).  However, no adjustment should be performed as no sharing
>> is possible.
>>
> correct, my previous patch did not fully fix the issue.
> 
> Above example maybe typo for vm_start, vm_end. The issue didn't fixed by my patch would be with another example,
> (vm_start, vm_end) = (1G-8M, 1G+2M), (start, end) = (1G-6M, 1G-4M), end should not be adjusted to 1G, although after adjust it still below vm_end.
> 

Sorry, I did incorrectly write that example.  It should have read:

Consider the example (start, end) = (2G-6M, 2G-4M), and
(vm_start, vm_end) = (2G, 2G-2M).  This patch would adjust the range to
(2G, 2G-4M).  However, no adjustment should be performed as no sharing
is possible.

>> Below is proposed code to address the issue.  I'm not sending a formal
>> patch yet as I would like comments on the code first.  It is not a critical
>> issue and any fix can wait a bit.

> Now, this fully fixed the issue.
> 
> One thing to be sure is that the (start, end) as input parameter must already within vma's range, although the range_in_vma test() can cover the out of range cases.
> 
> Reviewed-by: Li Xinhai <lixinhai.lxh@gmail.com>

Thanks for taking a look.

I believe the only case where your patch produced incorrect results is
when the range was within a vma that was smaller than PUD_SIZE.  Do you
agree?

If that is the case, then how about just adding the following to your patch?
I think this is simpler and faster than the 'range_in_vma' checking I proposed.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 49990c0a02a3..716d1e58a7ae 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5261,7 +5261,9 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
 {
 	unsigned long a_start, a_end;
 
-	if (!(vma->vm_flags & VM_MAYSHARE))
+	/* Quick check for vma capable of pmd sharing */
+	if (!(vma->vm_flags & VM_MAYSHARE) ||
+	    (vma->vm_start - vma->vm_end) < PUD_SIZE)
 		return;
 
 	/* Extend the range to be PUD aligned for a worst case scenario */
Li Xinhai Jan. 4, 2021, 7:10 a.m. UTC | #7
On 1/4/21 11:55 AM, Mike Kravetz wrote:
> I believe the only case where your patch produced incorrect results is
> when the range was within a vma that was smaller than PUD_SIZE.  Do you
> agree?
> 
Not exactly. We need to consider for vma which span at least one
PUD_SIZE after align its vm_start and vm_end.

> If that is the case, then how about just adding the following to your patch?
> I think this is simpler and faster than the 'range_in_vma' checking I proposed.
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 49990c0a02a3..716d1e58a7ae 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5261,7 +5261,9 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
>   {
>   	unsigned long a_start, a_end;
>   
> -	if (!(vma->vm_flags & VM_MAYSHARE))
> +	/* Quick check for vma capable of pmd sharing */
> +	if (!(vma->vm_flags & VM_MAYSHARE) ||
> +	    (vma->vm_start - vma->vm_end) < PUD_SIZE)
>   		return;
>   
>   	/* Extend the range to be PUD aligned for a worst case scenario */

Your suggestion is good, we are able to simplify the code a bit, with
less comparison, and maybe easier to follow. I will send a new patch to
review.
Mike Kravetz Jan. 4, 2021, 6:59 p.m. UTC | #8
On 1/3/21 11:10 PM, Li Xinhai wrote:
> 
> 
> On 1/4/21 11:55 AM, Mike Kravetz wrote:
>> I believe the only case where your patch produced incorrect results is
>> when the range was within a vma that was smaller than PUD_SIZE.  Do you
>> agree?
>>
> Not exactly. We need to consider for vma which span at least one
> PUD_SIZE after align its vm_start and vm_end.
> 

I know that I provided an incorrect example again.  Sorry (again)!

Can you provide an example where adding the simple check for vma size less
than PUD_SIZE to your original patch will not work.  The logic in your V2
patch is correct.  However, I am having a hard time finding a problem with
this simpler approach.
Li Xinhai Jan. 5, 2021, 2:10 a.m. UTC | #9
On 1/5/21 2:59 AM, Mike Kravetz wrote:
> On 1/3/21 11:10 PM, Li Xinhai wrote:
>>
>>
>> On 1/4/21 11:55 AM, Mike Kravetz wrote:
>>> I believe the only case where your patch produced incorrect results is
>>> when the range was within a vma that was smaller than PUD_SIZE.  Do you
>>> agree?
>>>
>> Not exactly. We need to consider for vma which span at least one
>> PUD_SIZE after align its vm_start and vm_end.
>>
> 
> I know that I provided an incorrect example again.  Sorry (again)!
> 
> Can you provide an example where adding the simple check for vma size less
> than PUD_SIZE to your original patch will not work.  The logic in your V2
> patch is correct.  However, I am having a hard time finding a problem with
> this simpler approach.
> 


Thanks for checking. An example like this:
(vm_start, vm_end) = (2G - 6M, 2G+8M), so this vma is bigger than
PUD_SIZE, and the check for vma size bigger than PUD_SIZE will pass.

With (start, end) = (2G-4M, 2G-2M), the previous patch will not adjust
start(because adjust it from (2G-4M) to 1G will exceeding vm_start), but
it will adjust end from (2G-2M) to 2G (because 2G still below vm_end).

The adjustment of end is incorrect, because (2G-6M, 2G) range of vma
is not allowed for PMD sharing(i.e., that range do not fully occupy
PUD_SIZE). If we make the wrong adjustment, then we will unnecessarily
impact to range (2G-2M, 2G).
Li Xinhai Jan. 5, 2021, 2:38 a.m. UTC | #10
On 1/5/21 10:10 AM, Li Xinhai wrote:
> 
> 
> On 1/5/21 2:59 AM, Mike Kravetz wrote:
>> On 1/3/21 11:10 PM, Li Xinhai wrote:
>>>
>>>
>>> On 1/4/21 11:55 AM, Mike Kravetz wrote:
>>>> I believe the only case where your patch produced incorrect results is
>>>> when the range was within a vma that was smaller than PUD_SIZE.  Do you
>>>> agree?
>>>>
>>> Not exactly. We need to consider for vma which span at least one
>>> PUD_SIZE after align its vm_start and vm_end.
>>>
>>
>> I know that I provided an incorrect example again.  Sorry (again)!
>>
>> Can you provide an example where adding the simple check for vma size 
>> less
>> than PUD_SIZE to your original patch will not work.  The logic in your V2
>> patch is correct.  However, I am having a hard time finding a problem 
>> with
>> this simpler approach.
>>
> 
> 
> Thanks for checking. An example like this:
> (vm_start, vm_end) = (2G - 6M, 2G+8M), so this vma is bigger than
> PUD_SIZE, and the check for vma size bigger than PUD_SIZE will pass.
> 
> With (start, end) = (2G-4M, 2G-2M), the previous patch will not adjust
> start(because adjust it from (2G-4M) to 1G will exceeding vm_start), but
> it will adjust end from (2G-2M) to 2G (because 2G still below vm_end).
> 
> The adjustment of end is incorrect, because (2G-6M, 2G) range of vma
> is not allowed for PMD sharing(i.e., that range do not fully occupy
> PUD_SIZE). If we make the wrong adjustment, then we will unnecessarily
> impact to range (2G-2M, 2G).
> 
> 
sorry, the above example is not correct for vma size, update it as
below:

(vm_start, vm_end) = (2G - 600M, 2G+800M), so this vma is bigger than
PUD_SIZE, and the check for vma size bigger than PUD_SIZE will pass.

With (start, end) = (2G-4M, 2G-2M), the previous patch will not adjust
start(because adjust it from (2G-4M) to 1G will exceeding vm_start), but
it will adjust end from (2G-2M) to 2G (because 2G still below vm_end).

The adjustment of end is incorrect, because (2G-600M, 2G) range of vma
is not allowed for PMD sharing(i.e., that range do not fully occupy
PUD_SIZE). If we make the wrong adjustment, then we will unnecessarily
impact to range (2G-2M, 2G).
Mike Kravetz Jan. 5, 2021, 6:20 p.m. UTC | #11
On 1/4/21 6:38 PM, Li Xinhai wrote:
> 
> 
> On 1/5/21 10:10 AM, Li Xinhai wrote:
>>
>>
>> On 1/5/21 2:59 AM, Mike Kravetz wrote:
>>> On 1/3/21 11:10 PM, Li Xinhai wrote:
>>>>
>>>>
>>>> On 1/4/21 11:55 AM, Mike Kravetz wrote:
>>>>> I believe the only case where your patch produced incorrect results is
>>>>> when the range was within a vma that was smaller than PUD_SIZE.  Do you
>>>>> agree?
>>>>>
>>>> Not exactly. We need to consider for vma which span at least one
>>>> PUD_SIZE after align its vm_start and vm_end.
>>>>
>>>
>>> I know that I provided an incorrect example again.  Sorry (again)!
>>>
>>> Can you provide an example where adding the simple check for vma size less
>>> than PUD_SIZE to your original patch will not work.  The logic in your V2
>>> patch is correct.  However, I am having a hard time finding a problem with
>>> this simpler approach.
>>>
>>
>>
>> Thanks for checking. An example like this:
>> (vm_start, vm_end) = (2G - 6M, 2G+8M), so this vma is bigger than
>> PUD_SIZE, and the check for vma size bigger than PUD_SIZE will pass.
>>
>> With (start, end) = (2G-4M, 2G-2M), the previous patch will not adjust
>> start(because adjust it from (2G-4M) to 1G will exceeding vm_start), but
>> it will adjust end from (2G-2M) to 2G (because 2G still below vm_end).
>>
>> The adjustment of end is incorrect, because (2G-6M, 2G) range of vma
>> is not allowed for PMD sharing(i.e., that range do not fully occupy
>> PUD_SIZE). If we make the wrong adjustment, then we will unnecessarily
>> impact to range (2G-2M, 2G).
>>
>>
> sorry, the above example is not correct for vma size, update it as
> below:
> 
> (vm_start, vm_end) = (2G - 600M, 2G+800M), so this vma is bigger than
> PUD_SIZE, and the check for vma size bigger than PUD_SIZE will pass.
> 
> With (start, end) = (2G-4M, 2G-2M), the previous patch will not adjust
> start(because adjust it from (2G-4M) to 1G will exceeding vm_start), but
> it will adjust end from (2G-2M) to 2G (because 2G still below vm_end).
> 
> The adjustment of end is incorrect, because (2G-600M, 2G) range of vma
> is not allowed for PMD sharing(i.e., that range do not fully occupy
> PUD_SIZE). If we make the wrong adjustment, then we will unnecessarily
> impact to range (2G-2M, 2G).

Thanks you for the example!

This routine is conceptually simple, but for some reason there have been
multiple problems in the implementation.
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index cbf32d2824fd..d1e9ea55b7e6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5249,11 +5249,16 @@  void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
 	a_end = ALIGN(*end, PUD_SIZE);
 
 	/*
-	 * Intersect the range with the vma range, since pmd sharing won't be
-	 * across vma after all
+	 * If the PUD aligned address across vma range, then it means the
+	 * vm_start/vm_end is not PUD aligned. In that case, we must don't
+	 * adjust range because pmd sharing is not possbile at the start and/or
+	 * end part of vma.
 	 */
-	*start = max(vma->vm_start, a_start);
-	*end = min(vma->vm_end, a_end);
+	if (a_start >= vma->vm_start)
+		*start = a_start;
+
+	if (a_end <= vma->vm_end)
+		*end = a_end;
 }
 
 /*