diff mbox series

[v2] mm/page_vma_mapped.c: Detect mismatched pfn of hugetlbfs page in pfn_in_hpage()

Message ID 1578579963-6075-1-git-send-email-lixinhai.lxh@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2] mm/page_vma_mapped.c: Detect mismatched pfn of hugetlbfs page in pfn_in_hpage() | expand

Commit Message

Li Xinhai Jan. 9, 2020, 2:26 p.m. UTC
check_pte is called for hugetlbfs page and comparing pfn in pfn_in_page,
where pfn is compared in range [hpage_pfn, hpage_pfn+HPAGE_PMD_NR). For
hugetlbfs page, pfn must equal to hpage_pfn, so catch it as bug if they
are not equal.

Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/page_vma_mapped.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Michal Hocko Jan. 9, 2020, 3 p.m. UTC | #1
[Cc Mike who is hugetlb maintainer]

On Thu 09-01-20 14:26:02, Li Xinhai wrote:
> check_pte is called for hugetlbfs page and comparing pfn in pfn_in_page,
> where pfn is compared in range [hpage_pfn, hpage_pfn+HPAGE_PMD_NR). For
> hugetlbfs page, pfn must equal to hpage_pfn, so catch it as bug if they
> are not equal.

Why do we need this? Would that help to debug any past bug?

> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  mm/page_vma_mapped.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index eff4b45..713aaed 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -55,6 +55,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
>  static inline bool pfn_in_hpage(struct page *hpage, unsigned long pfn)
>  {
>  	unsigned long hpage_pfn = page_to_pfn(hpage);
> +	VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
>  
>  	/* THP can be referenced by any subpage */
>  	return pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage);
> -- 
> 1.8.3.1
>
Mike Kravetz Jan. 9, 2020, 5:09 p.m. UTC | #2
On 1/9/20 7:00 AM, Michal Hocko wrote:
> [Cc Mike who is hugetlb maintainer]
> 
> On Thu 09-01-20 14:26:02, Li Xinhai wrote:
>> check_pte is called for hugetlbfs page and comparing pfn in pfn_in_page,
>> where pfn is compared in range [hpage_pfn, hpage_pfn+HPAGE_PMD_NR). For
>> hugetlbfs page, pfn must equal to hpage_pfn, so catch it as bug if they
>> are not equal.
> 
> Why do we need this? Would that help to debug any past bug?

I have the same question as Michal.  Did you see an issue where such a check
could help?

There is nothing wrong with the proposed code.  However, it would be good to
know the reason for adding it.  I started looking at all possible code paths
where this could apply.  However, it would be better if you pointed out the
concern that caused you to create the patch.
Li Xinhai Jan. 9, 2020, 10:48 p.m. UTC | #3
On 2020-01-10 at 01:09 Mike Kravetz wrote:
>On 1/9/20 7:00 AM, Michal Hocko wrote:
>> [Cc Mike who is hugetlb maintainer]
>>
>> On Thu 09-01-20 14:26:02, Li Xinhai wrote:
>>> check_pte is called for hugetlbfs page and comparing pfn in pfn_in_page,
>>> where pfn is compared in range [hpage_pfn, hpage_pfn+HPAGE_PMD_NR). For
>>> hugetlbfs page, pfn must equal to hpage_pfn, so catch it as bug if they
>>> are not equal.
>>
>> Why do we need this? Would that help to debug any past bug?
>
>I have the same question as Michal.  Did you see an issue where such a check
>could help?
>
>There is nothing wrong with the proposed code.  However, it would be good to
>know the reason for adding it.  I started looking at all possible code paths
>where this could apply.  However, it would be better if you pointed out the
>concern that caused you to create the patch.
>--
>Mike Kravetz
> 
oops, I didn't write the code correctly. I should wrote it as 

	if (pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage)) {
		VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
		return true;
	}

	return false;

hpage_nr_pages(hpage) give us HPAGE_PMD_NR for THP and hugetlbfs page,
but remapping PTE to a differrnt hugetlbfs page still allowed, so put the BUG code
into this condition is necessary. By this way, if it was not a exact match for PageHuge,
then it is a bug.

>>
>>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>> ---
>>>  mm/page_vma_mapped.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index eff4b45..713aaed 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -55,6 +55,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
>>>  static inline bool pfn_in_hpage(struct page *hpage, unsigned long pfn)
>>>  {
>>>  unsigned long hpage_pfn = page_to_pfn(hpage);
>>> +	VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
>>> 
>>>  /* THP can be referenced by any subpage */
>>>  return pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage);
>>> --
>>> 1.8.3.1
Mike Kravetz Jan. 9, 2020, 11 p.m. UTC | #4
On 1/9/20 2:48 PM, Li Xinhai wrote:
> oops, I didn't write the code correctly. I should wrote it as 
> 
> 	if (pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage)) {
> 		VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
> 		return true;
> 	}
> 
> 	return false;
> 
> hpage_nr_pages(hpage) give us HPAGE_PMD_NR for THP and hugetlbfs page,
> but remapping PTE to a differrnt hugetlbfs page still allowed, so put the BUG code
> into this condition is necessary. By this way, if it was not a exact match for PageHuge,
> then it is a bug.

Thank you.  I think we all agree on what the proposed code is doing.
However, we would like to know why you believe this code should be added.
For example,
- Did you actually encounter this situation (PageHuge(hpage) && pfn !=
  hpage_pfn)?
- Did you discover some code path where we are likely to encounter this
  situation?
- Some other reason?
Li Xinhai Jan. 10, 2020, 2:52 a.m. UTC | #5
On 2020-01-10 at 07:00 Mike Kravetz wrote:
>On 1/9/20 2:48 PM, Li Xinhai wrote:
>> oops, I didn't write the code correctly. I should wrote it as
>>
>> if (pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage)) {
>> VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
>> return true;
>> }
>>
>> return false;
>>
>> hpage_nr_pages(hpage) give us HPAGE_PMD_NR for THP and hugetlbfs page,
>> but remapping PTE to a differrnt hugetlbfs page still allowed, so put the BUG code
>> into this condition is necessary. By this way, if it was not a exact match for PageHuge,
>> then it is a bug.
>
>Thank you.  I think we all agree on what the proposed code is doing.
>However, we would like to know why you believe this code should be added.
>For example,
>- Did you actually encounter this situation (PageHuge(hpage) && pfn !=
>  hpage_pfn)?
>- Did you discover some code path where we are likely to encounter this
>  situation?
>- Some other reason? 

I didn't actually encounter this condition.

There are two ways for faulty code,
1. one is from the 'hpage', it could be head or tail page of hugetlbfs (I see that
current code make sure always call with head page as you mentioned). Luckly,
we catch the tail page case as BUG at begining of this mapped_walk(the
page_hstate(page) return NULL for tail page).
2. The other is from the content stored in the PTE, wihch we used as 'pfn' and
compare with 'hpage'.

Current code matches 'pfn' and 'hpage' like below:
- normal 4k page: hpage_pfn <= pfn < hpage_pfn + 1
- THP, hugetlbfs page:  hpage_pfn <= pfn < hpage_pfn + HPAGE_PMD_NR
we need do exact match for normal 4K page and hugetlbfs page, and range
match for THP.

>--
>Mike Kravetz
Michal Hocko Jan. 10, 2020, 6:22 a.m. UTC | #6
On Fri 10-01-20 10:52:56, Li Xinhai wrote:
> On 2020-01-10 at 07:00 Mike Kravetz wrote:
> >On 1/9/20 2:48 PM, Li Xinhai wrote:
> >> oops, I didn't write the code correctly. I should wrote it as
> >>
> >> if (pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage)) {
> >> VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
> >> return true;
> >> }
> >>
> >> return false;
> >>
> >> hpage_nr_pages(hpage) give us HPAGE_PMD_NR for THP and hugetlbfs page,
> >> but remapping PTE to a differrnt hugetlbfs page still allowed, so put the BUG code
> >> into this condition is necessary. By this way, if it was not a exact match for PageHuge,
> >> then it is a bug.
> >
> >Thank you.  I think we all agree on what the proposed code is doing.
> >However, we would like to know why you believe this code should be added.
> >For example,
> >- Did you actually encounter this situation (PageHuge(hpage) && pfn !=
> >  hpage_pfn)?
> >- Did you discover some code path where we are likely to encounter this
> >  situation?
> >- Some other reason? 
> 
> I didn't actually encounter this condition.
> 
> There are two ways for faulty code,
> 1. one is from the 'hpage', it could be head or tail page of hugetlbfs (I see that
> current code make sure always call with head page as you mentioned). Luckly,
> we catch the tail page case as BUG at begining of this mapped_walk(the
> page_hstate(page) return NULL for tail page).
> 2. The other is from the content stored in the PTE, wihch we used as 'pfn' and
> compare with 'hpage'.
> 
> Current code matches 'pfn' and 'hpage' like below:
> - normal 4k page: hpage_pfn <= pfn < hpage_pfn + 1
> - THP, hugetlbfs page:  hpage_pfn <= pfn < hpage_pfn + HPAGE_PMD_NR
> we need do exact match for normal 4K page and hugetlbfs page, and range
> match for THP.

This still doesn't really explain why to add the BUG_ON, I am afraid.
pfn_in_hpage is called from the vma walk. check_pte is reponsible to
check the page table mapping so the input to pfn_in_hpage should be
already sanitized. If it is not then that should be fixed and {VM_}BUG_ON
is not the best way to do such a sanitization IMHO. First of all this is
all under locks so crashing would likely mean a follow up problems.
On the other hand a failure can be handled gracefully AFAICS.

That being said I still do not see how this is going to help with
anything. Please note that adding {VM}_BUG_ON as general asserts is
strongly discouraged. Those should be added only when there is a
data corruption detected (and then it should likely be BUG_ON rather
than VM_BUG_ON) that cannot be handled gracefully or when it
considerably improves debugability of very subtle problems.
Li Xinhai Jan. 10, 2020, 7:11 a.m. UTC | #7
On 2020-01-10 at 14:22 Michal Hocko wrote:
>On Fri 10-01-20 10:52:56, Li Xinhai wrote:
>> On 2020-01-10 at 07:00 Mike Kravetz wrote:
>> >On 1/9/20 2:48 PM, Li Xinhai wrote:
>> >> oops, I didn't write the code correctly. I should wrote it as
>> >>
>> >> if (pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage)) {
>> >> VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
>> >> return true;
>> >> }
>> >>
>> >> return false;
>> >>
>> >> hpage_nr_pages(hpage) give us HPAGE_PMD_NR for THP and hugetlbfs page,
>> >> but remapping PTE to a differrnt hugetlbfs page still allowed, so put the BUG code
>> >> into this condition is necessary. By this way, if it was not a exact match for PageHuge,
>> >> then it is a bug.
>> >
>> >Thank you.  I think we all agree on what the proposed code is doing.
>> >However, we would like to know why you believe this code should be added.
>> >For example,
>> >- Did you actually encounter this situation (PageHuge(hpage) && pfn !=
>> >  hpage_pfn)?
>> >- Did you discover some code path where we are likely to encounter this
>> >  situation?
>> >- Some other reason?
>>
>> I didn't actually encounter this condition.
>>
>> There are two ways for faulty code,
>> 1. one is from the 'hpage', it could be head or tail page of hugetlbfs (I see that
>> current code make sure always call with head page as you mentioned). Luckly,
>> we catch the tail page case as BUG at begining of this mapped_walk(the
>> page_hstate(page) return NULL for tail page).
>> 2. The other is from the content stored in the PTE, wihch we used as 'pfn' and
>> compare with 'hpage'.
>>
>> Current code matches 'pfn' and 'hpage' like below:
>> - normal 4k page: hpage_pfn <= pfn < hpage_pfn + 1
>> - THP, hugetlbfs page:  hpage_pfn <= pfn < hpage_pfn + HPAGE_PMD_NR
>> we need do exact match for normal 4K page and hugetlbfs page, and range
>> match for THP.
>
>This still doesn't really explain why to add the BUG_ON, I am afraid.
>pfn_in_hpage is called from the vma walk. check_pte is reponsible to
>check the page table mapping so the input to pfn_in_hpage should be
>already sanitized. If it is not then that should be fixed and {VM_}BUG_ON
>is not the best way to do such a sanitization IMHO. First of all this is
>all under locks so crashing would likely mean a follow up problems.
>On the other hand a failure can be handled gracefully AFAICS.
>
>That being said I still do not see how this is going to help with
>anything. Please note that adding {VM}_BUG_ON as general asserts is
>strongly discouraged. Those should be added only when there is a
>data corruption detected (and then it should likely be BUG_ON rather
>than VM_BUG_ON) that cannot be handled gracefully or when it
>considerably improves debugability of very subtle problems.
> 

pfn_in_hpage() has purpose to check all those three types of page, not
just THP as implied in its name. Change name as below would be clear

static inline bool pfn_is_matched(struct page *page, unsigned long pfn)
{
	unsigned long page_pfn = page_to_pfn(page);

	/* normal page and hugetlbfs page */
	if (!PageCompound(page) || PageHuge(page))
		return page_pfn == pfn;

	/* THP can be referenced by any subpage */
	return pfn >= page_pfn && pfn - page_pfn < hpage_nr_pages(page);
}

check_pte is doing right thing and helps to collect 'pfn' from both normal PTE
and migration entry. No chnages to it.

>--
>Michal Hocko
>SUSE Labs
diff mbox series

Patch

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index eff4b45..713aaed 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -55,6 +55,7 @@  static bool map_pte(struct page_vma_mapped_walk *pvmw)
 static inline bool pfn_in_hpage(struct page *hpage, unsigned long pfn)
 {
 	unsigned long hpage_pfn = page_to_pfn(hpage);
+	VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage);
 
 	/* THP can be referenced by any subpage */
 	return pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage);