diff mbox series

mm/debug_vm_pgtable: Fix kernel crash with page table validate

Message ID 20200608062739.378902-1-aneesh.kumar@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series mm/debug_vm_pgtable: Fix kernel crash with page table validate | expand

Commit Message

Aneesh Kumar K.V June 8, 2020, 6:27 a.m. UTC
Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
no THP support enabled based on platforms. For ex: with 4K
PAGE_SIZE ppc64 supports THP only with radix translation.

This results in below crash when running with hash translation and
4K PAGE_SIZE.

kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
cpu 0x61: Vector: 700 (Program Check) at [c000000ff948f860]
    pc: c0000000018810f8: debug_vm_pgtable+0x480/0x8b0
    lr: c0000000018810ec: debug_vm_pgtable+0x474/0x8b0
...
[c000000ff948faf0] c000000001880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
[c000000ff948fbf0] c000000000011648 do_one_initcall+0x98/0x4f0
[c000000ff948fcd0] c000000001843928 kernel_init_freeable+0x330/0x3fc
[c000000ff948fdb0] c0000000000122ac kernel_init+0x24/0x148
[c000000ff948fe20] c00000000000cc44 ret_from_kernel_thread+0x5c/0x78

Check for THP support correctly

Cc: anshuman.khandual@arm.com
Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Anshuman Khandual June 8, 2020, 11:01 a.m. UTC | #1
Hi Aneesh,

On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote:
> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
> no THP support enabled based on platforms. For ex: with 4K
> PAGE_SIZE ppc64 supports THP only with radix translation.

Good catch, never hit this before.

> 
> This results in below crash when running with hash translation and
> 4K PAGE_SIZE.
> 
> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
> cpu 0x61: Vector: 700 (Program Check) at [c000000ff948f860]
>     pc: c0000000018810f8: debug_vm_pgtable+0x480/0x8b0
>     lr: c0000000018810ec: debug_vm_pgtable+0x474/0x8b0
> ...
> [c000000ff948faf0] c000000001880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
> [c000000ff948fbf0] c000000000011648 do_one_initcall+0x98/0x4f0
> [c000000ff948fcd0] c000000001843928 kernel_init_freeable+0x330/0x3fc
> [c000000ff948fdb0] c0000000000122ac kernel_init+0x24/0x148
> [c000000ff948fe20] c00000000000cc44 ret_from_kernel_thread+0x5c/0x78
> 
> Check for THP support correctly

Makes sense, is this the only configuration which hit the problem ?

> 
> Cc: anshuman.khandual@arm.com
> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 188c18908964..e60151c5e997 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>  {
>  	pmd_t pmd = pfn_pmd(pfn, prot);
>  
> +	if (!has_transparent_hugepage())
> +		return;
> +

We should also add this check to pud_basic_tests() as well.

>  	WARN_ON(!pmd_same(pmd, pmd));
>  	WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
>  	WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
> 

The subject line here should mention about correct THP support
detection which fixes the problem. Probably something like this
or similar ("Fix kernel crash with correct THP support check").

- Anshuman
Aneesh Kumar K.V June 8, 2020, 11:16 a.m. UTC | #2
On 6/8/20 4:31 PM, Anshuman Khandual wrote:
> Hi Aneesh,
> 
> On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote:
>> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
>> no THP support enabled based on platforms. For ex: with 4K
>> PAGE_SIZE ppc64 supports THP only with radix translation.
> 
> Good catch, never hit this before.
> 
>>
>> This results in below crash when running with hash translation and
>> 4K PAGE_SIZE.
>>
>> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
>> cpu 0x61: Vector: 700 (Program Check) at [c000000ff948f860]
>>      pc: c0000000018810f8: debug_vm_pgtable+0x480/0x8b0
>>      lr: c0000000018810ec: debug_vm_pgtable+0x474/0x8b0
>> ...
>> [c000000ff948faf0] c000000001880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
>> [c000000ff948fbf0] c000000000011648 do_one_initcall+0x98/0x4f0
>> [c000000ff948fcd0] c000000001843928 kernel_init_freeable+0x330/0x3fc
>> [c000000ff948fdb0] c0000000000122ac kernel_init+0x24/0x148
>> [c000000ff948fe20] c00000000000cc44 ret_from_kernel_thread+0x5c/0x78
>>
>> Check for THP support correctly
> 
> Makes sense, is this the only configuration which hit the problem ?

4K hash ppc64 is the only config i guess.

> 
>>
>> Cc: anshuman.khandual@arm.com
>> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   mm/debug_vm_pgtable.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 188c18908964..e60151c5e997 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>>   {
>>   	pmd_t pmd = pfn_pmd(pfn, prot);
>>   
>> +	if (!has_transparent_hugepage())
>> +		return;
>> +
> 
> We should also add this check to pud_basic_tests() as well.


Do we have a function that check for runtime support for pud level THP? 
ppc64 don't do pud level THP yet. So  we have 
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n

are you suggesting we do the same check for pud level THP too?


> 
>>   	WARN_ON(!pmd_same(pmd, pmd));
>>   	WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
>>   	WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
>>
> 
> The subject line here should mention about correct THP support
> detection which fixes the problem. Probably something like this
> or similar ("Fix kernel crash with correct THP support check").


Not sure about that. This fix a kernel crash with page table validate code.


-aneesh
Anshuman Khandual June 8, 2020, 12:15 p.m. UTC | #3
On 06/08/2020 04:46 PM, Aneesh Kumar K.V wrote:
> On 6/8/20 4:31 PM, Anshuman Khandual wrote:
>> Hi Aneesh,
>>
>> On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote:
>>> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
>>> no THP support enabled based on platforms. For ex: with 4K
>>> PAGE_SIZE ppc64 supports THP only with radix translation.
>>
>> Good catch, never hit this before.
>>
>>>
>>> This results in below crash when running with hash translation and
>>> 4K PAGE_SIZE.
>>>
>>> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
>>> cpu 0x61: Vector: 700 (Program Check) at [c000000ff948f860]
>>>      pc: c0000000018810f8: debug_vm_pgtable+0x480/0x8b0
>>>      lr: c0000000018810ec: debug_vm_pgtable+0x474/0x8b0
>>> ...
>>> [c000000ff948faf0] c000000001880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
>>> [c000000ff948fbf0] c000000000011648 do_one_initcall+0x98/0x4f0
>>> [c000000ff948fcd0] c000000001843928 kernel_init_freeable+0x330/0x3fc
>>> [c000000ff948fdb0] c0000000000122ac kernel_init+0x24/0x148
>>> [c000000ff948fe20] c00000000000cc44 ret_from_kernel_thread+0x5c/0x78
>>>
>>> Check for THP support correctly
>>
>> Makes sense, is this the only configuration which hit the problem ?
> 
> 4K hash ppc64 is the only config i guess.

Okay.

> 
>>
>>>
>>> Cc: anshuman.khandual@arm.com
>>> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>>   mm/debug_vm_pgtable.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index 188c18908964..e60151c5e997 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>>>   {
>>>       pmd_t pmd = pfn_pmd(pfn, prot);
>>>   +    if (!has_transparent_hugepage())
>>> +        return;
>>> +
>>
>> We should also add this check to pud_basic_tests() as well.
> 
> 
> Do we have a function that check for runtime support for pud level THP? ppc64 don't do pud level THP yet. So  we have 
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n

I believe, we dont have such a generic function. Please correct me, if I am
missing something here.

> 
> are you suggesting we do the same check for pud level THP too?

Yes. Because regardless CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD, could there
be any THP at PUD level when has_transparent_hugepage() returns negative ? The
current dependency between THP and PUD THP configs seems some what confusing
but having this check at PUD level should protect against similar problems. A
quick test (after adding this check to PUD level) on x86 does not indicate any
problem on the normal path.

> 
> 
>>
>>>       WARN_ON(!pmd_same(pmd, pmd));
>>>       WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
>>>       WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
>>>
>>
>> The subject line here should mention about correct THP support
>> detection which fixes the problem. Probably something like this
>> or similar ("Fix kernel crash with correct THP support check").
> 
> 
> Not sure about that. This fix a kernel crash with page table validate code.

What this fixes is very clear from the prefix itself - "mm/debug_vm_pgtable:",
making "page table validate" some what bit redundant. Instead, it could just
accommodate method of the fix i.e "via correct THP support check". Nonetheless,
it is just a small nit.
diff mbox series

Patch

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 188c18908964..e60151c5e997 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -61,6 +61,9 @@  static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
 {
 	pmd_t pmd = pfn_pmd(pfn, prot);
 
+	if (!has_transparent_hugepage())
+		return;
+
 	WARN_ON(!pmd_same(pmd, pmd));
 	WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
 	WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));