diff mbox series

[v3,12/12] mm/debug_vm_pgtable: Fix corrupted page flag

Message ID 20210719130613.334901-13-gshan@redhat.com (mailing list archive)
State New
Headers show
Series mm/debug_vm_pgtable: Enhancements | expand

Commit Message

Gavin Shan July 19, 2021, 1:06 p.m. UTC
In page table entry modifying tests, set_xxx_at() are used to populate
the page table entries. On ARM64, PG_arch_1 is set to the target page
flag if execution permission is given. The page flag is kept when the
page is free'd to buddy's free area list. However, it will trigger page
checking failure when it's pulled from the buddy's free area list, as
the following warning messages indicate.

   BUG: Bad page state in process memhog  pfn:08000
   page:0000000015c0a628 refcount:0 mapcount:0 \
        mapping:0000000000000000 index:0x1 pfn:0x8000
   flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
   raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
   raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
   page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set

This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
after set_xxx_at() is called.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 mm/debug_vm_pgtable.c | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

Comments

Anshuman Khandual July 21, 2021, 10:18 a.m. UTC | #1
On 7/19/21 6:36 PM, Gavin Shan wrote:
> In page table entry modifying tests, set_xxx_at() are used to populate
> the page table entries. On ARM64, PG_arch_1 is set to the target page
> flag if execution permission is given. The page flag is kept when the
> page is free'd to buddy's free area list. However, it will trigger page
> checking failure when it's pulled from the buddy's free area list, as
> the following warning messages indicate.
> 
>    BUG: Bad page state in process memhog  pfn:08000
>    page:0000000015c0a628 refcount:0 mapcount:0 \
>         mapping:0000000000000000 index:0x1 pfn:0x8000
>    flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
>    raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
>    raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>    page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
> 
> This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
> after set_xxx_at() is called.

Could you please add comments before each flush_dcache_page() instance
explaining why this is needed for arm64 platforms with relevant PG_arch_1
context and how this does not have any adverse effect on other platforms ?
It should be easy for some one looking at this code after a while to figure
out from where flush_dcache_page() came from.
Gavin Shan July 21, 2021, 12:03 p.m. UTC | #2
Hi Anshuman,

On 7/21/21 8:18 PM, Anshuman Khandual wrote:
> On 7/19/21 6:36 PM, Gavin Shan wrote:
>> In page table entry modifying tests, set_xxx_at() are used to populate
>> the page table entries. On ARM64, PG_arch_1 is set to the target page
>> flag if execution permission is given. The page flag is kept when the
>> page is free'd to buddy's free area list. However, it will trigger page
>> checking failure when it's pulled from the buddy's free area list, as
>> the following warning messages indicate.
>>
>>     BUG: Bad page state in process memhog  pfn:08000
>>     page:0000000015c0a628 refcount:0 mapcount:0 \
>>          mapping:0000000000000000 index:0x1 pfn:0x8000
>>     flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
>>     raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
>>     raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>>     page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
>>
>> This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
>> after set_xxx_at() is called.
> 
> Could you please add comments before each flush_dcache_page() instance
> explaining why this is needed for arm64 platforms with relevant PG_arch_1
> context and how this does not have any adverse effect on other platforms ?
> It should be easy for some one looking at this code after a while to figure
> out from where flush_dcache_page() came from.
> 

Good point. I will improve chage log to include the commit ID in v4 where the
page flag (PG_arch_1) is used and explain how. In that case, it's much clearer
to understand the reason why we need flush_dcache_page() after set_xxx_at() on
ARM64.

Thanks,
Gavin
Anshuman Khandual July 22, 2021, 3:51 a.m. UTC | #3
Small nit:

s/Fix corrupted page flag/Fix page flag corruption on arm64/

On 7/21/21 5:33 PM, Gavin Shan wrote:
> Hi Anshuman,
> 
> On 7/21/21 8:18 PM, Anshuman Khandual wrote:
>> On 7/19/21 6:36 PM, Gavin Shan wrote:
>>> In page table entry modifying tests, set_xxx_at() are used to populate
>>> the page table entries. On ARM64, PG_arch_1 is set to the target page
>>> flag if execution permission is given. The page flag is kept when the
>>> page is free'd to buddy's free area list. However, it will trigger page
>>> checking failure when it's pulled from the buddy's free area list, as
>>> the following warning messages indicate.
>>>
>>>     BUG: Bad page state in process memhog  pfn:08000
>>>     page:0000000015c0a628 refcount:0 mapcount:0 \
>>>          mapping:0000000000000000 index:0x1 pfn:0x8000
>>>     flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
>>>     raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
>>>     raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>>>     page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
>>>
>>> This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
>>> after set_xxx_at() is called.
>>
>> Could you please add comments before each flush_dcache_page() instance
>> explaining why this is needed for arm64 platforms with relevant PG_arch_1
>> context and how this does not have any adverse effect on other platforms ?
>> It should be easy for some one looking at this code after a while to figure
>> out from where flush_dcache_page() came from.
>>
> 
> Good point. I will improve chage log to include the commit ID in v4 where the
> page flag (PG_arch_1) is used and explain how. In that case, it's much clearer
> to understand the reason why we need flush_dcache_page() after set_xxx_at() on
> ARM64.

But also some in code comments where flush_dcache_page() is being called.
Gavin Shan July 22, 2021, 6:54 a.m. UTC | #4
Hi Anshuman,

On 7/22/21 1:51 PM, Anshuman Khandual wrote:
> Small nit:
> 
> s/Fix corrupted page flag/Fix page flag corruption on arm64/
> 

Sure, The replacement will be applied in v4, thanks!

> On 7/21/21 5:33 PM, Gavin Shan wrote:
>> Hi Anshuman,
>>
>> On 7/21/21 8:18 PM, Anshuman Khandual wrote:
>>> On 7/19/21 6:36 PM, Gavin Shan wrote:
>>>> In page table entry modifying tests, set_xxx_at() are used to populate
>>>> the page table entries. On ARM64, PG_arch_1 is set to the target page
>>>> flag if execution permission is given. The page flag is kept when the
>>>> page is free'd to buddy's free area list. However, it will trigger page
>>>> checking failure when it's pulled from the buddy's free area list, as
>>>> the following warning messages indicate.
>>>>
>>>>      BUG: Bad page state in process memhog  pfn:08000
>>>>      page:0000000015c0a628 refcount:0 mapcount:0 \
>>>>           mapping:0000000000000000 index:0x1 pfn:0x8000
>>>>      flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
>>>>      raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
>>>>      raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>>>>      page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
>>>>
>>>> This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
>>>> after set_xxx_at() is called.
>>>
>>> Could you please add comments before each flush_dcache_page() instance
>>> explaining why this is needed for arm64 platforms with relevant PG_arch_1
>>> context and how this does not have any adverse effect on other platforms ?
>>> It should be easy for some one looking at this code after a while to figure
>>> out from where flush_dcache_page() came from.
>>>
>>
>> Good point. I will improve chage log to include the commit ID in v4 where the
>> page flag (PG_arch_1) is used and explain how. In that case, it's much clearer
>> to understand the reason why we need flush_dcache_page() after set_xxx_at() on
>> ARM64.
> 
> But also some in code comments where flush_dcache_page() is being called.
> 

Yes, I will add some comments where flush_dcache_page() is called in v4.

Thanks,
Gavin
diff mbox series

Patch

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 4f7bf1c9724a..de9def6f7ce1 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -29,6 +29,8 @@ 
 #include <linux/start_kernel.h>
 #include <linux/sched/mm.h>
 #include <linux/io.h>
+
+#include <asm/cacheflush.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 
@@ -118,6 +120,7 @@  static void __init pte_basic_tests(struct pgtable_debug_args *args, int idx)
 
 static void __init pte_advanced_tests(struct pgtable_debug_args *args)
 {
+	struct page *page;
 	pte_t pte;
 
 	/*
@@ -127,13 +130,16 @@  static void __init pte_advanced_tests(struct pgtable_debug_args *args)
 	 */
 
 	pr_debug("Validating PTE advanced\n");
-	if (args->pte_pfn == ULONG_MAX) {
+	page = (args->pte_pfn != ULONG_MAX) ?
+	       pfn_to_page(args->pte_pfn) : NULL;
+	if (!page) {
 		pr_debug("%s: Skipped\n", __func__);
 		return;
 	}
 
 	pte = pfn_pte(args->pte_pfn, args->page_prot);
 	set_pte_at(args->mm, args->vaddr, args->ptep, pte);
+	flush_dcache_page(page);
 	ptep_set_wrprotect(args->mm, args->vaddr, args->ptep);
 	pte = ptep_get(args->ptep);
 	WARN_ON(pte_write(pte));
@@ -145,6 +151,7 @@  static void __init pte_advanced_tests(struct pgtable_debug_args *args)
 	pte = pte_wrprotect(pte);
 	pte = pte_mkclean(pte);
 	set_pte_at(args->mm, args->vaddr, args->ptep, pte);
+	flush_dcache_page(page);
 	pte = pte_mkwrite(pte);
 	pte = pte_mkdirty(pte);
 	ptep_set_access_flags(args->vma, args->vaddr, args->ptep, pte, 1);
@@ -157,6 +164,7 @@  static void __init pte_advanced_tests(struct pgtable_debug_args *args)
 	pte = pfn_pte(args->pte_pfn, args->page_prot);
 	pte = pte_mkyoung(pte);
 	set_pte_at(args->mm, args->vaddr, args->ptep, pte);
+	flush_dcache_page(page);
 	ptep_test_and_clear_young(args->vma, args->vaddr, args->ptep);
 	pte = ptep_get(args->ptep);
 	WARN_ON(pte_young(pte));
@@ -215,6 +223,7 @@  static void __init pmd_basic_tests(struct pgtable_debug_args *args, int idx)
 
 static void __init pmd_advanced_tests(struct pgtable_debug_args *args)
 {
+	struct page *page;
 	pmd_t pmd;
 	unsigned long vaddr = (args->vaddr & HPAGE_PMD_MASK);
 
@@ -222,7 +231,9 @@  static void __init pmd_advanced_tests(struct pgtable_debug_args *args)
 		return;
 
 	pr_debug("Validating PMD advanced\n");
-	if (args->pmd_pfn == ULONG_MAX) {
+	page = (args->pmd_pfn != ULONG_MAX) ?
+	       pfn_to_page(args->pmd_pfn) : NULL;
+	if (!page) {
 		pr_debug("%s: Skipped\n", __func__);
 		return;
 	}
@@ -231,6 +242,7 @@  static void __init pmd_advanced_tests(struct pgtable_debug_args *args)
 
 	pmd = pfn_pmd(args->pmd_pfn, args->page_prot);
 	set_pmd_at(args->mm, vaddr, args->pmdp, pmd);
+	flush_dcache_page(page);
 	pmdp_set_wrprotect(args->mm, vaddr, args->pmdp);
 	pmd = READ_ONCE(*(args->pmdp));
 	WARN_ON(pmd_write(pmd));
@@ -242,6 +254,7 @@  static void __init pmd_advanced_tests(struct pgtable_debug_args *args)
 	pmd = pmd_wrprotect(pmd);
 	pmd = pmd_mkclean(pmd);
 	set_pmd_at(args->mm, vaddr, args->pmdp, pmd);
+	flush_dcache_page(page);
 	pmd = pmd_mkwrite(pmd);
 	pmd = pmd_mkdirty(pmd);
 	pmdp_set_access_flags(args->vma, vaddr, args->pmdp, pmd, 1);
@@ -254,6 +267,7 @@  static void __init pmd_advanced_tests(struct pgtable_debug_args *args)
 	pmd = pmd_mkhuge(pfn_pmd(args->pmd_pfn, args->page_prot));
 	pmd = pmd_mkyoung(pmd);
 	set_pmd_at(args->mm, vaddr, args->pmdp, pmd);
+	flush_dcache_page(page);
 	pmdp_test_and_clear_young(args->vma, vaddr, args->pmdp);
 	pmd = READ_ONCE(*(args->pmdp));
 	WARN_ON(pmd_young(pmd));
@@ -340,6 +354,7 @@  static void __init pud_basic_tests(struct pgtable_debug_args *args, int idx)
 
 static void __init pud_advanced_tests(struct pgtable_debug_args *args)
 {
+	struct page *page;
 	unsigned long vaddr = (args->vaddr & HPAGE_PUD_MASK);
 	pud_t pud;
 
@@ -347,13 +362,16 @@  static void __init pud_advanced_tests(struct pgtable_debug_args *args)
 		return;
 
 	pr_debug("Validating PUD advanced\n");
-	if (args->pud_pfn == ULONG_MAX) {
+	page = (args->pud_pfn != ULONG_MAX) ?
+	       pfn_to_page(args->pud_pfn) : NULL;
+	if (!page) {
 		pr_debug("%s: Skipped\n", __func__);
 		return;
 	}
 
 	pud = pfn_pud(args->pud_pfn, args->page_prot);
 	set_pud_at(args->mm, vaddr, args->pudp, pud);
+	flush_dcache_page(page);
 	pudp_set_wrprotect(args->mm, vaddr, args->pudp);
 	pud = READ_ONCE(*(args->pudp));
 	WARN_ON(pud_write(pud));
@@ -367,6 +385,7 @@  static void __init pud_advanced_tests(struct pgtable_debug_args *args)
 	pud = pud_wrprotect(pud);
 	pud = pud_mkclean(pud);
 	set_pud_at(args->mm, vaddr, args->pudp, pud);
+	flush_dcache_page(page);
 	pud = pud_mkwrite(pud);
 	pud = pud_mkdirty(pud);
 	pudp_set_access_flags(args->vma, vaddr, args->pudp, pud, 1);
@@ -382,6 +401,7 @@  static void __init pud_advanced_tests(struct pgtable_debug_args *args)
 	pud = pfn_pud(args->pud_pfn, args->page_prot);
 	pud = pud_mkyoung(pud);
 	set_pud_at(args->mm, vaddr, args->pudp, pud);
+	flush_dcache_page(page);
 	pudp_test_and_clear_young(args->vma, vaddr, args->pudp);
 	pud = READ_ONCE(*(args->pudp));
 	WARN_ON(pud_young(pud));
@@ -596,10 +616,13 @@  static void __init pgd_populate_tests(struct pgtable_debug_args *args) { }
 
 static void __init pte_clear_tests(struct pgtable_debug_args *args)
 {
+	struct page *page;
 	pte_t pte;
 
 	pr_debug("Validating PTE clear\n");
-	if (args->pte_pfn == ULONG_MAX) {
+	page = (args->pte_pfn != ULONG_MAX) ?
+	       pfn_to_page(args->pte_pfn) : NULL;
+	if (!page) {
 		pr_debug("%s: Skipped\n", __func__);
 		return;
 	}
@@ -609,6 +632,7 @@  static void __init pte_clear_tests(struct pgtable_debug_args *args)
 	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
 #endif
 	set_pte_at(args->mm, args->vaddr, args->ptep, pte);
+	flush_dcache_page(page);
 	barrier();
 	pte_clear(args->mm, args->vaddr, args->ptep);
 	pte = ptep_get(args->ptep);