diff mbox series

mm/debug_vm_pgtable: Drop RANDOM_ORVALUE trick

Message ID 20240523132139.289719-1-peterx@redhat.com (mailing list archive)
State New
Headers show
Series mm/debug_vm_pgtable: Drop RANDOM_ORVALUE trick | expand

Commit Message

Peter Xu May 23, 2024, 1:21 p.m. UTC
Macro RANDOM_ORVALUE was used to make sure the pgtable entry will be
populated with !none data in clear tests.

The RANDOM_ORVALUE tried to cover mostly all the bits in a pgtable entry,
even if there's no discussion on whether all the bits will be vaild.  Both
S390 and PPC64 have their own masks to avoid touching some bits.  Now it's
the turn for x86_64.

The issue is there's a recent report from Mikhail Gavrilov showing that
this can cause a warning with the newly added pte set check in commit
8430557fc5 on writable v.s. userfaultfd-wp bit, even though the check
itself was valid, the random pte is not.  We can choose to mask more bits
out.

However the need to have such random bits setup is questionable, as now
it's already guaranteed to be true on below:

  - For pte level, the pgtable entry will be installed with value from
    pfn_pte(), where pfn points to a valid page.  Hence the pte will be
    !none already if populated with pfn_pte().

  - For upper-than-pte level, the pgtable entry should contain a directory
    entry always, which is also !none.

All the cases look like good enough to test a pxx_clear() helper.  Instead
of extending the bitmask, drop the "set random bits" trick completely.  Add
some warning guards to make sure the entries will be !none before clear().

Cc: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Link: https://lore.kernel.org/r/CABXGCsMB9A8-X+Np_Q+fWLURYL_0t3Y-MdoNabDM-Lzk58-DGA@mail.gmail.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/debug_vm_pgtable.c | 31 +++++--------------------------
 1 file changed, 5 insertions(+), 26 deletions(-)

Comments

David Hildenbrand May 23, 2024, 7:03 p.m. UTC | #1
On 23.05.24 15:21, Peter Xu wrote:
> Macro RANDOM_ORVALUE was used to make sure the pgtable entry will be
> populated with !none data in clear tests.
> 
> The RANDOM_ORVALUE tried to cover mostly all the bits in a pgtable entry,
> even if there's no discussion on whether all the bits will be vaild.  Both
> S390 and PPC64 have their own masks to avoid touching some bits.  Now it's
> the turn for x86_64.
> 
> The issue is there's a recent report from Mikhail Gavrilov showing that
> this can cause a warning with the newly added pte set check in commit
> 8430557fc5 on writable v.s. userfaultfd-wp bit, even though the check
> itself was valid, the random pte is not.  We can choose to mask more bits
> out.
> 
> However the need to have such random bits setup is questionable, as now
> it's already guaranteed to be true on below:
> 
>    - For pte level, the pgtable entry will be installed with value from
>      pfn_pte(), where pfn points to a valid page.  Hence the pte will be
>      !none already if populated with pfn_pte().
> 
>    - For upper-than-pte level, the pgtable entry should contain a directory
>      entry always, which is also !none.
> 
> All the cases look like good enough to test a pxx_clear() helper.  Instead
> of extending the bitmask, drop the "set random bits" trick completely.  Add
> some warning guards to make sure the entries will be !none before clear().
> 
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Gavin Shan <gshan@redhat.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Link: https://lore.kernel.org/r/CABXGCsMB9A8-X+Np_Q+fWLURYL_0t3Y-MdoNabDM-Lzk58-DGA@mail.gmail.com
> Signed-off-by: Peter Xu <peterx@redhat.com>

Acked-by: David Hildenbrand <david@redhat.com>
Pasha Tatashin May 23, 2024, 11:23 p.m. UTC | #2
On Thu, May 23, 2024 at 9:21 AM Peter Xu <peterx@redhat.com> wrote:
>
> Macro RANDOM_ORVALUE was used to make sure the pgtable entry will be
> populated with !none data in clear tests.
>
> The RANDOM_ORVALUE tried to cover mostly all the bits in a pgtable entry,
> even if there's no discussion on whether all the bits will be vaild.  Both
> S390 and PPC64 have their own masks to avoid touching some bits.  Now it's
> the turn for x86_64.
>
> The issue is there's a recent report from Mikhail Gavrilov showing that
> this can cause a warning with the newly added pte set check in commit
> 8430557fc5 on writable v.s. userfaultfd-wp bit, even though the check
> itself was valid, the random pte is not.  We can choose to mask more bits
> out.
>
> However the need to have such random bits setup is questionable, as now
> it's already guaranteed to be true on below:
>
>   - For pte level, the pgtable entry will be installed with value from
>     pfn_pte(), where pfn points to a valid page.  Hence the pte will be
>     !none already if populated with pfn_pte().
>
>   - For upper-than-pte level, the pgtable entry should contain a directory
>     entry always, which is also !none.
>
> All the cases look like good enough to test a pxx_clear() helper.  Instead
> of extending the bitmask, drop the "set random bits" trick completely.  Add
> some warning guards to make sure the entries will be !none before clear().
>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Gavin Shan <gshan@redhat.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Link: https://lore.kernel.org/r/CABXGCsMB9A8-X+Np_Q+fWLURYL_0t3Y-MdoNabDM-Lzk58-DGA@mail.gmail.com
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/debug_vm_pgtable.c | 31 +++++--------------------------
>  1 file changed, 5 insertions(+), 26 deletions(-)

Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
diff mbox series

Patch

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index f1c9a2c5abc0..866eddb6cfda 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -40,22 +40,7 @@ 
  * Please refer Documentation/mm/arch_pgtable_helpers.rst for the semantics
  * expectations that are being validated here. All future changes in here
  * or the documentation need to be in sync.
- *
- * On s390 platform, the lower 4 bits are used to identify given page table
- * entry type. But these bits might affect the ability to clear entries with
- * pxx_clear() because of how dynamic page table folding works on s390. So
- * while loading up the entries do not change the lower 4 bits. It does not
- * have affect any other platform. Also avoid the 62nd bit on ppc64 that is
- * used to mark a pte entry.
  */
-#define S390_SKIP_MASK		GENMASK(3, 0)
-#if __BITS_PER_LONG == 64
-#define PPC64_SKIP_MASK		GENMASK(62, 62)
-#else
-#define PPC64_SKIP_MASK		0x0
-#endif
-#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK)
-#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
 #define RANDOM_NZVALUE	GENMASK(7, 0)
 
 struct pgtable_debug_args {
@@ -511,8 +496,7 @@  static void __init pud_clear_tests(struct pgtable_debug_args *args)
 		return;
 
 	pr_debug("Validating PUD clear\n");
-	pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
-	WRITE_ONCE(*args->pudp, pud);
+	WARN_ON(pud_none(pud));
 	pud_clear(args->pudp);
 	pud = READ_ONCE(*args->pudp);
 	WARN_ON(!pud_none(pud));
@@ -548,8 +532,7 @@  static void __init p4d_clear_tests(struct pgtable_debug_args *args)
 		return;
 
 	pr_debug("Validating P4D clear\n");
-	p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
-	WRITE_ONCE(*args->p4dp, p4d);
+	WARN_ON(p4d_none(p4d));
 	p4d_clear(args->p4dp);
 	p4d = READ_ONCE(*args->p4dp);
 	WARN_ON(!p4d_none(p4d));
@@ -582,8 +565,7 @@  static void __init pgd_clear_tests(struct pgtable_debug_args *args)
 		return;
 
 	pr_debug("Validating PGD clear\n");
-	pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
-	WRITE_ONCE(*args->pgdp, pgd);
+	WARN_ON(pgd_none(pgd));
 	pgd_clear(args->pgdp);
 	pgd = READ_ONCE(*args->pgdp);
 	WARN_ON(!pgd_none(pgd));
@@ -634,10 +616,8 @@  static void __init pte_clear_tests(struct pgtable_debug_args *args)
 	if (WARN_ON(!args->ptep))
 		return;
 
-#ifndef CONFIG_RISCV
-	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
-#endif
 	set_pte_at(args->mm, args->vaddr, args->ptep, pte);
+	WARN_ON(pte_none(pte));
 	flush_dcache_page(page);
 	barrier();
 	ptep_clear(args->mm, args->vaddr, args->ptep);
@@ -650,8 +630,7 @@  static void __init pmd_clear_tests(struct pgtable_debug_args *args)
 	pmd_t pmd = READ_ONCE(*args->pmdp);
 
 	pr_debug("Validating PMD clear\n");
-	pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
-	WRITE_ONCE(*args->pmdp, pmd);
+	WARN_ON(pmd_none(pmd));
 	pmd_clear(args->pmdp);
 	pmd = READ_ONCE(*args->pmdp);
 	WARN_ON(!pmd_none(pmd));