diff mbox series

[RFC,v2,13/15] arm64: mm: Guard page table writes with kpkeys

Message ID 20250108103250.3188419-14-kevin.brodsky@arm.com (mailing list archive)
State New
Headers show
Series pkeys-based page table hardening | expand

Commit Message

Kevin Brodsky Jan. 8, 2025, 10:32 a.m. UTC
When CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, page tables (both
user and kernel) are mapped with a privileged pkey in the linear
mapping. As a result, they can only be written under the
kpkeys_hardened_pgtables guard, which sets POR_EL1 appropriately to
allow such writes.

Use this guard wherever page tables genuinely need to be written,
keeping its scope as small as possible (so that POR_EL1 is reset as
fast as possible). Where atomics are involved, the guard's scope
encompasses the whole loop to avoid switching POR_EL1 unnecessarily.

This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
(default).

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 19 +++++++++++++++++--
 arch/arm64/mm/fault.c            |  2 ++
 2 files changed, 19 insertions(+), 2 deletions(-)

Comments

Qi Zheng Jan. 9, 2025, 7:17 a.m. UTC | #1
Hi Kevin,

On 2025/1/8 18:32, Kevin Brodsky wrote:
> When CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, page tables (both
> user and kernel) are mapped with a privileged pkey in the linear
> mapping. As a result, they can only be written under the
> kpkeys_hardened_pgtables guard, which sets POR_EL1 appropriately to
> allow such writes.
> 
> Use this guard wherever page tables genuinely need to be written,
> keeping its scope as small as possible (so that POR_EL1 is reset as
> fast as possible). Where atomics are involved, the guard's scope
> encompasses the whole loop to avoid switching POR_EL1 unnecessarily.
> 
> This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
> (default).
> 
> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
> ---
>   arch/arm64/include/asm/pgtable.h | 19 +++++++++++++++++--
>   arch/arm64/mm/fault.c            |  2 ++
>   2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index f8dac6673887..0d60a49dc234 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -39,6 +39,7 @@
>   #include <linux/mm_types.h>
>   #include <linux/sched.h>
>   #include <linux/page_table_check.h>
> +#include <linux/kpkeys.h>
>   
>   #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>   #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
> @@ -314,6 +315,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte)
>   
>   static inline void __set_pte_nosync(pte_t *ptep, pte_t pte)
>   {
> +	guard(kpkeys_hardened_pgtables)();
>   	WRITE_ONCE(*ptep, pte);
>   }
>   
> @@ -758,6 +760,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
>   	}
>   #endif /* __PAGETABLE_PMD_FOLDED */
>   
> +	guard(kpkeys_hardened_pgtables)();
>   	WRITE_ONCE(*pmdp, pmd);
>   
>   	if (pmd_valid(pmd)) {

I noticed a long time ago that set_pte/set_pmd/... was implemented
separately by each architecture without a unified entry point. This
makes it difficult to add some hooks for them.

Taking set_pte() as an example, is it possible to do the following:

1) add a generic set_pte() in include/asm-generic/tlb.h (Or other more
    appropriate files)

static inline void set_pte(pte_t *ptep, pte_t pte)
{
	arch_set_pte(ptep, pte);
}

2) let each architecture include this file and rename the original
    set_pte() to arch_set_pte().

3) then we can add hooks for generic set_pte():

static inline void set_pte(pte_t *ptep, pte_t pte)
{
	guard(kpkeys_hardened_pgtables)();
	arch_set_pte(ptep, pte);
}

4) in this way, the architecture that supports
    ARCH_HAS_KPKEYS_HARDENED_PGTABLES only needs to implement
    the kpkeys_hardened_pgtables(), otherwise it is no-op.

Just some immature ideas, and the related set/clear interfaces
are currently quite messy. ;)

Of course, this does not affect the feature to be implemented
in this patch series.

Thanks!
Kevin Brodsky Jan. 10, 2025, 2:05 p.m. UTC | #2
On 09/01/2025 08:17, Qi Zheng wrote:
> [...]
>
>> @@ -314,6 +315,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte)
>>     static inline void __set_pte_nosync(pte_t *ptep, pte_t pte)
>>   {
>> +    guard(kpkeys_hardened_pgtables)();
>>       WRITE_ONCE(*ptep, pte);
>>   }
>>   @@ -758,6 +760,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
>>       }
>>   #endif /* __PAGETABLE_PMD_FOLDED */
>>   +    guard(kpkeys_hardened_pgtables)();
>>       WRITE_ONCE(*pmdp, pmd);
>>         if (pmd_valid(pmd)) {
>
> I noticed a long time ago that set_pte/set_pmd/... was implemented
> separately by each architecture without a unified entry point. This
> makes it difficult to add some hooks for them.
>
> Taking set_pte() as an example, is it possible to do the following:
>
> 1) add a generic set_pte() in include/asm-generic/tlb.h (Or other more
>    appropriate files)
>
> static inline void set_pte(pte_t *ptep, pte_t pte)
> {
>     arch_set_pte(ptep, pte);
> }
>
> 2) let each architecture include this file and rename the original
>    set_pte() to arch_set_pte().
>
> 3) then we can add hooks for generic set_pte():
>
> static inline void set_pte(pte_t *ptep, pte_t pte)
> {
>     guard(kpkeys_hardened_pgtables)();
>     arch_set_pte(ptep, pte);
> }
>
> 4) in this way, the architecture that supports
>    ARCH_HAS_KPKEYS_HARDENED_PGTABLES only needs to implement
>    the kpkeys_hardened_pgtables(), otherwise it is no-op.

Thanks for chiming in, it is an interesting idea for sure. I think the
issue here might be the benefit/churn ratio, because this would simply
be adding a layer of (generic) function calls without unifying any
existing arch code. Unfortunately, unlike the pagetable_alloc/tlb stuff,
the majority of what happens in the page table modifiers is arch-specific.

For set_p*(), we could potentially have it call a generic __set_p*()
that does a WRITE_ONCE(), which is what most architectures do (in
addition to various other things, like DSB/ISB on arm64). Adding the
kpkeys_hardened_pgtables guard there would be better, as it reduces the
"privileged" window to just the write itself. However for the other
modifiers, say ptep_get_and_clear(), the implementation seems to vary
wildly between arch's, including the atomic operation itself. Any
unification of those seems therefore difficult.

That said, I'd be happy to look into adding that generic layer on top
(i.e. generic set_pte() calls arch_set_pte()) if there is enough
consensus that the churn is justified. We could potentially do a mix of
both as well (arch-defined set_pte() calls generic __set_pte(), while
generic ptep_get_and_clear() calls arch_ptep_get_and_clear()).

- Kevin
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f8dac6673887..0d60a49dc234 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -39,6 +39,7 @@ 
 #include <linux/mm_types.h>
 #include <linux/sched.h>
 #include <linux/page_table_check.h>
+#include <linux/kpkeys.h>
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
@@ -314,6 +315,7 @@  static inline pte_t pte_clear_uffd_wp(pte_t pte)
 
 static inline void __set_pte_nosync(pte_t *ptep, pte_t pte)
 {
+	guard(kpkeys_hardened_pgtables)();
 	WRITE_ONCE(*ptep, pte);
 }
 
@@ -758,6 +760,7 @@  static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 	}
 #endif /* __PAGETABLE_PMD_FOLDED */
 
+	guard(kpkeys_hardened_pgtables)();
 	WRITE_ONCE(*pmdp, pmd);
 
 	if (pmd_valid(pmd)) {
@@ -825,6 +828,7 @@  static inline void set_pud(pud_t *pudp, pud_t pud)
 		return;
 	}
 
+	guard(kpkeys_hardened_pgtables)();
 	WRITE_ONCE(*pudp, pud);
 
 	if (pud_valid(pud)) {
@@ -906,6 +910,7 @@  static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
 		return;
 	}
 
+	guard(kpkeys_hardened_pgtables)();
 	WRITE_ONCE(*p4dp, p4d);
 	dsb(ishst);
 	isb();
@@ -1033,6 +1038,7 @@  static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
 		return;
 	}
 
+	guard(kpkeys_hardened_pgtables)();
 	WRITE_ONCE(*pgdp, pgd);
 	dsb(ishst);
 	isb();
@@ -1233,6 +1239,7 @@  static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma,
 {
 	pte_t old_pte, pte;
 
+	guard(kpkeys_hardened_pgtables)();
 	pte = __ptep_get(ptep);
 	do {
 		old_pte = pte;
@@ -1279,7 +1286,10 @@  static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 static inline pte_t __ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long address, pte_t *ptep)
 {
-	pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0));
+	pte_t pte;
+
+	scoped_guard(kpkeys_hardened_pgtables)
+		pte = __pte(xchg_relaxed(&pte_val(*ptep), 0));
 
 	page_table_check_pte_clear(mm, pte);
 
@@ -1322,7 +1332,10 @@  static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm,
 static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
 					    unsigned long address, pmd_t *pmdp)
 {
-	pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0));
+	pmd_t pmd;
+
+	scoped_guard(kpkeys_hardened_pgtables)
+		pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0));
 
 	page_table_check_pmd_clear(mm, pmd);
 
@@ -1336,6 +1349,7 @@  static inline void ___ptep_set_wrprotect(struct mm_struct *mm,
 {
 	pte_t old_pte;
 
+	guard(kpkeys_hardened_pgtables)();
 	do {
 		old_pte = pte;
 		pte = pte_wrprotect(pte);
@@ -1416,6 +1430,7 @@  static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
 		unsigned long address, pmd_t *pmdp, pmd_t pmd)
 {
 	page_table_check_pmd_set(vma->vm_mm, pmdp, pmd);
+	guard(kpkeys_hardened_pgtables)();
 	return __pmd(xchg_relaxed(&pmd_val(*pmdp), pmd_val(pmd)));
 }
 #endif
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index ef63651099a9..ab45047155b9 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -220,6 +220,8 @@  int __ptep_set_access_flags(struct vm_area_struct *vma,
 	if (pte_same(pte, entry))
 		return 0;
 
+	guard(kpkeys_hardened_pgtables)();
+
 	/* only preserve the access flags and write permission */
 	pte_val(entry) &= PTE_RDONLY | PTE_AF | PTE_WRITE | PTE_DIRTY;