Message ID | 20250108103250.3188419-14-kevin.brodsky@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | pkeys-based page table hardening | expand |
Hi Kevin, On 2025/1/8 18:32, Kevin Brodsky wrote: > When CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, page tables (both > user and kernel) are mapped with a privileged pkey in the linear > mapping. As a result, they can only be written under the > kpkeys_hardened_pgtables guard, which sets POR_EL1 appropriately to > allow such writes. > > Use this guard wherever page tables genuinely need to be written, > keeping its scope as small as possible (so that POR_EL1 is reset as > fast as possible). Where atomics are involved, the guard's scope > encompasses the whole loop to avoid switching POR_EL1 unnecessarily. > > This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled > (default). > > Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> > --- > arch/arm64/include/asm/pgtable.h | 19 +++++++++++++++++-- > arch/arm64/mm/fault.c | 2 ++ > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index f8dac6673887..0d60a49dc234 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -39,6 +39,7 @@ > #include <linux/mm_types.h> > #include <linux/sched.h> > #include <linux/page_table_check.h> > +#include <linux/kpkeys.h> > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE > @@ -314,6 +315,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) > > static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) > { > + guard(kpkeys_hardened_pgtables)(); > WRITE_ONCE(*ptep, pte); > } > > @@ -758,6 +760,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) > } > #endif /* __PAGETABLE_PMD_FOLDED */ > > + guard(kpkeys_hardened_pgtables)(); > WRITE_ONCE(*pmdp, pmd); > > if (pmd_valid(pmd)) { I noticed a long time ago that set_pte/set_pmd/... was implemented separately by each architecture without a unified entry point. This makes it difficult to add some hooks for them. Taking set_pte() as an example, is it possible to do the following: 1) add a generic set_pte() in include/asm-generic/tlb.h (Or other more appropriate files) static inline void set_pte(pte_t *ptep, pte_t pte) { arch_set_pte(ptep, pte); } 2) let each architecture include this file and rename the original set_pte() to arch_set_pte(). 3) then we can add hooks for generic set_pte(): static inline void set_pte(pte_t *ptep, pte_t pte) { guard(kpkeys_hardened_pgtables)(); arch_set_pte(ptep, pte); } 4) in this way, the architecture that supports ARCH_HAS_KPKEYS_HARDENED_PGTABLES only needs to implement the kpkeys_hardened_pgtables(), otherwise it is no-op. Just some immature ideas, and the related set/clear interfaces are currently quite messy. ;) Of course, this does not affect the feature to be implemented in this patch series. Thanks!
On 09/01/2025 08:17, Qi Zheng wrote: > [...] > >> @@ -314,6 +315,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) >> static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) >> { >> + guard(kpkeys_hardened_pgtables)(); >> WRITE_ONCE(*ptep, pte); >> } >> @@ -758,6 +760,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) >> } >> #endif /* __PAGETABLE_PMD_FOLDED */ >> + guard(kpkeys_hardened_pgtables)(); >> WRITE_ONCE(*pmdp, pmd); >> if (pmd_valid(pmd)) { > > I noticed a long time ago that set_pte/set_pmd/... was implemented > separately by each architecture without a unified entry point. This > makes it difficult to add some hooks for them. > > Taking set_pte() as an example, is it possible to do the following: > > 1) add a generic set_pte() in include/asm-generic/tlb.h (Or other more > appropriate files) > > static inline void set_pte(pte_t *ptep, pte_t pte) > { > arch_set_pte(ptep, pte); > } > > 2) let each architecture include this file and rename the original > set_pte() to arch_set_pte(). > > 3) then we can add hooks for generic set_pte(): > > static inline void set_pte(pte_t *ptep, pte_t pte) > { > guard(kpkeys_hardened_pgtables)(); > arch_set_pte(ptep, pte); > } > > 4) in this way, the architecture that supports > ARCH_HAS_KPKEYS_HARDENED_PGTABLES only needs to implement > the kpkeys_hardened_pgtables(), otherwise it is no-op. Thanks for chiming in, it is an interesting idea for sure. I think the issue here might be the benefit/churn ratio, because this would simply be adding a layer of (generic) function calls without unifying any existing arch code. Unfortunately, unlike the pagetable_alloc/tlb stuff, the majority of what happens in the page table modifiers is arch-specific. For set_p*(), we could potentially have it call a generic __set_p*() that does a WRITE_ONCE(), which is what most architectures do (in addition to various other things, like DSB/ISB on arm64). Adding the kpkeys_hardened_pgtables guard there would be better, as it reduces the "privileged" window to just the write itself. However for the other modifiers, say ptep_get_and_clear(), the implementation seems to vary wildly between arch's, including the atomic operation itself. Any unification of those seems therefore difficult. That said, I'd be happy to look into adding that generic layer on top (i.e. generic set_pte() calls arch_set_pte()) if there is enough consensus that the churn is justified. We could potentially do a mix of both as well (arch-defined set_pte() calls generic __set_pte(), while generic ptep_get_and_clear() calls arch_ptep_get_and_clear()). - Kevin
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index f8dac6673887..0d60a49dc234 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -39,6 +39,7 @@ #include <linux/mm_types.h> #include <linux/sched.h> #include <linux/page_table_check.h> +#include <linux/kpkeys.h> #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE @@ -314,6 +315,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte) static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) { + guard(kpkeys_hardened_pgtables)(); WRITE_ONCE(*ptep, pte); } @@ -758,6 +760,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) } #endif /* __PAGETABLE_PMD_FOLDED */ + guard(kpkeys_hardened_pgtables)(); WRITE_ONCE(*pmdp, pmd); if (pmd_valid(pmd)) { @@ -825,6 +828,7 @@ static inline void set_pud(pud_t *pudp, pud_t pud) return; } + guard(kpkeys_hardened_pgtables)(); WRITE_ONCE(*pudp, pud); if (pud_valid(pud)) { @@ -906,6 +910,7 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) return; } + guard(kpkeys_hardened_pgtables)(); WRITE_ONCE(*p4dp, p4d); dsb(ishst); isb(); @@ -1033,6 +1038,7 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) return; } + guard(kpkeys_hardened_pgtables)(); WRITE_ONCE(*pgdp, pgd); dsb(ishst); isb(); @@ -1233,6 +1239,7 @@ static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma, { pte_t old_pte, pte; + guard(kpkeys_hardened_pgtables)(); pte = __ptep_get(ptep); do { old_pte = pte; @@ -1279,7 +1286,10 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { - pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); + pte_t pte; + + scoped_guard(kpkeys_hardened_pgtables) + pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); page_table_check_pte_clear(mm, pte); @@ -1322,7 +1332,10 @@ static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0)); + pmd_t pmd; + + scoped_guard(kpkeys_hardened_pgtables) + pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0)); page_table_check_pmd_clear(mm, pmd); @@ -1336,6 +1349,7 @@ static inline void ___ptep_set_wrprotect(struct mm_struct *mm, { pte_t old_pte; + guard(kpkeys_hardened_pgtables)(); do { old_pte = pte; pte = pte_wrprotect(pte); @@ -1416,6 +1430,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { page_table_check_pmd_set(vma->vm_mm, pmdp, pmd); + guard(kpkeys_hardened_pgtables)(); return __pmd(xchg_relaxed(&pmd_val(*pmdp), pmd_val(pmd))); } #endif diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index ef63651099a9..ab45047155b9 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -220,6 +220,8 @@ int __ptep_set_access_flags(struct vm_area_struct *vma, if (pte_same(pte, entry)) return 0; + guard(kpkeys_hardened_pgtables)(); + /* only preserve the access flags and write permission */ pte_val(entry) &= PTE_RDONLY | PTE_AF | PTE_WRITE | PTE_DIRTY;
When CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, page tables (both user and kernel) are mapped with a privileged pkey in the linear mapping. As a result, they can only be written under the kpkeys_hardened_pgtables guard, which sets POR_EL1 appropriately to allow such writes. Use this guard wherever page tables genuinely need to be written, keeping its scope as small as possible (so that POR_EL1 is reset as fast as possible). Where atomics are involved, the guard's scope encompasses the whole loop to avoid switching POR_EL1 unnecessarily. This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled (default). Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> --- arch/arm64/include/asm/pgtable.h | 19 +++++++++++++++++-- arch/arm64/mm/fault.c | 2 ++ 2 files changed, 19 insertions(+), 2 deletions(-)