Message ID | 20220130211838.8382-22-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Shadow stacks for userspace | expand |
On 1/30/22 13:18, Rick Edgecombe wrote: > In change_pte_range(), when a PTE is changed for prot_numa, _PAGE_RW is > preserved to avoid the additional write fault after the NUMA hinting fault. > However, pte_write() now includes both normal writable and shadow stack > (RW=0, Dirty=1) PTEs, but the latter does not have _PAGE_RW and has no need > to preserve it. This series creates an interesting situation: it causes a logical disconnection between things that were tightly coupled before. For instance, before this series, _PAGE_RW=1 and "writable" really were synonyms. They meant the same thing. One of the complexities in this series is differentiating the two. For instance, a shadow stack page can be written to, even though it has _PAGE_RW=0. This particular patch seems to be hacking around the problem that a p*_mkwrite() doesn't work on shadow stack PTE/PMDs. First, that makes me wonder what *actually* happens if we do a plain pte_mkwrite() on a shadow stack PTE. I *think* it will take the [Write=0,Dirty=1] PTE and pte = pte_set_flags(pte, _PAGE_RW); so we'll end up with [Write=1,Dirty=1], which is bad. Let's say pte_mkwrite() can't be fixed. We should probably make it VM_BUG_ON() if it's ever asked to muck with a shadow stack PTE. It's also weird because we have this pte_write()==1 PTE in a !VM_WRITE VMA. Then, we're trying to pte_mkwrite() under this !VM_WRITE VMA. pte_write() <-- returns true for on shadow stack PTE! pte_mkwrite() <-- illegal on shadow stack PTE I need to think about this a little more. I don't have a solution. But, as-is, it seems untenable. The rules are just too counter intuitive to live.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1c7167e6f223..01375e39b52b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1750,6 +1750,13 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, return 0; preserve_write = prot_numa && pmd_write(*pmd); + + /* + * Preserve only normal writable huge PMD, but not shadow + * stack (RW=0, Dirty=1). + */ + if (is_shadow_stack_mapping(vma->vm_flags)) + preserve_write = false; ret = 1; #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION diff --git a/mm/mprotect.c b/mm/mprotect.c index b0012c13a00e..faac710f0891 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -77,6 +77,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, pte_t ptent; bool preserve_write = prot_numa && pte_write(oldpte); + /* + * Preserve only normal writable PTE, but not shadow + * stack (RW=0, Dirty=1). + */ + if (is_shadow_stack_mapping(vma->vm_flags)) + preserve_write = false; + /* * Avoid trapping faults against the zero or KSM * pages. See similar comment in change_huge_pmd.