Message ID | 1473788797-10879-6-git-send-email-catalin.marinas@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Sep 13, 2016 at 06:46:35PM +0100, Catalin Marinas wrote: > When TTBR0_EL1 is set to the reserved page, an erroneous kernel access > to user space would generate a translation fault. This patch adds the > checks for the software-set PSR_PAN_BIT to emulate a permission fault > and report it accordingly. Why do we need to treat this case as a permission fault? It looks like a fault that wasn't a deliberate uaccess (which thus doesn't have a fixup handler) should have do_page_fault call __do_kernel_fault, where we should die(). Is this just for more consistent reporting of erroneous accesses to user memory? Or does this fix some other issue? > This patch also updates the description of the synchronous external > aborts on translation table walks. This sounds fine, though it might be better as a separate/preparatory patch. > -static inline bool is_permission_fault(unsigned int esr) > +static inline bool is_permission_fault(unsigned int esr, struct pt_regs *regs) > { > unsigned int ec = ESR_ELx_EC(esr); > unsigned int fsc_type = esr & ESR_ELx_FSC_TYPE; > > - return (ec == ESR_ELx_EC_DABT_CUR && fsc_type == ESR_ELx_FSC_PERM) || > - (ec == ESR_ELx_EC_IABT_CUR && fsc_type == ESR_ELx_FSC_PERM); > + if (ec != ESR_ELx_EC_DABT_CUR && ec != ESR_ELx_EC_IABT_CUR) > + return false; > + > + if (system_uses_ttbr0_pan()) > + return fsc_type == ESR_ELx_FSC_FAULT && > + (regs->pstate & PSR_PAN_BIT); > + else > + return fsc_type == ESR_ELx_FSC_PERM; > } Since the entry code will clear the PAN bit in the SPSR when we see the reserved ASID, faults in EFI runtime services will still be correctly reported, though that's somewhat subtle (and it took me a while to convince myself that was the case). Also, I think that faults elsewhere may be misreported, e.g. in cold entry to kernel paths (idle, hotplug) where we have a global mapping in TTBR0, and it looks like we're using the reserved ASID. I'm not immediately sure how to distinguish those cases. Thanks, Mark.
On Fri, Sep 16, 2016 at 12:33:25PM +0100, Mark Rutland wrote: > On Tue, Sep 13, 2016 at 06:46:35PM +0100, Catalin Marinas wrote: > > When TTBR0_EL1 is set to the reserved page, an erroneous kernel access > > to user space would generate a translation fault. This patch adds the > > checks for the software-set PSR_PAN_BIT to emulate a permission fault > > and report it accordingly. > > Why do we need to treat this case as a permission fault? It looks like a > fault that wasn't a deliberate uaccess (which thus doesn't have a fixup > handler) should have do_page_fault call __do_kernel_fault, where we > should die(). We don't always check the exception table for a dedicated uaccess. The only cases where the exception table is checked is when down_read_trylock(mmap_sem) fails or CONFIG_DEBUG_VM is enabled. This is a slight optimisation of the fast path of the page fault handling. So, without is_permission_fault() (which much quicker than search_exception_tables()) the kernel would keep restarting the same instruction. > > -static inline bool is_permission_fault(unsigned int esr) > > +static inline bool is_permission_fault(unsigned int esr, struct pt_regs *regs) > > { > > unsigned int ec = ESR_ELx_EC(esr); > > unsigned int fsc_type = esr & ESR_ELx_FSC_TYPE; > > > > - return (ec == ESR_ELx_EC_DABT_CUR && fsc_type == ESR_ELx_FSC_PERM) || > > - (ec == ESR_ELx_EC_IABT_CUR && fsc_type == ESR_ELx_FSC_PERM); > > + if (ec != ESR_ELx_EC_DABT_CUR && ec != ESR_ELx_EC_IABT_CUR) > > + return false; > > + > > + if (system_uses_ttbr0_pan()) > > + return fsc_type == ESR_ELx_FSC_FAULT && > > + (regs->pstate & PSR_PAN_BIT); > > + else > > + return fsc_type == ESR_ELx_FSC_PERM; > > } > > Since the entry code will clear the PAN bit in the SPSR when we see the > reserved ASID, It actually sets the PAN bit in regs->pstate when it sees the reserved ASID. > faults in EFI runtime services will still be correctly reported, > though that's somewhat subtle (and it took me a while to convince > myself that was the case). The EFI runtime services use a non-reserved ASID, so a fault taken while executing EFI services would clear the PAN bit in regs->pstate. Such faults would not trigger is_permission_fault() == true above. > Also, I think that faults elsewhere may be misreported, e.g. in cold > entry to kernel paths (idle, hotplug) where we have a global mapping in > TTBR0, and it looks like we're using the reserved ASID. > > I'm not immediately sure how to distinguish those cases. We don't normally expect faults on those paths. If we get one, they are usually fatal, so the kernel still dies but with a slightly misleading message. We could improve this I think if we can distinguished between reserved_ttbr0 after swapper (set on exception entry) and the reserved TTBR0_EL1 pointing to empty_zero_page (and not merging Ard's patch). Is it worth having such distinction? I'm not convinced, the only thing you probably save is not printing "Accessing user space memory outside uaccess.h routines" during fault (there may be other ways to mark these special cases maybe by checking the current thread_info but it needs more thinking).
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 05d2bd776c69..2be46e6794e1 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -268,13 +268,19 @@ out: return fault; } -static inline bool is_permission_fault(unsigned int esr) +static inline bool is_permission_fault(unsigned int esr, struct pt_regs *regs) { unsigned int ec = ESR_ELx_EC(esr); unsigned int fsc_type = esr & ESR_ELx_FSC_TYPE; - return (ec == ESR_ELx_EC_DABT_CUR && fsc_type == ESR_ELx_FSC_PERM) || - (ec == ESR_ELx_EC_IABT_CUR && fsc_type == ESR_ELx_FSC_PERM); + if (ec != ESR_ELx_EC_DABT_CUR && ec != ESR_ELx_EC_IABT_CUR) + return false; + + if (system_uses_ttbr0_pan()) + return fsc_type == ESR_ELx_FSC_FAULT && + (regs->pstate & PSR_PAN_BIT); + else + return fsc_type == ESR_ELx_FSC_PERM; } static bool is_el0_instruction_abort(unsigned int esr) @@ -314,7 +320,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, mm_flags |= FAULT_FLAG_WRITE; } - if (is_permission_fault(esr) && (addr < USER_DS)) { + if (addr < USER_DS && is_permission_fault(esr, regs)) { /* regs->orig_addr_limit may be 0 if we entered from EL0 */ if (regs->orig_addr_limit == KERNEL_DS) die("Accessing user space memory with fs=KERNEL_DS", regs, esr); @@ -506,10 +512,10 @@ static const struct fault_info { { do_bad, SIGBUS, 0, "unknown 17" }, { do_bad, SIGBUS, 0, "unknown 18" }, { do_bad, SIGBUS, 0, "unknown 19" }, - { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, { do_bad, SIGBUS, 0, "synchronous parity error" }, { do_bad, SIGBUS, 0, "unknown 25" }, { do_bad, SIGBUS, 0, "unknown 26" },
When TTBR0_EL1 is set to the reserved page, an erroneous kernel access to user space would generate a translation fault. This patch adds the checks for the software-set PSR_PAN_BIT to emulate a permission fault and report it accordingly. This patch also updates the description of the synchronous external aborts on translation table walks. Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> --- arch/arm64/mm/fault.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)