Message ID | 20220919103845.100809-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] ARM: mm: Provide better fault message for permission fault | expand |
On Mon, Sep 19, 2022 at 06:38:45PM +0800, Kefeng Wang wrote: > If there is a permission fault in __do_kernel_fault(), we only > print the generic "paging request" message which don't show > read, write or excute information, let's provide better fault > message for them. I don't like this change. With CPUs that do not have the ability to relocate the vectors to 0xffff0000, the vectors live at address 0, so NULL pointer dereferences can produce permission faults. I would much rather we did something similar to what x86 does: pr_alert("#PF: %s %s in %s mode\n", (error_code & X86_PF_USER) ? "user" : "supervisor", (error_code & X86_PF_INSTR) ? "instruction fetch" : (error_code & X86_PF_WRITE) ? "write access" : "read access", user_mode(regs) ? "user" : "kernel"); As we already print whether we're in user or kernel mode in the register dump, there's no need to repeat that. I think we just need an extra line to decode the FSR PF and write bits.
On 2022/9/26 18:13, Russell King (Oracle) wrote: > On Mon, Sep 19, 2022 at 06:38:45PM +0800, Kefeng Wang wrote: >> If there is a permission fault in __do_kernel_fault(), we only >> print the generic "paging request" message which don't show >> read, write or excute information, let's provide better fault >> message for them. > I don't like this change. With CPUs that do not have the ability to > relocate the vectors to 0xffff0000, the vectors live at address 0, > so NULL pointer dereferences can produce permission faults. The __do_user_fault(), do_DataAbort() and do_PrefetchAbort() shows the FSR when printing, we could do it in die_kernel_fault(), and which will be easy for us to check whether the page fault is permision fault, --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -111,8 +111,8 @@ static void die_kernel_fault(const char *msg, struct mm_struct *mm, { bust_spinlocks(1); pr_alert("8<--- cut here ---\n"); - pr_alert("Unable to handle kernel %s at virtual address %08lx\n", - msg, addr); + pr_alert("Unable to handle kernel %s (0x%08x) at virtual address %08lx\n", + msg, fsr, addr); show_pte(KERN_ALERT, mm, addr); die("Oops", regs, fsr); or, > I would much rather we did something similar to what x86 does: > > pr_alert("#PF: %s %s in %s mode\n", > (error_code & X86_PF_USER) ? "user" : "supervisor", > (error_code & X86_PF_INSTR) ? "instruction fetch" : > (error_code & X86_PF_WRITE) ? "write access" : > "read access", > user_mode(regs) ? "user" : "kernel"); > > As we already print whether we're in user or kernel mode in the > register dump, there's no need to repeat that. I think we just > need an extra line to decode the FSR PF and write bits. We could decode the FSR register, diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 46cccd6bf705..406e0210c3c5 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -113,6 +113,10 @@ static void die_kernel_fault(const char *msg, struct mm_struct *mm, pr_alert("8<--- cut here ---\n"); pr_alert("Unable to handle kernel %s at virtual address %08lx\n", msg, addr); + pr_alert("FSR: 0x%08x, LNX_PF = %u, CM = %u, WnR = %u\n", fsr, + (fsr & FSR_LNX_PF) >> FSR_LNX_PF_SHIFT, + (fsr & FSR_CM) >> FSR_CM_SHIFT, + (fsr & FSR_WRITE) >> FSR_WRITE_SHIFT); show_pte(KERN_ALERT, mm, addr); die("Oops", regs, fsr); diff --git a/arch/arm/mm/fault.h b/arch/arm/mm/fault.h index 83b5ab32d7a4..18f882aa2b32 100644 --- a/arch/arm/mm/fault.h +++ b/arch/arm/mm/fault.h @@ -5,9 +5,12 @@ /* * Fault status register encodings. We steal bit 31 for our own purposes. */ -#define FSR_LNX_PF (1 << 31) -#define FSR_CM (1 << 13) -#define FSR_WRITE (1 << 11) +#define FSR_LNX_PF_SHIFT (31) +#define FSR_LNX_PF (1 << FSR_LNX_PF_SHIFT) +#define FSR_CM_SHIFT (13) +#define FSR_CM (1 << FSR_CM_SHIFT) +#define FSR_WRITE_SHIFT (11) +#define FSR_WRITE (1 << FSR_WRITE_SHIFT) #define FSR_FS4 (1 << 10) #define FSR_FS3_0 (15) #define FSR_FS5_0 (0x3f) What's your option ? >
On Mon, Sep 26, 2022 at 09:26:50PM +0800, Kefeng Wang wrote: > > On 2022/9/26 18:13, Russell King (Oracle) wrote: > > On Mon, Sep 19, 2022 at 06:38:45PM +0800, Kefeng Wang wrote: > > > If there is a permission fault in __do_kernel_fault(), we only > > > print the generic "paging request" message which don't show > > > read, write or excute information, let's provide better fault > > > message for them. > > I don't like this change. With CPUs that do not have the ability to > > relocate the vectors to 0xffff0000, the vectors live at address 0, > > so NULL pointer dereferences can produce permission faults. > The __do_user_fault(), do_DataAbort() and do_PrefetchAbort() shows > the FSR when printing, we could do it in die_kernel_fault(), and > which will be easy for us to check whether the page fault is permision > fault, We print the hex value already in the oops: Internal error: Oops: (fsr hex value) [#num] ...
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 46cccd6bf705..387c87112d80 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -105,6 +105,19 @@ static inline bool is_write_fault(unsigned int fsr) return (fsr & FSR_WRITE) && !(fsr & FSR_CM); } +static inline bool is_permission_fault(unsigned int fsr) +{ + int fs = fsr_fs(fsr); +#ifdef CONFIG_ARM_LPAE + if ((fs & FS_PERM_NOLL_MASK) == FS_PERM_NOLL) + return true; +#else + if (fs == FS_L1_PERM || fs == FS_L2_PERM) + return true; +#endif + return false; +} + static void die_kernel_fault(const char *msg, struct mm_struct *mm, unsigned long addr, unsigned int fsr, struct pt_regs *regs) @@ -137,7 +150,14 @@ __do_kernel_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, /* * No handler, we'll have to terminate things with extreme prejudice. */ - if (addr < PAGE_SIZE) { + if (is_permission_fault(fsr)) { + if (fsr & FSR_WRITE) + msg = "write to read-only memory"; + else if (fsr & FSR_LNX_PF) + msg = "execute from non-executable memory"; + else + msg = "read from unreadable memory"; + } else if (addr < PAGE_SIZE) { msg = "NULL pointer dereference"; } else { if (kfence_handle_page_fault(addr, is_write_fault(fsr), regs)) @@ -204,19 +224,6 @@ void do_bad_area(unsigned long addr, unsigned int fsr, struct pt_regs *regs) #define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000) #define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000) -static inline bool is_permission_fault(unsigned int fsr) -{ - int fs = fsr_fs(fsr); -#ifdef CONFIG_ARM_LPAE - if ((fs & FS_PERM_NOLL_MASK) == FS_PERM_NOLL) - return true; -#else - if (fs == FS_L1_PERM || fs == FS_L2_PERM) - return true; -#endif - return false; -} - static vm_fault_t __kprobes __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int flags, unsigned long vma_flags, struct pt_regs *regs)
If there is a permission fault in __do_kernel_fault(), we only print the generic "paging request" message which don't show read, write or excute information, let's provide better fault message for them. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- v2: fix !CONFIG_MMU built arch/arm/mm/fault.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-)