Message ID | 20210726141141.2839385-5-arnd@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | ARM: remove set_fs callers and implementation | expand |
On Mon, Jul 26, 2021 at 04:11:35PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann <arnd@arndb.de> > > The system call number is used in a a couple of places, in particular > ptrace, seccomp and /proc/<pid>/syscall. *thread necromancy* Hi! So, it seems like the seccomp selftests broke in a few places due to this change (back in v5.15). I really thought kernelci.org was running the seccomp tests, but it seems like the coverage is spotty. Specifically, the syscall_restart selftest fails, as well as syscall_errno and syscall_faked (both via seccomp and PTRACE), starting with this patch. > The last one apparently never worked reliably on ARM for tasks that are > not currently getting traced. > > Storing the syscall number in the normal entry path makes it work, > as well as allowing us to see if the current system call is for OABI > compat mode, which is the next thing I want to hook into. > > Since the thread_info->syscall field is not just the number any more, it > is now renamed to abi_syscall. In kernels that enable both OABI and EABI, > the upper bits of this field encode 0x900000 (__NR_OABI_SYSCALL_BASE) > for OABI tasks, while normal EABI tasks do not set the upper bits. This > makes it possible to implement the in_oabi_syscall() helper later. > > All other users of thread_info->syscall go through the syscall_get_nr() > helper, which in turn filters out the ABI bits. While I've reproducing the bisect done by mediatek, I'm still poking around in here to figure out what's gone wrong. There was a recent patch to fix this, but it looks like it's not complete: https://lore.kernel.org/all/20230724121655.7894-1-lecopzer.chen@mediatek.com/ With the above applied, syscall_errno and syscall_faked start working again, but not the syscall_restart test. > Note that the ABI information is lost with PTRACE_SET_SYSCALL, so one > cannot set the internal number to a particular version, but this was > already the case. We could change it to let gdb encode the ABI type along > with the syscall in a CONFIG_OABI_COMPAT-enabled kernel, but that itself > would be a (backwards-compatible) ABI change, so I don't do it here. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> Another issue of note, which may just be "by design" for arm32, is that an invalid syscall (or, at least, a negative syscall) results in SIGILL, rather than a errno=ENOSYS failure. This seems to have been true at least as far back as v5.8 (where this was cleaned up for at least arm64 and s390). There was a seccomp test added for it in v5.9, but it has been failing for arm32 since then. :( I mention this because the behavior of the syscall_restart test looks like an invalid syscall: on restart a SIGILL is caught instead of the syscall correctly continuing. Anyway, I'll keep debugging this, but figured I'd mention it in case anyone else had been seeing issues in here. -Kees
On Thu, Aug 03, 2023 at 04:17:24PM -0700, Kees Cook wrote: > Anyway, I'll keep debugging this, but figured I'd mention it in case > anyone else had been seeing issues in here. Okay, I think I have a working patch now. Sent here: https://lore.kernel.org/lkml/20230804071045.never.134-kees@kernel.org/
On Fri, Aug 4, 2023, at 01:17, Kees Cook wrote: > On Mon, Jul 26, 2021 at 04:11:35PM +0200, Arnd Bergmann wrote: >> From: Arnd Bergmann <arnd@arndb.de> >> >> The system call number is used in a a couple of places, in particular >> ptrace, seccomp and /proc/<pid>/syscall. > > *thread necromancy* > > Hi! > > So, it seems like the seccomp selftests broke in a few places due to > this change (back in v5.15). I really thought kernelci.org was running > the seccomp tests, but it seems like the coverage is spotty. > > Specifically, the syscall_restart selftest fails, as well as syscall_errno > and syscall_faked (both via seccomp and PTRACE), starting with this patch. Thanks for tracking this down! >> The last one apparently never worked reliably on ARM for tasks that are >> not currently getting traced. >> >> Storing the syscall number in the normal entry path makes it work, >> as well as allowing us to see if the current system call is for OABI >> compat mode, which is the next thing I want to hook into. >> >> Since the thread_info->syscall field is not just the number any more, it >> is now renamed to abi_syscall. In kernels that enable both OABI and EABI, >> the upper bits of this field encode 0x900000 (__NR_OABI_SYSCALL_BASE) >> for OABI tasks, while normal EABI tasks do not set the upper bits. This >> makes it possible to implement the in_oabi_syscall() helper later. >> >> All other users of thread_info->syscall go through the syscall_get_nr() >> helper, which in turn filters out the ABI bits. > > While I've reproducing the bisect done by mediatek, I'm still poking > around in here to figure out what's gone wrong. There was a recent patch > to fix this, but it looks like it's not complete: > https://lore.kernel.org/all/20230724121655.7894-1-lecopzer.chen@mediatek.com/ > > With the above applied, syscall_errno and syscall_faked start working > again, but not the syscall_restart test. Right, I also see you addressed this better in your follow-up patch, I'll comment there. >> Note that the ABI information is lost with PTRACE_SET_SYSCALL, so one >> cannot set the internal number to a particular version, but this was >> already the case. We could change it to let gdb encode the ABI type along >> with the syscall in a CONFIG_OABI_COMPAT-enabled kernel, but that itself >> would be a (backwards-compatible) ABI change, so I don't do it here. >> >> Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > Another issue of note, which may just be "by design" for arm32, is that > an invalid syscall (or, at least, a negative syscall) results in SIGILL, > rather than a errno=ENOSYS failure. This seems to have been true at least > as far back as v5.8 (where this was cleaned up for at least arm64 and > s390). There was a seccomp test added for it in v5.9, but it has been > failing for arm32 since then. :( > > I mention this because the behavior of the syscall_restart test looks > like an invalid syscall: on restart a SIGILL is caught instead of the > syscall correctly continuing. The odd arm behavior came up on IRC recently, and I saw that this was what arm has always done, but I could not figure out why this is done. I tried to see where s390 and arm64 changed the behavior but can't find it. Do you have the commit IDs? Arnd
diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h index fd02761ba06c..f055e846a5cc 100644 --- a/arch/arm/include/asm/syscall.h +++ b/arch/arm/include/asm/syscall.h @@ -22,7 +22,10 @@ extern const unsigned long sys_call_table[]; static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { - return task_thread_info(task)->syscall; + if (IS_ENABLED(CONFIG_AEABI) && !IS_ENABLED(CONFIG_OABI_COMPAT)) + return task_thread_info(task)->abi_syscall; + + return task_thread_info(task)->abi_syscall & __NR_SYSCALL_MASK; } static inline void syscall_rollback(struct task_struct *task, diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index 70d4cbc49ae1..17c56051747b 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -62,7 +62,7 @@ struct thread_info { unsigned long stack_canary; #endif struct cpu_context_save cpu_context; /* cpu context */ - __u32 syscall; /* syscall number */ + __u32 abi_syscall; /* ABI type and syscall nr */ __u8 used_cp[16]; /* thread used copro */ unsigned long tp_value[2]; /* TLS registers */ #ifdef CONFIG_CRUNCH diff --git a/arch/arm/include/uapi/asm/unistd.h b/arch/arm/include/uapi/asm/unistd.h index ae7749e15726..a1149911464c 100644 --- a/arch/arm/include/uapi/asm/unistd.h +++ b/arch/arm/include/uapi/asm/unistd.h @@ -15,6 +15,7 @@ #define _UAPI__ASM_ARM_UNISTD_H #define __NR_OABI_SYSCALL_BASE 0x900000 +#define __NR_SYSCALL_MASK 0x0fffff #if defined(__thumb__) || defined(__ARM_EABI__) #define __NR_SYSCALL_BASE 0 diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c index 70993af22d80..a0945b898ca3 100644 --- a/arch/arm/kernel/asm-offsets.c +++ b/arch/arm/kernel/asm-offsets.c @@ -48,6 +48,7 @@ int main(void) DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, cpu_domain)); DEFINE(TI_CPU_SAVE, offsetof(struct thread_info, cpu_context)); + DEFINE(TI_ABI_SYSCALL, offsetof(struct thread_info, abi_syscall)); DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); DEFINE(TI_TP_VALUE, offsetof(struct thread_info, tp_value)); DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 7f0b7aba1498..e837af90cd44 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -226,6 +226,7 @@ ENTRY(vector_swi) /* saved_psr and saved_pc are now dead */ uaccess_disable tbl + get_thread_info tsk adr tbl, sys_call_table @ load syscall table pointer @@ -237,13 +238,17 @@ ENTRY(vector_swi) * get the old ABI syscall table address. */ bics r10, r10, #0xff000000 + strne r10, [tsk, #TI_ABI_SYSCALL] + streq scno, [tsk, #TI_ABI_SYSCALL] eorne scno, r10, #__NR_OABI_SYSCALL_BASE ldrne tbl, =sys_oabi_call_table #elif !defined(CONFIG_AEABI) bic scno, scno, #0xff000000 @ mask off SWI op-code + str scno, [tsk, #TI_ABI_SYSCALL] eor scno, scno, #__NR_SYSCALL_BASE @ check OS number +#else + str scno, [tsk, #TI_ABI_SYSCALL] #endif - get_thread_info tsk /* * Reload the registers that may have been corrupted on entry to * the syscall assembly (by tracing or context tracking.) @@ -288,7 +293,6 @@ ENDPROC(vector_swi) * context switches, and waiting for our parent to respond. */ __sys_trace: - mov r1, scno add r0, sp, #S_OFF bl syscall_trace_enter mov scno, r0 diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c index 2771e682220b..d886ea8910cb 100644 --- a/arch/arm/kernel/ptrace.c +++ b/arch/arm/kernel/ptrace.c @@ -25,6 +25,7 @@ #include <linux/tracehook.h> #include <linux/unistd.h> +#include <asm/syscall.h> #include <asm/traps.h> #define CREATE_TRACE_POINTS @@ -811,7 +812,8 @@ long arch_ptrace(struct task_struct *child, long request, break; case PTRACE_SET_SYSCALL: - task_thread_info(child)->syscall = data; + task_thread_info(child)->abi_syscall = data & + __NR_SYSCALL_MASK; ret = 0; break; @@ -880,14 +882,14 @@ static void tracehook_report_syscall(struct pt_regs *regs, if (dir == PTRACE_SYSCALL_EXIT) tracehook_report_syscall_exit(regs, 0); else if (tracehook_report_syscall_entry(regs)) - current_thread_info()->syscall = -1; + current_thread_info()->abi_syscall = -1; regs->ARM_ip = ip; } -asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno) +asmlinkage int syscall_trace_enter(struct pt_regs *regs) { - current_thread_info()->syscall = scno; + int scno; if (test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER); @@ -898,11 +900,11 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno) return -1; #else /* XXX: remove this once OABI gets fixed */ - secure_computing_strict(current_thread_info()->syscall); + secure_computing_strict(syscall_get_nr(current, regs)); #endif /* Tracer or seccomp may have changed syscall. */ - scno = current_thread_info()->syscall; + scno = syscall_get_nr(current, regs); if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) trace_sys_enter(regs, scno);