Message ID | 20150909193025.GA29244@www.outflux.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wednesday 09 September 2015 12:30:27 Kees Cook wrote: > The syscall ABI is inconsistent on aarch64 compat, so at least we should > document it in the seccomp_bpf tests. > > Signed-off-by: Kees Cook <keescook@chromium.org> Can you explain in what way the ABI is inconsistent here? > --- > Can someone with access to native aarch64 double-check this for me? I > think we need to change these tests to pass if it's expected, but the > compat behavior seems bad. It means compat code will break under an > aarch64 kernel, when dealing with syscalls, like through seccomp. > --- > tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c > index 770f47adf295..866ff42e000d 100644 > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > @@ -33,6 +33,10 @@ > #include <unistd.h> > #include <sys/syscall.h> > > +#if defined(__aarch64__) && !defined(__NR_poll) > +# define __NR_poll 0x49 > +#endif I don't understand this: 0x49 is __NR_ppoll on arm64 and all architectures that use asm-generic/unistd.h, not __NR_poll, which is no longer used there. If this is intentional, it at least needs a comment to explain the situation, and be extended to all other architectures that do not have a poll() system call. The arm32 version of sys_poll should be available as 168 in both native and compat mode. Arnd
On Wed, Sep 9, 2015 at 1:08 PM, Arnd Bergmann <arnd@arndb.de> wrote: > On Wednesday 09 September 2015 12:30:27 Kees Cook wrote: >> The syscall ABI is inconsistent on aarch64 compat, so at least we should >> document it in the seccomp_bpf tests. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> > > Can you explain in what way the ABI is inconsistent here? > >> --- >> Can someone with access to native aarch64 double-check this for me? I >> think we need to change these tests to pass if it's expected, but the >> compat behavior seems bad. It means compat code will break under an >> aarch64 kernel, when dealing with syscalls, like through seccomp. >> --- >> tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c >> index 770f47adf295..866ff42e000d 100644 >> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c >> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c >> @@ -33,6 +33,10 @@ >> #include <unistd.h> >> #include <sys/syscall.h> >> >> +#if defined(__aarch64__) && !defined(__NR_poll) >> +# define __NR_poll 0x49 >> +#endif > > I don't understand this: 0x49 is __NR_ppoll on arm64 and all architectures > that use asm-generic/unistd.h, not __NR_poll, which is no longer used there. Ah-ha, okay, that explains part of my struggle. :) > If this is intentional, it at least needs a comment to explain the > situation, and be extended to all other architectures that do not have > a poll() system call. > > The arm32 version of sys_poll should be available as 168 in both native > and compat mode. Does ppoll still get interrupted like poll to require a restart_syscall call? Regardless, the primary problem is this (emphasis added): >> + * - native ARM does _not_ expose true syscall. >> + * - compat ARM on ARM64 _does_ expose true syscall. When you ptrace or seccomp an arm32 binary under and arm32 kernel, restart_syscall is invisible. When you ptrace or seccomp an arm32 binary under and arm64 kernel, suddenly it's visible. This means, for example, seccomp filters will break under an arm64 kernel. (And apologies if I'm not remembering pieces of this correctly, I don't have access to arm64 hardware at the moment, which is why I'm reaching out for some help on this... I'm trying to close out this old thread: https://lkml.org/lkml/2015/1/20/778 ) -Kees
On Wednesday 09 September 2015 13:52:39 Kees Cook wrote: > On Wed, Sep 9, 2015 at 1:08 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Wednesday 09 September 2015 12:30:27 Kees Cook wrote: > > If this is intentional, it at least needs a comment to explain the > > situation, and be extended to all other architectures that do not have > > a poll() system call. > > > > The arm32 version of sys_poll should be available as 168 in both native > > and compat mode. > > Does ppoll still get interrupted like poll to require a restart_syscall call? Yes, the only difference between the two is the optional signal mask argument in ppoll. > Regardless, the primary problem is this (emphasis added): > > >> + * - native ARM does _not_ expose true syscall. > >> + * - compat ARM on ARM64 _does_ expose true syscall. > > When you ptrace or seccomp an arm32 binary under and arm32 kernel, > restart_syscall is invisible. When you ptrace or seccomp an arm32 > binary under and arm64 kernel, suddenly it's visible. This means, for > example, seccomp filters will break under an arm64 kernel. Ok, I see. > (And apologies if I'm not remembering pieces of this correctly, I > don't have access to arm64 hardware at the moment, which is why I'm > reaching out for some help on this... I'm trying to close out this old > thread: https://lkml.org/lkml/2015/1/20/778 ) FWIW, it should not be too hard to get an image that runs in qemu-system-arm64. I suppose the same problem exists on all restartable system calls (e.g. nanosleep, select, recvmsg, clock_nanosleep, ...)? I also believe there are various architectures that cannot see the original system call number for a restarted syscall, in particular when the syscall number argument is in the same register as the return code of the syscall and gets overwritten on the way out of the kernel. Is this the problem you are seeing? If so, we should find a solution that works on all such architectures. Arnd
On Wed, Sep 9, 2015 at 2:20 PM, Arnd Bergmann <arnd@arndb.de> wrote: > On Wednesday 09 September 2015 13:52:39 Kees Cook wrote: >> On Wed, Sep 9, 2015 at 1:08 PM, Arnd Bergmann <arnd@arndb.de> wrote: >> > On Wednesday 09 September 2015 12:30:27 Kees Cook wrote: > >> > If this is intentional, it at least needs a comment to explain the >> > situation, and be extended to all other architectures that do not have >> > a poll() system call. >> > >> > The arm32 version of sys_poll should be available as 168 in both native >> > and compat mode. >> >> Does ppoll still get interrupted like poll to require a restart_syscall call? > > Yes, the only difference between the two is the optional signal mask > argument in ppoll. Okay, good. I'll respin the patch to use ppoll (which was Bamvor's original suggestion to me, IIRC). >> Regardless, the primary problem is this (emphasis added): >> >> >> + * - native ARM does _not_ expose true syscall. >> >> + * - compat ARM on ARM64 _does_ expose true syscall. >> >> When you ptrace or seccomp an arm32 binary under and arm32 kernel, >> restart_syscall is invisible. When you ptrace or seccomp an arm32 >> binary under and arm64 kernel, suddenly it's visible. This means, for >> example, seccomp filters will break under an arm64 kernel. > > Ok, I see. > >> (And apologies if I'm not remembering pieces of this correctly, I >> don't have access to arm64 hardware at the moment, which is why I'm >> reaching out for some help on this... I'm trying to close out this old >> thread: https://lkml.org/lkml/2015/1/20/778 ) > > FWIW, it should not be too hard to get an image that runs in > qemu-system-arm64. Yeah, that's my next project on this front. > I suppose the same problem exists on all restartable system calls > (e.g. nanosleep, select, recvmsg, clock_nanosleep, ...)? As far as I know, yes. I just happened to set up the testing around poll as it was easiest to trigger for me. > I also believe there are various architectures that cannot see the > original system call number for a restarted syscall, in particular > when the syscall number argument is in the same register as the > return code of the syscall and gets overwritten on the way out of > the kernel. Is this the problem you are seeing? If so, we should > find a solution that works on all such architectures. I don't think there's any one way things operate, unfortunately. I need to map out the behaviors, since sometimes ptrace sees things differently from seccomp (which leads to no end of confusion on my part). I will try to generate a comparison across several architectures. -Kees
On 09/10/2015 04:30 AM, Kees Cook wrote: > The syscall ABI is inconsistent on aarch64 compat, so at least we should > document it in the seccomp_bpf tests. > > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > Can someone with access to native aarch64 double-check this for me? I > think we need to change these tests to pass if it's expected, but the > compat behavior seems bad. It means compat code will break under an > aarch64 kernel, when dealing with syscalls, like through seccomp. > --- > tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c > index 770f47adf295..866ff42e000d 100644 > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > @@ -33,6 +33,10 @@ > #include <unistd.h> > #include <sys/syscall.h> > > +#if defined(__aarch64__) && !defined(__NR_poll) > +# define __NR_poll 0x49 > +#endif > + > #include "test_harness.h" > > #ifndef PR_SET_PTRACER > @@ -2124,10 +2128,17 @@ TEST(syscall_restart) > ASSERT_EQ(SIGTRAP, WSTOPSIG(status)); > ASSERT_EQ(PTRACE_EVENT_SECCOMP, (status >> 16)); > ASSERT_EQ(0, ptrace(PTRACE_GETEVENTMSG, child_pid, NULL, &msg)); > - ASSERT_EQ(0x200, msg); > + > + /* > + * FIXME: > + * - native ARM does not expose true syscall. > + * - compat ARM on ARM64 does expose true syscall. > + * - native ARM64 hides true syscall even from seccomp. Are you sure about the last line? The kernel pushes __NR_compat_restart_syscall to w7 in compat mode, while __NR_restart_syscall to x8 in native mode. But it is the only difference, as far as I understand, in terms of restarting a system call. So the behavior should be basically the same. -Takahiro AKASHI > + */ > + ASSERT_EQ(0x200, msg); /* This will fail on native arm64. */ > ret = get_syscall(_metadata, child_pid); > #if defined(__arm__) > - /* FIXME: ARM does not expose true syscall in registers. */ > + /* This will fail on arm64 in compat mode. */ > EXPECT_EQ(__NR_poll, ret); > #else > EXPECT_EQ(__NR_restart_syscall, ret); >
On Thu, Sep 10, 2015 at 3:35 AM, AKASHI Takahiro <takahiro.akashi@linaro.org> wrote: > On 09/10/2015 04:30 AM, Kees Cook wrote: >> >> The syscall ABI is inconsistent on aarch64 compat, so at least we should >> document it in the seccomp_bpf tests. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> --- >> Can someone with access to native aarch64 double-check this for me? I >> think we need to change these tests to pass if it's expected, but the >> compat behavior seems bad. It means compat code will break under an >> aarch64 kernel, when dealing with syscalls, like through seccomp. >> --- >> tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c >> b/tools/testing/selftests/seccomp/seccomp_bpf.c >> index 770f47adf295..866ff42e000d 100644 >> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c >> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c >> @@ -33,6 +33,10 @@ >> #include <unistd.h> >> #include <sys/syscall.h> >> >> +#if defined(__aarch64__) && !defined(__NR_poll) >> +# define __NR_poll 0x49 >> +#endif >> + >> #include "test_harness.h" >> >> #ifndef PR_SET_PTRACER >> @@ -2124,10 +2128,17 @@ TEST(syscall_restart) >> ASSERT_EQ(SIGTRAP, WSTOPSIG(status)); >> ASSERT_EQ(PTRACE_EVENT_SECCOMP, (status >> 16)); >> ASSERT_EQ(0, ptrace(PTRACE_GETEVENTMSG, child_pid, NULL, &msg)); >> - ASSERT_EQ(0x200, msg); >> + >> + /* >> + * FIXME: >> + * - native ARM does not expose true syscall. >> + * - compat ARM on ARM64 does expose true syscall. >> + * - native ARM64 hides true syscall even from seccomp. > > > Are you sure about the last line? > The kernel pushes __NR_compat_restart_syscall to w7 in compat mode, while > __NR_restart_syscall to x8 in native mode. But it is the only difference, > as far as I understand, in terms of restarting a system call. > So the behavior should be basically the same. I've spent a little more time looking at this, and no, that last statement is wrong. My current testing shows that seccomp always sees the "true" syscall, which is what we want (matches all other architectures). However, the other two statements continue to appear true: there is a register visibility difference. During the restart_syscall syscall, this routine behaves different on native arm and compat arm: int get_syscall(pid_t tracee) { struct iovec iov; struct pt_regs regs; iov.iov_base = ®s; iov.iov_len = sizeof(regs); assert(0 == ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &iov)); return regs.ARM_r7; } On native ARM, this returns the restarted syscall (in my test case, nanosleep). On compat ARM, this returns restart_syscall syscall. Either we should fix native ARM to do what all other architectures do (show restart_syscall syscall) or we should fix compat ARM to behave like native ARM. -Kees
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 770f47adf295..866ff42e000d 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -33,6 +33,10 @@ #include <unistd.h> #include <sys/syscall.h> +#if defined(__aarch64__) && !defined(__NR_poll) +# define __NR_poll 0x49 +#endif + #include "test_harness.h" #ifndef PR_SET_PTRACER @@ -2124,10 +2128,17 @@ TEST(syscall_restart) ASSERT_EQ(SIGTRAP, WSTOPSIG(status)); ASSERT_EQ(PTRACE_EVENT_SECCOMP, (status >> 16)); ASSERT_EQ(0, ptrace(PTRACE_GETEVENTMSG, child_pid, NULL, &msg)); - ASSERT_EQ(0x200, msg); + + /* + * FIXME: + * - native ARM does not expose true syscall. + * - compat ARM on ARM64 does expose true syscall. + * - native ARM64 hides true syscall even from seccomp. + */ + ASSERT_EQ(0x200, msg); /* This will fail on native arm64. */ ret = get_syscall(_metadata, child_pid); #if defined(__arm__) - /* FIXME: ARM does not expose true syscall in registers. */ + /* This will fail on arm64 in compat mode. */ EXPECT_EQ(__NR_poll, ret); #else EXPECT_EQ(__NR_restart_syscall, ret);
The syscall ABI is inconsistent on aarch64 compat, so at least we should document it in the seccomp_bpf tests. Signed-off-by: Kees Cook <keescook@chromium.org> --- Can someone with access to native aarch64 double-check this for me? I think we need to change these tests to pass if it's expected, but the compat behavior seems bad. It means compat code will break under an aarch64 kernel, when dealing with syscalls, like through seccomp. --- tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)