diff mbox

selftests/seccomp: build on aarch64, document ABI

Message ID 20150909193025.GA29244@www.outflux.net (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook Sept. 9, 2015, 7:30 p.m. UTC
The syscall ABI is inconsistent on aarch64 compat, so at least we should
document it in the seccomp_bpf tests.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
Can someone with access to native aarch64 double-check this for me? I
think we need to change these tests to pass if it's expected, but the
compat behavior seems bad. It means compat code will break under an
aarch64 kernel, when dealing with syscalls, like through seccomp.
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Comments

Arnd Bergmann Sept. 9, 2015, 8:08 p.m. UTC | #1
On Wednesday 09 September 2015 12:30:27 Kees Cook wrote:
> The syscall ABI is inconsistent on aarch64 compat, so at least we should
> document it in the seccomp_bpf tests.
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>

Can you explain in what way the ABI is inconsistent here?

> ---
> Can someone with access to native aarch64 double-check this for me? I
> think we need to change these tests to pass if it's expected, but the
> compat behavior seems bad. It means compat code will break under an
> aarch64 kernel, when dealing with syscalls, like through seccomp.
> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 770f47adf295..866ff42e000d 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -33,6 +33,10 @@
>  #include <unistd.h>
>  #include <sys/syscall.h>
>  
> +#if defined(__aarch64__) && !defined(__NR_poll)
> +# define __NR_poll 0x49
> +#endif

I don't understand this: 0x49 is __NR_ppoll on arm64 and all architectures
that use asm-generic/unistd.h, not __NR_poll, which is no longer used there.

If this is intentional, it at least needs a comment to explain the
situation, and be extended to all other architectures that do not have
a poll() system call.

The arm32 version of sys_poll should be available as 168 in both native
and compat mode.

	Arnd
Kees Cook Sept. 9, 2015, 8:52 p.m. UTC | #2
On Wed, Sep 9, 2015 at 1:08 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 09 September 2015 12:30:27 Kees Cook wrote:
>> The syscall ABI is inconsistent on aarch64 compat, so at least we should
>> document it in the seccomp_bpf tests.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> Can you explain in what way the ABI is inconsistent here?
>
>> ---
>> Can someone with access to native aarch64 double-check this for me? I
>> think we need to change these tests to pass if it's expected, but the
>> compat behavior seems bad. It means compat code will break under an
>> aarch64 kernel, when dealing with syscalls, like through seccomp.
>> ---
>>  tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++--
>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> index 770f47adf295..866ff42e000d 100644
>> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> @@ -33,6 +33,10 @@
>>  #include <unistd.h>
>>  #include <sys/syscall.h>
>>
>> +#if defined(__aarch64__) && !defined(__NR_poll)
>> +# define __NR_poll 0x49
>> +#endif
>
> I don't understand this: 0x49 is __NR_ppoll on arm64 and all architectures
> that use asm-generic/unistd.h, not __NR_poll, which is no longer used there.

Ah-ha, okay, that explains part of my struggle. :)

> If this is intentional, it at least needs a comment to explain the
> situation, and be extended to all other architectures that do not have
> a poll() system call.
>
> The arm32 version of sys_poll should be available as 168 in both native
> and compat mode.

Does ppoll still get interrupted like poll to require a restart_syscall call?

Regardless, the primary problem is this (emphasis added):

>> +        * - native ARM does _not_ expose true syscall.
>> +        * - compat ARM on ARM64 _does_ expose true syscall.

When you ptrace or seccomp an arm32 binary under and arm32 kernel,
restart_syscall is invisible. When you ptrace or seccomp an arm32
binary under and arm64 kernel, suddenly it's visible. This means, for
example, seccomp filters will break under an arm64 kernel.

(And apologies if I'm not remembering pieces of this correctly, I
don't have access to arm64 hardware at the moment, which is why I'm
reaching out for some help on this... I'm trying to close out this old
thread: https://lkml.org/lkml/2015/1/20/778 )

-Kees
Arnd Bergmann Sept. 9, 2015, 9:20 p.m. UTC | #3
On Wednesday 09 September 2015 13:52:39 Kees Cook wrote:
> On Wed, Sep 9, 2015 at 1:08 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wednesday 09 September 2015 12:30:27 Kees Cook wrote:

> > If this is intentional, it at least needs a comment to explain the
> > situation, and be extended to all other architectures that do not have
> > a poll() system call.
> >
> > The arm32 version of sys_poll should be available as 168 in both native
> > and compat mode.
> 
> Does ppoll still get interrupted like poll to require a restart_syscall call?

Yes, the only difference between the two is the optional signal mask
argument in ppoll.

> Regardless, the primary problem is this (emphasis added):
> 
> >> +        * - native ARM does _not_ expose true syscall.
> >> +        * - compat ARM on ARM64 _does_ expose true syscall.
> 
> When you ptrace or seccomp an arm32 binary under and arm32 kernel,
> restart_syscall is invisible. When you ptrace or seccomp an arm32
> binary under and arm64 kernel, suddenly it's visible. This means, for
> example, seccomp filters will break under an arm64 kernel.

Ok, I see.

> (And apologies if I'm not remembering pieces of this correctly, I
> don't have access to arm64 hardware at the moment, which is why I'm
> reaching out for some help on this... I'm trying to close out this old
> thread: https://lkml.org/lkml/2015/1/20/778 )

FWIW, it should not be too hard to get an image that runs in
qemu-system-arm64.

I suppose the same problem exists on all restartable system calls
(e.g. nanosleep, select, recvmsg, clock_nanosleep, ...)?

I also believe there are various architectures that cannot see the
original system call number for a restarted syscall, in particular
when the syscall number argument is in the same register as the
return code of the syscall and gets overwritten on the way out of 
the kernel. Is this the problem you are seeing? If so, we should
find a solution that works on all such architectures.

	Arnd
Kees Cook Sept. 9, 2015, 10:03 p.m. UTC | #4
On Wed, Sep 9, 2015 at 2:20 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 09 September 2015 13:52:39 Kees Cook wrote:
>> On Wed, Sep 9, 2015 at 1:08 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Wednesday 09 September 2015 12:30:27 Kees Cook wrote:
>
>> > If this is intentional, it at least needs a comment to explain the
>> > situation, and be extended to all other architectures that do not have
>> > a poll() system call.
>> >
>> > The arm32 version of sys_poll should be available as 168 in both native
>> > and compat mode.
>>
>> Does ppoll still get interrupted like poll to require a restart_syscall call?
>
> Yes, the only difference between the two is the optional signal mask
> argument in ppoll.

Okay, good. I'll respin the patch to use ppoll (which was Bamvor's
original suggestion to me, IIRC).

>> Regardless, the primary problem is this (emphasis added):
>>
>> >> +        * - native ARM does _not_ expose true syscall.
>> >> +        * - compat ARM on ARM64 _does_ expose true syscall.
>>
>> When you ptrace or seccomp an arm32 binary under and arm32 kernel,
>> restart_syscall is invisible. When you ptrace or seccomp an arm32
>> binary under and arm64 kernel, suddenly it's visible. This means, for
>> example, seccomp filters will break under an arm64 kernel.
>
> Ok, I see.
>
>> (And apologies if I'm not remembering pieces of this correctly, I
>> don't have access to arm64 hardware at the moment, which is why I'm
>> reaching out for some help on this... I'm trying to close out this old
>> thread: https://lkml.org/lkml/2015/1/20/778 )
>
> FWIW, it should not be too hard to get an image that runs in
> qemu-system-arm64.

Yeah, that's my next project on this front.

> I suppose the same problem exists on all restartable system calls
> (e.g. nanosleep, select, recvmsg, clock_nanosleep, ...)?

As far as I know, yes. I just happened to set up the testing around
poll as it was easiest to trigger for me.

> I also believe there are various architectures that cannot see the
> original system call number for a restarted syscall, in particular
> when the syscall number argument is in the same register as the
> return code of the syscall and gets overwritten on the way out of
> the kernel. Is this the problem you are seeing? If so, we should
> find a solution that works on all such architectures.

I don't think there's any one way things operate, unfortunately. I
need to map out the behaviors, since sometimes ptrace sees things
differently from seccomp (which leads to no end of confusion on my
part). I will try to generate a comparison across several
architectures.

-Kees
AKASHI Takahiro Sept. 10, 2015, 10:35 a.m. UTC | #5
On 09/10/2015 04:30 AM, Kees Cook wrote:
> The syscall ABI is inconsistent on aarch64 compat, so at least we should
> document it in the seccomp_bpf tests.
>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> Can someone with access to native aarch64 double-check this for me? I
> think we need to change these tests to pass if it's expected, but the
> compat behavior seems bad. It means compat code will break under an
> aarch64 kernel, when dealing with syscalls, like through seccomp.
> ---
>   tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++--
>   1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 770f47adf295..866ff42e000d 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -33,6 +33,10 @@
>   #include <unistd.h>
>   #include <sys/syscall.h>
>
> +#if defined(__aarch64__) && !defined(__NR_poll)
> +# define __NR_poll 0x49
> +#endif
> +
>   #include "test_harness.h"
>
>   #ifndef PR_SET_PTRACER
> @@ -2124,10 +2128,17 @@ TEST(syscall_restart)
>   	ASSERT_EQ(SIGTRAP, WSTOPSIG(status));
>   	ASSERT_EQ(PTRACE_EVENT_SECCOMP, (status >> 16));
>   	ASSERT_EQ(0, ptrace(PTRACE_GETEVENTMSG, child_pid, NULL, &msg));
> -	ASSERT_EQ(0x200, msg);
> +
> +	/*
> +	 * FIXME:
> +	 * - native ARM does not expose true syscall.
> +	 * - compat ARM on ARM64 does expose true syscall.
> +	 * - native ARM64 hides true syscall even from seccomp.

Are you sure about the last line?
The kernel pushes __NR_compat_restart_syscall to w7 in compat mode, while
__NR_restart_syscall to x8 in native mode. But it is the only difference,
as far as I understand, in terms of restarting a system call.
So the behavior should be basically the same.

-Takahiro AKASHI

> +	 */
> +	ASSERT_EQ(0x200, msg);	/* This will fail on native arm64. */
>   	ret = get_syscall(_metadata, child_pid);
>   #if defined(__arm__)
> -	/* FIXME: ARM does not expose true syscall in registers. */
> +	/* This will fail on arm64 in compat mode. */
>   	EXPECT_EQ(__NR_poll, ret);
>   #else
>   	EXPECT_EQ(__NR_restart_syscall, ret);
>
Kees Cook Oct. 6, 2015, 5:42 p.m. UTC | #6
On Thu, Sep 10, 2015 at 3:35 AM, AKASHI Takahiro
<takahiro.akashi@linaro.org> wrote:
> On 09/10/2015 04:30 AM, Kees Cook wrote:
>>
>> The syscall ABI is inconsistent on aarch64 compat, so at least we should
>> document it in the seccomp_bpf tests.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>> Can someone with access to native aarch64 double-check this for me? I
>> think we need to change these tests to pass if it's expected, but the
>> compat behavior seems bad. It means compat code will break under an
>> aarch64 kernel, when dealing with syscalls, like through seccomp.
>> ---
>>   tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> index 770f47adf295..866ff42e000d 100644
>> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> @@ -33,6 +33,10 @@
>>   #include <unistd.h>
>>   #include <sys/syscall.h>
>>
>> +#if defined(__aarch64__) && !defined(__NR_poll)
>> +# define __NR_poll 0x49
>> +#endif
>> +
>>   #include "test_harness.h"
>>
>>   #ifndef PR_SET_PTRACER
>> @@ -2124,10 +2128,17 @@ TEST(syscall_restart)
>>         ASSERT_EQ(SIGTRAP, WSTOPSIG(status));
>>         ASSERT_EQ(PTRACE_EVENT_SECCOMP, (status >> 16));
>>         ASSERT_EQ(0, ptrace(PTRACE_GETEVENTMSG, child_pid, NULL, &msg));
>> -       ASSERT_EQ(0x200, msg);
>> +
>> +       /*
>> +        * FIXME:
>> +        * - native ARM does not expose true syscall.
>> +        * - compat ARM on ARM64 does expose true syscall.
>> +        * - native ARM64 hides true syscall even from seccomp.
>
>
> Are you sure about the last line?
> The kernel pushes __NR_compat_restart_syscall to w7 in compat mode, while
> __NR_restart_syscall to x8 in native mode. But it is the only difference,
> as far as I understand, in terms of restarting a system call.
> So the behavior should be basically the same.

I've spent a little more time looking at this, and no, that last
statement is wrong. My current testing shows that seccomp always sees
the "true" syscall, which is what we want (matches all other
architectures).

However, the other two statements continue to appear true: there is a
register visibility difference. During the restart_syscall syscall,
this routine behaves different on native arm and compat arm:

int get_syscall(pid_t tracee)
{
        struct iovec iov;
        struct pt_regs regs;

        iov.iov_base = &regs;
        iov.iov_len = sizeof(regs);
        assert(0 == ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &iov));

        return regs.ARM_r7;
}

On native ARM, this returns the restarted syscall (in my test case,
nanosleep). On compat ARM, this returns restart_syscall syscall.

Either we should fix native ARM to do what all other architectures do
(show restart_syscall syscall) or we should fix compat ARM to behave
like native ARM.

-Kees
diff mbox

Patch

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 770f47adf295..866ff42e000d 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -33,6 +33,10 @@ 
 #include <unistd.h>
 #include <sys/syscall.h>
 
+#if defined(__aarch64__) && !defined(__NR_poll)
+# define __NR_poll 0x49
+#endif
+
 #include "test_harness.h"
 
 #ifndef PR_SET_PTRACER
@@ -2124,10 +2128,17 @@  TEST(syscall_restart)
 	ASSERT_EQ(SIGTRAP, WSTOPSIG(status));
 	ASSERT_EQ(PTRACE_EVENT_SECCOMP, (status >> 16));
 	ASSERT_EQ(0, ptrace(PTRACE_GETEVENTMSG, child_pid, NULL, &msg));
-	ASSERT_EQ(0x200, msg);
+
+	/*
+	 * FIXME:
+	 * - native ARM does not expose true syscall.
+	 * - compat ARM on ARM64 does expose true syscall.
+	 * - native ARM64 hides true syscall even from seccomp.
+	 */
+	ASSERT_EQ(0x200, msg);	/* This will fail on native arm64. */
 	ret = get_syscall(_metadata, child_pid);
 #if defined(__arm__)
-	/* FIXME: ARM does not expose true syscall in registers. */
+	/* This will fail on arm64 in compat mode. */
 	EXPECT_EQ(__NR_poll, ret);
 #else
 	EXPECT_EQ(__NR_restart_syscall, ret);