diff mbox

[v2] seccomp: not compatible with ARM OABI

Message ID 20131107174746.GA22344@www.outflux.net (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook Nov. 7, 2013, 5:47 p.m. UTC
Make sure that seccomp filter won't be built when ARM OABI is in use,
since there is work needed to distinguish calling conventions. Until
that is done (which is likely never since OABI is deprecated), make
sure seccomp filter is unavailable in the OABI world.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
v2:
 - toggle availability via HAVE_ARCH_SECCOMP_FILTER; James Hogan.
---
 arch/arm/Kconfig |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Andy Lutomirski Nov. 7, 2013, 6:16 p.m. UTC | #1
On Thu, Nov 7, 2013 at 9:47 AM, Kees Cook <keescook@chromium.org> wrote:
> Make sure that seccomp filter won't be built when ARM OABI is in use,
> since there is work needed to distinguish calling conventions. Until
> that is done (which is likely never since OABI is deprecated), make
> sure seccomp filter is unavailable in the OABI world.
>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> v2:
>  - toggle availability via HAVE_ARCH_SECCOMP_FILTER; James Hogan.
> ---
>  arch/arm/Kconfig |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 0a1dc697333c..a0a8590f3609 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -23,7 +23,7 @@ config ARM
>         select HARDIRQS_SW_RESEND
>         select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
>         select HAVE_ARCH_KGDB
> -       select HAVE_ARCH_SECCOMP_FILTER
> +       select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
>         select HAVE_ARCH_TRACEHOOK
>         select HAVE_BPF_JIT
>         select HAVE_CONTEXT_TRACKING
> @@ -1735,6 +1735,11 @@ config OABI_COMPAT
>           in memory differs between the legacy ABI and the new ARM EABI
>           (only for non "thumb" binaries). This option adds a tiny
>           overhead to all syscalls and produces a slightly larger kernel.
> +
> +         The seccomp filter system will not be available when this is
> +         selected, since there is no way yet to sensibly distinguish
> +         between calling conventions during filtering.
> +
>           If you know you'll be using only pure EABI user space then you
>           can say N here. If this option is not selected and you attempt
>           to execute a legacy ABI binary then the result will be
> --
> 1.7.9.5
>
>

FWIW, OABI-only (i.e. !AEABI, as opposed to OABI_COMPAT) is, in
principle, supportable -- userspace would just have to know that, if
build for OABI, the calling convention is different.

I doubt this is worth supporting, though, and, if no one complains
about your patch for a couple releases, then that would mean we could
get away with adding AUDIT_ARCH_ARM_OABI or something (maybe for
seccomp only) if needed.

--Andy

> --
> Kees Cook
> Chrome OS Security
Kees Cook Nov. 7, 2013, 6:39 p.m. UTC | #2
On Thu, Nov 7, 2013 at 10:16 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Nov 7, 2013 at 9:47 AM, Kees Cook <keescook@chromium.org> wrote:
>> Make sure that seccomp filter won't be built when ARM OABI is in use,
>> since there is work needed to distinguish calling conventions. Until
>> that is done (which is likely never since OABI is deprecated), make
>> sure seccomp filter is unavailable in the OABI world.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>> v2:
>>  - toggle availability via HAVE_ARCH_SECCOMP_FILTER; James Hogan.
>> ---
>>  arch/arm/Kconfig |    7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index 0a1dc697333c..a0a8590f3609 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -23,7 +23,7 @@ config ARM
>>         select HARDIRQS_SW_RESEND
>>         select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
>>         select HAVE_ARCH_KGDB
>> -       select HAVE_ARCH_SECCOMP_FILTER
>> +       select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
>>         select HAVE_ARCH_TRACEHOOK
>>         select HAVE_BPF_JIT
>>         select HAVE_CONTEXT_TRACKING
>> @@ -1735,6 +1735,11 @@ config OABI_COMPAT
>>           in memory differs between the legacy ABI and the new ARM EABI
>>           (only for non "thumb" binaries). This option adds a tiny
>>           overhead to all syscalls and produces a slightly larger kernel.
>> +
>> +         The seccomp filter system will not be available when this is
>> +         selected, since there is no way yet to sensibly distinguish
>> +         between calling conventions during filtering.
>> +
>>           If you know you'll be using only pure EABI user space then you
>>           can say N here. If this option is not selected and you attempt
>>           to execute a legacy ABI binary then the result will be
>> --
>> 1.7.9.5
>>
>>
>
> FWIW, OABI-only (i.e. !AEABI, as opposed to OABI_COMPAT) is, in
> principle, supportable -- userspace would just have to know that, if
> build for OABI, the calling convention is different.

Right -- I opted for enforcing seccomp-on-ARM-means-EABI.

> I doubt this is worth supporting, though, and, if no one complains
> about your patch for a couple releases, then that would mean we could
> get away with adding AUDIT_ARCH_ARM_OABI or something (maybe for
> seccomp only) if needed.

Agreed.

Thanks again for looking at all this!

-Kees
Eric Paris Nov. 7, 2013, 6:56 p.m. UTC | #3
On Thu, 2013-11-07 at 10:39 -0800, Kees Cook wrote:
> On Thu, Nov 7, 2013 at 10:16 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> > On Thu, Nov 7, 2013 at 9:47 AM, Kees Cook <keescook@chromium.org> wrote:
> >> Make sure that seccomp filter won't be built when ARM OABI is in use,
> >> since there is work needed to distinguish calling conventions. Until
> >> that is done (which is likely never since OABI is deprecated), make
> >> sure seccomp filter is unavailable in the OABI world.
> >>
> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >> ---
> >> v2:
> >>  - toggle availability via HAVE_ARCH_SECCOMP_FILTER; James Hogan.
> >> ---
> >>  arch/arm/Kconfig |    7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> >> index 0a1dc697333c..a0a8590f3609 100644
> >> --- a/arch/arm/Kconfig
> >> +++ b/arch/arm/Kconfig
> >> @@ -23,7 +23,7 @@ config ARM
> >>         select HARDIRQS_SW_RESEND
> >>         select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
> >>         select HAVE_ARCH_KGDB
> >> -       select HAVE_ARCH_SECCOMP_FILTER
> >> +       select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
> >>         select HAVE_ARCH_TRACEHOOK
> >>         select HAVE_BPF_JIT
> >>         select HAVE_CONTEXT_TRACKING
> >> @@ -1735,6 +1735,11 @@ config OABI_COMPAT
> >>           in memory differs between the legacy ABI and the new ARM EABI
> >>           (only for non "thumb" binaries). This option adds a tiny
> >>           overhead to all syscalls and produces a slightly larger kernel.
> >> +
> >> +         The seccomp filter system will not be available when this is
> >> +         selected, since there is no way yet to sensibly distinguish
> >> +         between calling conventions during filtering.
> >> +
> >>           If you know you'll be using only pure EABI user space then you
> >>           can say N here. If this option is not selected and you attempt
> >>           to execute a legacy ABI binary then the result will be
> >> --
> >> 1.7.9.5
> >>
> >>
> >
> > FWIW, OABI-only (i.e. !AEABI, as opposed to OABI_COMPAT) is, in
> > principle, supportable -- userspace would just have to know that, if
> > build for OABI, the calling convention is different.
> 
> Right -- I opted for enforcing seccomp-on-ARM-means-EABI.
> 
> > I doubt this is worth supporting, though, and, if no one complains
> > about your patch for a couple releases, then that would mean we could
> > get away with adding AUDIT_ARCH_ARM_OABI or something (maybe for
> > seccomp only) if needed.

Audit already has: (ARM && AEABI && !OABI_COMPAT)  adding
AUDIT_ARCH_ARM_OABI means we could support it and no worries about ABI
breakage.

Isn't x32 similarly screwy?  Does it work because the syscall numbers
are different?

-Eric
Andy Lutomirski Nov. 7, 2013, 7:05 p.m. UTC | #4
On Thu, Nov 7, 2013 at 10:56 AM, Eric Paris <eparis@redhat.com> wrote:
> On Thu, 2013-11-07 at 10:39 -0800, Kees Cook wrote:
>> On Thu, Nov 7, 2013 at 10:16 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>> > On Thu, Nov 7, 2013 at 9:47 AM, Kees Cook <keescook@chromium.org> wrote:
>> >> Make sure that seccomp filter won't be built when ARM OABI is in use,
>> >> since there is work needed to distinguish calling conventions. Until
>> >> that is done (which is likely never since OABI is deprecated), make
>> >> sure seccomp filter is unavailable in the OABI world.
>> >>
>> >> Signed-off-by: Kees Cook <keescook@chromium.org>
>> >> ---
>> >> v2:
>> >>  - toggle availability via HAVE_ARCH_SECCOMP_FILTER; James Hogan.
>> >> ---
>> >>  arch/arm/Kconfig |    7 ++++++-
>> >>  1 file changed, 6 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> >> index 0a1dc697333c..a0a8590f3609 100644
>> >> --- a/arch/arm/Kconfig
>> >> +++ b/arch/arm/Kconfig
>> >> @@ -23,7 +23,7 @@ config ARM
>> >>         select HARDIRQS_SW_RESEND
>> >>         select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
>> >>         select HAVE_ARCH_KGDB
>> >> -       select HAVE_ARCH_SECCOMP_FILTER
>> >> +       select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
>> >>         select HAVE_ARCH_TRACEHOOK
>> >>         select HAVE_BPF_JIT
>> >>         select HAVE_CONTEXT_TRACKING
>> >> @@ -1735,6 +1735,11 @@ config OABI_COMPAT
>> >>           in memory differs between the legacy ABI and the new ARM EABI
>> >>           (only for non "thumb" binaries). This option adds a tiny
>> >>           overhead to all syscalls and produces a slightly larger kernel.
>> >> +
>> >> +         The seccomp filter system will not be available when this is
>> >> +         selected, since there is no way yet to sensibly distinguish
>> >> +         between calling conventions during filtering.
>> >> +
>> >>           If you know you'll be using only pure EABI user space then you
>> >>           can say N here. If this option is not selected and you attempt
>> >>           to execute a legacy ABI binary then the result will be
>> >> --
>> >> 1.7.9.5
>> >>
>> >>
>> >
>> > FWIW, OABI-only (i.e. !AEABI, as opposed to OABI_COMPAT) is, in
>> > principle, supportable -- userspace would just have to know that, if
>> > build for OABI, the calling convention is different.
>>
>> Right -- I opted for enforcing seccomp-on-ARM-means-EABI.
>>
>> > I doubt this is worth supporting, though, and, if no one complains
>> > about your patch for a couple releases, then that would mean we could
>> > get away with adding AUDIT_ARCH_ARM_OABI or something (maybe for
>> > seccomp only) if needed.
>
> Audit already has: (ARM && AEABI && !OABI_COMPAT)  adding
> AUDIT_ARCH_ARM_OABI means we could support it and no worries about ABI
> breakage.

Maybe this isn't such a bad idea after all :)

>
> Isn't x32 similarly screwy?  Does it work because the syscall numbers
> are different?

Yes (from reading the code -- I haven't actually tried it).

I've always interpreted the AUDIT_ARCH stuff as meaning that
(audit_arch, nr) uniquely identifies a syscall and that (audit_arch,
nr, argument registers) identifies a syscall and its arguments.

On x32, the syscall invocation instruction is identical to x86_64 and
the mode of the process has nothing to do with which syscall is
invoked, so having a different audit_arch is unnecessary (as long as
the x32 bit in nr is preserved).

The weirdest case I know of is plain old x86, which has three
different syscall instructions that (IIRC) have three different
calling conventions.  They all result in the same syscalls happening,
and the kernel fudges it by changing the registers on entry and
fiddling with the return address.  This means that the physical
registers at the time of kernel entry are lost, but anyone can pretend
that the syscall was issued using int 80 and they'll correctly
interpret the syscall.

--Andy
Dave Martin Nov. 7, 2013, 8:33 p.m. UTC | #5
On Thu, Nov 07, 2013 at 10:16:13AM -0800, Andy Lutomirski wrote:
> On Thu, Nov 7, 2013 at 9:47 AM, Kees Cook <keescook@chromium.org> wrote:
> > Make sure that seccomp filter won't be built when ARM OABI is in use,
> > since there is work needed to distinguish calling conventions. Until
> > that is done (which is likely never since OABI is deprecated), make
> > sure seccomp filter is unavailable in the OABI world.
> >
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > ---
> > v2:
> >  - toggle availability via HAVE_ARCH_SECCOMP_FILTER; James Hogan.
> > ---
> >  arch/arm/Kconfig |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 0a1dc697333c..a0a8590f3609 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -23,7 +23,7 @@ config ARM
> >         select HARDIRQS_SW_RESEND
> >         select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
> >         select HAVE_ARCH_KGDB
> > -       select HAVE_ARCH_SECCOMP_FILTER
> > +       select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
> >         select HAVE_ARCH_TRACEHOOK
> >         select HAVE_BPF_JIT
> >         select HAVE_CONTEXT_TRACKING
> > @@ -1735,6 +1735,11 @@ config OABI_COMPAT
> >           in memory differs between the legacy ABI and the new ARM EABI
> >           (only for non "thumb" binaries). This option adds a tiny
> >           overhead to all syscalls and produces a slightly larger kernel.
> > +
> > +         The seccomp filter system will not be available when this is
> > +         selected, since there is no way yet to sensibly distinguish
> > +         between calling conventions during filtering.
> > +
> >           If you know you'll be using only pure EABI user space then you
> >           can say N here. If this option is not selected and you attempt
> >           to execute a legacy ABI binary then the result will be
> > --
> > 1.7.9.5
> >
> >
> 
> FWIW, OABI-only (i.e. !AEABI, as opposed to OABI_COMPAT) is, in
> principle, supportable -- userspace would just have to know that, if
> build for OABI, the calling convention is different.
> 
> I doubt this is worth supporting, though, and, if no one complains
> about your patch for a couple releases, then that would mean we could
> get away with adding AUDIT_ARCH_ARM_OABI or something (maybe for
> seccomp only) if needed.

Won't SECCOMP (from arch/arm/Kconfig) still be visible and selectable
in the Kconfig system?  That may cause confusion.


One option might be to make SECCOMP depend on AEABI, and treat all OABI
syscalls as invalid after TIF_SECCOMP is set.  This would allow people
to build modern kernels that can still run ancient userspace, without
having to compile SECCOMP out.  I don't think we care about whether
SECCOMP works for OABI userspace -- it's too recent a feature on ARM.

That may be overkill nowadays, though.  The alternative would be to
make SECCOMP depend on AEABI && !OABI_COMPAT.

Cheer
---Dave
Paul Moore Nov. 8, 2013, 4:29 p.m. UTC | #6
On Thursday, November 07, 2013 11:05:26 AM Andy Lutomirski wrote:
> On Thu, Nov 7, 2013 at 10:56 AM, Eric Paris <eparis@redhat.com> wrote:
>
> > Isn't x32 similarly screwy?  Does it work because the syscall numbers
> > are different?
> 
> Yes (from reading the code -- I haven't actually tried it).

I've got a x32 VM that I boot occasionally to test seccomp/libseccomp.  For 
the purposes of seccomp it looks exactly like x86_64, including sharing the 
same AUDIT_ARCH_X86_64 value, the only difference being the syscall number 
offset ... Assuming you're using kernel 3.9 or later.  Previous kernels had a 
bug which stripped the x32 syscall offset so it was impossible to distinguish 
from x86_64 and x32 with seccomp.  See the following commit for the details:

 commit 8b4b9f27e57584f3d90e0bb84cf800ad81cfe3a1
 Author: Paul Moore <pmoore@redhat.com>
 Date:   Fri Feb 15 12:21:43 2013 -0500

    x86: remove the x32 syscall bitmask from syscall_get_nr()
    
    Commit fca460f95e928bae373daa8295877b6905bc62b8 simplified the x32
    implementation by creating a syscall bitmask, equal to 0x40000000, that
    could be applied to x32 syscalls such that the masked syscall number
    would be the same as a x86_64 syscall.  While that patch was a nice
    way to simplify the code, it went a bit too far by adding the mask to
    syscall_get_nr(); returning the masked syscall numbers can cause
    confusion with callers that expect syscall numbers matching the x32
    ABI, e.g. unmasked syscall numbers.
    
    This patch fixes this by simply removing the mask from syscall_get_nr()
    while preserving the other changes from the original commit.  While
    there are several syscall_get_nr() callers in the kernel, most simply
    check that the syscall number is greater than zero, in this case this
    patch will have no effect.  Of those remaining callers, they appear
    to be few, seccomp and ftrace, and from my testing of seccomp without
    this patch the original commit definitely breaks things; the seccomp
    filter does not correctly filter the syscalls due to the difference in
    syscall numbers in the BPF filter and the value from syscall_get_nr().
    Applying this patch restores the seccomp BPF filter functionality on
    x32.
    
    I've tested this patch with the seccomp BPF filters as well as ftrace
    and everything looks reasonable to me; needless to say general usage
    seemed fine as well.
    
    Signed-off-by: Paul Moore <pmoore@redhat.com>
    Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost
    Cc: <stable@vger.kernel.org>
    Cc: Will Drewry <wad@chromium.org>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

> I've always interpreted the AUDIT_ARCH stuff as meaning that
> (audit_arch, nr) uniquely identifies a syscall and that (audit_arch,
> nr, argument registers) identifies a syscall and its arguments.

That matches my own experience working with seccomp.

> On x32, the syscall invocation instruction is identical to x86_64 and
> the mode of the process has nothing to do with which syscall is
> invoked, so having a different audit_arch is unnecessary (as long as
> the x32 bit in nr is preserved).
Andy Lutomirski Nov. 8, 2013, 4:39 p.m. UTC | #7
On Fri, Nov 8, 2013 at 8:29 AM, Paul Moore <pmoore@redhat.com> wrote:
> On Thursday, November 07, 2013 11:05:26 AM Andy Lutomirski wrote:
>> On Thu, Nov 7, 2013 at 10:56 AM, Eric Paris <eparis@redhat.com> wrote:
>>
>> > Isn't x32 similarly screwy?  Does it work because the syscall numbers
>> > are different?
>>
>> Yes (from reading the code -- I haven't actually tried it).
>
> I've got a x32 VM that I boot occasionally to test seccomp/libseccomp.  For
> the purposes of seccomp it looks exactly like x86_64, including sharing the
> same AUDIT_ARCH_X86_64 value, the only difference being the syscall number
> offset ... Assuming you're using kernel 3.9 or later.  Previous kernels had a
> bug which stripped the x32 syscall offset so it was impossible to distinguish
> from x86_64 and x32 with seccomp.  See the following commit for the details:

Ooh -- where did you get this?  (I imagine I could debootstrap such a
beast and then just chroot / nspawn / schroot in, but if there are
readily available images, that would be great.  Fedora doesn't seem to
have much x32 support.)

--Andy
Paul Moore Nov. 8, 2013, 6:23 p.m. UTC | #8
On Friday, November 08, 2013 08:39:29 AM Andy Lutomirski wrote:
> On Fri, Nov 8, 2013 at 8:29 AM, Paul Moore <pmoore@redhat.com> wrote:
> > On Thursday, November 07, 2013 11:05:26 AM Andy Lutomirski wrote:
> >> On Thu, Nov 7, 2013 at 10:56 AM, Eric Paris <eparis@redhat.com> wrote:
> >> > Isn't x32 similarly screwy?  Does it work because the syscall numbers
> >> > are different?
> >> 
> >> Yes (from reading the code -- I haven't actually tried it).
> > 
> > I've got a x32 VM that I boot occasionally to test seccomp/libseccomp. 
> > For the purposes of seccomp it looks exactly like x86_64, including
> > sharing the same AUDIT_ARCH_X86_64 value, the only difference being the
> > syscall number offset ... Assuming you're using kernel 3.9 or later. 
> > Previous kernels had a bug which stripped the x32 syscall offset so it was
> > impossible to distinguish from x86_64 and x32 with seccomp.  See the
> > following commit for the details:
>
> Ooh -- where did you get this?  (I imagine I could debootstrap such a
> beast and then just chroot / nspawn / schroot in, but if there are
> readily available images, that would be great.  Fedora doesn't seem to
> have much x32 support.)

I built up a small Gentoo image:

 * http://distfiles.gentoo.org/releases/amd64/current-stage3
diff mbox

Patch

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0a1dc697333c..a0a8590f3609 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -23,7 +23,7 @@  config ARM
 	select HARDIRQS_SW_RESEND
 	select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
 	select HAVE_ARCH_KGDB
-	select HAVE_ARCH_SECCOMP_FILTER
+	select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_BPF_JIT
 	select HAVE_CONTEXT_TRACKING
@@ -1735,6 +1735,11 @@  config OABI_COMPAT
 	  in memory differs between the legacy ABI and the new ARM EABI
 	  (only for non "thumb" binaries). This option adds a tiny
 	  overhead to all syscalls and produces a slightly larger kernel.
+
+	  The seccomp filter system will not be available when this is
+	  selected, since there is no way yet to sensibly distinguish
+	  between calling conventions during filtering.
+
 	  If you know you'll be using only pure EABI user space then you
 	  can say N here. If this option is not selected and you attempt
 	  to execute a legacy ABI binary then the result will be