mbox series

[v7,0/7] Syscall User Dispatch

Message ID 20201118032840.3429268-1-krisman@collabora.com (mailing list archive)
Headers show
Series Syscall User Dispatch | expand

Message

Gabriel Krisman Bertazi Nov. 18, 2020, 3:28 a.m. UTC
Hi,

This is the v7 of syscall user dispatch.  This version is a bit
different from v6 on the following points, after the modifications
requested on that submission.

* The interface no longer receives  <start>,<end> end parameters, but
  <start>,<length> as suggested by PeterZ.

* Syscall User Dispatch is now done before ptrace, and this means there
  is some SYSCALL_WORK_EXIT work that needs to be done.  No challenges
  there, but I'd like to draw attention to that region of the code that
  is new in this submission.

* The previous TIF_SYSCALL_USER_DISPATCH is now handled through
  SYSCALL_WORK flags.

* Introduced a new test as patch 6, which benchmarks the fast submission
  path and test the return in blocked selector state.

* Nothing is architecture dependent anymore. No config switches.  It
  only depends on CONFIG_GENERIC_ENTRY.

Other smaller changes are documented one each commit.

This was tested using the kselftests tests in patch 5 and 6 and compiled
tested with !CONFIG_GENERIC_ENTRY.

This patchset is based on the core/entry branch of the TIP tree.

A working tree with this patchset is available at:

  https://gitlab.collabora.com/krisman/linux -b syscall-user-dispatch-v7

Previous submissions are archived at:

RFC/v1: https://lkml.org/lkml/2020/7/8/96
v2: https://lkml.org/lkml/2020/7/9/17
v3: https://lkml.org/lkml/2020/7/12/4
v4: https://www.spinics.net/lists/linux-kselftest/msg16377.html
v5: https://lkml.org/lkml/2020/8/10/1320
v6: https://lkml.org/lkml/2020/9/4/1122

Gabriel Krisman Bertazi (7):
  x86: vdso: Expose sigreturn address on vdso to the kernel
  signal: Expose SYS_USER_DISPATCH si_code type
  kernel: Implement selective syscall userspace redirection
  entry: Support Syscall User Dispatch on common syscall entry
  selftests: Add kselftest for syscall user dispatch
  selftests: Add benchmark for syscall user dispatch
  docs: Document Syscall User Dispatch

 .../admin-guide/syscall-user-dispatch.rst     |  87 +++++
 arch/x86/entry/vdso/vdso2c.c                  |   2 +
 arch/x86/entry/vdso/vdso32/sigreturn.S        |   2 +
 arch/x86/entry/vdso/vma.c                     |  15 +
 arch/x86/include/asm/elf.h                    |   2 +
 arch/x86/include/asm/vdso.h                   |   2 +
 arch/x86/kernel/signal_compat.c               |   2 +-
 fs/exec.c                                     |   3 +
 include/linux/entry-common.h                  |   2 +
 include/linux/sched.h                         |   2 +
 include/linux/syscall_user_dispatch.h         |  40 +++
 include/linux/thread_info.h                   |   2 +
 include/uapi/asm-generic/siginfo.h            |   3 +-
 include/uapi/linux/prctl.h                    |   5 +
 kernel/entry/Makefile                         |   2 +-
 kernel/entry/common.c                         |  17 +
 kernel/entry/common.h                         |  16 +
 kernel/entry/syscall_user_dispatch.c          | 102 ++++++
 kernel/fork.c                                 |   1 +
 kernel/sys.c                                  |   5 +
 tools/testing/selftests/Makefile              |   1 +
 .../syscall_user_dispatch/.gitignore          |   3 +
 .../selftests/syscall_user_dispatch/Makefile  |   9 +
 .../selftests/syscall_user_dispatch/config    |   1 +
 .../syscall_user_dispatch/sud_benchmark.c     | 200 +++++++++++
 .../syscall_user_dispatch/sud_test.c          | 310 ++++++++++++++++++
 26 files changed, 833 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/admin-guide/syscall-user-dispatch.rst
 create mode 100644 include/linux/syscall_user_dispatch.h
 create mode 100644 kernel/entry/common.h
 create mode 100644 kernel/entry/syscall_user_dispatch.c
 create mode 100644 tools/testing/selftests/syscall_user_dispatch/.gitignore
 create mode 100644 tools/testing/selftests/syscall_user_dispatch/Makefile
 create mode 100644 tools/testing/selftests/syscall_user_dispatch/config
 create mode 100644 tools/testing/selftests/syscall_user_dispatch/sud_benchmark.c
 create mode 100644 tools/testing/selftests/syscall_user_dispatch/sud_test.c

Comments

Florian Weimer Nov. 18, 2020, 8:47 a.m. UTC | #1
* Gabriel Krisman Bertazi:

> This is the v7 of syscall user dispatch.  This version is a bit
> different from v6 on the following points, after the modifications
> requested on that submission.

Is this supposed to work with existing (Linux) libcs, or do you bring
your own low-level run-time libraries?
Gabriel Krisman Bertazi Nov. 18, 2020, 5:01 p.m. UTC | #2
Florian Weimer <fw@deneb.enyo.de> writes:

> * Gabriel Krisman Bertazi:
>
>> This is the v7 of syscall user dispatch.  This version is a bit
>> different from v6 on the following points, after the modifications
>> requested on that submission.
>
> Is this supposed to work with existing (Linux) libcs, or do you bring
> your own low-level run-time libraries?

Hi Florian,

The main use case is to intercept Windows system calls of an application
running over Wine. While Wine is using an unmodified glibc to execute
its own native Linux syscalls, the Windows libraries might be directly
issuing syscalls that we need to capture. So there is a mix. While this
mechanism is compatible with existing libc, we might have other
libraries executing a syscall instruction directly.
Florian Weimer Nov. 18, 2020, 5:22 p.m. UTC | #3
* Gabriel Krisman Bertazi:

> The main use case is to intercept Windows system calls of an application
> running over Wine. While Wine is using an unmodified glibc to execute
> its own native Linux syscalls, the Windows libraries might be directly
> issuing syscalls that we need to capture. So there is a mix. While this
> mechanism is compatible with existing libc, we might have other
> libraries executing a syscall instruction directly.

Please raise this on libc-alpha, it's an unexpected compatibility
constraint on glibc.  Thanks.
Peter Zijlstra Nov. 19, 2020, 12:38 p.m. UTC | #4
On Tue, Nov 17, 2020 at 10:28:33PM -0500, Gabriel Krisman Bertazi wrote:
> Gabriel Krisman Bertazi (7):
>   x86: vdso: Expose sigreturn address on vdso to the kernel
>   signal: Expose SYS_USER_DISPATCH si_code type
>   kernel: Implement selective syscall userspace redirection
>   entry: Support Syscall User Dispatch on common syscall entry
>   selftests: Add kselftest for syscall user dispatch
>   selftests: Add benchmark for syscall user dispatch
>   docs: Document Syscall User Dispatch

Aside from the one little nit this looks good to me.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Kees Cook Nov. 21, 2020, 12:24 a.m. UTC | #5
On Thu, Nov 19, 2020 at 01:38:27PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 17, 2020 at 10:28:33PM -0500, Gabriel Krisman Bertazi wrote:
> > Gabriel Krisman Bertazi (7):
> >   x86: vdso: Expose sigreturn address on vdso to the kernel
> >   signal: Expose SYS_USER_DISPATCH si_code type
> >   kernel: Implement selective syscall userspace redirection
> >   entry: Support Syscall User Dispatch on common syscall entry
> >   selftests: Add kselftest for syscall user dispatch
> >   selftests: Add benchmark for syscall user dispatch
> >   docs: Document Syscall User Dispatch
> 
> Aside from the one little nit this looks good to me.
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Agreed, and thank you Gabriel for the SYSCALL_WORK series too. :) That's
so nice to have!