mbox series

[bpf-next,v3,0/6] bpf: add __user tagging support in vmlinux BTF

Message ID 20220127154555.650886-1-yhs@fb.com (mailing list archive)
Headers show
Series bpf: add __user tagging support in vmlinux BTF | expand

Message

Yonghong Song Jan. 27, 2022, 3:45 p.m. UTC
The __user attribute is currently mainly used by sparse for type checking.
The attribute indicates whether a memory access is in user memory address
space or not. Such information is important during tracing kernel
internal functions or data structures as accessing user memory often
has different mechanisms compared to accessing kernel memory. For example,
the perf-probe needs explicit command line specification to indicate a
particular argument or string in user-space memory ([1], [2], [3]).
Currently, vmlinux BTF is available in kernel with many distributions.
If __user attribute information is available in vmlinux BTF, the explicit
user memory access information from users will not be necessary as
the kernel can figure it out by itself with vmlinux BTF.

Besides the above possible use for perf/probe, another use case is
for bpf verifier. Currently, for bpf BPF_PROG_TYPE_TRACING type of bpf
programs, users can write direct code like
  p->m1->m2
and "p" could be a function parameter. Without __user information in BTF,
the verifier will assume p->m1 accessing kernel memory and will generate
normal loads. Let us say "p" actually tagged with __user in the source
code.  In such cases, p->m1 is actually accessing user memory and direct
load is not right and may produce incorrect result. For such cases,
bpf_probe_read_user() will be the correct way to read p->m1.

To support encoding __user information in BTF, a new attribute
  __attribute__((btf_type_tag("<arbitrary_string>")))
is implemented in clang ([4]). For example, if we have
  #define __user __attribute__((btf_type_tag("user")))
during kernel compilation, the attribute "user" information will
be preserved in dwarf. After pahole converting dwarf to BTF, __user
information will be available in vmlinux BTF and such information
can be used by bpf verifier, perf/probe or other use cases.

Currently btf_type_tag is only supported in clang (>= clang14) and
pahole (>= 1.23). gcc support is also proposed and under development ([5]).

In the rest of patch set, Patch 1 added support of __user btf_type_tag
during compilation. Patch 2 added bpf verifier support to utilize __user
tag information to reject bpf programs not using proper helper to access
user memories. Patches 3-5 are for bpf selftests which demonstrate verifier
can reject direct user memory accesses.

  [1] http://lkml.kernel.org/r/155789874562.26965.10836126971405890891.stgit@devnote2
  [2] http://lkml.kernel.org/r/155789872187.26965.4468456816590888687.stgit@devnote2
  [3] http://lkml.kernel.org/r/155789871009.26965.14167558859557329331.stgit@devnote2
  [4] https://reviews.llvm.org/D111199
  [5] https://lore.kernel.org/bpf/0cbeb2fb-1a18-f690-e360-24b1c90c2a91@fb.com/
  
Changelog:
  v2 -> v3:
    - remove FLAG_DONTCARE enumerator and just use 0 as dontcare flag.
    - explain how btf type_tag is encoded in btf type chain.
  v1 -> v2:
    - use MEM_USER flag for PTR_TO_BTF_ID reg type instead of a separate
      field to encode __user tag.
    - add a test with kernel function __sys_getsockname which has __user tagged
      argument.

Yonghong Song (6):
  compiler_types: define __user as __attribute__((btf_type_tag("user")))
  bpf: reject program if a __user tagged memory accessed in kernel way
  selftests/bpf: rename btf_decl_tag.c to test_btf_decl_tag.c
  selftests/bpf: add a selftest with __user tag
  selftests/bpf: specify pahole version requirement for btf_tag test
  docs/bpf: clarify how btf_type_tag gets encoded in the type chain

 Documentation/bpf/btf.rst                     |  13 +++
 include/linux/bpf.h                           |   9 +-
 include/linux/btf.h                           |   5 +
 include/linux/compiler_types.h                |   3 +
 kernel/bpf/btf.c                              |  34 ++++--
 kernel/bpf/verifier.c                         |  35 ++++--
 lib/Kconfig.debug                             |   8 ++
 net/bpf/bpf_dummy_struct_ops.c                |   6 +-
 net/ipv4/bpf_tcp_ca.c                         |   6 +-
 tools/testing/selftests/bpf/README.rst        |   2 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  18 ++++
 .../selftests/bpf/prog_tests/btf_tag.c        | 101 +++++++++++++++++-
 .../selftests/bpf/progs/btf_type_tag_user.c   |  40 +++++++
 .../{btf_decl_tag.c => test_btf_decl_tag.c}   |   0
 14 files changed, 252 insertions(+), 28 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/btf_type_tag_user.c
 rename tools/testing/selftests/bpf/progs/{btf_decl_tag.c => test_btf_decl_tag.c} (100%)

Comments

Alexei Starovoitov Jan. 27, 2022, 8:17 p.m. UTC | #1
On Thu, Jan 27, 2022 at 7:46 AM Yonghong Song <yhs@fb.com> wrote:
>
> The __user attribute is currently mainly used by sparse for type checking.
> The attribute indicates whether a memory access is in user memory address
> space or not. Such information is important during tracing kernel
> internal functions or data structures as accessing user memory often
> has different mechanisms compared to accessing kernel memory. For example,
> the perf-probe needs explicit command line specification to indicate a
> particular argument or string in user-space memory ([1], [2], [3]).
> Currently, vmlinux BTF is available in kernel with many distributions.
> If __user attribute information is available in vmlinux BTF, the explicit
> user memory access information from users will not be necessary as
> the kernel can figure it out by itself with vmlinux BTF.
>
> Besides the above possible use for perf/probe, another use case is
> for bpf verifier. Currently, for bpf BPF_PROG_TYPE_TRACING type of bpf
> programs, users can write direct code like
>   p->m1->m2
> and "p" could be a function parameter. Without __user information in BTF,
> the verifier will assume p->m1 accessing kernel memory and will generate
> normal loads. Let us say "p" actually tagged with __user in the source
> code.  In such cases, p->m1 is actually accessing user memory and direct
> load is not right and may produce incorrect result. For such cases,
> bpf_probe_read_user() will be the correct way to read p->m1.
>
> To support encoding __user information in BTF, a new attribute
>   __attribute__((btf_type_tag("<arbitrary_string>")))
> is implemented in clang ([4]). For example, if we have
>   #define __user __attribute__((btf_type_tag("user")))
> during kernel compilation, the attribute "user" information will
> be preserved in dwarf. After pahole converting dwarf to BTF, __user
> information will be available in vmlinux BTF and such information
> can be used by bpf verifier, perf/probe or other use cases.
>
> Currently btf_type_tag is only supported in clang (>= clang14) and
> pahole (>= 1.23). gcc support is also proposed and under development ([5]).
>
> In the rest of patch set, Patch 1 added support of __user btf_type_tag
> during compilation. Patch 2 added bpf verifier support to utilize __user
> tag information to reject bpf programs not using proper helper to access
> user memories. Patches 3-5 are for bpf selftests which demonstrate verifier
> can reject direct user memory accesses.
>
>   [1] http://lkml.kernel.org/r/155789874562.26965.10836126971405890891.stgit@devnote2
>   [2] http://lkml.kernel.org/r/155789872187.26965.4468456816590888687.stgit@devnote2
>   [3] http://lkml.kernel.org/r/155789871009.26965.14167558859557329331.stgit@devnote2
>   [4] https://reviews.llvm.org/D111199
>   [5] https://lore.kernel.org/bpf/0cbeb2fb-1a18-f690-e360-24b1c90c2a91@fb.com/
>
> Changelog:
>   v2 -> v3:
>     - remove FLAG_DONTCARE enumerator and just use 0 as dontcare flag.
>     - explain how btf type_tag is encoded in btf type chain.

Applied. Thanks