Message ID | 20240308010812.89848-1-alexei.starovoitov@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | bpf: Introduce BPF arena. | expand |
On Thu, Mar 7, 2024 at 5:08 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > v2->v3: > - contains bpf bits only, but cc-ing past audience for continuity > - since prerequisite patches landed, this series focus on the main > functionality of bpf_arena. > - adopted Andrii's approach to support arena in libbpf. > - simplified LLVM support. Instead of two instructions it's now only one. > - switched to cond_break (instead of open coded iters) in selftests > - implemented several follow-ups that will be sent after this set > . remember first IP and bpf insn that faulted in arena. > report to user space via bpftool > . copy paste and tweak glob_match() aka mini-regex as a selftests/bpf > - see patch 1 for detailed description of bpf_arena > > v1->v2: > - Improved commit log with reasons for using vmap_pages_range() in arena. > Thanks to Johannes > - Added support for __arena global variables in bpf programs > - Fixed race conditions spotted by Barret > - Fixed wrap32 issue spotted by Barret > - Fixed bpf_map_mmap_sz() the way Andrii suggested > > The work on bpf_arena was inspired by Barret's work: > https://github.com/google/ghost-userspace/blob/main/lib/queue.bpf.h > that implements queues, lists and AVL trees completely as bpf programs > using giant bpf array map and integer indices instead of pointers. > bpf_arena is a sparse array that allows to use normal C pointers to > build such data structures. Last few patches implement page_frag > allocator, link list and hash table as bpf programs. > > v1: > bpf programs have multiple options to communicate with user space: > - Various ring buffers (perf, ftrace, bpf): The data is streamed > unidirectionally from bpf to user space. > - Hash map: The bpf program populates elements, and user space consumes > them via bpf syscall. > - mmap()-ed array map: Libbpf creates an array map that is directly > accessed by the bpf program and mmap-ed to user space. It's the fastest > way. Its disadvantage is that memory for the whole array is reserved at > the start. > > Alexei Starovoitov (13): > bpf: Introduce bpf_arena. > bpf: Disasm support for addr_space_cast instruction. > bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. > bpf: Add x86-64 JIT support for bpf_addr_space_cast instruction. > bpf: Recognize addr_space_cast instruction in the verifier. > bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. > libbpf: Add __arg_arena to bpf_helpers.h > libbpf: Add support for bpf_arena. > bpftool: Recognize arena map type > bpf: Add helper macro bpf_addr_space_cast() > selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages > selftests/bpf: Add bpf_arena_list test. > selftests/bpf: Add bpf_arena_htab test. > > Andrii Nakryiko (1): > libbpf: Recognize __arena global varaibles. > > arch/x86/net/bpf_jit_comp.c | 231 +++++++- > include/linux/bpf.h | 10 +- > include/linux/bpf_types.h | 1 + > include/linux/bpf_verifier.h | 1 + > include/linux/filter.h | 4 + > include/uapi/linux/bpf.h | 14 + > kernel/bpf/Makefile | 3 + > kernel/bpf/arena.c | 558 ++++++++++++++++++ > kernel/bpf/btf.c | 19 +- > kernel/bpf/core.c | 16 + > kernel/bpf/disasm.c | 10 + > kernel/bpf/log.c | 3 + > kernel/bpf/syscall.c | 42 ++ > kernel/bpf/verifier.c | 123 +++- > .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +- > tools/bpf/bpftool/gen.c | 13 + > tools/bpf/bpftool/map.c | 2 +- > tools/include/uapi/linux/bpf.h | 14 + > tools/lib/bpf/bpf_helpers.h | 1 + > tools/lib/bpf/libbpf.c | 163 ++++- > tools/lib/bpf/libbpf.h | 2 +- > tools/lib/bpf/libbpf_probes.c | 7 + > tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 + > tools/testing/selftests/bpf/DENYLIST.s390x | 2 + > tools/testing/selftests/bpf/bpf_arena_alloc.h | 67 +++ > .../testing/selftests/bpf/bpf_arena_common.h | 70 +++ > tools/testing/selftests/bpf/bpf_arena_htab.h | 100 ++++ > tools/testing/selftests/bpf/bpf_arena_list.h | 92 +++ > .../testing/selftests/bpf/bpf_experimental.h | 43 ++ > .../selftests/bpf/prog_tests/arena_htab.c | 88 +++ > .../selftests/bpf/prog_tests/arena_list.c | 68 +++ > .../selftests/bpf/prog_tests/verifier.c | 2 + > .../testing/selftests/bpf/progs/arena_htab.c | 48 ++ > .../selftests/bpf/progs/arena_htab_asm.c | 5 + > .../testing/selftests/bpf/progs/arena_list.c | 87 +++ > .../selftests/bpf/progs/verifier_arena.c | 146 +++++ > tools/testing/selftests/bpf/test_loader.c | 9 +- > 37 files changed, 2028 insertions(+), 40 deletions(-) > create mode 100644 kernel/bpf/arena.c > create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h > create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h > create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h > create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h > create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c > create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c > create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c > create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c > create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c > create mode 100644 tools/testing/selftests/bpf/progs/verifier_arena.c > > -- > 2.43.0 > Besides a few comments on patch #1 (and maybe one or two potential corner case issues I mentioned, which can be easily fixed), the series looked good. So I've applied patches as is. I fixed typo ("varaibles") in one of the commit subjects while applying. Also, in one of the selftests you hard-coded PAGE_SIZE to 4096, which isn't correct on some architectures, so please see how you can make it not hard-coded (but still work for both bpf and user code). It seemed minor enough to not delay patches (either way those architectures don't support ARENA just yet).
Hello: This series was applied to bpf/bpf-next.git (master) by Andrii Nakryiko <andrii@kernel.org>: On Thu, 7 Mar 2024 17:07:58 -0800 you wrote: > From: Alexei Starovoitov <ast@kernel.org> > > v2->v3: > - contains bpf bits only, but cc-ing past audience for continuity > - since prerequisite patches landed, this series focus on the main > functionality of bpf_arena. > - adopted Andrii's approach to support arena in libbpf. > - simplified LLVM support. Instead of two instructions it's now only one. > - switched to cond_break (instead of open coded iters) in selftests > - implemented several follow-ups that will be sent after this set > . remember first IP and bpf insn that faulted in arena. > report to user space via bpftool > . copy paste and tweak glob_match() aka mini-regex as a selftests/bpf > - see patch 1 for detailed description of bpf_arena > > [...] Here is the summary with links: - [v3,bpf-next,01/14] bpf: Introduce bpf_arena. https://git.kernel.org/bpf/bpf-next/c/317460317a02 - [v3,bpf-next,02/14] bpf: Disasm support for addr_space_cast instruction. https://git.kernel.org/bpf/bpf-next/c/667a86ad9b71 - [v3,bpf-next,03/14] bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. https://git.kernel.org/bpf/bpf-next/c/2fe99eb0ccf2 - [v3,bpf-next,04/14] bpf: Add x86-64 JIT support for bpf_addr_space_cast instruction. https://git.kernel.org/bpf/bpf-next/c/142fd4d2dcf5 - [v3,bpf-next,05/14] bpf: Recognize addr_space_cast instruction in the verifier. https://git.kernel.org/bpf/bpf-next/c/6082b6c328b5 - [v3,bpf-next,06/14] bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. https://git.kernel.org/bpf/bpf-next/c/2edc3de6fb65 - [v3,bpf-next,07/14] libbpf: Add __arg_arena to bpf_helpers.h https://git.kernel.org/bpf/bpf-next/c/4d2b56081c32 - [v3,bpf-next,08/14] libbpf: Add support for bpf_arena. https://git.kernel.org/bpf/bpf-next/c/79ff13e99169 - [v3,bpf-next,09/14] bpftool: Recognize arena map type https://git.kernel.org/bpf/bpf-next/c/eed512e8ac64 - [v3,bpf-next,10/14] libbpf: Recognize __arena global varaibles. https://git.kernel.org/bpf/bpf-next/c/2e7ba4f8fd1f - [v3,bpf-next,11/14] bpf: Add helper macro bpf_addr_space_cast() https://git.kernel.org/bpf/bpf-next/c/204c628730c6 - [v3,bpf-next,12/14] selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages https://git.kernel.org/bpf/bpf-next/c/80a4129fcf20 - [v3,bpf-next,13/14] selftests/bpf: Add bpf_arena_list test. https://git.kernel.org/bpf/bpf-next/c/9f2c156f90a4 - [v3,bpf-next,14/14] selftests/bpf: Add bpf_arena_htab test. https://git.kernel.org/bpf/bpf-next/c/8df839ae23b8 You are awesome, thank you!
On Mon, Mar 11, 2024 at 3:45 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Mar 7, 2024 at 5:08 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > From: Alexei Starovoitov <ast@kernel.org> > > > > v2->v3: > > - contains bpf bits only, but cc-ing past audience for continuity > > - since prerequisite patches landed, this series focus on the main > > functionality of bpf_arena. > > - adopted Andrii's approach to support arena in libbpf. > > - simplified LLVM support. Instead of two instructions it's now only one. > > - switched to cond_break (instead of open coded iters) in selftests > > - implemented several follow-ups that will be sent after this set > > . remember first IP and bpf insn that faulted in arena. > > report to user space via bpftool > > . copy paste and tweak glob_match() aka mini-regex as a selftests/bpf > > - see patch 1 for detailed description of bpf_arena > > > > v1->v2: > > - Improved commit log with reasons for using vmap_pages_range() in arena. > > Thanks to Johannes > > - Added support for __arena global variables in bpf programs > > - Fixed race conditions spotted by Barret > > - Fixed wrap32 issue spotted by Barret > > - Fixed bpf_map_mmap_sz() the way Andrii suggested > > > > The work on bpf_arena was inspired by Barret's work: > > https://github.com/google/ghost-userspace/blob/main/lib/queue.bpf.h > > that implements queues, lists and AVL trees completely as bpf programs > > using giant bpf array map and integer indices instead of pointers. > > bpf_arena is a sparse array that allows to use normal C pointers to > > build such data structures. Last few patches implement page_frag > > allocator, link list and hash table as bpf programs. > > > > v1: > > bpf programs have multiple options to communicate with user space: > > - Various ring buffers (perf, ftrace, bpf): The data is streamed > > unidirectionally from bpf to user space. > > - Hash map: The bpf program populates elements, and user space consumes > > them via bpf syscall. > > - mmap()-ed array map: Libbpf creates an array map that is directly > > accessed by the bpf program and mmap-ed to user space. It's the fastest > > way. Its disadvantage is that memory for the whole array is reserved at > > the start. > > > > Alexei Starovoitov (13): > > bpf: Introduce bpf_arena. > > bpf: Disasm support for addr_space_cast instruction. > > bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. > > bpf: Add x86-64 JIT support for bpf_addr_space_cast instruction. > > bpf: Recognize addr_space_cast instruction in the verifier. > > bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. > > libbpf: Add __arg_arena to bpf_helpers.h > > libbpf: Add support for bpf_arena. > > bpftool: Recognize arena map type > > bpf: Add helper macro bpf_addr_space_cast() > > selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages > > selftests/bpf: Add bpf_arena_list test. > > selftests/bpf: Add bpf_arena_htab test. > > > > Andrii Nakryiko (1): > > libbpf: Recognize __arena global varaibles. > > > > arch/x86/net/bpf_jit_comp.c | 231 +++++++- > > include/linux/bpf.h | 10 +- > > include/linux/bpf_types.h | 1 + > > include/linux/bpf_verifier.h | 1 + > > include/linux/filter.h | 4 + > > include/uapi/linux/bpf.h | 14 + > > kernel/bpf/Makefile | 3 + > > kernel/bpf/arena.c | 558 ++++++++++++++++++ > > kernel/bpf/btf.c | 19 +- > > kernel/bpf/core.c | 16 + > > kernel/bpf/disasm.c | 10 + > > kernel/bpf/log.c | 3 + > > kernel/bpf/syscall.c | 42 ++ > > kernel/bpf/verifier.c | 123 +++- > > .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +- > > tools/bpf/bpftool/gen.c | 13 + > > tools/bpf/bpftool/map.c | 2 +- > > tools/include/uapi/linux/bpf.h | 14 + > > tools/lib/bpf/bpf_helpers.h | 1 + > > tools/lib/bpf/libbpf.c | 163 ++++- > > tools/lib/bpf/libbpf.h | 2 +- > > tools/lib/bpf/libbpf_probes.c | 7 + > > tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 + > > tools/testing/selftests/bpf/DENYLIST.s390x | 2 + > > tools/testing/selftests/bpf/bpf_arena_alloc.h | 67 +++ > > .../testing/selftests/bpf/bpf_arena_common.h | 70 +++ > > tools/testing/selftests/bpf/bpf_arena_htab.h | 100 ++++ > > tools/testing/selftests/bpf/bpf_arena_list.h | 92 +++ > > .../testing/selftests/bpf/bpf_experimental.h | 43 ++ > > .../selftests/bpf/prog_tests/arena_htab.c | 88 +++ > > .../selftests/bpf/prog_tests/arena_list.c | 68 +++ > > .../selftests/bpf/prog_tests/verifier.c | 2 + > > .../testing/selftests/bpf/progs/arena_htab.c | 48 ++ > > .../selftests/bpf/progs/arena_htab_asm.c | 5 + > > .../testing/selftests/bpf/progs/arena_list.c | 87 +++ > > .../selftests/bpf/progs/verifier_arena.c | 146 +++++ > > tools/testing/selftests/bpf/test_loader.c | 9 +- > > 37 files changed, 2028 insertions(+), 40 deletions(-) > > create mode 100644 kernel/bpf/arena.c > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h > > create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h > > create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c > > create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c > > create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c > > create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c > > create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c > > create mode 100644 tools/testing/selftests/bpf/progs/verifier_arena.c > > > > -- > > 2.43.0 > > > > Besides a few comments on patch #1 (and maybe one or two potential > corner case issues I mentioned, which can be easily fixed), > the series > looked good. So I've applied patches as is. I fixed typo ("varaibles") > in one of the commit subjects while applying. Thanks! That subj typo survived two months of reviews in v1,v2,v3 and I swear I use ./scripts/checkpatch.pl --codespell all the time. I guess it got drowned in all of the messages like: WARNING: 'mmaped' may be misspelled - perhaps 'mapped'? #356: FILE: tools/lib/bpf/libbpf.c:13666: + *mmaped = map->mmaped; ^^^^^^ WARNING: 'mmaped' may be misspelled - perhaps 'mapped'? #356: FILE: tools/lib/bpf/libbpf.c:13666: + *mmaped = map->mmaped; ^^^^^^ > Also, in one of the selftests you hard-coded PAGE_SIZE to 4096, which > isn't correct on some architectures, so please see how you can make it > not hard-coded (but still work for both bpf and user code). It seemed > minor enough to not delay patches (either way those architectures > don't support ARENA just yet). yes. It's on todo list already. I've added #define PAGE_SIZE 4096 to user space side of bpf selftest, because it's used in bpf_arena_*.h code which is dual compiled as bpf prog (and then it's using PAGE_SIZE from vmlinux.h) and compiled as native code. So bpf side gets correct PAGE_SIZE automatically as a nice constant at compile time, but for user space there is no good PAGE_SIZE constant to use. Just doing #define PAGE_SIZE sysconf(_SC_PAGE_SIZE) produces inefficient code. Hence I left it as a todo to figure out later.
From: Alexei Starovoitov <ast@kernel.org> v2->v3: - contains bpf bits only, but cc-ing past audience for continuity - since prerequisite patches landed, this series focus on the main functionality of bpf_arena. - adopted Andrii's approach to support arena in libbpf. - simplified LLVM support. Instead of two instructions it's now only one. - switched to cond_break (instead of open coded iters) in selftests - implemented several follow-ups that will be sent after this set . remember first IP and bpf insn that faulted in arena. report to user space via bpftool . copy paste and tweak glob_match() aka mini-regex as a selftests/bpf - see patch 1 for detailed description of bpf_arena v1->v2: - Improved commit log with reasons for using vmap_pages_range() in arena. Thanks to Johannes - Added support for __arena global variables in bpf programs - Fixed race conditions spotted by Barret - Fixed wrap32 issue spotted by Barret - Fixed bpf_map_mmap_sz() the way Andrii suggested The work on bpf_arena was inspired by Barret's work: https://github.com/google/ghost-userspace/blob/main/lib/queue.bpf.h that implements queues, lists and AVL trees completely as bpf programs using giant bpf array map and integer indices instead of pointers. bpf_arena is a sparse array that allows to use normal C pointers to build such data structures. Last few patches implement page_frag allocator, link list and hash table as bpf programs. v1: bpf programs have multiple options to communicate with user space: - Various ring buffers (perf, ftrace, bpf): The data is streamed unidirectionally from bpf to user space. - Hash map: The bpf program populates elements, and user space consumes them via bpf syscall. - mmap()-ed array map: Libbpf creates an array map that is directly accessed by the bpf program and mmap-ed to user space. It's the fastest way. Its disadvantage is that memory for the whole array is reserved at the start. Alexei Starovoitov (13): bpf: Introduce bpf_arena. bpf: Disasm support for addr_space_cast instruction. bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. bpf: Add x86-64 JIT support for bpf_addr_space_cast instruction. bpf: Recognize addr_space_cast instruction in the verifier. bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. libbpf: Add __arg_arena to bpf_helpers.h libbpf: Add support for bpf_arena. bpftool: Recognize arena map type bpf: Add helper macro bpf_addr_space_cast() selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages selftests/bpf: Add bpf_arena_list test. selftests/bpf: Add bpf_arena_htab test. Andrii Nakryiko (1): libbpf: Recognize __arena global varaibles. arch/x86/net/bpf_jit_comp.c | 231 +++++++- include/linux/bpf.h | 10 +- include/linux/bpf_types.h | 1 + include/linux/bpf_verifier.h | 1 + include/linux/filter.h | 4 + include/uapi/linux/bpf.h | 14 + kernel/bpf/Makefile | 3 + kernel/bpf/arena.c | 558 ++++++++++++++++++ kernel/bpf/btf.c | 19 +- kernel/bpf/core.c | 16 + kernel/bpf/disasm.c | 10 + kernel/bpf/log.c | 3 + kernel/bpf/syscall.c | 42 ++ kernel/bpf/verifier.c | 123 +++- .../bpf/bpftool/Documentation/bpftool-map.rst | 2 +- tools/bpf/bpftool/gen.c | 13 + tools/bpf/bpftool/map.c | 2 +- tools/include/uapi/linux/bpf.h | 14 + tools/lib/bpf/bpf_helpers.h | 1 + tools/lib/bpf/libbpf.c | 163 ++++- tools/lib/bpf/libbpf.h | 2 +- tools/lib/bpf/libbpf_probes.c | 7 + tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 + tools/testing/selftests/bpf/DENYLIST.s390x | 2 + tools/testing/selftests/bpf/bpf_arena_alloc.h | 67 +++ .../testing/selftests/bpf/bpf_arena_common.h | 70 +++ tools/testing/selftests/bpf/bpf_arena_htab.h | 100 ++++ tools/testing/selftests/bpf/bpf_arena_list.h | 92 +++ .../testing/selftests/bpf/bpf_experimental.h | 43 ++ .../selftests/bpf/prog_tests/arena_htab.c | 88 +++ .../selftests/bpf/prog_tests/arena_list.c | 68 +++ .../selftests/bpf/prog_tests/verifier.c | 2 + .../testing/selftests/bpf/progs/arena_htab.c | 48 ++ .../selftests/bpf/progs/arena_htab_asm.c | 5 + .../testing/selftests/bpf/progs/arena_list.c | 87 +++ .../selftests/bpf/progs/verifier_arena.c | 146 +++++ tools/testing/selftests/bpf/test_loader.c | 9 +- 37 files changed, 2028 insertions(+), 40 deletions(-) create mode 100644 kernel/bpf/arena.c create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c create mode 100644 tools/testing/selftests/bpf/progs/verifier_arena.c