Message ID | cover.1644453291.git.delyank@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | Avoid typedef size mismatches in skeletons | expand |
On 2/9/22 4:36 PM, Delyan Kratunov wrote: > As reported in [0], kernel and userspace can sometimes disagree > on the definition of a typedef (in particular, the size). > This leads to trouble when userspace maps the memory of a bpf program > and reads/writes to it assuming a different memory layout. > > This series resolves most int-like typedefs and rewrites them as > standard int16_t-like types. In particular, we don't touch > __u32-like types, char, and _Bool, as changing them changes cast > semantics and would break too many pre-existing programs. For example, > int8_t* is not convertible to char* because int8_t is explicitly signed. Build with clang (adding LLVM=1 to build kernel and selftests), several btf_dump subtests failed. Please take a look. btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(int32_t)1234' != expected '(int)1234' btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:PASS:ensure expected/actual match 0 nsec btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(int32_t)1234' != expected '(int)1234' btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(int32_t)0' != expected '(int)0' ... btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(uint128_t)0xffffffffffffffff' != expected '(unsigned __int128)0xffffffffffffffff' btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(uint128_t)0xfffffffffffffffffffffffffffffffe' != expected '(unsigned __int128)0xfffffffffffffffffffffffffffff' test_btf_dump_int_data:FAIL:dump unsigned __int128 unexpected error: -14 (errno 7) #20/9 btf_dump/btf_dump: int_data:FAIL ... btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(uint64_t)1' != expected '(u64)1' btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(uint64_t)0' != expected '(u64)0' btf_dump_data:PASS:find type id 0 nsec ... btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(atomic_t){ .counter = (int32_t)0, }' != expected '(atomic_t){ .counter = (int)0, }' btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed to return -E2BIG 0 nsec btf_dump_data:PASS:ensure expected/actual match 0 nsec #20/12 btf_dump/btf_dump: typedef_data:FAIL ... test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){ .owner = (struct module *)0xffffffffffffffff, .llseek = (int64_t (*)(struct file *, int64_t, int32_t))0xfffffffffff' != expected '(struct file_operations){ .owner = (struct module *)0xffffffffffffffff, .llseek = (loff_t (*)(struct file *, loff_t, int))0xffffffffffffffff,' ... ... > > [0]: https://github.com/iovisor/bcc/pull/3777 > > Delyan Kratunov (3): > libbpf: btf_dump can produce explicitly sized ints > bpftool: skeleton uses explicitly sized ints > selftests/bpf: add test case for userspace and bpf type size mismatch > > tools/bpf/bpftool/gen.c | 3 + > tools/lib/bpf/btf.h | 4 +- > tools/lib/bpf/btf_dump.c | 80 ++++++++++++++++++- > .../selftests/bpf/prog_tests/skeleton.c | 22 +++-- > .../selftests/bpf/progs/test_skeleton.c | 8 ++ > 5 files changed, 107 insertions(+), 10 deletions(-) > > -- > 2.34.1
On Thu, Feb 10, 2022 at 2:18 PM Yonghong Song <yhs@fb.com> wrote: > > > > On 2/9/22 4:36 PM, Delyan Kratunov wrote: > > As reported in [0], kernel and userspace can sometimes disagree > > on the definition of a typedef (in particular, the size). > > This leads to trouble when userspace maps the memory of a bpf program > > and reads/writes to it assuming a different memory layout. > > > > This series resolves most int-like typedefs and rewrites them as > > standard int16_t-like types. In particular, we don't touch > > __u32-like types, char, and _Bool, as changing them changes cast > > semantics and would break too many pre-existing programs. For example, > > int8_t* is not convertible to char* because int8_t is explicitly signed. > > Build with clang (adding LLVM=1 to build kernel and selftests), > several btf_dump subtests failed. Please take a look. > > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed/unexpected type_sz 0 nsec > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(int32_t)1234' != expected '(int)1234' > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed/unexpected type_sz 0 nsec > btf_dump_data:PASS:ensure expected/actual match 0 nsec > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed/unexpected type_sz 0 nsec > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(int32_t)1234' != expected '(int)1234' > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed/unexpected type_sz 0 nsec > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(int32_t)0' != expected '(int)0' > ... > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(uint128_t)0xffffffffffffffff' != > expected '(unsigned __int128)0xffffffffffffffff' > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed/unexpected type_sz 0 nsec > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual > '(uint128_t)0xfffffffffffffffffffffffffffffffe' != expected '(unsigned > __int128)0xfffffffffffffffffffffffffffff' > test_btf_dump_int_data:FAIL:dump unsigned __int128 unexpected error: -14 > (errno 7) > #20/9 btf_dump/btf_dump: int_data:FAIL > ... > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(uint64_t)1' != expected '(u64)1' > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed/unexpected type_sz 0 nsec > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(uint64_t)0' != expected '(u64)0' > btf_dump_data:PASS:find type id 0 nsec > ... > btf_dump_data:FAIL:ensure expected/actual match unexpected ensure > expected/actual match: actual '(atomic_t){ > .counter = (int32_t)0, > }' != expected '(atomic_t){ > .counter = (int)0, > }' > btf_dump_data:PASS:find type id 0 nsec > btf_dump_data:PASS:failed to return -E2BIG 0 nsec > btf_dump_data:PASS:ensure expected/actual match 0 nsec > #20/12 btf_dump/btf_dump: typedef_data:FAIL > ... > test_btf_dump_struct_data:FAIL:file_operations unexpected > file_operations: actual '(struct file_operations){ > .owner = (struct module *)0xffffffffffffffff, > .llseek = (int64_t (*)(struct file *, int64_t, > int32_t))0xfffffffffff' != expected '(struct file_operations){ > .owner = (struct module *)0xffffffffffffffff, > .llseek = (loff_t (*)(struct file *, loff_t, > int))0xffffffffffffffff,' > ... > ... > This is due to unconditional normalization in btf_dump_emit_type_cast(), we shouldn't do it. So this will "autofix" itself when we fix that issue. > > > > [0]: https://github.com/iovisor/bcc/pull/3777 > > > > Delyan Kratunov (3): > > libbpf: btf_dump can produce explicitly sized ints > > bpftool: skeleton uses explicitly sized ints > > selftests/bpf: add test case for userspace and bpf type size mismatch > > > > tools/bpf/bpftool/gen.c | 3 + > > tools/lib/bpf/btf.h | 4 +- > > tools/lib/bpf/btf_dump.c | 80 ++++++++++++++++++- > > .../selftests/bpf/prog_tests/skeleton.c | 22 +++-- > > .../selftests/bpf/progs/test_skeleton.c | 8 ++ > > 5 files changed, 107 insertions(+), 10 deletions(-) > > > > -- > > 2.34.1
On 2/9/22 4:36 PM, Delyan Kratunov wrote: > As reported in [0], kernel and userspace can sometimes disagree > on the definition of a typedef (in particular, the size). > This leads to trouble when userspace maps the memory of a bpf program > and reads/writes to it assuming a different memory layout. I am thinking whether we can do better since this only resolved some basic types. But it is totally possible some types in vmlinux.h, who are kernel internal types, happen in some uapi or user header as well but with different sizes. Currently, the exposed bpf program types (in skeleton) are all from global variables. Since we intend to ensure their size be equal, and bpf program itself provides the size of the type. For example, in bpf program, we have following, TypeA variable; Since TypeA will appear in the skel.h file, user must define it somehow before skel.h. Let us say TypeA size is 20 from bpf program BTF type. So we could insert a BUILD_BUG_ON(sizeof(TypeA) != 20) in the skeleton file to ensure the size match and this applies to all types. In the skel.h file, we can have #ifndef BUILD_BUG_ON #define BUILD_BUG_ON ... #endif to have BUILD_BUG_ON to cause compilation error if the condition is true. User can define BUILD_BUG_ON before skel.h if they want to override. This should apply to all types put in bss/data/rodata sections by skeleton. If this indeed happens as in [0], user can detect the problem and they may look at vmlinux.h and use proper underlying types to resolve the issue. WDYT? > > This series resolves most int-like typedefs and rewrites them as > standard int16_t-like types. In particular, we don't touch > __u32-like types, char, and _Bool, as changing them changes cast > semantics and would break too many pre-existing programs. For example, > int8_t* is not convertible to char* because int8_t is explicitly signed. > > [0]: https://github.com/iovisor/bcc/pull/3777 > > Delyan Kratunov (3): > libbpf: btf_dump can produce explicitly sized ints > bpftool: skeleton uses explicitly sized ints > selftests/bpf: add test case for userspace and bpf type size mismatch > > tools/bpf/bpftool/gen.c | 3 + > tools/lib/bpf/btf.h | 4 +- > tools/lib/bpf/btf_dump.c | 80 ++++++++++++++++++- > .../selftests/bpf/prog_tests/skeleton.c | 22 +++-- > .../selftests/bpf/progs/test_skeleton.c | 8 ++ > 5 files changed, 107 insertions(+), 10 deletions(-) > > -- > 2.34.1
On Thu, Feb 10, 2022 at 2:51 PM Yonghong Song <yhs@fb.com> wrote: > > > > On 2/9/22 4:36 PM, Delyan Kratunov wrote: > > As reported in [0], kernel and userspace can sometimes disagree > > on the definition of a typedef (in particular, the size). > > This leads to trouble when userspace maps the memory of a bpf program > > and reads/writes to it assuming a different memory layout. > > I am thinking whether we can do better since this only resolved > some basic types. But it is totally possible some types in vmlinux.h, > who are kernel internal types, happen in some uapi or user header > as well but with different sizes. > > Currently, the exposed bpf program types (in skeleton) are all > from global variables. Since we intend to ensure their size > be equal, and bpf program itself provides the size of the type. > > For example, in bpf program, we have following, > TypeA variable; > > Since TypeA will appear in the skel.h file, user must define it > somehow before skel.h. Let us say TypeA size is 20 from bpf program > BTF type. > > So we could insert a > BUILD_BUG_ON(sizeof(TypeA) != 20) > in the skeleton file to ensure the size match and this applies > to all types. > > In the skel.h file, we can have > #ifndef BUILD_BUG_ON > #define BUILD_BUG_ON ... > #endif > to have BUILD_BUG_ON to cause compilation error if the condition is true. > > User can define BUILD_BUG_ON before skel.h if they want to > override. > > This should apply to all types put in bss/data/rodata sections > by skeleton. > > If this indeed happens as in [0], user can detect the problem > and they may look at vmlinux.h and use proper underlying types > to resolve the issue. > > WDYT? Going back to https://github.com/iovisor/bcc/pull/3777 issue.. The user space part that included skel.h might have used dev_t in other places in the code. When bpftool automagically replaces dev_t in skel.h with int32_t to match the kernel it only moves the problem further into the user space where dev_t would still mismatch. So adding _Static_assert(sizeof(type_in_skel) == const_size_from_kernel); to skel.h would force users to pick types that are the same both in bpf prog and in corresponding user space part. I think that's a better approach. Maybe we don't need to add static_assert for all types, but that's a minor tweak.
On Thu, 2022-02-10 at 15:14 -0800, Alexei Starovoitov wrote: > So adding > _Static_assert(sizeof(type_in_skel) == const_size_from_kernel); > to skel.h would force users to pick types that are the same > both in bpf prog and in corresponding user space part. I'm not sure users have this much control over the definition of the types in their program though. If a kernel and uapi typedef differ in size, this approach would force the user to use kernel types for the entire program. If, for example, pid_t is a different size in glibc and the kernel, it would force you out of using any glibc functions taking pid_t (and potentially all of glibc depending on how entangled the headers are). By normalizing to stdint types, we're saying that the contract represented by the skel does not operate with either uapi or kernel types and it's up to you to ensure you use the right one (or mix and match, if you can). It feels fundamentally more permissive to different types of situations than forcing the skel user to only use kernel types or refactor their program to isolate the kernel type leakage to only the compilation unit using the skel.
On Thu, Feb 10, 2022 at 3:59 PM Delyan Kratunov <delyank@fb.com> wrote: > > On Thu, 2022-02-10 at 15:14 -0800, Alexei Starovoitov wrote: > > So adding > > _Static_assert(sizeof(type_in_skel) == const_size_from_kernel); > > to skel.h would force users to pick types that are the same > > both in bpf prog and in corresponding user space part. > > I'm not sure users have this much control over the definition of the types in > their program though. If a kernel and uapi typedef differ in size, this approach > would force the user to use kernel types for the entire program. > > If, for example, pid_t is a different size in glibc and the kernel, it would > force you out of using any glibc functions taking pid_t (and potentially all of > glibc depending on how entangled the headers are). pid_t is a great example, since its meaning is different in kernel and in user space. One is thread and another is process. static_assert will catch such issues and will force users to pick u64 and think through the type conversions. In kernel from pid_t -> u64 in bpf prog, and in user space pid_t -> u64 in skel. That would be an explicit action by users that hopefully make them think about the differences in size and semantics. > By normalizing to stdint types, we're saying that the contract represented by > the skel does not operate with either uapi or kernel types and it's up to you to > ensure you use the right one (or mix and match, if you can). It feels > fundamentally more permissive to different types of situations than forcing the > skel user to only use kernel types or refactor their program to isolate the > kernel type leakage to only the compilation unit using the skel. static_assert will force users to use compatible types, not kernel types.