Message ID | 20220211152059.1584701-1-yhs@fb.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: fix a bpf_timer initialization issue | expand |
On Fri, Feb 11, 2022 at 7:21 AM Yonghong Song <yhs@fb.com> wrote: > > struct bpf_spin_lock { > __u32 val; > }; > struct bpf_timer { > __u64 :64; > __u64 :64; > } __attribute__((aligned(8))); > > The initialization code: > *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = > (struct bpf_spin_lock){}; > *(struct bpf_timer *)(dst + map->timer_off) = > (struct bpf_timer){}; > It appears the compiler has no obligation to initialize anonymous fields. > For example, let us use clang with bpf target as below: > $ cat t.c > struct bpf_timer { > unsigned long long :64; > }; > struct bpf_timer2 { > unsigned long long a; > }; > > void test(struct bpf_timer *t) { > *t = (struct bpf_timer){}; > } > void test2(struct bpf_timer2 *t) { > *t = (struct bpf_timer2){}; > } > $ clang -target bpf -O2 -c -g t.c > $ llvm-objdump -d t.o > ... > 0000000000000000 <test>: > 0: 95 00 00 00 00 00 00 00 exit > 0000000000000008 <test2>: > 1: b7 02 00 00 00 00 00 00 r2 = 0 > 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 > 3: 95 00 00 00 00 00 00 00 exit wow! Is this a clang only behavior or gcc does the same "smart" optimization? We've seen this issue with padding, but I could have never guessed that compiler will do so for explicit anon fields. I wonder what standard says and what other kernel code is broken by this "optimization".
On 2/11/22 8:47 AM, Alexei Starovoitov wrote: > On Fri, Feb 11, 2022 at 7:21 AM Yonghong Song <yhs@fb.com> wrote: >> >> struct bpf_spin_lock { >> __u32 val; >> }; >> struct bpf_timer { >> __u64 :64; >> __u64 :64; >> } __attribute__((aligned(8))); >> >> The initialization code: >> *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = >> (struct bpf_spin_lock){}; >> *(struct bpf_timer *)(dst + map->timer_off) = >> (struct bpf_timer){}; >> It appears the compiler has no obligation to initialize anonymous fields. >> For example, let us use clang with bpf target as below: >> $ cat t.c >> struct bpf_timer { >> unsigned long long :64; >> }; >> struct bpf_timer2 { >> unsigned long long a; >> }; >> >> void test(struct bpf_timer *t) { >> *t = (struct bpf_timer){}; >> } >> void test2(struct bpf_timer2 *t) { >> *t = (struct bpf_timer2){}; >> } >> $ clang -target bpf -O2 -c -g t.c >> $ llvm-objdump -d t.o >> ... >> 0000000000000000 <test>: >> 0: 95 00 00 00 00 00 00 00 exit >> 0000000000000008 <test2>: >> 1: b7 02 00 00 00 00 00 00 r2 = 0 >> 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 >> 3: 95 00 00 00 00 00 00 00 exit > > wow! > Is this a clang only behavior or gcc does the same "smart" optimization? gcc seems okay for my above test. $ /home/yhs/work/gcc2/gcc11-2/install/bin/gcc --version gcc (GCC) 11.2.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ /home/yhs/work/gcc2/gcc11-2/install/bin/gcc -O2 -c t.c $ llvm-objdump -d t.o t.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <test>: 0: 48 c7 07 00 00 00 00 movq $0, (%rdi) 7: c3 retq 8: 0f 1f 84 00 00 00 00 00 nopl (%rax,%rax) 0000000000000010 <test2>: 10: 48 c7 07 00 00 00 00 movq $0, (%rdi) 17: c3 retq [yhs@devbig309.ftw3 ~/tmp2] > > We've seen this issue with padding, but I could have never guessed > that compiler will do so for explicit anon fields. > I wonder what standard says and what other kernel code is broken > by this "optimization". Not familiar with the standard. Need to dig out.
On 2/11/22 8:47 AM, Alexei Starovoitov wrote: > On Fri, Feb 11, 2022 at 7:21 AM Yonghong Song <yhs@fb.com> wrote: >> >> struct bpf_spin_lock { >> __u32 val; >> }; >> struct bpf_timer { >> __u64 :64; >> __u64 :64; >> } __attribute__((aligned(8))); >> >> The initialization code: >> *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = >> (struct bpf_spin_lock){}; >> *(struct bpf_timer *)(dst + map->timer_off) = >> (struct bpf_timer){}; >> It appears the compiler has no obligation to initialize anonymous fields. >> For example, let us use clang with bpf target as below: >> $ cat t.c >> struct bpf_timer { >> unsigned long long :64; >> }; >> struct bpf_timer2 { >> unsigned long long a; >> }; >> >> void test(struct bpf_timer *t) { >> *t = (struct bpf_timer){}; >> } >> void test2(struct bpf_timer2 *t) { >> *t = (struct bpf_timer2){}; >> } >> $ clang -target bpf -O2 -c -g t.c >> $ llvm-objdump -d t.o >> ... >> 0000000000000000 <test>: >> 0: 95 00 00 00 00 00 00 00 exit >> 0000000000000008 <test2>: >> 1: b7 02 00 00 00 00 00 00 r2 = 0 >> 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 >> 3: 95 00 00 00 00 00 00 00 exit > > wow! > Is this a clang only behavior or gcc does the same "smart" optimization? > > We've seen this issue with padding, but I could have never guessed > that compiler will do so for explicit anon fields. > I wonder what standard says and what other kernel code is broken > by this "optimization". Searched the internet, not sure whether this helps or not. INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x Programming languages — C http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n1547.pdf page 157: Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.
On Fri, Feb 11, 2022 at 9:32 AM Yonghong Song <yhs@fb.com> wrote: > > > > On 2/11/22 8:47 AM, Alexei Starovoitov wrote: > > On Fri, Feb 11, 2022 at 7:21 AM Yonghong Song <yhs@fb.com> wrote: > >> > >> struct bpf_spin_lock { > >> __u32 val; > >> }; > >> struct bpf_timer { > >> __u64 :64; > >> __u64 :64; > >> } __attribute__((aligned(8))); > >> > >> The initialization code: > >> *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = > >> (struct bpf_spin_lock){}; > >> *(struct bpf_timer *)(dst + map->timer_off) = > >> (struct bpf_timer){}; > >> It appears the compiler has no obligation to initialize anonymous fields. > >> For example, let us use clang with bpf target as below: > >> $ cat t.c > >> struct bpf_timer { > >> unsigned long long :64; > >> }; > >> struct bpf_timer2 { > >> unsigned long long a; > >> }; > >> > >> void test(struct bpf_timer *t) { > >> *t = (struct bpf_timer){}; > >> } > >> void test2(struct bpf_timer2 *t) { > >> *t = (struct bpf_timer2){}; > >> } > >> $ clang -target bpf -O2 -c -g t.c > >> $ llvm-objdump -d t.o > >> ... > >> 0000000000000000 <test>: > >> 0: 95 00 00 00 00 00 00 00 exit > >> 0000000000000008 <test2>: > >> 1: b7 02 00 00 00 00 00 00 r2 = 0 > >> 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 > >> 3: 95 00 00 00 00 00 00 00 exit > > > > wow! > > Is this a clang only behavior or gcc does the same "smart" optimization? > > > > We've seen this issue with padding, but I could have never guessed > > that compiler will do so for explicit anon fields. > > I wonder what standard says and what other kernel code is broken > > by this "optimization". > > Searched the internet, not sure whether this helps or not. > > INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x > Programming languages — C > > http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n1547.pdf > page 157: > > Except where explicitly stated otherwise, for the purposes of this > subclause unnamed > members of objects of structure and union type do not participate in > initialization. > Unnamed members of structure objects have indeterminate value even after > initialization. Thanks for clarifying! It means that gcc might do so eventually as well.
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index fa517ae604ad..1a4c73742a1f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -209,11 +209,9 @@ static inline bool map_value_has_timer(const struct bpf_map *map) static inline void check_and_init_map_value(struct bpf_map *map, void *dst) { if (unlikely(map_value_has_spin_lock(map))) - *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = - (struct bpf_spin_lock){}; + memset(dst + map->spin_lock_off, 0, sizeof(struct bpf_spin_lock)); if (unlikely(map_value_has_timer(map))) - *(struct bpf_timer *)(dst + map->timer_off) = - (struct bpf_timer){}; + memset(dst + map->timer_off, 0, sizeof(struct bpf_timer)); } /* copy everything but bpf_spin_lock and bpf_timer. There could be one of each. */
The patch in [1] intends to fix a bpf_timer related issue, but the fix caused existing 'timer' selftest to fail with hang or some random errors. After some debug, I found an issue with check_and_init_map_value() in the hashtab.c. More specifically, in hashtab.c, we have code l_new = bpf_map_kmalloc_node(&htab->map, ...) check_and_init_map_value(&htab->map, l_new...) Note that bpf_map_kmalloc_node() does not do initialization so l_new contains random value. The function check_and_init_map_value() intends to zero the bpf_spin_lock and bpf_timer if they exist in the map. But I found bpf_spin_lock is zero'ed but bpf_timer is not zero'ed. With [1], later copy_map_value() skips copying of bpf_spin_lock and bpf_timer. The non-zero bpf_timer caused random failures for 'timer' selftest. Without [1], for both bpf_spin_lock and bpf_timer case, bpf_timer will be zero'ed, so 'timer' self test is okay. For check_and_init_map_value(), why bpf_spin_lock is zero'ed properly while bpf_timer not. In bpf uapi header, we have struct bpf_spin_lock { __u32 val; }; struct bpf_timer { __u64 :64; __u64 :64; } __attribute__((aligned(8))); The initialization code: *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = (struct bpf_spin_lock){}; *(struct bpf_timer *)(dst + map->timer_off) = (struct bpf_timer){}; It appears the compiler has no obligation to initialize anonymous fields. For example, let us use clang with bpf target as below: $ cat t.c struct bpf_timer { unsigned long long :64; }; struct bpf_timer2 { unsigned long long a; }; void test(struct bpf_timer *t) { *t = (struct bpf_timer){}; } void test2(struct bpf_timer2 *t) { *t = (struct bpf_timer2){}; } $ clang -target bpf -O2 -c -g t.c $ llvm-objdump -d t.o ... 0000000000000000 <test>: 0: 95 00 00 00 00 00 00 00 exit 0000000000000008 <test2>: 1: b7 02 00 00 00 00 00 00 r2 = 0 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 3: 95 00 00 00 00 00 00 00 exit To fix the problem, let use memset for bpf_timer case in check_and_init_map_value(). For consistency, memset is also used for bpf_spin_lock case. [1] https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com/ Fixes: 68134668c17f3 ("bpf: Add map side support for bpf timers.") Signed-off-by: Yonghong Song <yhs@fb.com> --- include/linux/bpf.h | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)