Message ID | 20240210003308.3374075-1-andrii@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: emit source code file name and line number in verifier log | expand |
On 2/9/24 4:33 PM, Andrii Nakryiko wrote: > As BPF applications grow in size and complexity and are separated into > multiple .bpf.c files that are statically linked together, it becomes > harder and harder to match verifier's BPF assembly level output to > original C code. While often annotated C source code is unique enough to > be able to identify the file it belongs to, quite often this is actually > problematic as parts of source code can be quite generic. > > Long story short, it is very useful to see source code file name and > line number information along with the original C code. Verifier already > knows this information, we just need to output it. > > This patch set is an initial proposal on how this can be done. No new > flags are added and file:line information is appended at the end of > C code: > > ; <original C code> (<filename>.bpf.c:<line>) > > If file name has directory names in it, they are stripped away. This > should be fine in practice as file names tend to be pretty unique with > C code anyways, and keeping log size smaller is always good. > > In practice this might look something like below, where some code is > coming from application files, while others are from libbpf's usdt.bpf.h > header file: > > ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534) > 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm > > ... > > 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) > 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) > 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) > ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183) > 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) > 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm > 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() > 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) > 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > 180: (71) r4 = *(u8 *)(r8 +14) > 181: safe > > I've played with few different formats and none stood out as > particularly better than other. Suggestions and votes are appreciated: > > a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184] > c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed) > d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed) > > Above output shows variant a), which is quite non-distracting in > practice. > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev>
On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > As BPF applications grow in size and complexity and are separated into > multiple .bpf.c files that are statically linked together, it becomes > harder and harder to match verifier's BPF assembly level output to > original C code. While often annotated C source code is unique enough to > be able to identify the file it belongs to, quite often this is actually > problematic as parts of source code can be quite generic. > > Long story short, it is very useful to see source code file name and > line number information along with the original C code. Verifier already > knows this information, we just need to output it. > > This patch set is an initial proposal on how this can be done. No new > flags are added and file:line information is appended at the end of > C code: > > ; <original C code> (<filename>.bpf.c:<line>) > > If file name has directory names in it, they are stripped away. This > should be fine in practice as file names tend to be pretty unique with > C code anyways, and keeping log size smaller is always good. > > In practice this might look something like below, where some code is > coming from application files, while others are from libbpf's usdt.bpf.h > header file: > > ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534) > 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm > > ... > > 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) > 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) > 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) > ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183) > 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) > 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm > 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() > 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) > 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > 180: (71) r4 = *(u8 *)(r8 +14) > 181: safe > > I've played with few different formats and none stood out as > particularly better than other. Suggestions and votes are appreciated: > > a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184] > c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed) > d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed) Great idea. I would drop parenthesis. Don't see a reason to wrap a text. Since we already use ';' as a comment I would continue: e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184 Note that that fragile test needs to be adjusted again: Error: #137/3 log_fixup/bad_core_relo_trunc_full pw-bot: cr
On Sun, Feb 11, 2024 at 11:43 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > As BPF applications grow in size and complexity and are separated into > > multiple .bpf.c files that are statically linked together, it becomes > > harder and harder to match verifier's BPF assembly level output to > > original C code. While often annotated C source code is unique enough to > > be able to identify the file it belongs to, quite often this is actually > > problematic as parts of source code can be quite generic. > > > > Long story short, it is very useful to see source code file name and > > line number information along with the original C code. Verifier already > > knows this information, we just need to output it. > > > > This patch set is an initial proposal on how this can be done. No new > > flags are added and file:line information is appended at the end of > > C code: > > > > ; <original C code> (<filename>.bpf.c:<line>) > > > > If file name has directory names in it, they are stripped away. This > > should be fine in practice as file names tend to be pretty unique with > > C code anyways, and keeping log size smaller is always good. > > > > In practice this might look something like below, where some code is > > coming from application files, while others are from libbpf's usdt.bpf.h > > header file: > > > > ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534) > > 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > > 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > > 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm > > > > ... > > > > 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) > > 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) > > 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) > > ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183) > > 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) > > 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > > 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm > > 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() > > 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) > > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > > 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) > > 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() > > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > > 180: (71) r4 = *(u8 *)(r8 +14) > > 181: safe > > > > I've played with few different formats and none stood out as > > particularly better than other. Suggestions and votes are appreciated: > > > > a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > > b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184] > > c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed) > > d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed) > > Great idea. > I would drop parenthesis. Don't see a reason to wrap a text. > Since we already use ';' as a comment I would continue: A bit worried that it will quite weird for any source line which ends in ';', like: ; x = 123; ; my_file.bpf.c:123 r1 = 123; E.g., here's a real excerpt (I cut some register states for cleanliness): ; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396 5376: (79) r1 = *(u64 *)(r10 -40) ; R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0 5377: (79) r1 = *(u64 *)(r1 +8) ; R1_w=scalar(...) R9_w=10 ; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396 5378: (dd) if r1 s<= r9 goto pc-5 ; R1_w=scalar(smin=umin=11,...) ; descr->key_lens[i] = 0; ; strobemeta_probe.bpf.c:398 5379: (b4) w1 = 0 ; R1_w=0 5380: (6b) *(u16 *)(r8 -30) = r1 ; R1_w=0 ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ; strobemeta_probe.bpf.c:400 5381: (79) r3 = *(u64 *)(r7 -8) ; R3_w=scalar() R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192) 5382: (7b) *(u64 *)(r10 -24) = r6 ; ... ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ; strobemeta_probe.bpf.c:400 5383: (bc) w6 = w6 ; R6_w=scalar(...) ; barrier_var(payload_off); ; strobemeta_probe.bpf.c:280 5384: (bf) r2 = r6 ; R2_w=scalar(...) 5385: (bf) r1 = r4 VS ; if (i >= map->cnt) (strobemeta_probe.bpf.c:396) 5376: (79) r1 = *(u64 *)(r10 -40) ; R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0 5377: (79) r1 = *(u64 *)(r1 +8) ; R1_w=scalar() ; if (i >= map->cnt) (strobemeta_probe.bpf.c:396) 5378: (dd) if r1 s<= r9 goto pc-5 ; R1_w=scalar(...) R9_w=10 ; descr->key_lens[i] = 0; (strobemeta_probe.bpf.c:398) 5379: (b4) w1 = 0 ; R1_w=0 5380: (6b) *(u16 *)(r8 -30) = r1 ; R1_w=0 ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); (strobemeta_probe.bpf.c:400) 5381: (79) r3 = *(u64 *)(r7 -8) ; R3_w=scalar() R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192) 5382: (7b) *(u64 *)(r10 -24) = r6 ; ... ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); (strobemeta_probe.bpf.c:400) 5383: (bc) w6 = w6 ; R6_w=scalar(...) ; barrier_var(payload_off); (strobemeta_probe.bpf.c:280) 5384: (bf) r2 = r6 ; R2_w=scalar(...) 5385: (bf) r1 = r4 Can't say that either is super nice and clean. But when I tried e) proposal, I realized that semicolon separators are used also for register state (next to instruction dump) and they sort of overlap visually more and make it a bit harder to read log (subjective IMO, of course). But let me know if you still prefer e) and I'll send v2 with it. > > e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184 > > Note that that fragile test needs to be adjusted again: > Error: #137/3 log_fixup/bad_core_relo_trunc_full yep, my bad, I forgot to run all test_progs tests this time, already fixed locally > > pw-bot: cr
On Mon, Feb 12, 2024 at 10:59 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Sun, Feb 11, 2024 at 11:43 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > As BPF applications grow in size and complexity and are separated into > > > multiple .bpf.c files that are statically linked together, it becomes > > > harder and harder to match verifier's BPF assembly level output to > > > original C code. While often annotated C source code is unique enough to > > > be able to identify the file it belongs to, quite often this is actually > > > problematic as parts of source code can be quite generic. > > > > > > Long story short, it is very useful to see source code file name and > > > line number information along with the original C code. Verifier already > > > knows this information, we just need to output it. > > > > > > This patch set is an initial proposal on how this can be done. No new > > > flags are added and file:line information is appended at the end of > > > C code: > > > > > > ; <original C code> (<filename>.bpf.c:<line>) > > > > > > If file name has directory names in it, they are stripped away. This > > > should be fine in practice as file names tend to be pretty unique with > > > C code anyways, and keeping log size smaller is always good. > > > > > > In practice this might look something like below, where some code is > > > coming from application files, while others are from libbpf's usdt.bpf.h > > > header file: > > > > > > ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534) > > > 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > > > 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > > > 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm > > > > > > ... > > > > > > 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) > > > 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) > > > 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) > > > ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183) > > > 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) > > > 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > > > 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm > > > 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() > > > 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) > > > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > > > 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) > > > 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() > > > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > > > 180: (71) r4 = *(u8 *)(r8 +14) > > > 181: safe > > > > > > I've played with few different formats and none stood out as > > > particularly better than other. Suggestions and votes are appreciated: > > > > > > a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > > > b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184] > > > c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed) > > > d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed) > > > > Great idea. > > I would drop parenthesis. Don't see a reason to wrap a text. > > Since we already use ';' as a comment I would continue: > > A bit worried that it will quite weird for any source line which ends > in ';', like: > > ; x = 123; ; my_file.bpf.c:123 > r1 = 123; > > > E.g., here's a real excerpt (I cut some register states for cleanliness): > > ; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396 > 5376: (79) r1 = *(u64 *)(r10 -40) ; > R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0 > 5377: (79) r1 = *(u64 *)(r1 +8) ; R1_w=scalar(...) R9_w=10 > ; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396 > 5378: (dd) if r1 s<= r9 goto pc-5 ; R1_w=scalar(smin=umin=11,...) > ; descr->key_lens[i] = 0; ; strobemeta_probe.bpf.c:398 > 5379: (b4) w1 = 0 ; R1_w=0 > 5380: (6b) *(u16 *)(r8 -30) = r1 ; R1_w=0 > ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ; > strobemeta_probe.bpf.c:400 > 5381: (79) r3 = *(u64 *)(r7 -8) ; R3_w=scalar() > R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192) > 5382: (7b) *(u64 *)(r10 -24) = r6 ; ... > ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ; > strobemeta_probe.bpf.c:400 > 5383: (bc) w6 = w6 ; R6_w=scalar(...) > ; barrier_var(payload_off); ; strobemeta_probe.bpf.c:280 > 5384: (bf) r2 = r6 ; R2_w=scalar(...) > 5385: (bf) r1 = r4 > > VS > > ; if (i >= map->cnt) (strobemeta_probe.bpf.c:396) > 5376: (79) r1 = *(u64 *)(r10 -40) ; > R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0 > 5377: (79) r1 = *(u64 *)(r1 +8) ; R1_w=scalar() > ; if (i >= map->cnt) (strobemeta_probe.bpf.c:396) > 5378: (dd) if r1 s<= r9 goto pc-5 ; R1_w=scalar(...) R9_w=10 > ; descr->key_lens[i] = 0; (strobemeta_probe.bpf.c:398) > 5379: (b4) w1 = 0 ; R1_w=0 > 5380: (6b) *(u16 *)(r8 -30) = r1 ; R1_w=0 > ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); > (strobemeta_probe.bpf.c:400) > 5381: (79) r3 = *(u64 *)(r7 -8) ; R3_w=scalar() > R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192) > 5382: (7b) *(u64 *)(r10 -24) = r6 ; ... > ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); > (strobemeta_probe.bpf.c:400) > 5383: (bc) w6 = w6 ; R6_w=scalar(...) > ; barrier_var(payload_off); (strobemeta_probe.bpf.c:280) > 5384: (bf) r2 = r6 ; R2_w=scalar(...) > 5385: (bf) r1 = r4 > > > Can't say that either is super nice and clean. But when I tried e) > proposal, I realized that semicolon separators are used also for > register state (next to instruction dump) and they sort of overlap > visually more and make it a bit harder to read log (subjective IMO, of > course). > > But let me know if you still prefer e) and I'll send v2 with it. > Goodness, gmail made everything even worse. See [0] for visual comparison [0] https://gist.github.com/anakryiko/f5e9217f277b0f8cd156ceb6cb641268 > > > > e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184 > > > > Note that that fragile test needs to be adjusted again: > > Error: #137/3 log_fixup/bad_core_relo_trunc_full > > yep, my bad, I forgot to run all test_progs tests this time, already > fixed locally > > > > > pw-bot: cr
On Mon, Feb 12, 2024 at 11:02 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > > Can't say that either is super nice and clean. But when I tried e) > > proposal, I realized that semicolon separators are used also for > > register state (next to instruction dump) and they sort of overlap > > visually more and make it a bit harder to read log (subjective IMO, of > > course). > > > > But let me know if you still prefer e) and I'll send v2 with it. > > > > Goodness, gmail made everything even worse. See [0] for visual comparison > > [0] https://gist.github.com/anakryiko/f5e9217f277b0f8cd156ceb6cb641268 Two ; ; are indeed not pretty. Maybe let's use a single character that is not used in C ? Like @ ? Then it will be: ; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396 ; descr->key_lens[i] = 0; @ strobemeta_probe.bpf.c:398 Some asm languages use ! as a comment. It's ok-ish. a bit worse imo: ; if (i >= map->cnt) ! strobemeta_probe.bpf.c:396 ; descr->key_lens[i] = 0; ! strobemeta_probe.bpf.c:398 or single underscore ? ; if (i >= map->cnt) _ strobemeta_probe.bpf.c:396 ; descr->key_lens[i] = 0; _ strobemeta_probe.bpf.c:398 I think all of the above are better than () or []
On Mon, Feb 12, 2024 at 12:05 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Mon, Feb 12, 2024 at 11:02 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > Can't say that either is super nice and clean. But when I tried e) > > > proposal, I realized that semicolon separators are used also for > > > register state (next to instruction dump) and they sort of overlap > > > visually more and make it a bit harder to read log (subjective IMO, of > > > course). > > > > > > But let me know if you still prefer e) and I'll send v2 with it. > > > > > > > Goodness, gmail made everything even worse. See [0] for visual comparison > > > > [0] https://gist.github.com/anakryiko/f5e9217f277b0f8cd156ceb6cb641268 > > > Two ; ; are indeed not pretty. > Maybe let's use a single character that is not used in C ? > Like @ ? I like @, it's both distinctive and meaningful. Will send v2 with @ <file>:<line>. > > Then it will be: > ; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396 > ; descr->key_lens[i] = 0; @ strobemeta_probe.bpf.c:398 > > Some asm languages use ! as a comment. It's ok-ish. a bit worse imo: > ; if (i >= map->cnt) ! strobemeta_probe.bpf.c:396 > ; descr->key_lens[i] = 0; ! strobemeta_probe.bpf.c:398 > > or single underscore ? > ; if (i >= map->cnt) _ strobemeta_probe.bpf.c:396 > ; descr->key_lens[i] = 0; _ strobemeta_probe.bpf.c:398 > > I think all of the above are better than () or []
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c index 594a234f122b..4b49d0eb5cd1 100644 --- a/kernel/bpf/log.c +++ b/kernel/bpf/log.c @@ -9,6 +9,7 @@ #include <linux/bpf.h> #include <linux/bpf_verifier.h> #include <linux/math64.h> +#include <linux/string.h> #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args) @@ -362,6 +363,8 @@ __printf(3, 4) void verbose_linfo(struct bpf_verifier_env *env, const char *prefix_fmt, ...) { const struct bpf_line_info *linfo; + const struct btf *btf; + const char *s, *fname; if (!bpf_verifier_log_needed(&env->log)) return; @@ -378,9 +381,15 @@ __printf(3, 4) void verbose_linfo(struct bpf_verifier_env *env, va_end(args); } - verbose(env, "%s\n", - ltrim(btf_name_by_offset(env->prog->aux->btf, - linfo->line_off))); + btf = env->prog->aux->btf; + s = ltrim(btf_name_by_offset(btf, linfo->line_off)); + verbose(env, "%s", s); /* source code line */ + + s = btf_name_by_offset(btf, linfo->file_name_off); + /* leave only file name */ + fname = strrchr(s, '/'); + fname = fname ? fname + 1 : s; + verbose(env, " (%s:%u)\n", fname, BPF_LINE_INFO_LINE_NUM(linfo->line_col)); env->prev_linfo = linfo; }
As BPF applications grow in size and complexity and are separated into multiple .bpf.c files that are statically linked together, it becomes harder and harder to match verifier's BPF assembly level output to original C code. While often annotated C source code is unique enough to be able to identify the file it belongs to, quite often this is actually problematic as parts of source code can be quite generic. Long story short, it is very useful to see source code file name and line number information along with the original C code. Verifier already knows this information, we just need to output it. This patch set is an initial proposal on how this can be done. No new flags are added and file:line information is appended at the end of C code: ; <original C code> (<filename>.bpf.c:<line>) If file name has directory names in it, they are stripped away. This should be fine in practice as file names tend to be pretty unique with C code anyways, and keeping log size smaller is always good. In practice this might look something like below, where some code is coming from application files, while others are from libbpf's usdt.bpf.h header file: ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534) 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm ... 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183) 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() ; if (arg_spec->arg_signed) (usdt.bpf.h:184) 180: (71) r4 = *(u8 *)(r8 +14) 181: safe I've played with few different formats and none stood out as particularly better than other. Suggestions and votes are appreciated: a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184] c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed) d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed) Above output shows variant a), which is quite non-distracting in practice. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> --- kernel/bpf/log.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)