diff mbox series

[bpf-next] bpf: emit source code file name and line number in verifier log

Message ID 20240210003308.3374075-1-andrii@kernel.org (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next] bpf: emit source code file name and line number in verifier log | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 fail Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 fail Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 fail Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-32 fail Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 fail Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-39 fail Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 fail Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38 fail Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1050 this patch: 1050
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 9 maintainers not CCed: jolsa@kernel.org john.fastabend@gmail.com yonghong.song@linux.dev martin.lau@linux.dev song@kernel.org sdf@google.com eddyz87@gmail.com kpsingh@kernel.org haoluo@google.com
netdev/build_clang success Errors and warnings before: 1066 this patch: 1066
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1067 this patch: 1067
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Andrii Nakryiko Feb. 10, 2024, 12:33 a.m. UTC
As BPF applications grow in size and complexity and are separated into
multiple .bpf.c files that are statically linked together, it becomes
harder and harder to match verifier's BPF assembly level output to
original C code. While often annotated C source code is unique enough to
be able to identify the file it belongs to, quite often this is actually
problematic as parts of source code can be quite generic.

Long story short, it is very useful to see source code file name and
line number information along with the original C code. Verifier already
knows this information, we just need to output it.

This patch set is an initial proposal on how this can be done. No new
flags are added and file:line information is appended at the end of
C code:

  ; <original C code> (<filename>.bpf.c:<line>)

If file name has directory names in it, they are stripped away. This
should be fine in practice as file names tend to be pretty unique with
C code anyways, and keeping log size smaller is always good.

In practice this might look something like below, where some code is
coming from application files, while others are from libbpf's usdt.bpf.h
header file:

  ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534)
  5592: (79) r1 = *(u64 *)(r10 -56)     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
  5593: (7b) *(u64 *)(r10 -56) = r1     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
  5594: (79) r3 = *(u64 *)(r10 -8)      ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm

  ...

  170: (71) r1 = *(u8 *)(r8 +15)        ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208)
  171: (67) r1 <<= 56                   ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000))
  172: (c7) r1 s>>= 56                  ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127)
  ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183)
  173: (67) r1 <<= 32                   ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
  174: (77) r1 >>= 32                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
  175: (79) r2 = *(u64 *)(r10 -8)       ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm
  176: (6f) r2 <<= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar()
  177: (7b) *(u64 *)(r10 -8) = r2       ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61)
  ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
  178: (bf) r3 = r2                     ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61)
  179: (7f) r3 >>= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar()
  ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
  180: (71) r4 = *(u8 *)(r8 +14)
  181: safe

I've played with few different formats and none stood out as
particularly better than other. Suggestions and votes are appreciated:

  a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
  b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184]
  c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed)
  d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed)

Above output shows variant a), which is quite non-distracting in
practice.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/log.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

Comments

Yonghong Song Feb. 11, 2024, 4:26 p.m. UTC | #1
On 2/9/24 4:33 PM, Andrii Nakryiko wrote:
> As BPF applications grow in size and complexity and are separated into
> multiple .bpf.c files that are statically linked together, it becomes
> harder and harder to match verifier's BPF assembly level output to
> original C code. While often annotated C source code is unique enough to
> be able to identify the file it belongs to, quite often this is actually
> problematic as parts of source code can be quite generic.
>
> Long story short, it is very useful to see source code file name and
> line number information along with the original C code. Verifier already
> knows this information, we just need to output it.
>
> This patch set is an initial proposal on how this can be done. No new
> flags are added and file:line information is appended at the end of
> C code:
>
>    ; <original C code> (<filename>.bpf.c:<line>)
>
> If file name has directory names in it, they are stripped away. This
> should be fine in practice as file names tend to be pretty unique with
> C code anyways, and keeping log size smaller is always good.
>
> In practice this might look something like below, where some code is
> coming from application files, while others are from libbpf's usdt.bpf.h
> header file:
>
>    ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534)
>    5592: (79) r1 = *(u64 *)(r10 -56)     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
>    5593: (7b) *(u64 *)(r10 -56) = r1     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
>    5594: (79) r3 = *(u64 *)(r10 -8)      ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm
>
>    ...
>
>    170: (71) r1 = *(u8 *)(r8 +15)        ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208)
>    171: (67) r1 <<= 56                   ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000))
>    172: (c7) r1 s>>= 56                  ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127)
>    ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183)
>    173: (67) r1 <<= 32                   ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
>    174: (77) r1 >>= 32                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
>    175: (79) r2 = *(u64 *)(r10 -8)       ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm
>    176: (6f) r2 <<= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar()
>    177: (7b) *(u64 *)(r10 -8) = r2       ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61)
>    ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
>    178: (bf) r3 = r2                     ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61)
>    179: (7f) r3 >>= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar()
>    ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
>    180: (71) r4 = *(u8 *)(r8 +14)
>    181: safe
>
> I've played with few different formats and none stood out as
> particularly better than other. Suggestions and votes are appreciated:
>
>    a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
>    b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184]
>    c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed)
>    d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed)
>
> Above output shows variant a), which is quite non-distracting in
> practice.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Yonghong Song <yonghong.song@linux.dev>
Alexei Starovoitov Feb. 11, 2024, 7:42 p.m. UTC | #2
On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> As BPF applications grow in size and complexity and are separated into
> multiple .bpf.c files that are statically linked together, it becomes
> harder and harder to match verifier's BPF assembly level output to
> original C code. While often annotated C source code is unique enough to
> be able to identify the file it belongs to, quite often this is actually
> problematic as parts of source code can be quite generic.
>
> Long story short, it is very useful to see source code file name and
> line number information along with the original C code. Verifier already
> knows this information, we just need to output it.
>
> This patch set is an initial proposal on how this can be done. No new
> flags are added and file:line information is appended at the end of
> C code:
>
>   ; <original C code> (<filename>.bpf.c:<line>)
>
> If file name has directory names in it, they are stripped away. This
> should be fine in practice as file names tend to be pretty unique with
> C code anyways, and keeping log size smaller is always good.
>
> In practice this might look something like below, where some code is
> coming from application files, while others are from libbpf's usdt.bpf.h
> header file:
>
>   ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534)
>   5592: (79) r1 = *(u64 *)(r10 -56)     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
>   5593: (7b) *(u64 *)(r10 -56) = r1     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
>   5594: (79) r3 = *(u64 *)(r10 -8)      ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm
>
>   ...
>
>   170: (71) r1 = *(u8 *)(r8 +15)        ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208)
>   171: (67) r1 <<= 56                   ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000))
>   172: (c7) r1 s>>= 56                  ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127)
>   ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183)
>   173: (67) r1 <<= 32                   ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
>   174: (77) r1 >>= 32                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
>   175: (79) r2 = *(u64 *)(r10 -8)       ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm
>   176: (6f) r2 <<= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar()
>   177: (7b) *(u64 *)(r10 -8) = r2       ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61)
>   ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
>   178: (bf) r3 = r2                     ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61)
>   179: (7f) r3 >>= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar()
>   ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
>   180: (71) r4 = *(u8 *)(r8 +14)
>   181: safe
>
> I've played with few different formats and none stood out as
> particularly better than other. Suggestions and votes are appreciated:
>
>   a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
>   b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184]
>   c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed)
>   d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed)

Great idea.
I would drop parenthesis. Don't see a reason to wrap a text.
Since we already use ';' as a comment I would continue:

e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184

Note that that fragile test needs to be adjusted again:
Error: #137/3 log_fixup/bad_core_relo_trunc_full

pw-bot: cr
Andrii Nakryiko Feb. 12, 2024, 6:59 p.m. UTC | #3
On Sun, Feb 11, 2024 at 11:43 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > As BPF applications grow in size and complexity and are separated into
> > multiple .bpf.c files that are statically linked together, it becomes
> > harder and harder to match verifier's BPF assembly level output to
> > original C code. While often annotated C source code is unique enough to
> > be able to identify the file it belongs to, quite often this is actually
> > problematic as parts of source code can be quite generic.
> >
> > Long story short, it is very useful to see source code file name and
> > line number information along with the original C code. Verifier already
> > knows this information, we just need to output it.
> >
> > This patch set is an initial proposal on how this can be done. No new
> > flags are added and file:line information is appended at the end of
> > C code:
> >
> >   ; <original C code> (<filename>.bpf.c:<line>)
> >
> > If file name has directory names in it, they are stripped away. This
> > should be fine in practice as file names tend to be pretty unique with
> > C code anyways, and keeping log size smaller is always good.
> >
> > In practice this might look something like below, where some code is
> > coming from application files, while others are from libbpf's usdt.bpf.h
> > header file:
> >
> >   ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534)
> >   5592: (79) r1 = *(u64 *)(r10 -56)     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
> >   5593: (7b) *(u64 *)(r10 -56) = r1     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
> >   5594: (79) r3 = *(u64 *)(r10 -8)      ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm
> >
> >   ...
> >
> >   170: (71) r1 = *(u8 *)(r8 +15)        ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208)
> >   171: (67) r1 <<= 56                   ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000))
> >   172: (c7) r1 s>>= 56                  ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127)
> >   ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183)
> >   173: (67) r1 <<= 32                   ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
> >   174: (77) r1 >>= 32                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> >   175: (79) r2 = *(u64 *)(r10 -8)       ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm
> >   176: (6f) r2 <<= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar()
> >   177: (7b) *(u64 *)(r10 -8) = r2       ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61)
> >   ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
> >   178: (bf) r3 = r2                     ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61)
> >   179: (7f) r3 >>= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar()
> >   ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
> >   180: (71) r4 = *(u8 *)(r8 +14)
> >   181: safe
> >
> > I've played with few different formats and none stood out as
> > particularly better than other. Suggestions and votes are appreciated:
> >
> >   a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
> >   b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184]
> >   c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed)
> >   d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed)
>
> Great idea.
> I would drop parenthesis. Don't see a reason to wrap a text.
> Since we already use ';' as a comment I would continue:

A bit worried that it will quite weird for any source line which ends
in ';', like:

; x = 123; ; my_file.bpf.c:123
r1 = 123;


E.g., here's a real excerpt (I cut some register states for cleanliness):

; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396
5376: (79) r1 = *(u64 *)(r10 -40)     ;
R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0
5377: (79) r1 = *(u64 *)(r1 +8)       ; R1_w=scalar(...) R9_w=10
; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396
5378: (dd) if r1 s<= r9 goto pc-5     ; R1_w=scalar(smin=umin=11,...)
; descr->key_lens[i] = 0; ; strobemeta_probe.bpf.c:398
5379: (b4) w1 = 0                     ; R1_w=0
5380: (6b) *(u16 *)(r8 -30) = r1      ; R1_w=0
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ;
strobemeta_probe.bpf.c:400
5381: (79) r3 = *(u64 *)(r7 -8)       ; R3_w=scalar()
R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192)
5382: (7b) *(u64 *)(r10 -24) = r6     ; ...
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ;
strobemeta_probe.bpf.c:400
5383: (bc) w6 = w6                    ; R6_w=scalar(...)
; barrier_var(payload_off); ; strobemeta_probe.bpf.c:280
5384: (bf) r2 = r6                    ; R2_w=scalar(...)
5385: (bf) r1 = r4

VS

; if (i >= map->cnt) (strobemeta_probe.bpf.c:396)
5376: (79) r1 = *(u64 *)(r10 -40)     ;
R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0
5377: (79) r1 = *(u64 *)(r1 +8)       ; R1_w=scalar()
; if (i >= map->cnt) (strobemeta_probe.bpf.c:396)
5378: (dd) if r1 s<= r9 goto pc-5     ; R1_w=scalar(...) R9_w=10
; descr->key_lens[i] = 0; (strobemeta_probe.bpf.c:398)
5379: (b4) w1 = 0                     ; R1_w=0
5380: (6b) *(u16 *)(r8 -30) = r1      ; R1_w=0
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key);
(strobemeta_probe.bpf.c:400)
5381: (79) r3 = *(u64 *)(r7 -8)       ; R3_w=scalar()
R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192)
5382: (7b) *(u64 *)(r10 -24) = r6     ; ...
; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key);
(strobemeta_probe.bpf.c:400)
5383: (bc) w6 = w6                    ; R6_w=scalar(...)
; barrier_var(payload_off); (strobemeta_probe.bpf.c:280)
5384: (bf) r2 = r6                    ; R2_w=scalar(...)
5385: (bf) r1 = r4


Can't say that either is super nice and clean. But when I tried e)
proposal, I realized that semicolon separators are used also for
register state (next to instruction dump) and they sort of overlap
visually more and make it a bit harder to read log (subjective IMO, of
course).

But let me know if you still prefer e) and I'll send v2 with it.

>
> e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184
>
> Note that that fragile test needs to be adjusted again:
> Error: #137/3 log_fixup/bad_core_relo_trunc_full

yep, my bad, I forgot to run all test_progs tests this time, already
fixed locally

>
> pw-bot: cr
Andrii Nakryiko Feb. 12, 2024, 7:01 p.m. UTC | #4
On Mon, Feb 12, 2024 at 10:59 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sun, Feb 11, 2024 at 11:43 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > As BPF applications grow in size and complexity and are separated into
> > > multiple .bpf.c files that are statically linked together, it becomes
> > > harder and harder to match verifier's BPF assembly level output to
> > > original C code. While often annotated C source code is unique enough to
> > > be able to identify the file it belongs to, quite often this is actually
> > > problematic as parts of source code can be quite generic.
> > >
> > > Long story short, it is very useful to see source code file name and
> > > line number information along with the original C code. Verifier already
> > > knows this information, we just need to output it.
> > >
> > > This patch set is an initial proposal on how this can be done. No new
> > > flags are added and file:line information is appended at the end of
> > > C code:
> > >
> > >   ; <original C code> (<filename>.bpf.c:<line>)
> > >
> > > If file name has directory names in it, they are stripped away. This
> > > should be fine in practice as file names tend to be pretty unique with
> > > C code anyways, and keeping log size smaller is always good.
> > >
> > > In practice this might look something like below, where some code is
> > > coming from application files, while others are from libbpf's usdt.bpf.h
> > > header file:
> > >
> > >   ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534)
> > >   5592: (79) r1 = *(u64 *)(r10 -56)     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
> > >   5593: (7b) *(u64 *)(r10 -56) = r1     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680)
> > >   5594: (79) r3 = *(u64 *)(r10 -8)      ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm
> > >
> > >   ...
> > >
> > >   170: (71) r1 = *(u8 *)(r8 +15)        ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208)
> > >   171: (67) r1 <<= 56                   ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000))
> > >   172: (c7) r1 s>>= 56                  ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127)
> > >   ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183)
> > >   173: (67) r1 <<= 32                   ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
> > >   174: (77) r1 >>= 32                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> > >   175: (79) r2 = *(u64 *)(r10 -8)       ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm
> > >   176: (6f) r2 <<= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar()
> > >   177: (7b) *(u64 *)(r10 -8) = r2       ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61)
> > >   ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
> > >   178: (bf) r3 = r2                     ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61)
> > >   179: (7f) r3 >>= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar()
> > >   ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
> > >   180: (71) r4 = *(u8 *)(r8 +14)
> > >   181: safe
> > >
> > > I've played with few different formats and none stood out as
> > > particularly better than other. Suggestions and votes are appreciated:
> > >
> > >   a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184)
> > >   b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184]
> > >   c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed)
> > >   d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed)
> >
> > Great idea.
> > I would drop parenthesis. Don't see a reason to wrap a text.
> > Since we already use ';' as a comment I would continue:
>
> A bit worried that it will quite weird for any source line which ends
> in ';', like:
>
> ; x = 123; ; my_file.bpf.c:123
> r1 = 123;
>
>
> E.g., here's a real excerpt (I cut some register states for cleanliness):
>
> ; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396
> 5376: (79) r1 = *(u64 *)(r10 -40)     ;
> R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0
> 5377: (79) r1 = *(u64 *)(r1 +8)       ; R1_w=scalar(...) R9_w=10
> ; if (i >= map->cnt) ; strobemeta_probe.bpf.c:396
> 5378: (dd) if r1 s<= r9 goto pc-5     ; R1_w=scalar(smin=umin=11,...)
> ; descr->key_lens[i] = 0; ; strobemeta_probe.bpf.c:398
> 5379: (b4) w1 = 0                     ; R1_w=0
> 5380: (6b) *(u16 *)(r8 -30) = r1      ; R1_w=0
> ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ;
> strobemeta_probe.bpf.c:400
> 5381: (79) r3 = *(u64 *)(r7 -8)       ; R3_w=scalar()
> R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192)
> 5382: (7b) *(u64 *)(r10 -24) = r6     ; ...
> ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key); ;
> strobemeta_probe.bpf.c:400
> 5383: (bc) w6 = w6                    ; R6_w=scalar(...)
> ; barrier_var(payload_off); ; strobemeta_probe.bpf.c:280
> 5384: (bf) r2 = r6                    ; R2_w=scalar(...)
> 5385: (bf) r1 = r4
>
> VS
>
> ; if (i >= map->cnt) (strobemeta_probe.bpf.c:396)
> 5376: (79) r1 = *(u64 *)(r10 -40)     ;
> R1_w=map_value(map=raw_map_heap,ks=4,vs=264) R10=fp0
> 5377: (79) r1 = *(u64 *)(r1 +8)       ; R1_w=scalar()
> ; if (i >= map->cnt) (strobemeta_probe.bpf.c:396)
> 5378: (dd) if r1 s<= r9 goto pc-5     ; R1_w=scalar(...) R9_w=10
> ; descr->key_lens[i] = 0; (strobemeta_probe.bpf.c:398)
> 5379: (b4) w1 = 0                     ; R1_w=0
> 5380: (6b) *(u16 *)(r8 -30) = r1      ; R1_w=0
> ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key);
> (strobemeta_probe.bpf.c:400)
> 5381: (79) r3 = *(u64 *)(r7 -8)       ; R3_w=scalar()
> R7_w=map_value(map=raw_map_heap,ks=4,vs=264,off=192)
> 5382: (7b) *(u64 *)(r10 -24) = r6     ; ...
> ; task, data, off, STROBE_MAX_STR_LEN, map->entries[i].key);
> (strobemeta_probe.bpf.c:400)
> 5383: (bc) w6 = w6                    ; R6_w=scalar(...)
> ; barrier_var(payload_off); (strobemeta_probe.bpf.c:280)
> 5384: (bf) r2 = r6                    ; R2_w=scalar(...)
> 5385: (bf) r1 = r4
>
>
> Can't say that either is super nice and clean. But when I tried e)
> proposal, I realized that semicolon separators are used also for
> register state (next to instruction dump) and they sort of overlap
> visually more and make it a bit harder to read log (subjective IMO, of
> course).
>
> But let me know if you still prefer e) and I'll send v2 with it.
>

Goodness, gmail made everything even worse. See [0] for visual comparison

  [0] https://gist.github.com/anakryiko/f5e9217f277b0f8cd156ceb6cb641268


> >
> > e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184
> >
> > Note that that fragile test needs to be adjusted again:
> > Error: #137/3 log_fixup/bad_core_relo_trunc_full
>
> yep, my bad, I forgot to run all test_progs tests this time, already
> fixed locally
>
> >
> > pw-bot: cr
Alexei Starovoitov Feb. 12, 2024, 8:05 p.m. UTC | #5
On Mon, Feb 12, 2024 at 11:02 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> >
> > Can't say that either is super nice and clean. But when I tried e)
> > proposal, I realized that semicolon separators are used also for
> > register state (next to instruction dump) and they sort of overlap
> > visually more and make it a bit harder to read log (subjective IMO, of
> > course).
> >
> > But let me know if you still prefer e) and I'll send v2 with it.
> >
>
> Goodness, gmail made everything even worse. See [0] for visual comparison
>
>   [0] https://gist.github.com/anakryiko/f5e9217f277b0f8cd156ceb6cb641268


Two ; ; are indeed not pretty.
Maybe let's use a single character that is not used in C ?
Like @ ?

Then it will be:
; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396
; descr->key_lens[i] = 0; @ strobemeta_probe.bpf.c:398

Some asm languages use ! as a comment. It's ok-ish. a bit worse imo:
; if (i >= map->cnt) ! strobemeta_probe.bpf.c:396
; descr->key_lens[i] = 0; ! strobemeta_probe.bpf.c:398

or single underscore ?
; if (i >= map->cnt) _ strobemeta_probe.bpf.c:396
; descr->key_lens[i] = 0; _ strobemeta_probe.bpf.c:398

I think all of the above are better than () or []
Andrii Nakryiko Feb. 12, 2024, 11:36 p.m. UTC | #6
On Mon, Feb 12, 2024 at 12:05 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Feb 12, 2024 at 11:02 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > >
> > > Can't say that either is super nice and clean. But when I tried e)
> > > proposal, I realized that semicolon separators are used also for
> > > register state (next to instruction dump) and they sort of overlap
> > > visually more and make it a bit harder to read log (subjective IMO, of
> > > course).
> > >
> > > But let me know if you still prefer e) and I'll send v2 with it.
> > >
> >
> > Goodness, gmail made everything even worse. See [0] for visual comparison
> >
> >   [0] https://gist.github.com/anakryiko/f5e9217f277b0f8cd156ceb6cb641268
>
>
> Two ; ; are indeed not pretty.
> Maybe let's use a single character that is not used in C ?
> Like @ ?

I like @, it's both distinctive and meaningful. Will send v2 with @
<file>:<line>.

>
> Then it will be:
> ; if (i >= map->cnt) @ strobemeta_probe.bpf.c:396
> ; descr->key_lens[i] = 0; @ strobemeta_probe.bpf.c:398
>
> Some asm languages use ! as a comment. It's ok-ish. a bit worse imo:
> ; if (i >= map->cnt) ! strobemeta_probe.bpf.c:396
> ; descr->key_lens[i] = 0; ! strobemeta_probe.bpf.c:398
>
> or single underscore ?
> ; if (i >= map->cnt) _ strobemeta_probe.bpf.c:396
> ; descr->key_lens[i] = 0; _ strobemeta_probe.bpf.c:398
>
> I think all of the above are better than () or []
diff mbox series

Patch

diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index 594a234f122b..4b49d0eb5cd1 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -9,6 +9,7 @@ 
 #include <linux/bpf.h>
 #include <linux/bpf_verifier.h>
 #include <linux/math64.h>
+#include <linux/string.h>
 
 #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args)
 
@@ -362,6 +363,8 @@  __printf(3, 4) void verbose_linfo(struct bpf_verifier_env *env,
 				  const char *prefix_fmt, ...)
 {
 	const struct bpf_line_info *linfo;
+	const struct btf *btf;
+	const char *s, *fname;
 
 	if (!bpf_verifier_log_needed(&env->log))
 		return;
@@ -378,9 +381,15 @@  __printf(3, 4) void verbose_linfo(struct bpf_verifier_env *env,
 		va_end(args);
 	}
 
-	verbose(env, "%s\n",
-		ltrim(btf_name_by_offset(env->prog->aux->btf,
-					 linfo->line_off)));
+	btf = env->prog->aux->btf;
+	s = ltrim(btf_name_by_offset(btf, linfo->line_off));
+	verbose(env, "%s", s); /* source code line */
+
+	s = btf_name_by_offset(btf, linfo->file_name_off);
+	/* leave only file name */
+	fname = strrchr(s, '/');
+	fname = fname ? fname + 1 : s;
+	verbose(env, " (%s:%u)\n", fname, BPF_LINE_INFO_LINE_NUM(linfo->line_col));
 
 	env->prev_linfo = linfo;
 }