Message ID | 20210207071726.3969978-1-yhs@fb.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | BPF |
Headers | show |
Series | [dwarves,v2] btf_encoder: sanitize non-regular int base type | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Sun, Feb 7, 2021 at 8:17 AM Yonghong Song <yhs@fb.com> wrote: > > clang with dwarf5 may generate non-regular int base type, > i.e., not a signed/unsigned char/short/int/longlong/__int128. > Such base types are often used to describe > how an actual parameter or variable is generated. For example, > > 0x000015cf: DW_TAG_base_type > DW_AT_name ("DW_ATE_unsigned_1") > DW_AT_encoding (DW_ATE_unsigned) > DW_AT_byte_size (0x00) > > 0x00010ed9: DW_TAG_formal_parameter > DW_AT_location (DW_OP_lit0, > DW_OP_not, > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > DW_OP_stack_value) > DW_AT_abstract_origin (0x00013984 "branch") > > What it does is with a literal "0", did a "not" operation, and the converted to > one-bit unsigned int and then 8-bit unsigned int. > > Another example, > > 0x000e97e4: DW_TAG_base_type > DW_AT_name ("DW_ATE_unsigned_24") > DW_AT_encoding (DW_ATE_unsigned) > DW_AT_byte_size (0x03) > > 0x000f88f8: DW_TAG_variable > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > [0xffffffff82808812, 0xffffffff82808817): > DW_OP_breg0 RAX+0, > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > DW_OP_stack_value, > DW_OP_piece 0x1, > DW_OP_breg0 RAX+0, > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > DW_OP_lit8, > DW_OP_shr, > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > DW_OP_stack_value, > DW_OP_piece 0x3 > ...... > > At one point, a right shift by 8 happens and the result is converted to > 32-bit unsigned int and then to 24-bit unsigned int. > > BTF does not need any of these DW_OP_* information and such non-regular int > types will cause libbpf to emit errors. > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > Cc: Sedat Dilek <sedat.dilek@gmail.com> Thanks for v2. For both v1 and v2: Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> My development and testing environment: 1. Debian/testing AMD64 2. Linux v5.11-rc6+ with custom mostly Clang fixes 3. Debug-Info: BTF + DWARF-v5 enabled 4. Compiler and tools (incl. Clang's Integrated ASsembler IAS): LLVM/Clang v12.0.0-rc1 (make LLVM=1 LLVM_IAS=1) Build and boot on bare metal. - Sedat - > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Signed-off-by: Yonghong Song <yhs@fb.com> > --- > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > 1 file changed, 38 insertions(+), 1 deletion(-) > > diff --git a/libbtf.c b/libbtf.c > index 9f76283..5843200 100644 > --- a/libbtf.c > +++ b/libbtf.c > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > struct btf *btf = btfe->btf; > const struct btf_type *t; > uint8_t encoding = 0; > + uint16_t byte_sz; > int32_t id; > > if (bt->is_signed) { > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > return -1; > } > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > + * {num} is not power of 2 and may exceed 128. Such attributes > + * are mostly used to record operation for an actual parameter > + * or variable. > + * For example, > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > + * [0xffffffff82808812, 0xffffffff82808817): > + * DW_OP_breg0 RAX+0, > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > + * DW_OP_stack_value, > + * DW_OP_piece 0x1, > + * DW_OP_breg0 RAX+0, > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > + * DW_OP_lit8, > + * DW_OP_shr, > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > + * DW_OP_stack_value, DW_OP_piece 0x3 > + * DW_AT_name ("ebx") > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > + * > + * In the above example, at some point, one unsigned_32 value > + * is right shifted by 8 and the result is converted to unsigned_32 > + * and then unsigned_24. > + * > + * BTF does not need such DW_OP_* information so let us sanitize > + * these non-regular int types to avoid libbpf/kernel complaints. > + */ > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > + name = "__SANITIZED_FAKE_INT__"; > + byte_sz = 4; > + } > + > + id = btf__add_int(btf, name, byte_sz, encoding); > if (id < 0) { > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > } else { > -- > 2.24.1 >
Hi, On Sat, 2021-02-06 at 23:17 -0800, Yonghong Song wrote: > clang with dwarf5 may generate non-regular int base type, > i.e., not a signed/unsigned char/short/int/longlong/__int128. > Such base types are often used to describe > how an actual parameter or variable is generated. For example, > > 0x000015cf: DW_TAG_base_type > DW_AT_name ("DW_ATE_unsigned_1") > DW_AT_encoding (DW_ATE_unsigned) > DW_AT_byte_size (0x00) > > 0x00010ed9: DW_TAG_formal_parameter > DW_AT_location (DW_OP_lit0, > DW_OP_not, > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > DW_OP_stack_value) > DW_AT_abstract_origin (0x00013984 "branch") > > What it does is with a literal "0", did a "not" operation, and the converted to > one-bit unsigned int and then 8-bit unsigned int. Thanks for tracking this down. Do you have any idea why the clang compiler emits this? You might be right that it is intended to do what you describe it does (but then it would simply encode an unsigned constant 1 char in a very inefficient way). But as implemented it doesn't seem to make any sense. What would DW_OP_convert of an zero sized base type even mean (if it is intended as a 1 bit sized typed, then why is there no DW_AT_bit_size)? So I do think your patch makes sense. clang clearly is emitting something bogus. And so some fixup is needed. But maybe we should at least give a warning about it, otherwise it might never get fixed. BTW. If these bogus base types are only emitted as part of a location expression and not as part of an actual function or variable type description, then why are we even trying to encode it as a BTF type? It might be cheaper to just skip/drop it. But maybe the code setup makes it hard to know whether or not such a (bogus) type is actually referenced from a function or variable description? Cheers, Mark
On Sun, Feb 7, 2021 at 3:18 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Sat, 2021-02-06 at 23:17 -0800, Yonghong Song wrote: > > clang with dwarf5 may generate non-regular int base type, > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > Such base types are often used to describe > > how an actual parameter or variable is generated. For example, > > > > 0x000015cf: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_1") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x00) > > > > 0x00010ed9: DW_TAG_formal_parameter > > DW_AT_location (DW_OP_lit0, > > DW_OP_not, > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > DW_OP_stack_value) > > DW_AT_abstract_origin (0x00013984 "branch") > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > one-bit unsigned int and then 8-bit unsigned int. > > Thanks for tracking this down. Do you have any idea why the clang > compiler emits this? You might be right that it is intended to do what > you describe it does (but then it would simply encode an unsigned > constant 1 char in a very inefficient way). But as implemented it > doesn't seem to make any sense. What would DW_OP_convert of an zero > sized base type even mean (if it is intended as a 1 bit sized typed, > then why is there no DW_AT_bit_size)? > > So I do think your patch makes sense. clang clearly is emitting > something bogus. And so some fixup is needed. But maybe we should at > least give a warning about it, otherwise it might never get fixed. > > BTW. If these bogus base types are only emitted as part of a location > expression and not as part of an actual function or variable type > description, then why are we even trying to encode it as a BTF type? It > might be cheaper to just skip/drop it. But maybe the code setup makes > it hard to know whether or not such a (bogus) type is actually > referenced from a function or variable description? > As said this is with LLVM/Clang v12.0.0-rc1 and `make LLVM=1 LLVM_IAS=1` means use all LLVM tools (do not use GNU/binutils like GNU/ld BFD) and Clang's Integrated ASsembler (means do not use GNU AS). When doing an indexed search for "DW_ATE_unsigned"... ...this points to changes recently done in places like llvm/lib/CodeGen/AsmPrinter/. I see 16 changes there touching DWARF-x area since llvmorg-12.0.0-rc1 release: $ git log --oneline llvmorg-12.0.0-rc1.. llvm/lib/CodeGen/AsmPrinter/ e44a10094283 .gcc_except_table: Set SHF_LINK_ORDER if binutils>=2.36, and drop unneeded unique ID for -fno-unique-section-names 853a2649160c [AsmPrinter] __patchable_function_entries: Set SHF_LINK_ORDER for binutils 2.36 and above e3c0b0fe0958 [WebAssembly] locals can now be indirect in DWARF 34f3249abdff [DebugInfo] Fix error from D95893, where I accidentally used an unsigned int in a loop and it wraps around. a740af4de970 [CodeView][DebugInfo] Update the code for removing template arguments from the display name of a codeview function id. 56fa34ae3570 DebugInfo: Temporarily work around -gsplit-dwarf + LTO .debug_gnu_pubnames regression after D94976 8998f5843503 Re-land D94976 after revert in e29552c5aff6 d32deaab4d53 Revert "[DWARF] Location-less inlined variables should not have DW_TAG_variable" ddc2f1e3fb4f [DWARF] Location-less inlined variables should not have DW_TAG_variable 511c9a76fb98 [AsmPrinter] Use ListSeparator (NFC) 85b7b5625a00 Fix memory leak in 4318028cd2d7633a0cdeb0b5d4d2ed81fab87864 4318028cd2d7 DebugInfo: Add a DWARF FORM extension for addrx+offset references to reduce relocations e29552c5aff6 Revert "[DWARF] Create subprogram's DIE in DISubprogram's unit" dd7297e1bffe DebugInfo: Fix bug in addr+offset exprloc to use DWARFv5 addrx op instead of DWARFv4 GNU extension 7e6c87ee0454 DebugInfo: Deduplicate addresses in debug_addr ef0dcb506300 [DWARF] Create subprogram's DIE in DISubprogram's unit What I try to say is with LLVM-13 (git) this might look different? Here, I have a LLVM-12 ThinLTO+PGO optimized toolchain which saves 1/3 of Linux-kernel build-time. So, I do not want to switch to Debian's or packages from <apt.llvm.org>. These binaries take much much longer and I do not know if I get some new issues with Linux v5.11-rc6+. Again, I am not a LLVM toolchain expert. Best is to ask on llvm-dev mailing-list? Thanks. - Sedat - [1] https://github.com/llvm/llvm-project/search?o=desc&q=DW_ATE_unsigned&s=indexed
On 2/7/21 6:18 AM, Mark Wielaard wrote: > Hi, > > On Sat, 2021-02-06 at 23:17 -0800, Yonghong Song wrote: >> clang with dwarf5 may generate non-regular int base type, >> i.e., not a signed/unsigned char/short/int/longlong/__int128. >> Such base types are often used to describe >> how an actual parameter or variable is generated. For example, >> >> 0x000015cf: DW_TAG_base_type >> DW_AT_name ("DW_ATE_unsigned_1") >> DW_AT_encoding (DW_ATE_unsigned) >> DW_AT_byte_size (0x00) >> >> 0x00010ed9: DW_TAG_formal_parameter >> DW_AT_location (DW_OP_lit0, >> DW_OP_not, >> DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", >> DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", >> DW_OP_stack_value) >> DW_AT_abstract_origin (0x00013984 "branch") >> >> What it does is with a literal "0", did a "not" operation, and the converted to >> one-bit unsigned int and then 8-bit unsigned int. > > Thanks for tracking this down. Do you have any idea why the clang > compiler emits this? You might be right that it is intended to do what No, I don't know. > you describe it does (but then it would simply encode an unsigned > constant 1 char in a very inefficient way). But as implemented it > doesn't seem to make any sense. What would DW_OP_convert of an zero > sized base type even mean (if it is intended as a 1 bit sized typed, > then why is there no DW_AT_bit_size)? We need to report back to llvm community about this instance. DW_AT_byte_size = 0 only for DW_ATE_unsigned_1 does not sound right. DW_AT_bit_size might be needed as you suggested. > > So I do think your patch makes sense. clang clearly is emitting > something bogus. And so some fixup is needed. But maybe we should at > least give a warning about it, otherwise it might never get fixed. In pahole, to deal with dwarf weirdness, we have several other places for workaround without warning. I think it is okay not to have warning here as users cannot do anything about it. Also we have this special int type with name __SANITIZED_FAKE_INT__ which will signal the issue still exists if looking at BTF or generated vmlinux.h. > > BTW. If these bogus base types are only emitted as part of a location > expression and not as part of an actual function or variable type > description, then why are we even trying to encode it as a BTF type? It Let us still encode it as a BTF type as an evidence that we do changed some types. After deduplication, we will only have ONE __SANITIZED_FAKE_INT__ type. I think leave this one type in BTF is okay. > might be cheaper to just skip/drop it. But maybe the code setup makes > it hard to know whether or not such a (bogus) type is actually > referenced from a function or variable description? I guess it is possible that we do a preprocessing to check whether any types BTF intends to emit (var/func definition, etc.) referencing such types or not. This will involves overhead for every CU most of which does not have this issue. And this should not really happen given a sane dwarf. With __SANITIZED_FAKE_INT__ as the type name, if something e.g., a variable definition use this type, you will have __SANITIZED_FAKE_INT__ a; in skeleton, or have struct { __SANITIZED_FAKE_INT__ member; ... }; in vmlinux.h, we should be able to catch such an error easily. > > Cheers, > > Mark >
On Sun, Feb 7, 2021 at 6:18 AM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Sat, 2021-02-06 at 23:17 -0800, Yonghong Song wrote: > > clang with dwarf5 may generate non-regular int base type, > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > Such base types are often used to describe > > how an actual parameter or variable is generated. For example, > > > > 0x000015cf: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_1") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x00) > > > > 0x00010ed9: DW_TAG_formal_parameter > > DW_AT_location (DW_OP_lit0, > > DW_OP_not, > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > DW_OP_stack_value) > > DW_AT_abstract_origin (0x00013984 "branch") > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > one-bit unsigned int and then 8-bit unsigned int. > > Thanks for tracking this down. Do you have any idea why the clang > compiler emits this? You might be right that it is intended to do what > you describe it does (but then it would simply encode an unsigned > constant 1 char in a very inefficient way). But as implemented it > doesn't seem to make any sense. What would DW_OP_convert of an zero > sized base type even mean (if it is intended as a 1 bit sized typed, > then why is there no DW_AT_bit_size)? David, Any thoughts on the above sequence of DW_OP_ entries? This is a part of DWARF I'm unfamiliar with. > > So I do think your patch makes sense. clang clearly is emitting > something bogus. And so some fixup is needed. But maybe we should at > least give a warning about it, otherwise it might never get fixed. > > BTW. If these bogus base types are only emitted as part of a location > expression and not as part of an actual function or variable type > description, then why are we even trying to encode it as a BTF type? It > might be cheaper to just skip/drop it. But maybe the code setup makes > it hard to know whether or not such a (bogus) type is actually > referenced from a function or variable description? > > Cheers, > > Mark
On Sat, Feb 6, 2021 at 11:17 PM Yonghong Song <yhs@fb.com> wrote: > > clang with dwarf5 may generate non-regular int base type, > i.e., not a signed/unsigned char/short/int/longlong/__int128. > Such base types are often used to describe > how an actual parameter or variable is generated. For example, > > 0x000015cf: DW_TAG_base_type > DW_AT_name ("DW_ATE_unsigned_1") > DW_AT_encoding (DW_ATE_unsigned) > DW_AT_byte_size (0x00) > > 0x00010ed9: DW_TAG_formal_parameter > DW_AT_location (DW_OP_lit0, > DW_OP_not, > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > DW_OP_stack_value) > DW_AT_abstract_origin (0x00013984 "branch") > > What it does is with a literal "0", did a "not" operation, and the converted to > one-bit unsigned int and then 8-bit unsigned int. > > Another example, > > 0x000e97e4: DW_TAG_base_type > DW_AT_name ("DW_ATE_unsigned_24") > DW_AT_encoding (DW_ATE_unsigned) > DW_AT_byte_size (0x03) > > 0x000f88f8: DW_TAG_variable > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > [0xffffffff82808812, 0xffffffff82808817): > DW_OP_breg0 RAX+0, > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > DW_OP_stack_value, > DW_OP_piece 0x1, > DW_OP_breg0 RAX+0, > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > DW_OP_lit8, > DW_OP_shr, > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > DW_OP_stack_value, > DW_OP_piece 0x3 > ...... > > At one point, a right shift by 8 happens and the result is converted to > 32-bit unsigned int and then to 24-bit unsigned int. > > BTF does not need any of these DW_OP_* information and such non-regular int > types will cause libbpf to emit errors. > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Signed-off-by: Yonghong Song <yhs@fb.com> Thanks for the patch! Tested-by: Nick Desaulniers <ndesaulniers@google.com> > --- > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > 1 file changed, 38 insertions(+), 1 deletion(-) > > diff --git a/libbtf.c b/libbtf.c > index 9f76283..5843200 100644 > --- a/libbtf.c > +++ b/libbtf.c > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > struct btf *btf = btfe->btf; > const struct btf_type *t; > uint8_t encoding = 0; > + uint16_t byte_sz; > int32_t id; > > if (bt->is_signed) { > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > return -1; > } > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > + * {num} is not power of 2 and may exceed 128. Such attributes > + * are mostly used to record operation for an actual parameter > + * or variable. > + * For example, > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > + * [0xffffffff82808812, 0xffffffff82808817): > + * DW_OP_breg0 RAX+0, > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > + * DW_OP_stack_value, > + * DW_OP_piece 0x1, > + * DW_OP_breg0 RAX+0, > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > + * DW_OP_lit8, > + * DW_OP_shr, > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > + * DW_OP_stack_value, DW_OP_piece 0x3 > + * DW_AT_name ("ebx") > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > + * > + * In the above example, at some point, one unsigned_32 value > + * is right shifted by 8 and the result is converted to unsigned_32 > + * and then unsigned_24. > + * > + * BTF does not need such DW_OP_* information so let us sanitize > + * these non-regular int types to avoid libbpf/kernel complaints. > + */ > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > + name = "__SANITIZED_FAKE_INT__"; > + byte_sz = 4; > + } > + > + id = btf__add_int(btf, name, byte_sz, encoding); > if (id < 0) { > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > } else { > -- > 2.24.1 >
On Mon, Feb 8, 2021 at 8:23 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Sat, Feb 6, 2021 at 11:17 PM Yonghong Song <yhs@fb.com> wrote: > > > > clang with dwarf5 may generate non-regular int base type, > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > Such base types are often used to describe > > how an actual parameter or variable is generated. For example, > > > > 0x000015cf: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_1") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x00) > > > > 0x00010ed9: DW_TAG_formal_parameter > > DW_AT_location (DW_OP_lit0, > > DW_OP_not, > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > DW_OP_stack_value) > > DW_AT_abstract_origin (0x00013984 "branch") > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > one-bit unsigned int and then 8-bit unsigned int. > > > > Another example, > > > > 0x000e97e4: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_24") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x03) > > > > 0x000f88f8: DW_TAG_variable > > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > [0xffffffff82808812, 0xffffffff82808817): > > DW_OP_breg0 RAX+0, > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > DW_OP_stack_value, > > DW_OP_piece 0x1, > > DW_OP_breg0 RAX+0, > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > DW_OP_lit8, > > DW_OP_shr, > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > DW_OP_stack_value, > > DW_OP_piece 0x3 > > ...... > > > > At one point, a right shift by 8 happens and the result is converted to > > 32-bit unsigned int and then to 24-bit unsigned int. > > > > BTF does not need any of these DW_OP_* information and such non-regular int > > types will cause libbpf to emit errors. > > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Signed-off-by: Yonghong Song <yhs@fb.com> > > Thanks for the patch! > > Tested-by: Nick Desaulniers <ndesaulniers@google.com> > Cool, thanks for testing Nick. - Sedat - > > --- > > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 38 insertions(+), 1 deletion(-) > > > > diff --git a/libbtf.c b/libbtf.c > > index 9f76283..5843200 100644 > > --- a/libbtf.c > > +++ b/libbtf.c > > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > struct btf *btf = btfe->btf; > > const struct btf_type *t; > > uint8_t encoding = 0; > > + uint16_t byte_sz; > > int32_t id; > > > > if (bt->is_signed) { > > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > return -1; > > } > > > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > > + * {num} is not power of 2 and may exceed 128. Such attributes > > + * are mostly used to record operation for an actual parameter > > + * or variable. > > + * For example, > > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > + * [0xffffffff82808812, 0xffffffff82808817): > > + * DW_OP_breg0 RAX+0, > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > + * DW_OP_stack_value, > > + * DW_OP_piece 0x1, > > + * DW_OP_breg0 RAX+0, > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > + * DW_OP_lit8, > > + * DW_OP_shr, > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > + * DW_OP_stack_value, DW_OP_piece 0x3 > > + * DW_AT_name ("ebx") > > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > > + * > > + * In the above example, at some point, one unsigned_32 value > > + * is right shifted by 8 and the result is converted to unsigned_32 > > + * and then unsigned_24. > > + * > > + * BTF does not need such DW_OP_* information so let us sanitize > > + * these non-regular int types to avoid libbpf/kernel complaints. > > + */ > > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > > + name = "__SANITIZED_FAKE_INT__"; > > + byte_sz = 4; > > + } > > + > > + id = btf__add_int(btf, name, byte_sz, encoding); > > if (id < 0) { > > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > > } else { > > -- > > 2.24.1 > > > > > -- > Thanks, > ~Nick Desaulniers
Em Sun, Feb 07, 2021 at 11:32:00AM +0100, Sedat Dilek escreveu: > On Sun, Feb 7, 2021 at 8:17 AM Yonghong Song <yhs@fb.com> wrote: > > > > clang with dwarf5 may generate non-regular int base type, > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > Such base types are often used to describe > > how an actual parameter or variable is generated. For example, > > > > 0x000015cf: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_1") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x00) > > > > 0x00010ed9: DW_TAG_formal_parameter > > DW_AT_location (DW_OP_lit0, > > DW_OP_not, > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > DW_OP_stack_value) > > DW_AT_abstract_origin (0x00013984 "branch") > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > one-bit unsigned int and then 8-bit unsigned int. > > > > Another example, > > > > 0x000e97e4: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_24") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x03) > > > > 0x000f88f8: DW_TAG_variable > > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > [0xffffffff82808812, 0xffffffff82808817): > > DW_OP_breg0 RAX+0, > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > DW_OP_stack_value, > > DW_OP_piece 0x1, > > DW_OP_breg0 RAX+0, > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > DW_OP_lit8, > > DW_OP_shr, > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > DW_OP_stack_value, > > DW_OP_piece 0x3 > > ...... > > > > At one point, a right shift by 8 happens and the result is converted to > > 32-bit unsigned int and then to 24-bit unsigned int. > > > > BTF does not need any of these DW_OP_* information and such non-regular int > > types will cause libbpf to emit errors. > > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > > Thanks for v2. > > For both v1 and v2: > > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Thanks, applied. - Arnaldo > My development and testing environment: > > 1. Debian/testing AMD64 > 2. Linux v5.11-rc6+ with custom mostly Clang fixes > 3. Debug-Info: BTF + DWARF-v5 enabled > 4. Compiler and tools (incl. Clang's Integrated ASsembler IAS): > LLVM/Clang v12.0.0-rc1 (make LLVM=1 LLVM_IAS=1) > > Build and boot on bare metal. > > - Sedat - > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Signed-off-by: Yonghong Song <yhs@fb.com> > > --- > > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 38 insertions(+), 1 deletion(-) > > > > diff --git a/libbtf.c b/libbtf.c > > index 9f76283..5843200 100644 > > --- a/libbtf.c > > +++ b/libbtf.c > > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > struct btf *btf = btfe->btf; > > const struct btf_type *t; > > uint8_t encoding = 0; > > + uint16_t byte_sz; > > int32_t id; > > > > if (bt->is_signed) { > > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > return -1; > > } > > > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > > + * {num} is not power of 2 and may exceed 128. Such attributes > > + * are mostly used to record operation for an actual parameter > > + * or variable. > > + * For example, > > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > + * [0xffffffff82808812, 0xffffffff82808817): > > + * DW_OP_breg0 RAX+0, > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > + * DW_OP_stack_value, > > + * DW_OP_piece 0x1, > > + * DW_OP_breg0 RAX+0, > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > + * DW_OP_lit8, > > + * DW_OP_shr, > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > + * DW_OP_stack_value, DW_OP_piece 0x3 > > + * DW_AT_name ("ebx") > > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > > + * > > + * In the above example, at some point, one unsigned_32 value > > + * is right shifted by 8 and the result is converted to unsigned_32 > > + * and then unsigned_24. > > + * > > + * BTF does not need such DW_OP_* information so let us sanitize > > + * these non-regular int types to avoid libbpf/kernel complaints. > > + */ > > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > > + name = "__SANITIZED_FAKE_INT__"; > > + byte_sz = 4; > > + } > > + > > + id = btf__add_int(btf, name, byte_sz, encoding); > > if (id < 0) { > > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > > } else { > > -- > > 2.24.1 > >
Em Mon, Feb 08, 2021 at 11:22:48AM -0800, Nick Desaulniers escreveu: > On Sat, Feb 6, 2021 at 11:17 PM Yonghong Song <yhs@fb.com> wrote: > > > > clang with dwarf5 may generate non-regular int base type, > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > Such base types are often used to describe > > how an actual parameter or variable is generated. For example, > > > > 0x000015cf: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_1") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x00) > > > > 0x00010ed9: DW_TAG_formal_parameter > > DW_AT_location (DW_OP_lit0, > > DW_OP_not, > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > DW_OP_stack_value) > > DW_AT_abstract_origin (0x00013984 "branch") > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > one-bit unsigned int and then 8-bit unsigned int. > > > > Another example, > > > > 0x000e97e4: DW_TAG_base_type > > DW_AT_name ("DW_ATE_unsigned_24") > > DW_AT_encoding (DW_ATE_unsigned) > > DW_AT_byte_size (0x03) > > > > 0x000f88f8: DW_TAG_variable > > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > [0xffffffff82808812, 0xffffffff82808817): > > DW_OP_breg0 RAX+0, > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > DW_OP_stack_value, > > DW_OP_piece 0x1, > > DW_OP_breg0 RAX+0, > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > DW_OP_lit8, > > DW_OP_shr, > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > DW_OP_stack_value, > > DW_OP_piece 0x3 > > ...... > > > > At one point, a right shift by 8 happens and the result is converted to > > 32-bit unsigned int and then to 24-bit unsigned int. > > > > BTF does not need any of these DW_OP_* information and such non-regular int > > types will cause libbpf to emit errors. > > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Signed-off-by: Yonghong Song <yhs@fb.com> > > Thanks for the patch! > > Tested-by: Nick Desaulniers <ndesaulniers@google.com> Thanks for testing and documenting that you tested, added the tag to the commit, - Arnaldo
On Mon, Feb 8, 2021 at 8:55 PM Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> wrote: > > Em Sun, Feb 07, 2021 at 11:32:00AM +0100, Sedat Dilek escreveu: > > On Sun, Feb 7, 2021 at 8:17 AM Yonghong Song <yhs@fb.com> wrote: > > > > > > clang with dwarf5 may generate non-regular int base type, > > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > > Such base types are often used to describe > > > how an actual parameter or variable is generated. For example, > > > > > > 0x000015cf: DW_TAG_base_type > > > DW_AT_name ("DW_ATE_unsigned_1") > > > DW_AT_encoding (DW_ATE_unsigned) > > > DW_AT_byte_size (0x00) > > > > > > 0x00010ed9: DW_TAG_formal_parameter > > > DW_AT_location (DW_OP_lit0, > > > DW_OP_not, > > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > > DW_OP_stack_value) > > > DW_AT_abstract_origin (0x00013984 "branch") > > > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > > one-bit unsigned int and then 8-bit unsigned int. > > > > > > Another example, > > > > > > 0x000e97e4: DW_TAG_base_type > > > DW_AT_name ("DW_ATE_unsigned_24") > > > DW_AT_encoding (DW_ATE_unsigned) > > > DW_AT_byte_size (0x03) > > > > > > 0x000f88f8: DW_TAG_variable > > > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > > [0xffffffff82808812, 0xffffffff82808817): > > > DW_OP_breg0 RAX+0, > > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > > DW_OP_stack_value, > > > DW_OP_piece 0x1, > > > DW_OP_breg0 RAX+0, > > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > DW_OP_lit8, > > > DW_OP_shr, > > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > > DW_OP_stack_value, > > > DW_OP_piece 0x3 > > > ...... > > > > > > At one point, a right shift by 8 happens and the result is converted to > > > 32-bit unsigned int and then to 24-bit unsigned int. > > > > > > BTF does not need any of these DW_OP_* information and such non-regular int > > > types will cause libbpf to emit errors. > > > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > > > > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > > > > Thanks for v2. > > > > For both v1 and v2: > > > > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > > Thanks, applied. > Great. I cannot see it yet in [1] or [2]. More important to me is is this worth a pahole v1.20.1 release? This patch is required to successfully build with BTF and DWARF-5 and Clang-12. I have Nathan's "bpf: Hoise pahole version checks into Kconfig" patch in my custom patchset (together with Nick's DWARF-v5 patchset) which makes it easier to do (depends on PAHOLE_VERSION >= 1201) via Kconfig for example. - Sedat - [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ [2] https://github.com/acmel/dwarves/ > - Arnaldo > > > My development and testing environment: > > > > 1. Debian/testing AMD64 > > 2. Linux v5.11-rc6+ with custom mostly Clang fixes > > 3. Debug-Info: BTF + DWARF-v5 enabled > > 4. Compiler and tools (incl. Clang's Integrated ASsembler IAS): > > LLVM/Clang v12.0.0-rc1 (make LLVM=1 LLVM_IAS=1) > > > > Build and boot on bare metal. > > > > - Sedat - > > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > Signed-off-by: Yonghong Song <yhs@fb.com> > > > --- > > > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > > > 1 file changed, 38 insertions(+), 1 deletion(-) > > > > > > diff --git a/libbtf.c b/libbtf.c > > > index 9f76283..5843200 100644 > > > --- a/libbtf.c > > > +++ b/libbtf.c > > > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > > struct btf *btf = btfe->btf; > > > const struct btf_type *t; > > > uint8_t encoding = 0; > > > + uint16_t byte_sz; > > > int32_t id; > > > > > > if (bt->is_signed) { > > > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > > return -1; > > > } > > > > > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > > > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > > > + * {num} is not power of 2 and may exceed 128. Such attributes > > > + * are mostly used to record operation for an actual parameter > > > + * or variable. > > > + * For example, > > > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > > + * [0xffffffff82808812, 0xffffffff82808817): > > > + * DW_OP_breg0 RAX+0, > > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > > + * DW_OP_stack_value, > > > + * DW_OP_piece 0x1, > > > + * DW_OP_breg0 RAX+0, > > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > + * DW_OP_lit8, > > > + * DW_OP_shr, > > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > > + * DW_OP_stack_value, DW_OP_piece 0x3 > > > + * DW_AT_name ("ebx") > > > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > > > + * > > > + * In the above example, at some point, one unsigned_32 value > > > + * is right shifted by 8 and the result is converted to unsigned_32 > > > + * and then unsigned_24. > > > + * > > > + * BTF does not need such DW_OP_* information so let us sanitize > > > + * these non-regular int types to avoid libbpf/kernel complaints. > > > + */ > > > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > > > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > > > + name = "__SANITIZED_FAKE_INT__"; > > > + byte_sz = 4; > > > + } > > > + > > > + id = btf__add_int(btf, name, byte_sz, encoding); > > > if (id < 0) { > > > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > > > } else { > > > -- > > > 2.24.1 > > > > > -- > > - Arnaldo
On Mon, Feb 8, 2021 at 12:46 PM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > On Mon, Feb 8, 2021 at 8:55 PM Arnaldo Carvalho de Melo > <arnaldo.melo@gmail.com> wrote: > > > > Em Sun, Feb 07, 2021 at 11:32:00AM +0100, Sedat Dilek escreveu: > > > On Sun, Feb 7, 2021 at 8:17 AM Yonghong Song <yhs@fb.com> wrote: > > > > > > > > clang with dwarf5 may generate non-regular int base type, > > > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > > > Such base types are often used to describe > > > > how an actual parameter or variable is generated. For example, > > > > > > > > 0x000015cf: DW_TAG_base_type > > > > DW_AT_name ("DW_ATE_unsigned_1") > > > > DW_AT_encoding (DW_ATE_unsigned) > > > > DW_AT_byte_size (0x00) > > > > > > > > 0x00010ed9: DW_TAG_formal_parameter > > > > DW_AT_location (DW_OP_lit0, > > > > DW_OP_not, > > > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > > > DW_OP_stack_value) > > > > DW_AT_abstract_origin (0x00013984 "branch") > > > > > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > > > one-bit unsigned int and then 8-bit unsigned int. > > > > > > > > Another example, > > > > > > > > 0x000e97e4: DW_TAG_base_type > > > > DW_AT_name ("DW_ATE_unsigned_24") > > > > DW_AT_encoding (DW_ATE_unsigned) > > > > DW_AT_byte_size (0x03) > > > > > > > > 0x000f88f8: DW_TAG_variable > > > > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > > > [0xffffffff82808812, 0xffffffff82808817): > > > > DW_OP_breg0 RAX+0, > > > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > > > DW_OP_stack_value, > > > > DW_OP_piece 0x1, > > > > DW_OP_breg0 RAX+0, > > > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > DW_OP_lit8, > > > > DW_OP_shr, > > > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > > > DW_OP_stack_value, > > > > DW_OP_piece 0x3 > > > > ...... > > > > > > > > At one point, a right shift by 8 happens and the result is converted to > > > > 32-bit unsigned int and then to 24-bit unsigned int. > > > > > > > > BTF does not need any of these DW_OP_* information and such non-regular int > > > > types will cause libbpf to emit errors. > > > > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > > > > > > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > > > > > > Thanks for v2. > > > > > > For both v1 and v2: > > > > > > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> > > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > > > > Thanks, applied. > > > > Great. > I cannot see it yet in [1] or [2]. > > More important to me is is this worth a pahole v1.20.1 release? > This patch is required to successfully build with BTF and DWARF-5 and Clang-12. > > I have Nathan's "bpf: Hoise pahole version checks into Kconfig" patch > in my custom patchset (together with Nick's DWARF-v5 patchset) which > makes it easier to do (depends on PAHOLE_VERSION >= 1201) via Kconfig I don't think the math works out with 1201 vs 121 (in the future), so I'd rather prefer 1.21, if necessary. > for example. > > - Sedat - > > [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ > [2] https://github.com/acmel/dwarves/ > > > - Arnaldo > > > > > My development and testing environment: > > > > > > 1. Debian/testing AMD64 > > > 2. Linux v5.11-rc6+ with custom mostly Clang fixes > > > 3. Debug-Info: BTF + DWARF-v5 enabled > > > 4. Compiler and tools (incl. Clang's Integrated ASsembler IAS): > > > LLVM/Clang v12.0.0-rc1 (make LLVM=1 LLVM_IAS=1) > > > > > > Build and boot on bare metal. > > > > > > - Sedat - > > > > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > Signed-off-by: Yonghong Song <yhs@fb.com> > > > > --- > > > > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > > > > 1 file changed, 38 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/libbtf.c b/libbtf.c > > > > index 9f76283..5843200 100644 > > > > --- a/libbtf.c > > > > +++ b/libbtf.c > > > > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > > > struct btf *btf = btfe->btf; > > > > const struct btf_type *t; > > > > uint8_t encoding = 0; > > > > + uint16_t byte_sz; > > > > int32_t id; > > > > > > > > if (bt->is_signed) { > > > > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > > > return -1; > > > > } > > > > > > > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > > > > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > > > > + * {num} is not power of 2 and may exceed 128. Such attributes > > > > + * are mostly used to record operation for an actual parameter > > > > + * or variable. > > > > + * For example, > > > > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > > > + * [0xffffffff82808812, 0xffffffff82808817): > > > > + * DW_OP_breg0 RAX+0, > > > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > > > + * DW_OP_stack_value, > > > > + * DW_OP_piece 0x1, > > > > + * DW_OP_breg0 RAX+0, > > > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > + * DW_OP_lit8, > > > > + * DW_OP_shr, > > > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > > > + * DW_OP_stack_value, DW_OP_piece 0x3 > > > > + * DW_AT_name ("ebx") > > > > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > > > > + * > > > > + * In the above example, at some point, one unsigned_32 value > > > > + * is right shifted by 8 and the result is converted to unsigned_32 > > > > + * and then unsigned_24. > > > > + * > > > > + * BTF does not need such DW_OP_* information so let us sanitize > > > > + * these non-regular int types to avoid libbpf/kernel complaints. > > > > + */ > > > > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > > > > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > > > > + name = "__SANITIZED_FAKE_INT__"; > > > > + byte_sz = 4; > > > > + } > > > > + > > > > + id = btf__add_int(btf, name, byte_sz, encoding); > > > > if (id < 0) { > > > > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > > > > } else { > > > > -- > > > > 2.24.1 > > > > > > > > -- > > > > - Arnaldo
On Mon, Feb 8, 2021 at 10:08 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Mon, Feb 8, 2021 at 12:46 PM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > > > On Mon, Feb 8, 2021 at 8:55 PM Arnaldo Carvalho de Melo > > <arnaldo.melo@gmail.com> wrote: > > > > > > Em Sun, Feb 07, 2021 at 11:32:00AM +0100, Sedat Dilek escreveu: > > > > On Sun, Feb 7, 2021 at 8:17 AM Yonghong Song <yhs@fb.com> wrote: > > > > > > > > > > clang with dwarf5 may generate non-regular int base type, > > > > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > > > > Such base types are often used to describe > > > > > how an actual parameter or variable is generated. For example, > > > > > > > > > > 0x000015cf: DW_TAG_base_type > > > > > DW_AT_name ("DW_ATE_unsigned_1") > > > > > DW_AT_encoding (DW_ATE_unsigned) > > > > > DW_AT_byte_size (0x00) > > > > > > > > > > 0x00010ed9: DW_TAG_formal_parameter > > > > > DW_AT_location (DW_OP_lit0, > > > > > DW_OP_not, > > > > > DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", > > > > > DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8", > > > > > DW_OP_stack_value) > > > > > DW_AT_abstract_origin (0x00013984 "branch") > > > > > > > > > > What it does is with a literal "0", did a "not" operation, and the converted to > > > > > one-bit unsigned int and then 8-bit unsigned int. > > > > > > > > > > Another example, > > > > > > > > > > 0x000e97e4: DW_TAG_base_type > > > > > DW_AT_name ("DW_ATE_unsigned_24") > > > > > DW_AT_encoding (DW_ATE_unsigned) > > > > > DW_AT_byte_size (0x03) > > > > > > > > > > 0x000f88f8: DW_TAG_variable > > > > > DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > > > > [0xffffffff82808812, 0xffffffff82808817): > > > > > DW_OP_breg0 RAX+0, > > > > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > > DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > > > > DW_OP_stack_value, > > > > > DW_OP_piece 0x1, > > > > > DW_OP_breg0 RAX+0, > > > > > DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > > DW_OP_lit8, > > > > > DW_OP_shr, > > > > > DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > > DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > > > > DW_OP_stack_value, > > > > > DW_OP_piece 0x3 > > > > > ...... > > > > > > > > > > At one point, a right shift by 8 happens and the result is converted to > > > > > 32-bit unsigned int and then to 24-bit unsigned int. > > > > > > > > > > BTF does not need any of these DW_OP_* information and such non-regular int > > > > > types will cause libbpf to emit errors. > > > > > Let us sanitize them to generate BTF acceptable to libbpf and kernel. > > > > > > > > > > Cc: Sedat Dilek <sedat.dilek@gmail.com> > > > > > > > > Thanks for v2. > > > > > > > > For both v1 and v2: > > > > > > > > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> > > > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > > > > > > Thanks, applied. > > > > > > > Great. > > I cannot see it yet in [1] or [2]. > > > > More important to me is is this worth a pahole v1.20.1 release? > > This patch is required to successfully build with BTF and DWARF-5 and Clang-12. > > > > I have Nathan's "bpf: Hoise pahole version checks into Kconfig" patch > > in my custom patchset (together with Nick's DWARF-v5 patchset) which > > makes it easier to do (depends on PAHOLE_VERSION >= 1201) via Kconfig > > I don't think the math works out with 1201 vs 121 (in the future), so > I'd rather prefer 1.21, if necessary. > Aaargh, you are right. # /opt/pahole/bin/pahole --numeric_version 120 The wish is to have a pahole v1.21. - Sedat - > > for example. > > > > - Sedat - > > > > [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ > > [2] https://github.com/acmel/dwarves/ > > > > > - Arnaldo > > > > > > > My development and testing environment: > > > > > > > > 1. Debian/testing AMD64 > > > > 2. Linux v5.11-rc6+ with custom mostly Clang fixes > > > > 3. Debug-Info: BTF + DWARF-v5 enabled > > > > 4. Compiler and tools (incl. Clang's Integrated ASsembler IAS): > > > > LLVM/Clang v12.0.0-rc1 (make LLVM=1 LLVM_IAS=1) > > > > > > > > Build and boot on bare metal. > > > > > > > > - Sedat - > > > > > > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > > Signed-off-by: Yonghong Song <yhs@fb.com> > > > > > --- > > > > > libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++- > > > > > 1 file changed, 38 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/libbtf.c b/libbtf.c > > > > > index 9f76283..5843200 100644 > > > > > --- a/libbtf.c > > > > > +++ b/libbtf.c > > > > > @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > > > > struct btf *btf = btfe->btf; > > > > > const struct btf_type *t; > > > > > uint8_t encoding = 0; > > > > > + uint16_t byte_sz; > > > > > int32_t id; > > > > > > > > > > if (bt->is_signed) { > > > > > @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, > > > > > return -1; > > > > > } > > > > > > > > > > - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); > > > > > + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where > > > > > + * {num} is not power of 2 and may exceed 128. Such attributes > > > > > + * are mostly used to record operation for an actual parameter > > > > > + * or variable. > > > > > + * For example, > > > > > + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: > > > > > + * [0xffffffff82808812, 0xffffffff82808817): > > > > > + * DW_OP_breg0 RAX+0, > > > > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > > + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", > > > > > + * DW_OP_stack_value, > > > > > + * DW_OP_piece 0x1, > > > > > + * DW_OP_breg0 RAX+0, > > > > > + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", > > > > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > > + * DW_OP_lit8, > > > > > + * DW_OP_shr, > > > > > + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", > > > > > + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", > > > > > + * DW_OP_stack_value, DW_OP_piece 0x3 > > > > > + * DW_AT_name ("ebx") > > > > > + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") > > > > > + * > > > > > + * In the above example, at some point, one unsigned_32 value > > > > > + * is right shifted by 8 and the result is converted to unsigned_32 > > > > > + * and then unsigned_24. > > > > > + * > > > > > + * BTF does not need such DW_OP_* information so let us sanitize > > > > > + * these non-regular int types to avoid libbpf/kernel complaints. > > > > > + */ > > > > > + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); > > > > > + if (!byte_sz || (byte_sz & (byte_sz - 1))) { > > > > > + name = "__SANITIZED_FAKE_INT__"; > > > > > + byte_sz = 4; > > > > > + } > > > > > + > > > > > + id = btf__add_int(btf, name, byte_sz, encoding); > > > > > if (id < 0) { > > > > > btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); > > > > > } else { > > > > > -- > > > > > 2.24.1 > > > > > > > > > > > -- > > > > > > - Arnaldo
Hi David, (Nice to have seen you just recently!) On Mon, 2021-02-08 at 12:43 -0800, David Blaikie wrote: > DW_OP_convert was added in DWARFv5 and it can be used for type conversions, > these are created in the LLVM middle/backend during optimizations, not by > the frontend. So the middle/backend doesn't have a way to create canonical > DW_TAG_base_types for frontend language integer types - so it creates > synthesized ones. So this is intentional/not a particularly quirky thing. > > LLVM creates locations incrementally as optimizations are applied - without > doing any particular canonicalization/reduction later on (maybe it has some > canonicalization/reduction, but not a very wholistic/aggressive approach > there) - so it can end up with something that is not the most compact > representation. But it seems to end up with a bogus representation (see below in the original quoted message for an example). The issue isn't really that it isn't an optimized constant representation (although it is kind of an inefficient one). It is this DW_OP_convert applied to a DW_TAG_base_type DIE with a DW_AT_byte_size of zero. Which doesn't really make any sense to me. It feels like it is asking the DWARF consumer to do a divide by zero here ("now express this value using zero bits!"). Could you explain these syntactic "DW_ATE_unsigned_1" base_type DIEs? What do they represent? Why do they have a zero size? Should they really have a DW_AT_bit_size and a DW_AT_byte_size of 1? Thanks, Mark > On Mon, Feb 8, 2021 at 11:17 AM Nick Desaulniers <ndesaulniers@google.com> > wrote: > > > On Sun, Feb 7, 2021 at 6:18 AM Mark Wielaard <mark@klomp.org> wrote: > > > > > > Hi, > > > > > > On Sat, 2021-02-06 at 23:17 -0800, Yonghong Song wrote: > > > > clang with dwarf5 may generate non-regular int base type, > > > > i.e., not a signed/unsigned char/short/int/longlong/__int128. > > > > Such base types are often used to describe > > > > how an actual parameter or variable is generated. For example, > > > > > > > > 0x000015cf: DW_TAG_base_type > > > > DW_AT_name ("DW_ATE_unsigned_1") > > > > DW_AT_encoding (DW_ATE_unsigned) > > > > DW_AT_byte_size (0x00) > > > > > > > > 0x00010ed9: DW_TAG_formal_parameter > > > > DW_AT_location (DW_OP_lit0, > > > > DW_OP_not, > > > > DW_OP_convert (0x000015cf) > > > > "DW_ATE_unsigned_1", > > > > DW_OP_convert (0x000015d4) > > > > "DW_ATE_unsigned_8", > > > > DW_OP_stack_value) > > > > DW_AT_abstract_origin (0x00013984 "branch") > > > > > > > > What it does is with a literal "0", did a "not" operation, and the > > > > converted to > > > > one-bit unsigned int and then 8-bit unsigned int. > > > > > > Thanks for tracking this down. Do you have any idea why the clang > > > compiler emits this? You might be right that it is intended to do what > > > you describe it does (but then it would simply encode an unsigned > > > constant 1 char in a very inefficient way). But as implemented it > > > doesn't seem to make any sense. What would DW_OP_convert of an zero > > > sized base type even mean (if it is intended as a 1 bit sized typed, > > > then why is there no DW_AT_bit_size)? > > > > David, > > Any thoughts on the above sequence of DW_OP_ entries? This is a part > > of DWARF I'm unfamiliar with. > > > > > > > > So I do think your patch makes sense. clang clearly is emitting > > > something bogus. And so some fixup is needed. But maybe we should at > > > least give a warning about it, otherwise it might never get fixed. > > > > > > BTW. If these bogus base types are only emitted as part of a location > > > expression and not as part of an actual function or variable type > > > description, then why are we even trying to encode it as a BTF type? It > > > might be cheaper to just skip/drop it. But maybe the code setup makes > > > it hard to know whether or not such a (bogus) type is actually > > > referenced from a function or variable description? > > > > > > Cheers, > > > > > > Mark > > > > > > > > -- > > Thanks, > > ~Nick Desaulniers > >
diff --git a/libbtf.c b/libbtf.c index 9f76283..5843200 100644 --- a/libbtf.c +++ b/libbtf.c @@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, struct btf *btf = btfe->btf; const struct btf_type *t; uint8_t encoding = 0; + uint16_t byte_sz; int32_t id; if (bt->is_signed) { @@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt, return -1; } - id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding); + /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where + * {num} is not power of 2 and may exceed 128. Such attributes + * are mostly used to record operation for an actual parameter + * or variable. + * For example, + * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0: + * [0xffffffff82808812, 0xffffffff82808817): + * DW_OP_breg0 RAX+0, + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", + * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8", + * DW_OP_stack_value, + * DW_OP_piece 0x1, + * DW_OP_breg0 RAX+0, + * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64", + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", + * DW_OP_lit8, + * DW_OP_shr, + * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32", + * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24", + * DW_OP_stack_value, DW_OP_piece 0x3 + * DW_AT_name ("ebx") + * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c") + * + * In the above example, at some point, one unsigned_32 value + * is right shifted by 8 and the result is converted to unsigned_32 + * and then unsigned_24. + * + * BTF does not need such DW_OP_* information so let us sanitize + * these non-regular int types to avoid libbpf/kernel complaints. + */ + byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size); + if (!byte_sz || (byte_sz & (byte_sz - 1))) { + name = "__SANITIZED_FAKE_INT__"; + byte_sz = 4; + } + + id = btf__add_int(btf, name, byte_sz, encoding); if (id < 0) { btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type"); } else {