mbox series

[bpf-next,v2,00/11] bpf: add support for new btf kind BTF_KIND_TAG

Message ID 20210913155122.3722704-1-yhs@fb.com (mailing list archive)
Headers show
Series bpf: add support for new btf kind BTF_KIND_TAG | expand

Message

Yonghong Song Sept. 13, 2021, 3:51 p.m. UTC
LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

This new attribute can be used to add attributes
to kernel codes, e.g., pre- or post- conditions,
allow/deny info, or any other info in which only
the kernel is interested. Such attributes will
be processed by clang frontend and emitted to
dwarf, converting to BTF by pahole. Ultimiately
the verifier can use these information for
verification purpose.

The new attribute can also be used for bpf
programs, e.g., tagging with __user attributes
for function parameters, specifying global
function preconditions, etc. Such information
may help verifier to detect user program
bugs.

After this series, pahole dwarf->btf converter
will be enhanced to support new llvm tag
for btf_tag attribute. With pahole support,
we will then try to add a few real use case,
e.g., __user/__rcu tagging, allow/deny list,
some kernel function precondition, etc,
in the kernel.

In the rest of the series, Patches 1-2 had
kernel support. Patches 3-4 added
libbpf support. Patch 5 added bpftool
support. Patches 6-10 added various selftests.
Patch 11 added documentation for the new kind.

  [1] https://reviews.llvm.org/D106614
  [2] https://reviews.llvm.org/D106621
  [3] https://reviews.llvm.org/D106622
  [4] https://reviews.llvm.org/D109560

Changelog:
  v1 -> v2:
    - BTF ELF format changed in llvm ([4] above),
      so cross-board change to use the new format.
    - Clarified in commit message that BTF_KIND_TAG
      is not emitted by bpftool btf dump format c.
    - Fix various comments from Andrii.

Yonghong Song (11):
  btf: change BTF_KIND_* macros to enums
  bpf: support for new btf kind BTF_KIND_TAG
  libbpf: rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
  libbpf: add support for BTF_KIND_TAG
  bpftool: add support for BTF_KIND_TAG
  selftests/bpf: test libbpf API function btf__add_tag()
  selftests/bpf: change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
  selftests/bpf: add BTF_KIND_TAG unit tests
  selftests/bpf: test BTF_KIND_TAG for deduplication
  selftests/bpf: add a test with a bpf program with btf_tag attributes
  docs/bpf: add documentation for BTF_KIND_TAG

 Documentation/bpf/btf.rst                     |  27 +-
 include/uapi/linux/btf.h                      |  52 +--
 kernel/bpf/btf.c                              | 120 +++++++
 tools/bpf/bpftool/btf.c                       |  12 +
 tools/include/uapi/linux/btf.h                |  52 +--
 tools/lib/bpf/btf.c                           |  85 ++++-
 tools/lib/bpf/btf.h                           |  15 +
 tools/lib/bpf/btf_dump.c                      |   3 +
 tools/lib/bpf/libbpf.c                        |  31 +-
 tools/lib/bpf/libbpf.map                      |   5 +
 tools/lib/bpf/libbpf_internal.h               |   2 +
 tools/testing/selftests/bpf/btf_helpers.c     |   7 +-
 tools/testing/selftests/bpf/prog_tests/btf.c  | 318 ++++++++++++++++--
 .../selftests/bpf/prog_tests/btf_tag.c        |  14 +
 .../selftests/bpf/prog_tests/btf_write.c      |  21 ++
 tools/testing/selftests/bpf/progs/tag.c       |  39 +++
 tools/testing/selftests/bpf/test_btf.h        |   3 +
 17 files changed, 736 insertions(+), 70 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_tag.c
 create mode 100644 tools/testing/selftests/bpf/progs/tag.c

Comments

Yonghong Song Sept. 13, 2021, 4:08 p.m. UTC | #1
cc Jose E. Marchesi

Hi, Jose, just let you know that the BTF format for BTF_KIND_TAG is
changed since v1 as the new format can simplify kernel/libbpf 
implementation. Thanks!

On 9/13/21 8:51 AM, Yonghong Song wrote:
> LLVM14 added support for a new C attribute ([1])
>    __attribute__((btf_tag("arbitrary_str")))
> This attribute will be emitted to dwarf ([2]) and pahole
> will convert it to BTF. Or for bpf target, this
> attribute will be emitted to BTF directly ([3], [4]).
> The attribute is intended to provide additional
> information for
>    - struct/union type or struct/union member
>    - static/global variables
>    - static/global function or function parameter.
> 
> This new attribute can be used to add attributes
> to kernel codes, e.g., pre- or post- conditions,
> allow/deny info, or any other info in which only
> the kernel is interested. Such attributes will
> be processed by clang frontend and emitted to
> dwarf, converting to BTF by pahole. Ultimiately
> the verifier can use these information for
> verification purpose.
> 
> The new attribute can also be used for bpf
> programs, e.g., tagging with __user attributes
> for function parameters, specifying global
> function preconditions, etc. Such information
> may help verifier to detect user program
> bugs.
> 
> After this series, pahole dwarf->btf converter
> will be enhanced to support new llvm tag
> for btf_tag attribute. With pahole support,
> we will then try to add a few real use case,
> e.g., __user/__rcu tagging, allow/deny list,
> some kernel function precondition, etc,
> in the kernel.
> 
> In the rest of the series, Patches 1-2 had
> kernel support. Patches 3-4 added
> libbpf support. Patch 5 added bpftool
> support. Patches 6-10 added various selftests.
> Patch 11 added documentation for the new kind.
> 
>    [1] https://reviews.llvm.org/D106614
>    [2] https://reviews.llvm.org/D106621
>    [3] https://reviews.llvm.org/D106622
>    [4] https://reviews.llvm.org/D109560
> 
> Changelog:
>    v1 -> v2:
>      - BTF ELF format changed in llvm ([4] above),
>        so cross-board change to use the new format.
>      - Clarified in commit message that BTF_KIND_TAG
>        is not emitted by bpftool btf dump format c.
>      - Fix various comments from Andrii.
> 
> Yonghong Song (11):
>    btf: change BTF_KIND_* macros to enums
>    bpf: support for new btf kind BTF_KIND_TAG
>    libbpf: rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
>    libbpf: add support for BTF_KIND_TAG
>    bpftool: add support for BTF_KIND_TAG
>    selftests/bpf: test libbpf API function btf__add_tag()
>    selftests/bpf: change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
>    selftests/bpf: add BTF_KIND_TAG unit tests
>    selftests/bpf: test BTF_KIND_TAG for deduplication
>    selftests/bpf: add a test with a bpf program with btf_tag attributes
>    docs/bpf: add documentation for BTF_KIND_TAG
> 
>   Documentation/bpf/btf.rst                     |  27 +-
>   include/uapi/linux/btf.h                      |  52 +--
>   kernel/bpf/btf.c                              | 120 +++++++
>   tools/bpf/bpftool/btf.c                       |  12 +
>   tools/include/uapi/linux/btf.h                |  52 +--
>   tools/lib/bpf/btf.c                           |  85 ++++-
>   tools/lib/bpf/btf.h                           |  15 +
>   tools/lib/bpf/btf_dump.c                      |   3 +
>   tools/lib/bpf/libbpf.c                        |  31 +-
>   tools/lib/bpf/libbpf.map                      |   5 +
>   tools/lib/bpf/libbpf_internal.h               |   2 +
>   tools/testing/selftests/bpf/btf_helpers.c     |   7 +-
>   tools/testing/selftests/bpf/prog_tests/btf.c  | 318 ++++++++++++++++--
>   .../selftests/bpf/prog_tests/btf_tag.c        |  14 +
>   .../selftests/bpf/prog_tests/btf_write.c      |  21 ++
>   tools/testing/selftests/bpf/progs/tag.c       |  39 +++
>   tools/testing/selftests/bpf/test_btf.h        |   3 +
>   17 files changed, 736 insertions(+), 70 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_tag.c
>   create mode 100644 tools/testing/selftests/bpf/progs/tag.c
>
Jose E. Marchesi Sept. 13, 2021, 4:40 p.m. UTC | #2
> cc Jose E. Marchesi
>
> Hi, Jose, just let you know that the BTF format for BTF_KIND_TAG is
> changed since v1 as the new format can simplify kernel/libbpf
> implementation. Thanks!

Noted.  Thanks for the update.

>
> On 9/13/21 8:51 AM, Yonghong Song wrote:
>> LLVM14 added support for a new C attribute ([1])
>>    __attribute__((btf_tag("arbitrary_str")))
>> This attribute will be emitted to dwarf ([2]) and pahole
>> will convert it to BTF. Or for bpf target, this
>> attribute will be emitted to BTF directly ([3], [4]).
>> The attribute is intended to provide additional
>> information for
>>    - struct/union type or struct/union member
>>    - static/global variables
>>    - static/global function or function parameter.
>> This new attribute can be used to add attributes
>> to kernel codes, e.g., pre- or post- conditions,
>> allow/deny info, or any other info in which only
>> the kernel is interested. Such attributes will
>> be processed by clang frontend and emitted to
>> dwarf, converting to BTF by pahole. Ultimiately
>> the verifier can use these information for
>> verification purpose.
>> The new attribute can also be used for bpf
>> programs, e.g., tagging with __user attributes
>> for function parameters, specifying global
>> function preconditions, etc. Such information
>> may help verifier to detect user program
>> bugs.
>> After this series, pahole dwarf->btf converter
>> will be enhanced to support new llvm tag
>> for btf_tag attribute. With pahole support,
>> we will then try to add a few real use case,
>> e.g., __user/__rcu tagging, allow/deny list,
>> some kernel function precondition, etc,
>> in the kernel.
>> In the rest of the series, Patches 1-2 had
>> kernel support. Patches 3-4 added
>> libbpf support. Patch 5 added bpftool
>> support. Patches 6-10 added various selftests.
>> Patch 11 added documentation for the new kind.
>>    [1] https://reviews.llvm.org/D106614
>>    [2] https://reviews.llvm.org/D106621
>>    [3] https://reviews.llvm.org/D106622
>>    [4] https://reviews.llvm.org/D109560
>> Changelog:
>>    v1 -> v2:
>>      - BTF ELF format changed in llvm ([4] above),
>>        so cross-board change to use the new format.
>>      - Clarified in commit message that BTF_KIND_TAG
>>        is not emitted by bpftool btf dump format c.
>>      - Fix various comments from Andrii.
>> Yonghong Song (11):
>>    btf: change BTF_KIND_* macros to enums
>>    bpf: support for new btf kind BTF_KIND_TAG
>>    libbpf: rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
>>    libbpf: add support for BTF_KIND_TAG
>>    bpftool: add support for BTF_KIND_TAG
>>    selftests/bpf: test libbpf API function btf__add_tag()
>>    selftests/bpf: change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
>>    selftests/bpf: add BTF_KIND_TAG unit tests
>>    selftests/bpf: test BTF_KIND_TAG for deduplication
>>    selftests/bpf: add a test with a bpf program with btf_tag attributes
>>    docs/bpf: add documentation for BTF_KIND_TAG
>>   Documentation/bpf/btf.rst                     |  27 +-
>>   include/uapi/linux/btf.h                      |  52 +--
>>   kernel/bpf/btf.c                              | 120 +++++++
>>   tools/bpf/bpftool/btf.c                       |  12 +
>>   tools/include/uapi/linux/btf.h                |  52 +--
>>   tools/lib/bpf/btf.c                           |  85 ++++-
>>   tools/lib/bpf/btf.h                           |  15 +
>>   tools/lib/bpf/btf_dump.c                      |   3 +
>>   tools/lib/bpf/libbpf.c                        |  31 +-
>>   tools/lib/bpf/libbpf.map                      |   5 +
>>   tools/lib/bpf/libbpf_internal.h               |   2 +
>>   tools/testing/selftests/bpf/btf_helpers.c     |   7 +-
>>   tools/testing/selftests/bpf/prog_tests/btf.c  | 318 ++++++++++++++++--
>>   .../selftests/bpf/prog_tests/btf_tag.c        |  14 +
>>   .../selftests/bpf/prog_tests/btf_write.c      |  21 ++
>>   tools/testing/selftests/bpf/progs/tag.c       |  39 +++
>>   tools/testing/selftests/bpf/test_btf.h        |   3 +
>>   17 files changed, 736 insertions(+), 70 deletions(-)
>>   create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_tag.c
>>   create mode 100644 tools/testing/selftests/bpf/progs/tag.c
>>
Yonghong Song Dec. 16, 2021, 9:52 p.m. UTC | #3
On 9/13/21 9:40 AM, Jose E. Marchesi wrote:
> 
>> cc Jose E. Marchesi
>>
>> Hi, Jose, just let you know that the BTF format for BTF_KIND_TAG is
>> changed since v1 as the new format can simplify kernel/libbpf
>> implementation. Thanks!
> 
> Noted.  Thanks for the update.

Hi, Jose,

This is just another update on btf_tag development.
Now, btf_tag is divided into btf_decl_tag and btf_type_tag
for tagging declarations and types as clang compiler prefers
not to mix them with each other. All compiler works in llvm
has done and you can check upstream llvm-project "main" branch
for implementation.

The patch set below (under review)
    https://lore.kernel.org/bpf/20211209173537.1525283-1-yhs@fb.com/
actually tried to use btf_type_tag for linux kernel __user
annotation so bpf verifier can use it.

Another question from Omar (open source drgn maintainer)
 
https://developers.facebook.com/blog/post/2021/12/09/drgn-how-linux-kernel-team-meta-debugs-kernel-scale/
mentioned that btf_tag information will also help drgn since it
can then especially distinguish between __percpu pointer from
other pointers. Currently drgn is using dwarf, clang compiled
kernel puts btf_tag information in dwarf. Based on our earlier
discussion, gcc intends to generate btf tags for BTF only. Maybe
we could discuss to also generate for dwarf? Do we need a flag?

Please let me know if you have any questions.
Happy to help in whatever way to get gcc also implementing btf tag
support.

Thanks!

Yonghong

> 
>>
>> On 9/13/21 8:51 AM, Yonghong Song wrote:
>>> LLVM14 added support for a new C attribute ([1])
>>>     __attribute__((btf_tag("arbitrary_str")))
>>> This attribute will be emitted to dwarf ([2]) and pahole
>>> will convert it to BTF. Or for bpf target, this
>>> attribute will be emitted to BTF directly ([3], [4]).
>>> The attribute is intended to provide additional
>>> information for
>>>     - struct/union type or struct/union member
>>>     - static/global variables
>>>     - static/global function or function parameter.
>>> This new attribute can be used to add attributes
>>> to kernel codes, e.g., pre- or post- conditions,
>>> allow/deny info, or any other info in which only
>>> the kernel is interested. Such attributes will
>>> be processed by clang frontend and emitted to
>>> dwarf, converting to BTF by pahole. Ultimiately
>>> the verifier can use these information for
>>> verification purpose.
>>> The new attribute can also be used for bpf
>>> programs, e.g., tagging with __user attributes
>>> for function parameters, specifying global
>>> function preconditions, etc. Such information
>>> may help verifier to detect user program
>>> bugs.
>>> After this series, pahole dwarf->btf converter
>>> will be enhanced to support new llvm tag
>>> for btf_tag attribute. With pahole support,
>>> we will then try to add a few real use case,
>>> e.g., __user/__rcu tagging, allow/deny list,
>>> some kernel function precondition, etc,
>>> in the kernel.
>>> In the rest of the series, Patches 1-2 had
>>> kernel support. Patches 3-4 added
>>> libbpf support. Patch 5 added bpftool
>>> support. Patches 6-10 added various selftests.
>>> Patch 11 added documentation for the new kind.
>>>     [1] https://reviews.llvm.org/D106614
>>>     [2] https://reviews.llvm.org/D106621
>>>     [3] https://reviews.llvm.org/D106622
>>>     [4] https://reviews.llvm.org/D109560
>>> Changelog:
>>>     v1 -> v2:
>>>       - BTF ELF format changed in llvm ([4] above),
>>>         so cross-board change to use the new format.
>>>       - Clarified in commit message that BTF_KIND_TAG
>>>         is not emitted by bpftool btf dump format c.
>>>       - Fix various comments from Andrii.
>>> Yonghong Song (11):
>>>     btf: change BTF_KIND_* macros to enums
>>>     bpf: support for new btf kind BTF_KIND_TAG
>>>     libbpf: rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
>>>     libbpf: add support for BTF_KIND_TAG
>>>     bpftool: add support for BTF_KIND_TAG
>>>     selftests/bpf: test libbpf API function btf__add_tag()
>>>     selftests/bpf: change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
>>>     selftests/bpf: add BTF_KIND_TAG unit tests
>>>     selftests/bpf: test BTF_KIND_TAG for deduplication
>>>     selftests/bpf: add a test with a bpf program with btf_tag attributes
>>>     docs/bpf: add documentation for BTF_KIND_TAG
>>>    Documentation/bpf/btf.rst                     |  27 +-
>>>    include/uapi/linux/btf.h                      |  52 +--
>>>    kernel/bpf/btf.c                              | 120 +++++++
>>>    tools/bpf/bpftool/btf.c                       |  12 +
>>>    tools/include/uapi/linux/btf.h                |  52 +--
>>>    tools/lib/bpf/btf.c                           |  85 ++++-
>>>    tools/lib/bpf/btf.h                           |  15 +
>>>    tools/lib/bpf/btf_dump.c                      |   3 +
>>>    tools/lib/bpf/libbpf.c                        |  31 +-
>>>    tools/lib/bpf/libbpf.map                      |   5 +
>>>    tools/lib/bpf/libbpf_internal.h               |   2 +
>>>    tools/testing/selftests/bpf/btf_helpers.c     |   7 +-
>>>    tools/testing/selftests/bpf/prog_tests/btf.c  | 318 ++++++++++++++++--
>>>    .../selftests/bpf/prog_tests/btf_tag.c        |  14 +
>>>    .../selftests/bpf/prog_tests/btf_write.c      |  21 ++
>>>    tools/testing/selftests/bpf/progs/tag.c       |  39 +++
>>>    tools/testing/selftests/bpf/test_btf.h        |   3 +
>>>    17 files changed, 736 insertions(+), 70 deletions(-)
>>>    create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_tag.c
>>>    create mode 100644 tools/testing/selftests/bpf/progs/tag.c
>>>
Jose E. Marchesi Dec. 17, 2021, 10:40 a.m. UTC | #4
Hi Yonghong.

> On 9/13/21 9:40 AM, Jose E. Marchesi wrote:
>> 
>>> cc Jose E. Marchesi
>>>
>>> Hi, Jose, just let you know that the BTF format for BTF_KIND_TAG is
>>> changed since v1 as the new format can simplify kernel/libbpf
>>> implementation. Thanks!
>> Noted.  Thanks for the update.
>
> Hi, Jose,
>
> This is just another update on btf_tag development.
> Now, btf_tag is divided into btf_decl_tag and btf_type_tag
> for tagging declarations and types as clang compiler prefers
> not to mix them with each other. All compiler works in llvm
> has done and you can check upstream llvm-project "main" branch
> for implementation.
>
> The patch set below (under review)
>    https://lore.kernel.org/bpf/20211209173537.1525283-1-yhs@fb.com/
> actually tried to use btf_type_tag for linux kernel __user
> annotation so bpf verifier can use it.

Noted, thanks for the heads up :)
We have not yet started to implement this.

> Another question from Omar (open source drgn maintainer)
>
> https://developers.facebook.com/blog/post/2021/12/09/drgn-how-linux-kernel-team-meta-debugs-kernel-scale/
> mentioned that btf_tag information will also help drgn since it
> can then especially distinguish between __percpu pointer from
> other pointers. Currently drgn is using dwarf, clang compiled
> kernel puts btf_tag information in dwarf. Based on our earlier
> discussion, gcc intends to generate btf tags for BTF only. Maybe
> we could discuss to also generate for dwarf? Do we need a flag?

It seems to me that there are three different orthogonal issues/topics
here, even if somehow related.  Each would require a separated
discussion, probably on different contexts:

[Please let me know if I am wrong on any detail in the summary below.
 In part I am writing it down as a recap to help myself :)]

1) The need for BTF to convey free-text tags on certain elements, such
   as members of struct types.

   IMO there is not much to discuss about this one.  The specification
   is straightforward as is the implementation.  We will be adding it to
   GCC soon.

   Note that:
   - This is obviously BTF specific.
   - This is not strictly BPF specific, as GCC can generate BTF for any
     supported target and not just BPF.  I think you have patches for
     LLVM to the same effect.

2) The need for DWARF to convey free-text tags on certain elements, such
   as members of struct types.

   The motivation for this was originally the way the Linux kernel
   generates its BTF information, using pahole, using DWARF as a source.
   As we discussed in our last exchange on this topic, this is
   accidental, i.e. if the kernel switched to generate BTF directly from
   the compiler and the linker could merge/deduplicate BTF, there would
   be no need for using DWARF to act as the "unwilling conveyer" of this
   information.  There are additional benefits of this second approach.
   Thats why we didn't plan to add these extended DWARF DIEs to GCC.

   However, it now seems that a DWARF consumer, the drgn project, would
   also benefit from having such a support in DWARF to distinguish
   between different kind of pointers.

   So it seems to me that now we have two use-cases for adding support
   for these free-text tags to DWARF, as a proper extension to the
   format, strictly unrelated to BTF, BPF or even the kernel, since:
   - This is not kernel specific.
   - This is not directly related to BTF.
   - This is not directly related to BPF.

   Therefore I would avoid any reference to BTF in the proposal to keep
   it general, like in the name of DIE types and so on.  This would 

   Whom to involve to discuss this?  Well, I would say that at a minimum
   we need to involve GCC people, LLVM people and DWARF people, and the
   consensum of all the three parties.  Once agreed, we can all
   implement the same DIEs counting on the next version of the DWARF
   spec will catch with them.  I would really avoid rushed solutions
   based on compiler-specific extensions.

   Where to discuss this?  I don't know.  In some DWARF forum?  Or
   cross-posting gcc-patches and whatever LLVM list uses for
   development?  I am CCing Mark Wielaard who is a DWARF wizard... any
   suggestion?

3) Addition of C-family language-level constructions to specify
   free-text tags on certain language elements, such as struct fields.

   These are the attributes, or built-ins or whatever syntax.

   Note that, strictly speaking:
   - This is orthogonal to both DWARF and BTF, and any other supported
     debugging format, which may or may not be expressive enough to
     convey the free-form text tag.
   - This is not specific to BPF.

   Therefore I would avoid any reference to BTF or BPF in the attribute
   names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
   makes very little sense to me; the attribute name ought to be more
   generic.

   Whom to involve to discuss this?  Definitely, the front-end chaps of
   both GCC and LLVM will have something to say about this, particularly
   the ones in charge of the C-like language front-ends like C and C++.
   A consensum among them would be ideal and would avoid
   compiler-specific hacks.

   Where to discuss this? Again, it seems that we need some neutral
   ground to discuss inter-operability issues between the different free
   software compilers...

In any case we are more than willing to help and participate in the
discussions :)

> Please let me know if you have any questions.  Happy to help in
> whatever way to get gcc also implementing btf tag support.
>
> Thanks!
>
> Yonghong
>
>> 
>>>
>>> On 9/13/21 8:51 AM, Yonghong Song wrote:
>>>> LLVM14 added support for a new C attribute ([1])
>>>>     __attribute__((btf_tag("arbitrary_str")))
>>>> This attribute will be emitted to dwarf ([2]) and pahole
>>>> will convert it to BTF. Or for bpf target, this
>>>> attribute will be emitted to BTF directly ([3], [4]).
>>>> The attribute is intended to provide additional
>>>> information for
>>>>     - struct/union type or struct/union member
>>>>     - static/global variables
>>>>     - static/global function or function parameter.
>>>> This new attribute can be used to add attributes
>>>> to kernel codes, e.g., pre- or post- conditions,
>>>> allow/deny info, or any other info in which only
>>>> the kernel is interested. Such attributes will
>>>> be processed by clang frontend and emitted to
>>>> dwarf, converting to BTF by pahole. Ultimiately
>>>> the verifier can use these information for
>>>> verification purpose.
>>>> The new attribute can also be used for bpf
>>>> programs, e.g., tagging with __user attributes
>>>> for function parameters, specifying global
>>>> function preconditions, etc. Such information
>>>> may help verifier to detect user program
>>>> bugs.
>>>> After this series, pahole dwarf->btf converter
>>>> will be enhanced to support new llvm tag
>>>> for btf_tag attribute. With pahole support,
>>>> we will then try to add a few real use case,
>>>> e.g., __user/__rcu tagging, allow/deny list,
>>>> some kernel function precondition, etc,
>>>> in the kernel.
>>>> In the rest of the series, Patches 1-2 had
>>>> kernel support. Patches 3-4 added
>>>> libbpf support. Patch 5 added bpftool
>>>> support. Patches 6-10 added various selftests.
>>>> Patch 11 added documentation for the new kind.
>>>>     [1] https://reviews.llvm.org/D106614
>>>>     [2] https://reviews.llvm.org/D106621
>>>>     [3] https://reviews.llvm.org/D106622
>>>>     [4] https://reviews.llvm.org/D109560
>>>> Changelog:
>>>>     v1 -> v2:
>>>>       - BTF ELF format changed in llvm ([4] above),
>>>>         so cross-board change to use the new format.
>>>>       - Clarified in commit message that BTF_KIND_TAG
>>>>         is not emitted by bpftool btf dump format c.
>>>>       - Fix various comments from Andrii.
>>>> Yonghong Song (11):
>>>>     btf: change BTF_KIND_* macros to enums
>>>>     bpf: support for new btf kind BTF_KIND_TAG
>>>>     libbpf: rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
>>>>     libbpf: add support for BTF_KIND_TAG
>>>>     bpftool: add support for BTF_KIND_TAG
>>>>     selftests/bpf: test libbpf API function btf__add_tag()
>>>>     selftests/bpf: change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
>>>>     selftests/bpf: add BTF_KIND_TAG unit tests
>>>>     selftests/bpf: test BTF_KIND_TAG for deduplication
>>>>     selftests/bpf: add a test with a bpf program with btf_tag attributes
>>>>     docs/bpf: add documentation for BTF_KIND_TAG
>>>>    Documentation/bpf/btf.rst                     |  27 +-
>>>>    include/uapi/linux/btf.h                      |  52 +--
>>>>    kernel/bpf/btf.c                              | 120 +++++++
>>>>    tools/bpf/bpftool/btf.c                       |  12 +
>>>>    tools/include/uapi/linux/btf.h                |  52 +--
>>>>    tools/lib/bpf/btf.c                           |  85 ++++-
>>>>    tools/lib/bpf/btf.h                           |  15 +
>>>>    tools/lib/bpf/btf_dump.c                      |   3 +
>>>>    tools/lib/bpf/libbpf.c                        |  31 +-
>>>>    tools/lib/bpf/libbpf.map                      |   5 +
>>>>    tools/lib/bpf/libbpf_internal.h               |   2 +
>>>>    tools/testing/selftests/bpf/btf_helpers.c     |   7 +-
>>>>    tools/testing/selftests/bpf/prog_tests/btf.c  | 318 ++++++++++++++++--
>>>>    .../selftests/bpf/prog_tests/btf_tag.c        |  14 +
>>>>    .../selftests/bpf/prog_tests/btf_write.c      |  21 ++
>>>>    tools/testing/selftests/bpf/progs/tag.c       |  39 +++
>>>>    tools/testing/selftests/bpf/test_btf.h        |   3 +
>>>>    17 files changed, 736 insertions(+), 70 deletions(-)
>>>>    create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_tag.c
>>>>    create mode 100644 tools/testing/selftests/bpf/progs/tag.c
>>>>
Alexei Starovoitov Dec. 18, 2021, 1:44 a.m. UTC | #5
On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
> 
> 2) The need for DWARF to convey free-text tags on certain elements, such
>    as members of struct types.
> 
>    The motivation for this was originally the way the Linux kernel
>    generates its BTF information, using pahole, using DWARF as a source.
>    As we discussed in our last exchange on this topic, this is
>    accidental, i.e. if the kernel switched to generate BTF directly from
>    the compiler and the linker could merge/deduplicate BTF, there would
>    be no need for using DWARF to act as the "unwilling conveyer" of this
>    information.  There are additional benefits of this second approach.
>    Thats why we didn't plan to add these extended DWARF DIEs to GCC.
> 
>    However, it now seems that a DWARF consumer, the drgn project, would
>    also benefit from having such a support in DWARF to distinguish
>    between different kind of pointers.

drgn can use .percpu section in vmlinux for global percpu vars.
For pointers the annotation is indeed necessary.

>    So it seems to me that now we have two use-cases for adding support
>    for these free-text tags to DWARF, as a proper extension to the
>    format, strictly unrelated to BTF, BPF or even the kernel, since:
>    - This is not kernel specific.
>    - This is not directly related to BTF.
>    - This is not directly related to BPF.

__percpu annotation is kernel specific.
__user and __rcu are kernel specific too.
Only BPF and BTF can meaningfully consume all three.
drgn can consume __percpu.

In that sense if GCC follows LLVM and emits compiler specific DWARF tag
pahole can convert it to the same BTF regardless whether kernel
was compiled with clang or gcc.
drgn can consume dwarf generated by clang or gcc as well even when BTF
is not there. That is the fastest way forward.

In that sense it would be nice to have common DWARF tag for pointer
annotations, but it's not mandatory. The time is the most valuable asset.
Implementing GCC specific DWARF tag doesn't require committee voting
and the mailing list bikeshedding.

> 3) Addition of C-family language-level constructions to specify
>    free-text tags on certain language elements, such as struct fields.
> 
>    These are the attributes, or built-ins or whatever syntax.
> 
>    Note that, strictly speaking:
>    - This is orthogonal to both DWARF and BTF, and any other supported
>      debugging format, which may or may not be expressive enough to
>      convey the free-form text tag.
>    - This is not specific to BPF.
> 
>    Therefore I would avoid any reference to BTF or BPF in the attribute
>    names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>    makes very little sense to me; the attribute name ought to be more
>    generic.

Let's agree to disagree.
When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
come up with the best ISA that would JIT to those architectures the best
possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
only. Hence it's called this way. Whenever actual users will appear that need
free-text tags on a struct field then and only then will be the time to discuss
generic tag name. Just because "free-text tag on a struct field" sounds generic
it doesn't mean that it has any use case beyond what we're using it for in BPF
land. It goes back to the point of coding now instead of talking about coding.
If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
go ahead and code it this way. The include/linux/compiler.h can accommodate it.
Yonghong Song Dec. 18, 2021, 8:15 p.m. UTC | #6
On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>
>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>     as members of struct types.
>>
>>     The motivation for this was originally the way the Linux kernel
>>     generates its BTF information, using pahole, using DWARF as a source.
>>     As we discussed in our last exchange on this topic, this is
>>     accidental, i.e. if the kernel switched to generate BTF directly from
>>     the compiler and the linker could merge/deduplicate BTF, there would
>>     be no need for using DWARF to act as the "unwilling conveyer" of this
>>     information.  There are additional benefits of this second approach.
>>     Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>
>>     However, it now seems that a DWARF consumer, the drgn project, would
>>     also benefit from having such a support in DWARF to distinguish
>>     between different kind of pointers.
> 
> drgn can use .percpu section in vmlinux for global percpu vars.
> For pointers the annotation is indeed necessary.
> 
>>     So it seems to me that now we have two use-cases for adding support
>>     for these free-text tags to DWARF, as a proper extension to the
>>     format, strictly unrelated to BTF, BPF or even the kernel, since:
>>     - This is not kernel specific.
>>     - This is not directly related to BTF.
>>     - This is not directly related to BPF.
> 
> __percpu annotation is kernel specific.
> __user and __rcu are kernel specific too.
> Only BPF and BTF can meaningfully consume all three.
> drgn can consume __percpu.
> 
> In that sense if GCC follows LLVM and emits compiler specific DWARF tag
> pahole can convert it to the same BTF regardless whether kernel
> was compiled with clang or gcc.
> drgn can consume dwarf generated by clang or gcc as well even when BTF
> is not there. That is the fastest way forward.
> 
> In that sense it would be nice to have common DWARF tag for pointer
> annotations, but it's not mandatory. The time is the most valuable asset.
> Implementing GCC specific DWARF tag doesn't require committee voting
> and the mailing list bikeshedding.
> 
>> 3) Addition of C-family language-level constructions to specify
>>     free-text tags on certain language elements, such as struct fields.
>>
>>     These are the attributes, or built-ins or whatever syntax.
>>
>>     Note that, strictly speaking:
>>     - This is orthogonal to both DWARF and BTF, and any other supported
>>       debugging format, which may or may not be expressive enough to
>>       convey the free-form text tag.
>>     - This is not specific to BPF.
>>
>>     Therefore I would avoid any reference to BTF or BPF in the attribute
>>     names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>     makes very little sense to me; the attribute name ought to be more
>>     generic.
> 
> Let's agree to disagree.
> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
> come up with the best ISA that would JIT to those architectures the best
> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
> only. Hence it's called this way. Whenever actual users will appear that need
> free-text tags on a struct field then and only then will be the time to discuss
> generic tag name. Just because "free-text tag on a struct field" sounds generic
> it doesn't mean that it has any use case beyond what we're using it for in BPF
> land. It goes back to the point of coding now instead of talking about coding.
> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
> go ahead and code it this way. The include/linux/compiler.h can accommodate it.

Just want to add a little bit context for this. In the beginning when we
proposed to add the attribute, we named as a generic name like 'tag' (or 
something like that). But eventually upstream suggested 'btf_tag' since
the use case we proposed is for bpf. At that point, we don't know
drgn use cases yet. Even with that, the use cases are still just for
linux kernel.

At that time, some *similar* use cases did came up, e.g., for
swift<->C++ conversion encoding ("tag name", "attribute info") for
attributes in the source code, will help a lot. But they will use a 
different "tag name" than btf_tag to differentiate.
Jose E. Marchesi Dec. 20, 2021, 9:49 a.m. UTC | #7
> On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
>> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>>
>>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>>     as members of struct types.
>>>
>>>     The motivation for this was originally the way the Linux kernel
>>>     generates its BTF information, using pahole, using DWARF as a source.
>>>     As we discussed in our last exchange on this topic, this is
>>>     accidental, i.e. if the kernel switched to generate BTF directly from
>>>     the compiler and the linker could merge/deduplicate BTF, there would
>>>     be no need for using DWARF to act as the "unwilling conveyer" of this
>>>     information.  There are additional benefits of this second approach.
>>>     Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>>
>>>     However, it now seems that a DWARF consumer, the drgn project, would
>>>     also benefit from having such a support in DWARF to distinguish
>>>     between different kind of pointers.
>> drgn can use .percpu section in vmlinux for global percpu vars.
>> For pointers the annotation is indeed necessary.
>> 
>>>     So it seems to me that now we have two use-cases for adding support
>>>     for these free-text tags to DWARF, as a proper extension to the
>>>     format, strictly unrelated to BTF, BPF or even the kernel, since:
>>>     - This is not kernel specific.
>>>     - This is not directly related to BTF.
>>>     - This is not directly related to BPF.
>> __percpu annotation is kernel specific.
>> __user and __rcu are kernel specific too.
>> Only BPF and BTF can meaningfully consume all three.
>> drgn can consume __percpu.
>> In that sense if GCC follows LLVM and emits compiler specific DWARF
>> tag
>> pahole can convert it to the same BTF regardless whether kernel
>> was compiled with clang or gcc.
>> drgn can consume dwarf generated by clang or gcc as well even when BTF
>> is not there. That is the fastest way forward.
>> In that sense it would be nice to have common DWARF tag for pointer
>> annotations, but it's not mandatory. The time is the most valuable asset.
>> Implementing GCC specific DWARF tag doesn't require committee voting
>> and the mailing list bikeshedding.
>> 
>>> 3) Addition of C-family language-level constructions to specify
>>>     free-text tags on certain language elements, such as struct fields.
>>>
>>>     These are the attributes, or built-ins or whatever syntax.
>>>
>>>     Note that, strictly speaking:
>>>     - This is orthogonal to both DWARF and BTF, and any other supported
>>>       debugging format, which may or may not be expressive enough to
>>>       convey the free-form text tag.
>>>     - This is not specific to BPF.
>>>
>>>     Therefore I would avoid any reference to BTF or BPF in the attribute
>>>     names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>>     makes very little sense to me; the attribute name ought to be more
>>>     generic.
>> Let's agree to disagree.
>> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
>> come up with the best ISA that would JIT to those architectures the best
>> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
>> only. Hence it's called this way. Whenever actual users will appear that need
>> free-text tags on a struct field then and only then will be the time to discuss
>> generic tag name. Just because "free-text tag on a struct field" sounds generic
>> it doesn't mean that it has any use case beyond what we're using it for in BPF
>> land. It goes back to the point of coding now instead of talking about coding.
>> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
>> go ahead and code it this way. The include/linux/compiler.h can accommodate it.
>
> Just want to add a little bit context for this. In the beginning when
> we proposed to add the attribute, we named as a generic name like
> 'tag' (or something like that). But eventually upstream suggested
> 'btf_tag' since the use case we proposed is for bpf. At that point, we
> don't know drgn use cases yet. Even with that, the use cases are still
> just for linux kernel.
>
> At that time, some *similar* use cases did came up, e.g., for
> swift<->C++ conversion encoding ("tag name", "attribute info") for
> attributes in the source code, will help a lot. But they will use a
> different "tag name" than btf_tag to differentiate.

Thanks for the info.

I find it very interesting that the LLVM people prefers to have several
"use case specific" tag names instead of something more generic, which
is the exact opposite of what I would have done :) They may have
appealing reasons for doing so.  Do you have a pointer to the dicussion
you had upstream at hand?

Anyway, I will taste the waters with the other GCC hackers about both
DIEs and attribute and see what we can come out with.  Thanks again for
reaching out Yonghong.
Yonghong Song Dec. 20, 2021, 3:52 p.m. UTC | #8
On 12/20/21 1:49 AM, Jose E. Marchesi wrote:
> 
>> On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
>>> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>>>
>>>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>>>      as members of struct types.
>>>>
>>>>      The motivation for this was originally the way the Linux kernel
>>>>      generates its BTF information, using pahole, using DWARF as a source.
>>>>      As we discussed in our last exchange on this topic, this is
>>>>      accidental, i.e. if the kernel switched to generate BTF directly from
>>>>      the compiler and the linker could merge/deduplicate BTF, there would
>>>>      be no need for using DWARF to act as the "unwilling conveyer" of this
>>>>      information.  There are additional benefits of this second approach.
>>>>      Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>>>
>>>>      However, it now seems that a DWARF consumer, the drgn project, would
>>>>      also benefit from having such a support in DWARF to distinguish
>>>>      between different kind of pointers.
>>> drgn can use .percpu section in vmlinux for global percpu vars.
>>> For pointers the annotation is indeed necessary.
>>>
>>>>      So it seems to me that now we have two use-cases for adding support
>>>>      for these free-text tags to DWARF, as a proper extension to the
>>>>      format, strictly unrelated to BTF, BPF or even the kernel, since:
>>>>      - This is not kernel specific.
>>>>      - This is not directly related to BTF.
>>>>      - This is not directly related to BPF.
>>> __percpu annotation is kernel specific.
>>> __user and __rcu are kernel specific too.
>>> Only BPF and BTF can meaningfully consume all three.
>>> drgn can consume __percpu.
>>> In that sense if GCC follows LLVM and emits compiler specific DWARF
>>> tag
>>> pahole can convert it to the same BTF regardless whether kernel
>>> was compiled with clang or gcc.
>>> drgn can consume dwarf generated by clang or gcc as well even when BTF
>>> is not there. That is the fastest way forward.
>>> In that sense it would be nice to have common DWARF tag for pointer
>>> annotations, but it's not mandatory. The time is the most valuable asset.
>>> Implementing GCC specific DWARF tag doesn't require committee voting
>>> and the mailing list bikeshedding.
>>>
>>>> 3) Addition of C-family language-level constructions to specify
>>>>      free-text tags on certain language elements, such as struct fields.
>>>>
>>>>      These are the attributes, or built-ins or whatever syntax.
>>>>
>>>>      Note that, strictly speaking:
>>>>      - This is orthogonal to both DWARF and BTF, and any other supported
>>>>        debugging format, which may or may not be expressive enough to
>>>>        convey the free-form text tag.
>>>>      - This is not specific to BPF.
>>>>
>>>>      Therefore I would avoid any reference to BTF or BPF in the attribute
>>>>      names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>>>      makes very little sense to me; the attribute name ought to be more
>>>>      generic.
>>> Let's agree to disagree.
>>> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
>>> come up with the best ISA that would JIT to those architectures the best
>>> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
>>> only. Hence it's called this way. Whenever actual users will appear that need
>>> free-text tags on a struct field then and only then will be the time to discuss
>>> generic tag name. Just because "free-text tag on a struct field" sounds generic
>>> it doesn't mean that it has any use case beyond what we're using it for in BPF
>>> land. It goes back to the point of coding now instead of talking about coding.
>>> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
>>> go ahead and code it this way. The include/linux/compiler.h can accommodate it.
>>
>> Just want to add a little bit context for this. In the beginning when
>> we proposed to add the attribute, we named as a generic name like
>> 'tag' (or something like that). But eventually upstream suggested
>> 'btf_tag' since the use case we proposed is for bpf. At that point, we
>> don't know drgn use cases yet. Even with that, the use cases are still
>> just for linux kernel.
>>
>> At that time, some *similar* use cases did came up, e.g., for
>> swift<->C++ conversion encoding ("tag name", "attribute info") for
>> attributes in the source code, will help a lot. But they will use a
>> different "tag name" than btf_tag to differentiate.
> 
> Thanks for the info.
> 
> I find it very interesting that the LLVM people prefers to have several
> "use case specific" tag names instead of something more generic, which
> is the exact opposite of what I would have done :) They may have
> appealing reasons for doing so.  Do you have a pointer to the dicussion
> you had upstream at hand?

Jose, the llvm-dev discussion link is below:
   https://lists.llvm.org/pipermail/llvm-dev/2021-June/151009.html

> 
> Anyway, I will taste the waters with the other GCC hackers about both
> DIEs and attribute and see what we can come out with.  Thanks again for
> reaching out Yonghong.

Thanks!
Yonghong Song Jan. 25, 2022, 3:58 a.m. UTC | #9
On 12/20/21 1:49 AM, Jose E. Marchesi wrote:
> 
>> On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
>>> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>>>
>>>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>>>      as members of struct types.
>>>>
>>>>      The motivation for this was originally the way the Linux kernel
>>>>      generates its BTF information, using pahole, using DWARF as a source.
>>>>      As we discussed in our last exchange on this topic, this is
>>>>      accidental, i.e. if the kernel switched to generate BTF directly from
>>>>      the compiler and the linker could merge/deduplicate BTF, there would
>>>>      be no need for using DWARF to act as the "unwilling conveyer" of this
>>>>      information.  There are additional benefits of this second approach.
>>>>      Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>>>
>>>>      However, it now seems that a DWARF consumer, the drgn project, would
>>>>      also benefit from having such a support in DWARF to distinguish
>>>>      between different kind of pointers.
>>> drgn can use .percpu section in vmlinux for global percpu vars.
>>> For pointers the annotation is indeed necessary.
>>>
>>>>      So it seems to me that now we have two use-cases for adding support
>>>>      for these free-text tags to DWARF, as a proper extension to the
>>>>      format, strictly unrelated to BTF, BPF or even the kernel, since:
>>>>      - This is not kernel specific.
>>>>      - This is not directly related to BTF.
>>>>      - This is not directly related to BPF.
>>> __percpu annotation is kernel specific.
>>> __user and __rcu are kernel specific too.
>>> Only BPF and BTF can meaningfully consume all three.
>>> drgn can consume __percpu.
>>> In that sense if GCC follows LLVM and emits compiler specific DWARF
>>> tag
>>> pahole can convert it to the same BTF regardless whether kernel
>>> was compiled with clang or gcc.
>>> drgn can consume dwarf generated by clang or gcc as well even when BTF
>>> is not there. That is the fastest way forward.
>>> In that sense it would be nice to have common DWARF tag for pointer
>>> annotations, but it's not mandatory. The time is the most valuable asset.
>>> Implementing GCC specific DWARF tag doesn't require committee voting
>>> and the mailing list bikeshedding.
>>>
>>>> 3) Addition of C-family language-level constructions to specify
>>>>      free-text tags on certain language elements, such as struct fields.
>>>>
>>>>      These are the attributes, or built-ins or whatever syntax.
>>>>
>>>>      Note that, strictly speaking:
>>>>      - This is orthogonal to both DWARF and BTF, and any other supported
>>>>        debugging format, which may or may not be expressive enough to
>>>>        convey the free-form text tag.
>>>>      - This is not specific to BPF.
>>>>
>>>>      Therefore I would avoid any reference to BTF or BPF in the attribute
>>>>      names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>>>      makes very little sense to me; the attribute name ought to be more
>>>>      generic.
>>> Let's agree to disagree.
>>> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
>>> come up with the best ISA that would JIT to those architectures the best
>>> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
>>> only. Hence it's called this way. Whenever actual users will appear that need
>>> free-text tags on a struct field then and only then will be the time to discuss
>>> generic tag name. Just because "free-text tag on a struct field" sounds generic
>>> it doesn't mean that it has any use case beyond what we're using it for in BPF
>>> land. It goes back to the point of coding now instead of talking about coding.
>>> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
>>> go ahead and code it this way. The include/linux/compiler.h can accommodate it.
>>
>> Just want to add a little bit context for this. In the beginning when
>> we proposed to add the attribute, we named as a generic name like
>> 'tag' (or something like that). But eventually upstream suggested
>> 'btf_tag' since the use case we proposed is for bpf. At that point, we
>> don't know drgn use cases yet. Even with that, the use cases are still
>> just for linux kernel.
>>
>> At that time, some *similar* use cases did came up, e.g., for
>> swift<->C++ conversion encoding ("tag name", "attribute info") for
>> attributes in the source code, will help a lot. But they will use a
>> different "tag name" than btf_tag to differentiate.
> 
> Thanks for the info.
> 
> I find it very interesting that the LLVM people prefers to have several
> "use case specific" tag names instead of something more generic, which
> is the exact opposite of what I would have done :) They may have
> appealing reasons for doing so.  Do you have a pointer to the dicussion
> you had upstream at hand?
> 
> Anyway, I will taste the waters with the other GCC hackers about both
> DIEs and attribute and see what we can come out with.  Thanks again for
> reaching out Yonghong.

Hi, Jose,

Any progress on gcc btf_tag support discussion? If possible, could
you add me to the discussion mailing list so I may help to move
the project forward? Thanks a lot!

Yonghong
Jose E. Marchesi Jan. 27, 2022, 3:38 p.m. UTC | #10
> On 12/20/21 1:49 AM, Jose E. Marchesi wrote:
>> 
>>> On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
>>>> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>>>>
>>>>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>>>>      as members of struct types.
>>>>>
>>>>>      The motivation for this was originally the way the Linux kernel
>>>>>      generates its BTF information, using pahole, using DWARF as a source.
>>>>>      As we discussed in our last exchange on this topic, this is
>>>>>      accidental, i.e. if the kernel switched to generate BTF directly from
>>>>>      the compiler and the linker could merge/deduplicate BTF, there would
>>>>>      be no need for using DWARF to act as the "unwilling conveyer" of this
>>>>>      information.  There are additional benefits of this second approach.
>>>>>      Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>>>>
>>>>>      However, it now seems that a DWARF consumer, the drgn project, would
>>>>>      also benefit from having such a support in DWARF to distinguish
>>>>>      between different kind of pointers.
>>>> drgn can use .percpu section in vmlinux for global percpu vars.
>>>> For pointers the annotation is indeed necessary.
>>>>
>>>>>      So it seems to me that now we have two use-cases for adding support
>>>>>      for these free-text tags to DWARF, as a proper extension to the
>>>>>      format, strictly unrelated to BTF, BPF or even the kernel, since:
>>>>>      - This is not kernel specific.
>>>>>      - This is not directly related to BTF.
>>>>>      - This is not directly related to BPF.
>>>> __percpu annotation is kernel specific.
>>>> __user and __rcu are kernel specific too.
>>>> Only BPF and BTF can meaningfully consume all three.
>>>> drgn can consume __percpu.
>>>> In that sense if GCC follows LLVM and emits compiler specific DWARF
>>>> tag
>>>> pahole can convert it to the same BTF regardless whether kernel
>>>> was compiled with clang or gcc.
>>>> drgn can consume dwarf generated by clang or gcc as well even when BTF
>>>> is not there. That is the fastest way forward.
>>>> In that sense it would be nice to have common DWARF tag for pointer
>>>> annotations, but it's not mandatory. The time is the most valuable asset.
>>>> Implementing GCC specific DWARF tag doesn't require committee voting
>>>> and the mailing list bikeshedding.
>>>>
>>>>> 3) Addition of C-family language-level constructions to specify
>>>>>      free-text tags on certain language elements, such as struct fields.
>>>>>
>>>>>      These are the attributes, or built-ins or whatever syntax.
>>>>>
>>>>>      Note that, strictly speaking:
>>>>>      - This is orthogonal to both DWARF and BTF, and any other supported
>>>>>        debugging format, which may or may not be expressive enough to
>>>>>        convey the free-form text tag.
>>>>>      - This is not specific to BPF.
>>>>>
>>>>>      Therefore I would avoid any reference to BTF or BPF in the attribute
>>>>>      names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>>>>      makes very little sense to me; the attribute name ought to be more
>>>>>      generic.
>>>> Let's agree to disagree.
>>>> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
>>>> come up with the best ISA that would JIT to those architectures the best
>>>> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
>>>> only. Hence it's called this way. Whenever actual users will appear that need
>>>> free-text tags on a struct field then and only then will be the time to discuss
>>>> generic tag name. Just because "free-text tag on a struct field" sounds generic
>>>> it doesn't mean that it has any use case beyond what we're using it for in BPF
>>>> land. It goes back to the point of coding now instead of talking about coding.
>>>> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
>>>> go ahead and code it this way. The include/linux/compiler.h can accommodate it.
>>>
>>> Just want to add a little bit context for this. In the beginning when
>>> we proposed to add the attribute, we named as a generic name like
>>> 'tag' (or something like that). But eventually upstream suggested
>>> 'btf_tag' since the use case we proposed is for bpf. At that point, we
>>> don't know drgn use cases yet. Even with that, the use cases are still
>>> just for linux kernel.
>>>
>>> At that time, some *similar* use cases did came up, e.g., for
>>> swift<->C++ conversion encoding ("tag name", "attribute info") for
>>> attributes in the source code, will help a lot. But they will use a
>>> different "tag name" than btf_tag to differentiate.
>> Thanks for the info.
>> I find it very interesting that the LLVM people prefers to have
>> several
>> "use case specific" tag names instead of something more generic, which
>> is the exact opposite of what I would have done :) They may have
>> appealing reasons for doing so.  Do you have a pointer to the dicussion
>> you had upstream at hand?
>> Anyway, I will taste the waters with the other GCC hackers about
>> both
>> DIEs and attribute and see what we can come out with.  Thanks again for
>> reaching out Yonghong.
>
> Hi, Jose,
>
> Any progress on gcc btf_tag support discussion? If possible, could
> you add me to the discussion mailing list so I may help to move
> the project forward? Thanks a lot!

We are in the process of implementing the support of the BTF extensions
(which is done) and the C language attributes (which is WIP.)

I haven't started the discussion about DWARF yet.  Will do shortly.  You
will be in CC :)
Yonghong Song Jan. 27, 2022, 4:42 p.m. UTC | #11
On 1/27/22 7:38 AM, Jose E. Marchesi wrote:
> 
>> On 12/20/21 1:49 AM, Jose E. Marchesi wrote:
>>>
>>>> On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
>>>>> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>>>>>
>>>>>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>>>>>       as members of struct types.
>>>>>>
>>>>>>       The motivation for this was originally the way the Linux kernel
>>>>>>       generates its BTF information, using pahole, using DWARF as a source.
>>>>>>       As we discussed in our last exchange on this topic, this is
>>>>>>       accidental, i.e. if the kernel switched to generate BTF directly from
>>>>>>       the compiler and the linker could merge/deduplicate BTF, there would
>>>>>>       be no need for using DWARF to act as the "unwilling conveyer" of this
>>>>>>       information.  There are additional benefits of this second approach.
>>>>>>       Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>>>>>
>>>>>>       However, it now seems that a DWARF consumer, the drgn project, would
>>>>>>       also benefit from having such a support in DWARF to distinguish
>>>>>>       between different kind of pointers.
>>>>> drgn can use .percpu section in vmlinux for global percpu vars.
>>>>> For pointers the annotation is indeed necessary.
>>>>>
>>>>>>       So it seems to me that now we have two use-cases for adding support
>>>>>>       for these free-text tags to DWARF, as a proper extension to the
>>>>>>       format, strictly unrelated to BTF, BPF or even the kernel, since:
>>>>>>       - This is not kernel specific.
>>>>>>       - This is not directly related to BTF.
>>>>>>       - This is not directly related to BPF.
>>>>> __percpu annotation is kernel specific.
>>>>> __user and __rcu are kernel specific too.
>>>>> Only BPF and BTF can meaningfully consume all three.
>>>>> drgn can consume __percpu.
>>>>> In that sense if GCC follows LLVM and emits compiler specific DWARF
>>>>> tag
>>>>> pahole can convert it to the same BTF regardless whether kernel
>>>>> was compiled with clang or gcc.
>>>>> drgn can consume dwarf generated by clang or gcc as well even when BTF
>>>>> is not there. That is the fastest way forward.
>>>>> In that sense it would be nice to have common DWARF tag for pointer
>>>>> annotations, but it's not mandatory. The time is the most valuable asset.
>>>>> Implementing GCC specific DWARF tag doesn't require committee voting
>>>>> and the mailing list bikeshedding.
>>>>>
>>>>>> 3) Addition of C-family language-level constructions to specify
>>>>>>       free-text tags on certain language elements, such as struct fields.
>>>>>>
>>>>>>       These are the attributes, or built-ins or whatever syntax.
>>>>>>
>>>>>>       Note that, strictly speaking:
>>>>>>       - This is orthogonal to both DWARF and BTF, and any other supported
>>>>>>         debugging format, which may or may not be expressive enough to
>>>>>>         convey the free-form text tag.
>>>>>>       - This is not specific to BPF.
>>>>>>
>>>>>>       Therefore I would avoid any reference to BTF or BPF in the attribute
>>>>>>       names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>>>>>       makes very little sense to me; the attribute name ought to be more
>>>>>>       generic.
>>>>> Let's agree to disagree.
>>>>> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
>>>>> come up with the best ISA that would JIT to those architectures the best
>>>>> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
>>>>> only. Hence it's called this way. Whenever actual users will appear that need
>>>>> free-text tags on a struct field then and only then will be the time to discuss
>>>>> generic tag name. Just because "free-text tag on a struct field" sounds generic
>>>>> it doesn't mean that it has any use case beyond what we're using it for in BPF
>>>>> land. It goes back to the point of coding now instead of talking about coding.
>>>>> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
>>>>> go ahead and code it this way. The include/linux/compiler.h can accommodate it.
>>>>
>>>> Just want to add a little bit context for this. In the beginning when
>>>> we proposed to add the attribute, we named as a generic name like
>>>> 'tag' (or something like that). But eventually upstream suggested
>>>> 'btf_tag' since the use case we proposed is for bpf. At that point, we
>>>> don't know drgn use cases yet. Even with that, the use cases are still
>>>> just for linux kernel.
>>>>
>>>> At that time, some *similar* use cases did came up, e.g., for
>>>> swift<->C++ conversion encoding ("tag name", "attribute info") for
>>>> attributes in the source code, will help a lot. But they will use a
>>>> different "tag name" than btf_tag to differentiate.
>>> Thanks for the info.
>>> I find it very interesting that the LLVM people prefers to have
>>> several
>>> "use case specific" tag names instead of something more generic, which
>>> is the exact opposite of what I would have done :) They may have
>>> appealing reasons for doing so.  Do you have a pointer to the dicussion
>>> you had upstream at hand?
>>> Anyway, I will taste the waters with the other GCC hackers about
>>> both
>>> DIEs and attribute and see what we can come out with.  Thanks again for
>>> reaching out Yonghong.
>>
>> Hi, Jose,
>>
>> Any progress on gcc btf_tag support discussion? If possible, could
>> you add me to the discussion mailing list so I may help to move
>> the project forward? Thanks a lot!
> 
> We are in the process of implementing the support of the BTF extensions
> (which is done) and the C language attributes (which is WIP.)

Sounds good. I am happy to answer questions if you have any.

> 
> I haven't started the discussion about DWARF yet.  Will do shortly.  You
> will be in CC :)

Thanks a lot, Jose! I am looking forward to the discussion.
Jose E. Marchesi Feb. 17, 2022, 1:20 p.m. UTC | #12
> On 1/27/22 7:38 AM, Jose E. Marchesi wrote:
>> 
>>> On 12/20/21 1:49 AM, Jose E. Marchesi wrote:
>>>>
>>>>> On 12/17/21 5:44 PM, Alexei Starovoitov wrote:
>>>>>> On Fri, Dec 17, 2021 at 11:40:10AM +0100, Jose E. Marchesi wrote:
>>>>>>>
>>>>>>> 2) The need for DWARF to convey free-text tags on certain elements, such
>>>>>>>       as members of struct types.
>>>>>>>
>>>>>>>       The motivation for this was originally the way the Linux kernel
>>>>>>>       generates its BTF information, using pahole, using DWARF as a source.
>>>>>>>       As we discussed in our last exchange on this topic, this is
>>>>>>>       accidental, i.e. if the kernel switched to generate BTF directly from
>>>>>>>       the compiler and the linker could merge/deduplicate BTF, there would
>>>>>>>       be no need for using DWARF to act as the "unwilling conveyer" of this
>>>>>>>       information.  There are additional benefits of this second approach.
>>>>>>>       Thats why we didn't plan to add these extended DWARF DIEs to GCC.
>>>>>>>
>>>>>>>       However, it now seems that a DWARF consumer, the drgn project, would
>>>>>>>       also benefit from having such a support in DWARF to distinguish
>>>>>>>       between different kind of pointers.
>>>>>> drgn can use .percpu section in vmlinux for global percpu vars.
>>>>>> For pointers the annotation is indeed necessary.
>>>>>>
>>>>>>>       So it seems to me that now we have two use-cases for adding support
>>>>>>>       for these free-text tags to DWARF, as a proper extension to the
>>>>>>>       format, strictly unrelated to BTF, BPF or even the kernel, since:
>>>>>>>       - This is not kernel specific.
>>>>>>>       - This is not directly related to BTF.
>>>>>>>       - This is not directly related to BPF.
>>>>>> __percpu annotation is kernel specific.
>>>>>> __user and __rcu are kernel specific too.
>>>>>> Only BPF and BTF can meaningfully consume all three.
>>>>>> drgn can consume __percpu.
>>>>>> In that sense if GCC follows LLVM and emits compiler specific DWARF
>>>>>> tag
>>>>>> pahole can convert it to the same BTF regardless whether kernel
>>>>>> was compiled with clang or gcc.
>>>>>> drgn can consume dwarf generated by clang or gcc as well even when BTF
>>>>>> is not there. That is the fastest way forward.
>>>>>> In that sense it would be nice to have common DWARF tag for pointer
>>>>>> annotations, but it's not mandatory. The time is the most valuable asset.
>>>>>> Implementing GCC specific DWARF tag doesn't require committee voting
>>>>>> and the mailing list bikeshedding.
>>>>>>
>>>>>>> 3) Addition of C-family language-level constructions to specify
>>>>>>>       free-text tags on certain language elements, such as struct fields.
>>>>>>>
>>>>>>>       These are the attributes, or built-ins or whatever syntax.
>>>>>>>
>>>>>>>       Note that, strictly speaking:
>>>>>>>       - This is orthogonal to both DWARF and BTF, and any other supported
>>>>>>>         debugging format, which may or may not be expressive enough to
>>>>>>>         convey the free-form text tag.
>>>>>>>       - This is not specific to BPF.
>>>>>>>
>>>>>>>       Therefore I would avoid any reference to BTF or BPF in the attribute
>>>>>>>       names.  Something like `__attribute__((btf_tag("arbitrary_str")))'
>>>>>>>       makes very little sense to me; the attribute name ought to be more
>>>>>>>       generic.
>>>>>> Let's agree to disagree.
>>>>>> When BPF ISA was designed we didn't go to Intel, Arm, Mips, etc in order to
>>>>>> come up with the best ISA that would JIT to those architectures the best
>>>>>> possible way. Same thing with btf_tag. Today it is specific to BTF and BPF
>>>>>> only. Hence it's called this way. Whenever actual users will appear that need
>>>>>> free-text tags on a struct field then and only then will be the time to discuss
>>>>>> generic tag name. Just because "free-text tag on a struct field" sounds generic
>>>>>> it doesn't mean that it has any use case beyond what we're using it for in BPF
>>>>>> land. It goes back to the point of coding now instead of talking about coding.
>>>>>> If gcc wants to call it __attribute__((my_precious_gcc_tag("arbitrary_str")))
>>>>>> go ahead and code it this way. The include/linux/compiler.h can accommodate it.
>>>>>
>>>>> Just want to add a little bit context for this. In the beginning when
>>>>> we proposed to add the attribute, we named as a generic name like
>>>>> 'tag' (or something like that). But eventually upstream suggested
>>>>> 'btf_tag' since the use case we proposed is for bpf. At that point, we
>>>>> don't know drgn use cases yet. Even with that, the use cases are still
>>>>> just for linux kernel.
>>>>>
>>>>> At that time, some *similar* use cases did came up, e.g., for
>>>>> swift<->C++ conversion encoding ("tag name", "attribute info") for
>>>>> attributes in the source code, will help a lot. But they will use a
>>>>> different "tag name" than btf_tag to differentiate.
>>>> Thanks for the info.
>>>> I find it very interesting that the LLVM people prefers to have
>>>> several
>>>> "use case specific" tag names instead of something more generic, which
>>>> is the exact opposite of what I would have done :) They may have
>>>> appealing reasons for doing so.  Do you have a pointer to the dicussion
>>>> you had upstream at hand?
>>>> Anyway, I will taste the waters with the other GCC hackers about
>>>> both
>>>> DIEs and attribute and see what we can come out with.  Thanks again for
>>>> reaching out Yonghong.
>>>
>>> Hi, Jose,
>>>
>>> Any progress on gcc btf_tag support discussion? If possible, could
>>> you add me to the discussion mailing list so I may help to move
>>> the project forward? Thanks a lot!
>> We are in the process of implementing the support of the BTF
>> extensions
>> (which is done) and the C language attributes (which is WIP.)
>
> Sounds good. I am happy to answer questions if you have any.
>
>> I haven't started the discussion about DWARF yet.  Will do shortly.
>> You
>> will be in CC :)
>
> Thanks a lot, Jose! I am looking forward to the discussion.

Just a heads-up.

We are still working on the GCC implementation of the tags.  Having some
difficulties with the ordering of the C type attributes.

Regarding the DWARF part, GCC uses DWARF as the internal "canonical"
debug info, and the BTF is generated from it.  This means we had to add
a DWARF DIE for the pointer tag qualifier anyway in order to convey the
info to BTF.  So now it is just a matter of emitting it along with the
rest of the DWARF.
Alexei Starovoitov Feb. 17, 2022, 3:28 p.m. UTC | #13
On Thu, Feb 17, 2022 at 5:20 AM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
> Just a heads-up.
>
> We are still working on the GCC implementation of the tags.  Having some
> difficulties with the ordering of the C type attributes.
>
> Regarding the DWARF part, GCC uses DWARF as the internal "canonical"
> debug info, and the BTF is generated from it.  This means we had to add
> a DWARF DIE for the pointer tag qualifier anyway in order to convey the
> info to BTF.  So now it is just a matter of emitting it along with the
> rest of the DWARF.

Thanks for the update!
Do you have an early git branch we can use to test building
the kernel with it?
Or is it not at this level yet?
Jose E. Marchesi Feb. 17, 2022, 4:41 p.m. UTC | #14
> On Thu, Feb 17, 2022 at 5:20 AM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>> Just a heads-up.
>>
>> We are still working on the GCC implementation of the tags.  Having some
>> difficulties with the ordering of the C type attributes.
>>
>> Regarding the DWARF part, GCC uses DWARF as the internal "canonical"
>> debug info, and the BTF is generated from it.  This means we had to add
>> a DWARF DIE for the pointer tag qualifier anyway in order to convey the
>> info to BTF.  So now it is just a matter of emitting it along with the
>> rest of the DWARF.
>
> Thanks for the update!
> Do you have an early git branch we can use to test building
> the kernel with it?
> Or is it not at this level yet?

Not yet.

Once we have something working internally we will submit the patches to
gcc-patches for discussion.  At that point we can put them in a branch
for early testing.