mbox series

[RFC,dwarves,0/6] Encoding function addresses using DECL_TAGs

Message ID 20230517161648.17582-1-alan.maguire@oracle.com (mailing list archive)
Headers show
Series Encoding function addresses using DECL_TAGs | expand

Message

Alan Maguire May 17, 2023, 4:16 p.m. UTC
As a means to continue the discussion in [1], which is
concerned with finding the best long-term solution to
having a BPF Type Format (BTF) representation of
functions that is usable for tracing of edge cases, this
proof-of-concept series is intended to explore one approach
to adding information to help make tracing more accurate.

A key problem today is that there is no matching from function
description to the actual instances of a function.

When that function only has one description, that is
not an issue, but if we have multiple inconsistent
static functions in different CUs such as

From kernel/irq/irqdesc.c
    
    static ssize_t wakeup_show(struct kobject *kobj,
                               struct kobj_attribute *attr, char *buf)
    
...and from drivers/base/power/sysfs.c
    
    static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
                               char *buf);

...this becomes a problem.  If I am attaching,
which do I want?  And even if I know which one
I want, which instance in kallsyms is which?

This series is a proof-of-concept that supports encoding
function addresses and associating them with BTF FUNC
descriptions using BTF declaration tags.

More work would need to be done on the kernel side
to _use_ this representation, but hopefully having a
rough approach outlined will help make that more feasible.

[1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/

Alan Maguire (6):
  btf_encoder: record function address and if it is local
  dwarf_loader: store address in function low_pc if available
  dwarf_loader: transfer low_pc info from subtroutine to its abstract
    origin
  btf_encoder: add "addr=0x<addr>" function declaration tag if
    --btf_gen_func_addr specified
  btf_encoder: store ELF function representations sorted by name _and_
    address
  pahole: document --btf_gen_func_addr

 btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
 btf_encoder.h      |  4 +--
 dwarf_loader.c     | 16 +++++++++---
 dwarves.h          |  3 +++
 man-pages/pahole.1 |  8 ++++++
 pahole.c           | 12 +++++++--
 6 files changed, 85 insertions(+), 22 deletions(-)

Comments

Yafang Shao May 19, 2023, 9:44 a.m. UTC | #1
On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> As a means to continue the discussion in [1], which is
> concerned with finding the best long-term solution to
> having a BPF Type Format (BTF) representation of
> functions that is usable for tracing of edge cases, this
> proof-of-concept series is intended to explore one approach
> to adding information to help make tracing more accurate.
>
> A key problem today is that there is no matching from function
> description to the actual instances of a function.
>
> When that function only has one description, that is
> not an issue, but if we have multiple inconsistent
> static functions in different CUs such as
>
> From kernel/irq/irqdesc.c
>
>     static ssize_t wakeup_show(struct kobject *kobj,
>                                struct kobj_attribute *attr, char *buf)
>
> ...and from drivers/base/power/sysfs.c
>
>     static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>                                char *buf);
>
> ...this becomes a problem.  If I am attaching,
> which do I want?  And even if I know which one
> I want, which instance in kallsyms is which?
>

As you described in the above example,  it is natural to attach a
*function* defined in a specific *file_path*.  So why not encoding the
file path instead ? What's the problem in it?

If we expose the addr and let the user select which address to attach,
it will be a trouble to deploy a bpf program across multiple kernels.
While the file path will have a lower chance to be conflict between
different kernel versions. So I think it would be better if we use the
file path instead and let the kernel find the address automatically.
In the old days, when we wanted to deploy a kprobe kernel module or a
systemtap script across multiple kernels, we had to use if-else in the
code, which was really troublesome as it is not scalable. I don't
think we want to do it the same way in the bpf program.

> This series is a proof-of-concept that supports encoding
> function addresses and associating them with BTF FUNC
> descriptions using BTF declaration tags.
>
> More work would need to be done on the kernel side
> to _use_ this representation, but hopefully having a
> rough approach outlined will help make that more feasible.
>
> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/
>
> Alan Maguire (6):
>   btf_encoder: record function address and if it is local
>   dwarf_loader: store address in function low_pc if available
>   dwarf_loader: transfer low_pc info from subtroutine to its abstract
>     origin
>   btf_encoder: add "addr=0x<addr>" function declaration tag if
>     --btf_gen_func_addr specified
>   btf_encoder: store ELF function representations sorted by name _and_
>     address
>   pahole: document --btf_gen_func_addr
>
>  btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
>  btf_encoder.h      |  4 +--
>  dwarf_loader.c     | 16 +++++++++---
>  dwarves.h          |  3 +++
>  man-pages/pahole.1 |  8 ++++++
>  pahole.c           | 12 +++++++--
>  6 files changed, 85 insertions(+), 22 deletions(-)
>
> --
> 2.31.1
>
Yonghong Song May 19, 2023, 2:46 p.m. UTC | #2
On 5/19/23 2:44 AM, Yafang Shao wrote:
> On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> As a means to continue the discussion in [1], which is
>> concerned with finding the best long-term solution to
>> having a BPF Type Format (BTF) representation of
>> functions that is usable for tracing of edge cases, this
>> proof-of-concept series is intended to explore one approach
>> to adding information to help make tracing more accurate.
>>
>> A key problem today is that there is no matching from function
>> description to the actual instances of a function.
>>
>> When that function only has one description, that is
>> not an issue, but if we have multiple inconsistent
>> static functions in different CUs such as
>>
>>  From kernel/irq/irqdesc.c
>>
>>      static ssize_t wakeup_show(struct kobject *kobj,
>>                                 struct kobj_attribute *attr, char *buf)
>>
>> ...and from drivers/base/power/sysfs.c
>>
>>      static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>>                                 char *buf);
>>
>> ...this becomes a problem.  If I am attaching,
>> which do I want?  And even if I know which one
>> I want, which instance in kallsyms is which?
>>
> 
> As you described in the above example,  it is natural to attach a
> *function* defined in a specific *file_path*.  So why not encoding the
> file path instead ? What's the problem in it?

Say we have the following kernel code:

common.h:
    static inline int foo(...) { ...}

bar1.c:
    #include "common.h."
    ... foo() ...
bar2.c:
    #include "common.h"
    ... foo() ...

Even if the function 'foo' is marked as inline, but the compiler
may not actually inline it. So now we got two static functions
'foo' in bar1.o and bar2.o with identical code path (common.h),
and we do want to differentiate it as user might want to
only trace one of them.

> 
> If we expose the addr and let the user select which address to attach,
> it will be a trouble to deploy a bpf program across multiple kernels.

user space may need a little bit work to decide which address to take
by looking at the vmlinux BTF intending to run, e.g., checking
BTF signatures in most time or in the worst case checking dwarf.

The kernel can then handle addr by doing some relocation if needed.

> While the file path will have a lower chance to be conflict between
> different kernel versions. So I think it would be better if we use the
> file path instead and let the kernel find the address automatically.
> In the old days, when we wanted to deploy a kprobe kernel module or a
> systemtap script across multiple kernels, we had to use if-else in the
> code, which was really troublesome as it is not scalable. I don't
> think we want to do it the same way in the bpf program.
> 
>> This series is a proof-of-concept that supports encoding
>> function addresses and associating them with BTF FUNC
>> descriptions using BTF declaration tags.
>>
>> More work would need to be done on the kernel side
>> to _use_ this representation, but hopefully having a
>> rough approach outlined will help make that more feasible.
>>
>> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/
>>
>> Alan Maguire (6):
>>    btf_encoder: record function address and if it is local
>>    dwarf_loader: store address in function low_pc if available
>>    dwarf_loader: transfer low_pc info from subtroutine to its abstract
>>      origin
>>    btf_encoder: add "addr=0x<addr>" function declaration tag if
>>      --btf_gen_func_addr specified
>>    btf_encoder: store ELF function representations sorted by name _and_
>>      address
>>    pahole: document --btf_gen_func_addr
>>
>>   btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
>>   btf_encoder.h      |  4 +--
>>   dwarf_loader.c     | 16 +++++++++---
>>   dwarves.h          |  3 +++
>>   man-pages/pahole.1 |  8 ++++++
>>   pahole.c           | 12 +++++++--
>>   6 files changed, 85 insertions(+), 22 deletions(-)
>>
>> --
>> 2.31.1
>>
> 
>
Alan Maguire May 19, 2023, 3:08 p.m. UTC | #3
On 19/05/2023 10:44, Yafang Shao wrote:
> On Thu, May 18, 2023 at 12:18 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> As a means to continue the discussion in [1], which is
>> concerned with finding the best long-term solution to
>> having a BPF Type Format (BTF) representation of
>> functions that is usable for tracing of edge cases, this
>> proof-of-concept series is intended to explore one approach
>> to adding information to help make tracing more accurate.
>>
>> A key problem today is that there is no matching from function
>> description to the actual instances of a function.
>>
>> When that function only has one description, that is
>> not an issue, but if we have multiple inconsistent
>> static functions in different CUs such as
>>
>> From kernel/irq/irqdesc.c
>>
>>     static ssize_t wakeup_show(struct kobject *kobj,
>>                                struct kobj_attribute *attr, char *buf)
>>
>> ...and from drivers/base/power/sysfs.c
>>
>>     static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>>                                char *buf);
>>
>> ...this becomes a problem.  If I am attaching,
>> which do I want?  And even if I know which one
>> I want, which instance in kallsyms is which?
>>
> 
> As you described in the above example,  it is natural to attach a
> *function* defined in a specific *file_path*.  So why not encoding the
> file path instead ? What's the problem in it?
> 
> If we expose the addr and let the user select which address to attach,
> it will be a trouble to deploy a bpf program across multiple kernels.
> While the file path will have a lower chance to be conflict between
> different kernel versions. So I think it would be better if we use the
> file path instead and let the kernel find the address automatically.
> In the old days, when we wanted to deploy a kprobe kernel module or a
> systemtap script across multiple kernels, we had to use if-else in the
> code, which was really troublesome as it is not scalable. I don't
> think we want to do it the same way in the bpf program.
> 

I think it's important to distinguish between what we do
in libbpf/kernel to ensure safe attach and what sorts
of tracing capabilities a tracer supports. For example,
a tracer (or indeed libbpf) can take the set of different
instances of cpumask_weight() and because they all have the
same BTF description, can attach to all instances. What we
lose though by not having address-level accuracy is the
ability to do more precise tracing.

I also don't think matching addresses -> BTF ids stops us
having file information in the mix; a "struct btf_func"
could potentially contain a name offset to a string
specifying the file too. My concern though is that
relying on _just_ the file/line will not be enough
in some cases. There are some weird edge cases in the
kernel. I'll give an example where file/line isn't
enough to figure out the mapping.

Consider 'struct elf_note_info'; there are two copies
of this in the vmlinux BTF:

[16819] STRUCT 'elf_note_info' size=248 vlen=8
        'thread' type_id=16817 bits_offset=0
        'psinfo' type_id=16815 bits_offset=64
        'signote' type_id=16815 bits_offset=256
        'auxv' type_id=16815 bits_offset=448
        'files' type_id=16815 bits_offset=640
        'csigdata' type_id=6083 bits_offset=832
        'size' type_id=74 bits_offset=1856
        'thread_notes' type_id=21 bits_offset=1920


[16851] STRUCT 'elf_note_info' size=248 vlen=8
        'thread' type_id=16850 bits_offset=0
        'psinfo' type_id=16815 bits_offset=64
        'signote' type_id=16815 bits_offset=256
        'auxv' type_id=16815 bits_offset=448
        'files' type_id=16815 bits_offset=640
        'csigdata' type_id=6507 bits_offset=832
        'size' type_id=74 bits_offset=1856
        'thread_notes' type_id=21 bits_offset=1920


...and here's the structure itself:

struct elf_note_info {
 	struct elf_thread_core_info *thread;
 	struct memelfnote psinfo;
 	struct memelfnote signote;
 	struct memelfnote auxv;
 	struct memelfnote files;
 	user_siginfo_t csigdata;
 	size_t size;
 	int thread_notes;
};

The reason there are two copies is not a dedup failure;
in the first case user_siginfo_t is defined as

[6083] TYPEDEF 'siginfo_t' type_id=6082

while in the second case it gets defined as

[6507] TYPEDEF 'compat_siginfo_t' type_id=6506

The reason is include/linux/compat.h was included
in one place when fs/binfmt_elf.c was built and not
in another when fs/binfmt_elf.c was built the
second time.

Because the object was built into the kernel twice,
as well as having two different instances of BTF descriptions
for the same structure, we also have multiple different BTF
function descriptions for functions that use a
"struct elf_note_info *", and we need to figure out which
maps to which kallsyms instance of the functions like
this:

ffffffff81489380 t fill_thread_core_info
ffffffff8148ca90 t fill_thread_core_info

Now, which BTF description goes with which instance?
The file/line number is identical for each BTF description,
but the BTF is (correctly) different because the object has
been built into the kernel twice and is slightly different
in each case. The difference isn't significant in practice here,
but that doesn't mean it couldn't be in other cases. Imagine a
case where one instance of the structure contained a field and
the other didn't; we'd be interpreting the tracing data
incorrectly.

I realize it's a weird edge case, but that's just one
I found from digging a bit. My worry is if we go with
the file/line as a way of identifying the function,
these sorts of edge cases will pile up and we'll need
to go more fine-grained and use addresses in the end
anyway. Again, that doesn't stop us applying higher-level
semantics at a tracer level, but we need to build those on
solid foundations.

Alan

>> This series is a proof-of-concept that supports encoding
>> function addresses and associating them with BTF FUNC
>> descriptions using BTF declaration tags.
>>
>> More work would need to be done on the kernel side
>> to _use_ this representation, but hopefully having a
>> rough approach outlined will help make that more feasible.
>>
>> [1] https://lore.kernel.org/bpf/ZF61j8WJls25BYTl@krava/
>>
>> Alan Maguire (6):
>>   btf_encoder: record function address and if it is local
>>   dwarf_loader: store address in function low_pc if available
>>   dwarf_loader: transfer low_pc info from subtroutine to its abstract
>>     origin
>>   btf_encoder: add "addr=0x<addr>" function declaration tag if
>>     --btf_gen_func_addr specified
>>   btf_encoder: store ELF function representations sorted by name _and_
>>     address
>>   pahole: document --btf_gen_func_addr
>>
>>  btf_encoder.c      | 64 +++++++++++++++++++++++++++++++++++-----------
>>  btf_encoder.h      |  4 +--
>>  dwarf_loader.c     | 16 +++++++++---
>>  dwarves.h          |  3 +++
>>  man-pages/pahole.1 |  8 ++++++
>>  pahole.c           | 12 +++++++--
>>  6 files changed, 85 insertions(+), 22 deletions(-)
>>
>> --
>> 2.31.1
>>
> 
>