diff mbox series

RFC: Mark "inlined by some callers" functions in BTF

Message ID B653950A-A58F-44C0-AD9D-95370710810F@fb.com (mailing list archive)
State RFC
Headers show
Series RFC: Mark "inlined by some callers" functions in BTF | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Song Liu Jan. 11, 2024, 9:51 p.m. UTC
The problem

Inlining can cause surprises to tracing users, especially when the tool
appears to be working. For example, with

    [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
    Attaching 1 probe...

The user may not realize switch_mm() is inlined by leave_mm(), and we are
not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
functions are in arch/x86/mm/tlb.c.)

We have folks working on ideas to create offline tools to detect such
issues for critical use cases at compile time. However, I think it is
necessary to handle it at program load/attach time.


Detect "inlined by some callers" functions

This appears to be straightforward in pahole. Something like the following
should do the work:


1. We can set struct btf_type.info.kind_flag for inlined function. Or we
   can use a bit from info.vlen.

2. We can simply not generate btf for these functions. This is similar to
   --skip_encoding_btf_inconsistent_proto.


Handle tracing inlined functions

If we go with option 1 above, we have a few options to handle program
load/attach to "inlined by some callers" functions:

a) We can reject the load/attach;
b) We can rejuct the load/attach, unless the user set a new flag;
c) We can simply print a warning, and let the load/attach work.


Please share your comments on this. Is this something we want to handle?
If so, which of these options makes more sense?

Thanks,
Song

Comments

Daniel Xu Jan. 11, 2024, 10:48 p.m. UTC | #1
Hi Song,

On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote:
> The problem
> 
> Inlining can cause surprises to tracing users, especially when the tool
> appears to be working. For example, with
> 
>     [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
>     Attaching 1 probe...
> 
> The user may not realize switch_mm() is inlined by leave_mm(), and we are
> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
> functions are in arch/x86/mm/tlb.c.)
> 
> We have folks working on ideas to create offline tools to detect such
> issues for critical use cases at compile time. However, I think it is
> necessary to handle it at program load/attach time.

Could you clarify what offline means?

I wonder if libbpf should just give a way for applications to query
inline status. Seems most flexible.  And maybe also a special SEC() def?

Although I suspect end users might want a flag to "attach anyways; I'm
aware", so a more generic api (eg `btf__is_inlined(..)`) results in a
smaller and maximally flexible result.

> Detect "inlined by some callers" functions
> 
> This appears to be straightforward in pahole. Something like the following
> should do the work:
> 
> diff --git i/btf_encoder.c w/btf_encoder.c
> index fd040086827e..e546a059eb4b 100644
> --- i/btf_encoder.c
> +++ w/btf_encoder.c
> @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
>         struct llvm_annotation *annot;
>         const char *name;
> 
> +       if (function__inlined(fn)) {
> +               /* This function is inlined by some callers. */
> +       }
> +
>         btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
>         name = function__name(fn);
>         btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
> 
> 
> Mark "inlined by some callers" functions
> 
> We have a few options to mark these functions.
> 
> 1. We can set struct btf_type.info.kind_flag for inlined function. Or we
>    can use a bit from info.vlen.

Seems reasonable. Decl tag is another option but probably too heavy.

> 2. We can simply not generate btf for these functions. This is similar to
>    --skip_encoding_btf_inconsistent_proto.

This option seems a bit confusing. Basically you're suggesting that:

        func_in_avilable_filter_functions && !func_in_btf

implies partially inlined function, right? Seems a bit non-obvious.

[...]

Thanks,
Daniel
Song Liu Jan. 11, 2024, 11:06 p.m. UTC | #2
> On Jan 11, 2024, at 2:48 PM, Daniel Xu <dxu@dxuuu.xyz> wrote:
> 
> Hi Song,
> 
> On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote:
>> The problem
>> 
>> Inlining can cause surprises to tracing users, especially when the tool
>> appears to be working. For example, with
>> 
>>    [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
>>    Attaching 1 probe...
>> 
>> The user may not realize switch_mm() is inlined by leave_mm(), and we are
>> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
>> functions are in arch/x86/mm/tlb.c.)
>> 
>> We have folks working on ideas to create offline tools to detect such
>> issues for critical use cases at compile time. However, I think it is
>> necessary to handle it at program load/attach time.
> 
> Could you clarify what offline means?

The idea is to keep a list of kernel functions used by key services. At 
kernel build time, we check whether any of these functions are inlined. 
If so, the tool will catch it, and we need to add noinline to the function. 

> 
> I wonder if libbpf should just give a way for applications to query
> inline status. Seems most flexible.  And maybe also a special SEC() def?

API to query inline status makes sense. What do we do with SEC()?

> 
> Although I suspect end users might want a flag to "attach anyways; I'm
> aware", so a more generic api (eg `btf__is_inlined(..)`) results in a
> smaller and maximally flexible result.
> 
>> Detect "inlined by some callers" functions
>> 
>> This appears to be straightforward in pahole. Something like the following
>> should do the work:
>> 
>> diff --git i/btf_encoder.c w/btf_encoder.c
>> index fd040086827e..e546a059eb4b 100644
>> --- i/btf_encoder.c
>> +++ w/btf_encoder.c
>> @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
>>        struct llvm_annotation *annot;
>>        const char *name;
>> 
>> +       if (function__inlined(fn)) {
>> +               /* This function is inlined by some callers. */
>> +       }
>> +
>>        btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
>>        name = function__name(fn);
>>        btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
>> 
>> 
>> Mark "inlined by some callers" functions
>> 
>> We have a few options to mark these functions.
>> 
>> 1. We can set struct btf_type.info.kind_flag for inlined function. Or we
>>   can use a bit from info.vlen.
> 
> Seems reasonable. Decl tag is another option but probably too heavy.
> 
>> 2. We can simply not generate btf for these functions. This is similar to
>>   --skip_encoding_btf_inconsistent_proto.
> 
> This option seems a bit confusing. Basically you're suggesting that:
> 
>        func_in_avilable_filter_functions && !func_in_btf
> 
> implies partially inlined function, right? Seems a bit non-obvious.

I didn't think about available_filter_functions. My original idea was to
not generate BTF for these functions, and thus disallow attaching (some)
BPF programs to them. I guess that will not stop regular kprobe attach. 

Thanks,
Song
Daniel Xu Jan. 11, 2024, 11:49 p.m. UTC | #3
On Thu, Jan 11, 2024 at 11:06:23PM +0000, Song Liu wrote:
> 
> 
> > On Jan 11, 2024, at 2:48 PM, Daniel Xu <dxu@dxuuu.xyz> wrote:
> > 
> > Hi Song,
> > 
> > On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote:
> >> The problem
> >> 
> >> Inlining can cause surprises to tracing users, especially when the tool
> >> appears to be working. For example, with
> >> 
> >>    [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
> >>    Attaching 1 probe...
> >> 
> >> The user may not realize switch_mm() is inlined by leave_mm(), and we are
> >> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
> >> functions are in arch/x86/mm/tlb.c.)
> >> 
> >> We have folks working on ideas to create offline tools to detect such
> >> issues for critical use cases at compile time. However, I think it is
> >> necessary to handle it at program load/attach time.
> > 
> > Could you clarify what offline means?
> 
> The idea is to keep a list of kernel functions used by key services. At 
> kernel build time, we check whether any of these functions are inlined. 
> If so, the tool will catch it, and we need to add noinline to the function. 

Neat idea!

> > I wonder if libbpf should just give a way for applications to query
> > inline status. Seems most flexible.  And maybe also a special SEC() def?
> 
> API to query inline status makes sense. What do we do with SEC()?

Oh, I meant that you could make the SEC() parser take for example:

      SEC("kprobe.noinline/switch_mm")

or something like that. Rather than add flags to bpf_program which may
not be applicable to every program type.

[...]

Thanks,
Daniel
Song Liu Jan. 12, 2024, 9:59 p.m. UTC | #4
> On Jan 11, 2024, at 3:49 PM, Daniel Xu <dxu@dxuuu.xyz> wrote:
> 
> On Thu, Jan 11, 2024 at 11:06:23PM +0000, Song Liu wrote:
>> 
>> 
>>> On Jan 11, 2024, at 2:48 PM, Daniel Xu <dxu@dxuuu.xyz> wrote:
>>> 
>>> Hi Song,
>>> 
>>> On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote:
>>>> The problem
>>>> 
>>>> Inlining can cause surprises to tracing users, especially when the tool
>>>> appears to be working. For example, with
>>>> 
>>>>   [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
>>>>   Attaching 1 probe...
>>>> 
>>>> The user may not realize switch_mm() is inlined by leave_mm(), and we are
>>>> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
>>>> functions are in arch/x86/mm/tlb.c.)
>>>> 
>>>> We have folks working on ideas to create offline tools to detect such
>>>> issues for critical use cases at compile time. However, I think it is
>>>> necessary to handle it at program load/attach time.
>>> 
>>> Could you clarify what offline means?
>> 
>> The idea is to keep a list of kernel functions used by key services. At 
>> kernel build time, we check whether any of these functions are inlined. 
>> If so, the tool will catch it, and we need to add noinline to the function.
> 
> Neat idea!
> 
>>> I wonder if libbpf should just give a way for applications to query
>>> inline status. Seems most flexible.  And maybe also a special SEC() def?
>> 
>> API to query inline status makes sense. What do we do with SEC()?
> 
> Oh, I meant that you could make the SEC() parser take for example:
> 
>      SEC("kprobe.noinline/switch_mm")
> 
> or something like that. Rather than add flags to bpf_program which may
> not be applicable to every program type.

This is an interesting idea. Thanks for the suggestion!

Song
Andrii Nakryiko Jan. 23, 2024, 12:30 a.m. UTC | #5
On Thu, Jan 11, 2024 at 1:51 PM Song Liu <songliubraving@meta.com> wrote:
>
> The problem
>
> Inlining can cause surprises to tracing users, especially when the tool
> appears to be working. For example, with
>
>     [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
>     Attaching 1 probe...
>
> The user may not realize switch_mm() is inlined by leave_mm(), and we are
> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
> functions are in arch/x86/mm/tlb.c.)
>
> We have folks working on ideas to create offline tools to detect such
> issues for critical use cases at compile time. However, I think it is
> necessary to handle it at program load/attach time.
>
>
> Detect "inlined by some callers" functions
>
> This appears to be straightforward in pahole. Something like the following
> should do the work:
>
> diff --git i/btf_encoder.c w/btf_encoder.c
> index fd040086827e..e546a059eb4b 100644
> --- i/btf_encoder.c
> +++ w/btf_encoder.c
> @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
>         struct llvm_annotation *annot;
>         const char *name;
>
> +       if (function__inlined(fn)) {
> +               /* This function is inlined by some callers. */
> +       }
> +
>         btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
>         name = function__name(fn);
>         btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
>
>
> Mark "inlined by some callers" functions
>
> We have a few options to mark these functions.
>
> 1. We can set struct btf_type.info.kind_flag for inlined function. Or we
>    can use a bit from info.vlen.

It doesn't feel right to use kflag or vlen for this particular
property. I think decl_tag is the generic way to have extra
information/annotation.

>
> 2. We can simply not generate btf for these functions. This is similar to
>    --skip_encoding_btf_inconsistent_proto.

This is too drastic, IMO. Even if some function was inlined somewhere,
it still might be important to trace non-inlined calls. So just
removing BTF is a regression in behavior.

>
>
> Handle tracing inlined functions
>
> If we go with option 1 above, we have a few options to handle program
> load/attach to "inlined by some callers" functions:
>
> a) We can reject the load/attach;
> b) We can rejuct the load/attach, unless the user set a new flag;
> c) We can simply print a warning, and let the load/attach work.
>

I'd start with c), probably. Everything else is a regression in
behavior compared to what we have today.


>
> Please share your comments on this. Is this something we want to handle?
> If so, which of these options makes more sense?
>
> Thanks,
> Song
>
>
Alan Maguire Jan. 23, 2024, 10:32 a.m. UTC | #6
On 23/01/2024 00:30, Andrii Nakryiko wrote:
> On Thu, Jan 11, 2024 at 1:51 PM Song Liu <songliubraving@meta.com> wrote:
>>
>> The problem
>>
>> Inlining can cause surprises to tracing users, especially when the tool
>> appears to be working. For example, with
>>
>>     [root@ ~]# bpftrace -e 'kprobe:switch_mm {}'
>>     Attaching 1 probe...
>>
>> The user may not realize switch_mm() is inlined by leave_mm(), and we are
>> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both
>> functions are in arch/x86/mm/tlb.c.)
>>
>> We have folks working on ideas to create offline tools to detect such
>> issues for critical use cases at compile time. However, I think it is
>> necessary to handle it at program load/attach time.
>>
>>
>> Detect "inlined by some callers" functions
>>
>> This appears to be straightforward in pahole. Something like the following
>> should do the work:
>>
>> diff --git i/btf_encoder.c w/btf_encoder.c
>> index fd040086827e..e546a059eb4b 100644
>> --- i/btf_encoder.c
>> +++ w/btf_encoder.c
>> @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
>>         struct llvm_annotation *annot;
>>         const char *name;
>>
>> +       if (function__inlined(fn)) {
>> +               /* This function is inlined by some callers. */
>> +       }
>> +
>>         btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
>>         name = function__name(fn);
>>         btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);
>>
>>
>> Mark "inlined by some callers" functions
>>
>> We have a few options to mark these functions.
>>
>> 1. We can set struct btf_type.info.kind_flag for inlined function. Or we
>>    can use a bit from info.vlen.
> 
> It doesn't feel right to use kflag or vlen for this particular
> property. I think decl_tag is the generic way to have extra
> information/annotation.
> 
>>
>> 2. We can simply not generate btf for these functions. This is similar to
>>    --skip_encoding_btf_inconsistent_proto.
> 
> This is too drastic, IMO. Even if some function was inlined somewhere,
> it still might be important to trace non-inlined calls. So just
> removing BTF is a regression in behavior.
>

Agreed; I haven't done the experiment but I suspect a lot of functions
would disappear from vmlinux BTF if we excluded functions on the basis
they were inlined at some point. Running "pfunct -H" (showing functions
that were inlined by the compiler but not declared inline) on a recent
bpf-next I see 22000 different functions, so if even a small percentage
of those were inlined at some sites but not others we'd lose quite a bit
of tracing coverage.

Knowing that we are potentially missing some tracing information is
definitely valuable, so having declaration tags providing
sometimes-inlined info would be great! A custom SEC() that would fail to
attach for sometimes-inlined functions on the basis of declaration tag
info is a great idea too; the original use case could then make use of
that and fail in a meaningful way rather than appearing to succeed.

Alan

>>
>>
>> Handle tracing inlined functions
>>
>> If we go with option 1 above, we have a few options to handle program
>> load/attach to "inlined by some callers" functions:
>>
>> a) We can reject the load/attach;
>> b) We can rejuct the load/attach, unless the user set a new flag;
>> c) We can simply print a warning, and let the load/attach work.
>>
> 
> I'd start with c), probably. Everything else is a regression in
> behavior compared to what we have today.
> 
> 
>>
>> Please share your comments on this. Is this something we want to handle?
>> If so, which of these options makes more sense?
>>
>> Thanks,
>> Song
>>
>>
diff mbox series

Patch

diff --git i/btf_encoder.c w/btf_encoder.c
index fd040086827e..e546a059eb4b 100644
--- i/btf_encoder.c
+++ w/btf_encoder.c
@@ -885,6 +885,15 @@  static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio
        struct llvm_annotation *annot;
        const char *name;

+       if (function__inlined(fn)) {
+               /* This function is inlined by some callers. */
+       }
+
        btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto);
        name = function__name(fn);
        btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false);


Mark "inlined by some callers" functions

We have a few options to mark these functions.