Message ID | B653950A-A58F-44C0-AD9D-95370710810F@fb.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | RFC: Mark "inlined by some callers" functions in BTF | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
Hi Song, On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote: > The problem > > Inlining can cause surprises to tracing users, especially when the tool > appears to be working. For example, with > > [root@ ~]# bpftrace -e 'kprobe:switch_mm {}' > Attaching 1 probe... > > The user may not realize switch_mm() is inlined by leave_mm(), and we are > not tracing the code path leave_mm => switch_mm. (This is x86_64, and both > functions are in arch/x86/mm/tlb.c.) > > We have folks working on ideas to create offline tools to detect such > issues for critical use cases at compile time. However, I think it is > necessary to handle it at program load/attach time. Could you clarify what offline means? I wonder if libbpf should just give a way for applications to query inline status. Seems most flexible. And maybe also a special SEC() def? Although I suspect end users might want a flag to "attach anyways; I'm aware", so a more generic api (eg `btf__is_inlined(..)`) results in a smaller and maximally flexible result. > Detect "inlined by some callers" functions > > This appears to be straightforward in pahole. Something like the following > should do the work: > > diff --git i/btf_encoder.c w/btf_encoder.c > index fd040086827e..e546a059eb4b 100644 > --- i/btf_encoder.c > +++ w/btf_encoder.c > @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio > struct llvm_annotation *annot; > const char *name; > > + if (function__inlined(fn)) { > + /* This function is inlined by some callers. */ > + } > + > btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto); > name = function__name(fn); > btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false); > > > Mark "inlined by some callers" functions > > We have a few options to mark these functions. > > 1. We can set struct btf_type.info.kind_flag for inlined function. Or we > can use a bit from info.vlen. Seems reasonable. Decl tag is another option but probably too heavy. > 2. We can simply not generate btf for these functions. This is similar to > --skip_encoding_btf_inconsistent_proto. This option seems a bit confusing. Basically you're suggesting that: func_in_avilable_filter_functions && !func_in_btf implies partially inlined function, right? Seems a bit non-obvious. [...] Thanks, Daniel
> On Jan 11, 2024, at 2:48 PM, Daniel Xu <dxu@dxuuu.xyz> wrote: > > Hi Song, > > On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote: >> The problem >> >> Inlining can cause surprises to tracing users, especially when the tool >> appears to be working. For example, with >> >> [root@ ~]# bpftrace -e 'kprobe:switch_mm {}' >> Attaching 1 probe... >> >> The user may not realize switch_mm() is inlined by leave_mm(), and we are >> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both >> functions are in arch/x86/mm/tlb.c.) >> >> We have folks working on ideas to create offline tools to detect such >> issues for critical use cases at compile time. However, I think it is >> necessary to handle it at program load/attach time. > > Could you clarify what offline means? The idea is to keep a list of kernel functions used by key services. At kernel build time, we check whether any of these functions are inlined. If so, the tool will catch it, and we need to add noinline to the function. > > I wonder if libbpf should just give a way for applications to query > inline status. Seems most flexible. And maybe also a special SEC() def? API to query inline status makes sense. What do we do with SEC()? > > Although I suspect end users might want a flag to "attach anyways; I'm > aware", so a more generic api (eg `btf__is_inlined(..)`) results in a > smaller and maximally flexible result. > >> Detect "inlined by some callers" functions >> >> This appears to be straightforward in pahole. Something like the following >> should do the work: >> >> diff --git i/btf_encoder.c w/btf_encoder.c >> index fd040086827e..e546a059eb4b 100644 >> --- i/btf_encoder.c >> +++ w/btf_encoder.c >> @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio >> struct llvm_annotation *annot; >> const char *name; >> >> + if (function__inlined(fn)) { >> + /* This function is inlined by some callers. */ >> + } >> + >> btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto); >> name = function__name(fn); >> btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false); >> >> >> Mark "inlined by some callers" functions >> >> We have a few options to mark these functions. >> >> 1. We can set struct btf_type.info.kind_flag for inlined function. Or we >> can use a bit from info.vlen. > > Seems reasonable. Decl tag is another option but probably too heavy. > >> 2. We can simply not generate btf for these functions. This is similar to >> --skip_encoding_btf_inconsistent_proto. > > This option seems a bit confusing. Basically you're suggesting that: > > func_in_avilable_filter_functions && !func_in_btf > > implies partially inlined function, right? Seems a bit non-obvious. I didn't think about available_filter_functions. My original idea was to not generate BTF for these functions, and thus disallow attaching (some) BPF programs to them. I guess that will not stop regular kprobe attach. Thanks, Song
On Thu, Jan 11, 2024 at 11:06:23PM +0000, Song Liu wrote: > > > > On Jan 11, 2024, at 2:48 PM, Daniel Xu <dxu@dxuuu.xyz> wrote: > > > > Hi Song, > > > > On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote: > >> The problem > >> > >> Inlining can cause surprises to tracing users, especially when the tool > >> appears to be working. For example, with > >> > >> [root@ ~]# bpftrace -e 'kprobe:switch_mm {}' > >> Attaching 1 probe... > >> > >> The user may not realize switch_mm() is inlined by leave_mm(), and we are > >> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both > >> functions are in arch/x86/mm/tlb.c.) > >> > >> We have folks working on ideas to create offline tools to detect such > >> issues for critical use cases at compile time. However, I think it is > >> necessary to handle it at program load/attach time. > > > > Could you clarify what offline means? > > The idea is to keep a list of kernel functions used by key services. At > kernel build time, we check whether any of these functions are inlined. > If so, the tool will catch it, and we need to add noinline to the function. Neat idea! > > I wonder if libbpf should just give a way for applications to query > > inline status. Seems most flexible. And maybe also a special SEC() def? > > API to query inline status makes sense. What do we do with SEC()? Oh, I meant that you could make the SEC() parser take for example: SEC("kprobe.noinline/switch_mm") or something like that. Rather than add flags to bpf_program which may not be applicable to every program type. [...] Thanks, Daniel
> On Jan 11, 2024, at 3:49 PM, Daniel Xu <dxu@dxuuu.xyz> wrote: > > On Thu, Jan 11, 2024 at 11:06:23PM +0000, Song Liu wrote: >> >> >>> On Jan 11, 2024, at 2:48 PM, Daniel Xu <dxu@dxuuu.xyz> wrote: >>> >>> Hi Song, >>> >>> On Thu, Jan 11, 2024 at 09:51:05PM +0000, Song Liu wrote: >>>> The problem >>>> >>>> Inlining can cause surprises to tracing users, especially when the tool >>>> appears to be working. For example, with >>>> >>>> [root@ ~]# bpftrace -e 'kprobe:switch_mm {}' >>>> Attaching 1 probe... >>>> >>>> The user may not realize switch_mm() is inlined by leave_mm(), and we are >>>> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both >>>> functions are in arch/x86/mm/tlb.c.) >>>> >>>> We have folks working on ideas to create offline tools to detect such >>>> issues for critical use cases at compile time. However, I think it is >>>> necessary to handle it at program load/attach time. >>> >>> Could you clarify what offline means? >> >> The idea is to keep a list of kernel functions used by key services. At >> kernel build time, we check whether any of these functions are inlined. >> If so, the tool will catch it, and we need to add noinline to the function. > > Neat idea! > >>> I wonder if libbpf should just give a way for applications to query >>> inline status. Seems most flexible. And maybe also a special SEC() def? >> >> API to query inline status makes sense. What do we do with SEC()? > > Oh, I meant that you could make the SEC() parser take for example: > > SEC("kprobe.noinline/switch_mm") > > or something like that. Rather than add flags to bpf_program which may > not be applicable to every program type. This is an interesting idea. Thanks for the suggestion! Song
On Thu, Jan 11, 2024 at 1:51 PM Song Liu <songliubraving@meta.com> wrote: > > The problem > > Inlining can cause surprises to tracing users, especially when the tool > appears to be working. For example, with > > [root@ ~]# bpftrace -e 'kprobe:switch_mm {}' > Attaching 1 probe... > > The user may not realize switch_mm() is inlined by leave_mm(), and we are > not tracing the code path leave_mm => switch_mm. (This is x86_64, and both > functions are in arch/x86/mm/tlb.c.) > > We have folks working on ideas to create offline tools to detect such > issues for critical use cases at compile time. However, I think it is > necessary to handle it at program load/attach time. > > > Detect "inlined by some callers" functions > > This appears to be straightforward in pahole. Something like the following > should do the work: > > diff --git i/btf_encoder.c w/btf_encoder.c > index fd040086827e..e546a059eb4b 100644 > --- i/btf_encoder.c > +++ w/btf_encoder.c > @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio > struct llvm_annotation *annot; > const char *name; > > + if (function__inlined(fn)) { > + /* This function is inlined by some callers. */ > + } > + > btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto); > name = function__name(fn); > btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false); > > > Mark "inlined by some callers" functions > > We have a few options to mark these functions. > > 1. We can set struct btf_type.info.kind_flag for inlined function. Or we > can use a bit from info.vlen. It doesn't feel right to use kflag or vlen for this particular property. I think decl_tag is the generic way to have extra information/annotation. > > 2. We can simply not generate btf for these functions. This is similar to > --skip_encoding_btf_inconsistent_proto. This is too drastic, IMO. Even if some function was inlined somewhere, it still might be important to trace non-inlined calls. So just removing BTF is a regression in behavior. > > > Handle tracing inlined functions > > If we go with option 1 above, we have a few options to handle program > load/attach to "inlined by some callers" functions: > > a) We can reject the load/attach; > b) We can rejuct the load/attach, unless the user set a new flag; > c) We can simply print a warning, and let the load/attach work. > I'd start with c), probably. Everything else is a regression in behavior compared to what we have today. > > Please share your comments on this. Is this something we want to handle? > If so, which of these options makes more sense? > > Thanks, > Song > >
On 23/01/2024 00:30, Andrii Nakryiko wrote: > On Thu, Jan 11, 2024 at 1:51 PM Song Liu <songliubraving@meta.com> wrote: >> >> The problem >> >> Inlining can cause surprises to tracing users, especially when the tool >> appears to be working. For example, with >> >> [root@ ~]# bpftrace -e 'kprobe:switch_mm {}' >> Attaching 1 probe... >> >> The user may not realize switch_mm() is inlined by leave_mm(), and we are >> not tracing the code path leave_mm => switch_mm. (This is x86_64, and both >> functions are in arch/x86/mm/tlb.c.) >> >> We have folks working on ideas to create offline tools to detect such >> issues for critical use cases at compile time. However, I think it is >> necessary to handle it at program load/attach time. >> >> >> Detect "inlined by some callers" functions >> >> This appears to be straightforward in pahole. Something like the following >> should do the work: >> >> diff --git i/btf_encoder.c w/btf_encoder.c >> index fd040086827e..e546a059eb4b 100644 >> --- i/btf_encoder.c >> +++ w/btf_encoder.c >> @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio >> struct llvm_annotation *annot; >> const char *name; >> >> + if (function__inlined(fn)) { >> + /* This function is inlined by some callers. */ >> + } >> + >> btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto); >> name = function__name(fn); >> btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false); >> >> >> Mark "inlined by some callers" functions >> >> We have a few options to mark these functions. >> >> 1. We can set struct btf_type.info.kind_flag for inlined function. Or we >> can use a bit from info.vlen. > > It doesn't feel right to use kflag or vlen for this particular > property. I think decl_tag is the generic way to have extra > information/annotation. > >> >> 2. We can simply not generate btf for these functions. This is similar to >> --skip_encoding_btf_inconsistent_proto. > > This is too drastic, IMO. Even if some function was inlined somewhere, > it still might be important to trace non-inlined calls. So just > removing BTF is a regression in behavior. > Agreed; I haven't done the experiment but I suspect a lot of functions would disappear from vmlinux BTF if we excluded functions on the basis they were inlined at some point. Running "pfunct -H" (showing functions that were inlined by the compiler but not declared inline) on a recent bpf-next I see 22000 different functions, so if even a small percentage of those were inlined at some sites but not others we'd lose quite a bit of tracing coverage. Knowing that we are potentially missing some tracing information is definitely valuable, so having declaration tags providing sometimes-inlined info would be great! A custom SEC() that would fail to attach for sometimes-inlined functions on the basis of declaration tag info is a great idea too; the original use case could then make use of that and fail in a meaningful way rather than appearing to succeed. Alan >> >> >> Handle tracing inlined functions >> >> If we go with option 1 above, we have a few options to handle program >> load/attach to "inlined by some callers" functions: >> >> a) We can reject the load/attach; >> b) We can rejuct the load/attach, unless the user set a new flag; >> c) We can simply print a warning, and let the load/attach work. >> > > I'd start with c), probably. Everything else is a regression in > behavior compared to what we have today. > > >> >> Please share your comments on this. Is this something we want to handle? >> If so, which of these options makes more sense? >> >> Thanks, >> Song >> >>
diff --git i/btf_encoder.c w/btf_encoder.c index fd040086827e..e546a059eb4b 100644 --- i/btf_encoder.c +++ w/btf_encoder.c @@ -885,6 +885,15 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder, struct functio struct llvm_annotation *annot; const char *name; + if (function__inlined(fn)) { + /* This function is inlined by some callers. */ + } + btf_fnproto_id = btf_encoder__add_func_proto(encoder, &fn->proto); name = function__name(fn); btf_fn_id = btf_encoder__add_ref_type(encoder, BTF_KIND_FUNC, btf_fnproto_id, name, false); Mark "inlined by some callers" functions We have a few options to mark these functions.