Message ID | 20230913142646.190047-1-alan.maguire@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | dwarves: detect BTF kinds supported by kernel | expand |
On Wed, 2023-09-13 at 15:26 +0100, Alan Maguire wrote: > > > > When a newer pahole is run on an older kernel, it often knows about BTF > > > > kinds that the kernel does not support, and adds them to the BTF > > > > representation. This is a problem because the BTF generated is then > > > > embedded in the kernel image. When it is later read - possibly by > > > > a different older toolchain or by the kernel directly - it is not usable. > > > > > > > > The scripts/pahole-flags.sh script enumerates the various pahole options > > > > available associated with various versions of pahole, but in the case > > > > of an older kernel is the set of BTF kinds the _kernel_ can handle that > > > > is of more importance. > > > > > > > > Because recent features such as BTF_KIND_ENUM64 are added by default > > > > (and only skipped if --skip_encoding_btf_* is set), BTF will be > > > > created with these newer kinds that the older kernel cannot read. > > > > This can be (and has been) fixed by stable-backporting --skip options, > > > > but this is cumbersome and would have to be done every time a new BTF kind > > > > is introduced. > > > > > > > > So this series attempts to detect the BTF kinds supported by the > > > > kernel/modules so that this can inform BTF encoding for older > > > > kernels. We look for BTF_KIND_MAX - either as an enumerated value > > > > in vmlinux DWARF (patch 1) or as an enumerated value in base vmlinux > > > > BTF (patch 3). Knowing this prior to encoding BTF allows us to specify > > > > skip_encoding options to avoid having BTF with kinds the kernel itself > > > > will not understand. > > > > > > > > The aim is to minimize pain for older stable kernels when new BTF > > > > kinds are introduced. Kind encoding [1] can solve the parsing problem > > > > with BTF, but this approach is intended to ensure generated BTF is > > > > usable when newer pahole runs on older kernels. > > > > > > > > This approach requires BTF kinds to be defined via an enumerated type, > > > > which happened for 5.16 and later. Older kernels than this used #defines > > > > so the approach will only work for 5.16 stable kernels and later currently. > > > > > > > > With this change in hand, adding new BTF kinds becomes a bit simpler, > > > > at least for the user of pahole. All that needs to be done is to add > > > > internal "skip_new_kind" booleans to struct conf_load and set them > > > > in dwarves__set_btf_kind_max() if the detected maximum kind is less > > > > than the kind in question - in other words, if the kernel does not know > > > > about that kind. In that case, we will not use it in encoding. > > > > > > > > The approach was tested on Linux 5.16 as released, i.e. prior to the > > > > backports adding --skip_encoding logic, and the BTF generated did not > > > > contain kinds > BTF_KIND_MAX for the kernel (corresponding to > > > > BTF_KIND_DECL_TAG in that case). Hi Alan, I tested this patch (by tweaking BTF_KIND_MAX value and comparing generated BTF) and it seems to work as intended. Left a few nitpicks. Thanks, Eduard > > > > > > > > Changes since RFC [2]: > > > > - added --skip_autodetect_btf_kind_max to disable kind autodetection > > > > (Jiri, patch 2) > > > > > > > > [1] https://lore.kernel.org/bpf/20230616171728.530116-1-alan.maguire@oracle.com/ > > > > [2] https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/ > > > > > > > > Alan Maguire (3): > > > > dwarves: auto-detect maximum kind supported by vmlinux > > > > pahole: add --skip_autodetect_btf_kind_max to disable kind autodetect > > > > btf_encoder: learn BTF_KIND_MAX value from base BTF when generating > > > > split BTF > > > > > > > > btf_encoder.c | 37 +++++++++++++++++++++++++++++++++ > > > > btf_encoder.h | 2 ++ > > > > dwarf_loader.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > dwarves.h | 3 +++ > > > > man-pages/pahole.1 | 4 ++++ > > > > pahole.c | 10 +++++++++ > > > > 6 files changed, 108 insertions(+) > > > >
On Wed, Sep 13, 2023 at 7:26 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > When a newer pahole is run on an older kernel, it often knows about BTF > kinds that the kernel does not support, and adds them to the BTF > representation. This is a problem because the BTF generated is then > embedded in the kernel image. When it is later read - possibly by > a different older toolchain or by the kernel directly - it is not usable. > > The scripts/pahole-flags.sh script enumerates the various pahole options > available associated with various versions of pahole, but in the case > of an older kernel is the set of BTF kinds the _kernel_ can handle that > is of more importance. > > Because recent features such as BTF_KIND_ENUM64 are added by default > (and only skipped if --skip_encoding_btf_* is set), BTF will be > created with these newer kinds that the older kernel cannot read. > This can be (and has been) fixed by stable-backporting --skip options, > but this is cumbersome and would have to be done every time a new BTF kind > is introduced. > Yes, this is indeed the problem, but I don't think any sort of auto detection by pahole itself of what is the BTF_KIND_MAX is the best solution. Sometimes new features are added to existing kinds (like kflag usage, etc), and that will still break even with "auto detection". I think the solution is to design pahole behavior in such a way that it allows full control for old kernels to specify which BTF features are expected to be generated, while also allowing to default to all the latest and greatest BTF features by default for any other application. So, how about something like the following. By default, pahole generates all the BTF features it knows about. But we add a new flag that says to stay conservative and only generate a specified subset of BTF features. E.g.: 1) `pahole -J <eLF.o>` will generate everything 2) `pahole -J <elf.o> --btf_feature=basic` will generate only the very basic stuff (we can decide what constitutes basic, we can go all the way to before we added variables, or can pick any random state after that) 3) `pahole -J <elf.o> --btf_feature=basic --btf_feature=enum64 --btf_feature=fancy_funcs` will generate only requested bits. We can have --btf_feature=all as a convenience as well, but kernel scripts won't use it. From the very beginning, pahole should not fail with a feature name that it doesn't recognize, though (maybe emit a warning, don't know). So that kernel-side scripts can be simpler: when kernel starts to recognize some new BTF functionality, we just unconditionally add another `--btf_feature=<something>`. And that works starting from the first pahole version that supports this `--btf_feature` flag. All this cleverness in trying to guess what kernel supports and what not (without actually running the kernel and feature-testing) will just come biting us later on. This never works reliably. > So this series attempts to detect the BTF kinds supported by the > kernel/modules so that this can inform BTF encoding for older > kernels. We look for BTF_KIND_MAX - either as an enumerated value > in vmlinux DWARF (patch 1) or as an enumerated value in base vmlinux > BTF (patch 3). Knowing this prior to encoding BTF allows us to specify > skip_encoding options to avoid having BTF with kinds the kernel itself > will not understand. > > The aim is to minimize pain for older stable kernels when new BTF > kinds are introduced. Kind encoding [1] can solve the parsing problem > with BTF, but this approach is intended to ensure generated BTF is > usable when newer pahole runs on older kernels. > > This approach requires BTF kinds to be defined via an enumerated type, > which happened for 5.16 and later. Older kernels than this used #defines > so the approach will only work for 5.16 stable kernels and later currently. > > With this change in hand, adding new BTF kinds becomes a bit simpler, > at least for the user of pahole. All that needs to be done is to add > internal "skip_new_kind" booleans to struct conf_load and set them > in dwarves__set_btf_kind_max() if the detected maximum kind is less > than the kind in question - in other words, if the kernel does not know > about that kind. In that case, we will not use it in encoding. > > The approach was tested on Linux 5.16 as released, i.e. prior to the > backports adding --skip_encoding logic, and the BTF generated did not > contain kinds > BTF_KIND_MAX for the kernel (corresponding to > BTF_KIND_DECL_TAG in that case). > > Changes since RFC [2]: > - added --skip_autodetect_btf_kind_max to disable kind autodetection > (Jiri, patch 2) > > [1] https://lore.kernel.org/bpf/20230616171728.530116-1-alan.maguire@oracle.com/ > [2] https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/ > > Alan Maguire (3): > dwarves: auto-detect maximum kind supported by vmlinux > pahole: add --skip_autodetect_btf_kind_max to disable kind autodetect > btf_encoder: learn BTF_KIND_MAX value from base BTF when generating > split BTF > > btf_encoder.c | 37 +++++++++++++++++++++++++++++++++ > btf_encoder.h | 2 ++ > dwarf_loader.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ > dwarves.h | 3 +++ > man-pages/pahole.1 | 4 ++++ > pahole.c | 10 +++++++++ > 6 files changed, 108 insertions(+) > > -- > 2.39.3 >
On 14/09/2023 18:58, Andrii Nakryiko wrote: > On Wed, Sep 13, 2023 at 7:26 AM Alan Maguire <alan.maguire@oracle.com> wrote: >> >> When a newer pahole is run on an older kernel, it often knows about BTF >> kinds that the kernel does not support, and adds them to the BTF >> representation. This is a problem because the BTF generated is then >> embedded in the kernel image. When it is later read - possibly by >> a different older toolchain or by the kernel directly - it is not usable. >> >> The scripts/pahole-flags.sh script enumerates the various pahole options >> available associated with various versions of pahole, but in the case >> of an older kernel is the set of BTF kinds the _kernel_ can handle that >> is of more importance. >> >> Because recent features such as BTF_KIND_ENUM64 are added by default >> (and only skipped if --skip_encoding_btf_* is set), BTF will be >> created with these newer kinds that the older kernel cannot read. >> This can be (and has been) fixed by stable-backporting --skip options, >> but this is cumbersome and would have to be done every time a new BTF kind >> is introduced. >> > > Yes, this is indeed the problem, but I don't think any sort of auto > detection by pahole itself of what is the BTF_KIND_MAX is the best > solution. Sometimes new features are added to existing kinds (like > kflag usage, etc), and that will still break even with "auto > detection". > > I think the solution is to design pahole behavior in such a way that > it allows full control for old kernels to specify which BTF features > are expected to be generated, while also allowing to default to all > the latest and greatest BTF features by default for any other > application. > > So, how about something like the following. By default, pahole > generates all the BTF features it knows about. But we add a new flag > that says to stay conservative and only generate a specified subset of > BTF features. E.g.: > > 1) `pahole -J <eLF.o>` will generate everything > > 2) `pahole -J <elf.o> --btf_feature=basic` will generate only the very > basic stuff (we can decide what constitutes basic, we can go all the > way to before we added variables, or can pick any random state after > that) > > 3) `pahole -J <elf.o> --btf_feature=basic --btf_feature=enum64 > --btf_feature=fancy_funcs` will generate only requested bits. > > We can have --btf_feature=all as a convenience as well, but kernel > scripts won't use it. > > From the very beginning, pahole should not fail with a feature name > that it doesn't recognize, though (maybe emit a warning, don't know). > So that kernel-side scripts can be simpler: when kernel starts to > recognize some new BTF functionality, we just unconditionally add > another `--btf_feature=<something>`. And that works starting from the > first pahole version that supports this `--btf_feature` flag. > The idea of a BTF feature flag set - not restricted to BTF kinds - is a good one. I think it should be in the UAPI also though as "enum btf_features". A set of bitmask values - probably closely mirroring the FEAT_BTF* . Something like this perhaps: enum btf_features { BTF_FEATURE_BASIC = 0x1, /* _FUNC, _FUNC_PROTO */ BTF_FEATURE_DATASEC = 0x2, /* _VAR, _DATASEC */ ..etc. A bitmask value would also be amenable to inclusion in an updated struct btf_header. So at BTF encoding time - if we support the newer header - we could add the feature set supported by the BTF encoding along with CRCs. That would be useful for tools - for example if bpftool encounters features it doesn't support in BTF it is trying to parse, it can complain upfront. Ditto for libbpf. With respect to the kind layout support, it probably isn't worth it. It would be a tax on every BTF encoding, and it only helps with parsing - as opposed to using - newer BTF features. As long as we can guarantee that a kernel doesn't wind up with BTF features it doesn't support in vmlinux/module BTF, I think that's enough. Alan > > All this cleverness in trying to guess what kernel supports and what > not (without actually running the kernel and feature-testing) will > just come biting us later on. This never works reliably. > > >> So this series attempts to detect the BTF kinds supported by the >> kernel/modules so that this can inform BTF encoding for older >> kernels. We look for BTF_KIND_MAX - either as an enumerated value >> in vmlinux DWARF (patch 1) or as an enumerated value in base vmlinux >> BTF (patch 3). Knowing this prior to encoding BTF allows us to specify >> skip_encoding options to avoid having BTF with kinds the kernel itself >> will not understand. >> >> The aim is to minimize pain for older stable kernels when new BTF >> kinds are introduced. Kind encoding [1] can solve the parsing problem >> with BTF, but this approach is intended to ensure generated BTF is >> usable when newer pahole runs on older kernels. >> >> This approach requires BTF kinds to be defined via an enumerated type, >> which happened for 5.16 and later. Older kernels than this used #defines >> so the approach will only work for 5.16 stable kernels and later currently. >> >> With this change in hand, adding new BTF kinds becomes a bit simpler, >> at least for the user of pahole. All that needs to be done is to add >> internal "skip_new_kind" booleans to struct conf_load and set them >> in dwarves__set_btf_kind_max() if the detected maximum kind is less >> than the kind in question - in other words, if the kernel does not know >> about that kind. In that case, we will not use it in encoding. >> >> The approach was tested on Linux 5.16 as released, i.e. prior to the >> backports adding --skip_encoding logic, and the BTF generated did not >> contain kinds > BTF_KIND_MAX for the kernel (corresponding to >> BTF_KIND_DECL_TAG in that case). >> >> Changes since RFC [2]: >> - added --skip_autodetect_btf_kind_max to disable kind autodetection >> (Jiri, patch 2) >> >> [1] https://lore.kernel.org/bpf/20230616171728.530116-1-alan.maguire@oracle.com/ >> [2] https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/ >> >> Alan Maguire (3): >> dwarves: auto-detect maximum kind supported by vmlinux >> pahole: add --skip_autodetect_btf_kind_max to disable kind autodetect >> btf_encoder: learn BTF_KIND_MAX value from base BTF when generating >> split BTF >> >> btf_encoder.c | 37 +++++++++++++++++++++++++++++++++ >> btf_encoder.h | 2 ++ >> dwarf_loader.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ >> dwarves.h | 3 +++ >> man-pages/pahole.1 | 4 ++++ >> pahole.c | 10 +++++++++ >> 6 files changed, 108 insertions(+) >> >> -- >> 2.39.3 >> >
On Tue, Sep 19, 2023 at 9:30 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > On 14/09/2023 18:58, Andrii Nakryiko wrote: > > On Wed, Sep 13, 2023 at 7:26 AM Alan Maguire <alan.maguire@oracle.com> wrote: > >> > >> When a newer pahole is run on an older kernel, it often knows about BTF > >> kinds that the kernel does not support, and adds them to the BTF > >> representation. This is a problem because the BTF generated is then > >> embedded in the kernel image. When it is later read - possibly by > >> a different older toolchain or by the kernel directly - it is not usable. > >> > >> The scripts/pahole-flags.sh script enumerates the various pahole options > >> available associated with various versions of pahole, but in the case > >> of an older kernel is the set of BTF kinds the _kernel_ can handle that > >> is of more importance. > >> > >> Because recent features such as BTF_KIND_ENUM64 are added by default > >> (and only skipped if --skip_encoding_btf_* is set), BTF will be > >> created with these newer kinds that the older kernel cannot read. > >> This can be (and has been) fixed by stable-backporting --skip options, > >> but this is cumbersome and would have to be done every time a new BTF kind > >> is introduced. > >> > > > > Yes, this is indeed the problem, but I don't think any sort of auto > > detection by pahole itself of what is the BTF_KIND_MAX is the best > > solution. Sometimes new features are added to existing kinds (like > > kflag usage, etc), and that will still break even with "auto > > detection". > > > > I think the solution is to design pahole behavior in such a way that > > it allows full control for old kernels to specify which BTF features > > are expected to be generated, while also allowing to default to all > > the latest and greatest BTF features by default for any other > > application. > > > > So, how about something like the following. By default, pahole > > generates all the BTF features it knows about. But we add a new flag > > that says to stay conservative and only generate a specified subset of > > BTF features. E.g.: > > > > 1) `pahole -J <eLF.o>` will generate everything > > > > 2) `pahole -J <elf.o> --btf_feature=basic` will generate only the very > > basic stuff (we can decide what constitutes basic, we can go all the > > way to before we added variables, or can pick any random state after > > that) > > > > 3) `pahole -J <elf.o> --btf_feature=basic --btf_feature=enum64 > > --btf_feature=fancy_funcs` will generate only requested bits. > > > > We can have --btf_feature=all as a convenience as well, but kernel > > scripts won't use it. > > > > From the very beginning, pahole should not fail with a feature name > > that it doesn't recognize, though (maybe emit a warning, don't know). > > So that kernel-side scripts can be simpler: when kernel starts to > > recognize some new BTF functionality, we just unconditionally add > > another `--btf_feature=<something>`. And that works starting from the > > first pahole version that supports this `--btf_feature` flag. > > > > The idea of a BTF feature flag set - not restricted to BTF kinds - so what about not trying to auto-detect anything and let kernel strictly opt into BTF functionality it expects from pahole and recognizes? > is a good one. I think it should be in the UAPI also though > as "enum btf_features". A set of bitmask values - probably closely > mirroring the FEAT_BTF* . Something like this perhaps: > > enum btf_features { > BTF_FEATURE_BASIC = 0x1, /* _FUNC, _FUNC_PROTO */ > BTF_FEATURE_DATASEC = 0x2, /* _VAR, _DATASEC */ > > ..etc. A bitmask value would also be amenable to inclusion in > an updated struct btf_header. I don't know if I agree to add this to UAPI. It seems like an overkill, and it also raises a question of "what is a feature"? Any tiny addition, extension, etc could be considered a feature and we'll end up using all the bits very soon. With self-describing btf_type sizes, tools should be able to skip BTF types they don't recognize. And if there is some fancy kflag usage within an old BTF KIND, for example, then it will be up to the application to either complain, skip, or ignore. E.g., bpftool should try to emit all possible information during bpftool btf dump, even if it doesn't recognize a particular flag or enum. > > So at BTF encoding time - if we support the newer header - we could > add the feature set supported by the BTF encoding along with CRCs. > That would be useful for tools - for example if bpftool encounters > features it doesn't support in BTF it is trying to parse, it can > complain upfront. Ditto for libbpf. > > With respect to the kind layout support, it probably isn't worth it. > It would be a tax on every BTF encoding, and it only helps with > parsing - as opposed to using - newer BTF features. As long as > we can guarantee that a kernel doesn't wind up with BTF features > it doesn't support in vmlinux/module BTF, I think that's enough. > > Alan > > > > > All this cleverness in trying to guess what kernel supports and what > > not (without actually running the kernel and feature-testing) will > > just come biting us later on. This never works reliably. > > > > > >> So this series attempts to detect the BTF kinds supported by the > >> kernel/modules so that this can inform BTF encoding for older > >> kernels. We look for BTF_KIND_MAX - either as an enumerated value > >> in vmlinux DWARF (patch 1) or as an enumerated value in base vmlinux > >> BTF (patch 3). Knowing this prior to encoding BTF allows us to specify > >> skip_encoding options to avoid having BTF with kinds the kernel itself > >> will not understand. > >> > >> The aim is to minimize pain for older stable kernels when new BTF > >> kinds are introduced. Kind encoding [1] can solve the parsing problem > >> with BTF, but this approach is intended to ensure generated BTF is > >> usable when newer pahole runs on older kernels. > >> > >> This approach requires BTF kinds to be defined via an enumerated type, > >> which happened for 5.16 and later. Older kernels than this used #defines > >> so the approach will only work for 5.16 stable kernels and later currently. > >> > >> With this change in hand, adding new BTF kinds becomes a bit simpler, > >> at least for the user of pahole. All that needs to be done is to add > >> internal "skip_new_kind" booleans to struct conf_load and set them > >> in dwarves__set_btf_kind_max() if the detected maximum kind is less > >> than the kind in question - in other words, if the kernel does not know > >> about that kind. In that case, we will not use it in encoding. > >> > >> The approach was tested on Linux 5.16 as released, i.e. prior to the > >> backports adding --skip_encoding logic, and the BTF generated did not > >> contain kinds > BTF_KIND_MAX for the kernel (corresponding to > >> BTF_KIND_DECL_TAG in that case). > >> > >> Changes since RFC [2]: > >> - added --skip_autodetect_btf_kind_max to disable kind autodetection > >> (Jiri, patch 2) > >> > >> [1] https://lore.kernel.org/bpf/20230616171728.530116-1-alan.maguire@oracle.com/ > >> [2] https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/ > >> > >> Alan Maguire (3): > >> dwarves: auto-detect maximum kind supported by vmlinux > >> pahole: add --skip_autodetect_btf_kind_max to disable kind autodetect > >> btf_encoder: learn BTF_KIND_MAX value from base BTF when generating > >> split BTF > >> > >> btf_encoder.c | 37 +++++++++++++++++++++++++++++++++ > >> btf_encoder.h | 2 ++ > >> dwarf_loader.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ > >> dwarves.h | 3 +++ > >> man-pages/pahole.1 | 4 ++++ > >> pahole.c | 10 +++++++++ > >> 6 files changed, 108 insertions(+) > >> > >> -- > >> 2.39.3 > >> > >
On 19/09/2023 19:58, Andrii Nakryiko wrote: > On Tue, Sep 19, 2023 at 9:30 AM Alan Maguire <alan.maguire@oracle.com> wrote: >> >> On 14/09/2023 18:58, Andrii Nakryiko wrote: >>> On Wed, Sep 13, 2023 at 7:26 AM Alan Maguire <alan.maguire@oracle.com> wrote: >>>> >>>> When a newer pahole is run on an older kernel, it often knows about BTF >>>> kinds that the kernel does not support, and adds them to the BTF >>>> representation. This is a problem because the BTF generated is then >>>> embedded in the kernel image. When it is later read - possibly by >>>> a different older toolchain or by the kernel directly - it is not usable. >>>> >>>> The scripts/pahole-flags.sh script enumerates the various pahole options >>>> available associated with various versions of pahole, but in the case >>>> of an older kernel is the set of BTF kinds the _kernel_ can handle that >>>> is of more importance. >>>> >>>> Because recent features such as BTF_KIND_ENUM64 are added by default >>>> (and only skipped if --skip_encoding_btf_* is set), BTF will be >>>> created with these newer kinds that the older kernel cannot read. >>>> This can be (and has been) fixed by stable-backporting --skip options, >>>> but this is cumbersome and would have to be done every time a new BTF kind >>>> is introduced. >>>> >>> >>> Yes, this is indeed the problem, but I don't think any sort of auto >>> detection by pahole itself of what is the BTF_KIND_MAX is the best >>> solution. Sometimes new features are added to existing kinds (like >>> kflag usage, etc), and that will still break even with "auto >>> detection". >>> >>> I think the solution is to design pahole behavior in such a way that >>> it allows full control for old kernels to specify which BTF features >>> are expected to be generated, while also allowing to default to all >>> the latest and greatest BTF features by default for any other >>> application. >>> >>> So, how about something like the following. By default, pahole >>> generates all the BTF features it knows about. But we add a new flag >>> that says to stay conservative and only generate a specified subset of >>> BTF features. E.g.: >>> >>> 1) `pahole -J <eLF.o>` will generate everything >>> >>> 2) `pahole -J <elf.o> --btf_feature=basic` will generate only the very >>> basic stuff (we can decide what constitutes basic, we can go all the >>> way to before we added variables, or can pick any random state after >>> that) >>> >>> 3) `pahole -J <elf.o> --btf_feature=basic --btf_feature=enum64 >>> --btf_feature=fancy_funcs` will generate only requested bits. >>> >>> We can have --btf_feature=all as a convenience as well, but kernel >>> scripts won't use it. >>> >>> From the very beginning, pahole should not fail with a feature name >>> that it doesn't recognize, though (maybe emit a warning, don't know). >>> So that kernel-side scripts can be simpler: when kernel starts to >>> recognize some new BTF functionality, we just unconditionally add >>> another `--btf_feature=<something>`. And that works starting from the >>> first pahole version that supports this `--btf_feature` flag. >>> >> >> The idea of a BTF feature flag set - not restricted to BTF kinds - > > so what about not trying to auto-detect anything and let kernel > strictly opt into BTF functionality it expects from pahole and > recognizes? > >> is a good one. I think it should be in the UAPI also though >> as "enum btf_features". A set of bitmask values - probably closely >> mirroring the FEAT_BTF* . Something like this perhaps: >> >> enum btf_features { >> BTF_FEATURE_BASIC = 0x1, /* _FUNC, _FUNC_PROTO */ >> BTF_FEATURE_DATASEC = 0x2, /* _VAR, _DATASEC */ >> >> ..etc. A bitmask value would also be amenable to inclusion in >> an updated struct btf_header. > > I don't know if I agree to add this to UAPI. It seems like an > overkill, and it also raises a question of "what is a feature"? Any > tiny addition, extension, etc could be considered a feature and we'll > end up using all the bits very soon. With self-describing btf_type > sizes, tools should be able to skip BTF types they don't recognize. > And if there is some fancy kflag usage within an old BTF KIND, for > example, then it will be up to the application to either complain, > skip, or ignore. E.g., bpftool should try to emit all possible > information during bpftool btf dump, even if it doesn't recognize a > particular flag or enum. > Based on the above, I've put together an RFC implementing a --btf_features=feature1[,feature2] ...parameter for pahole [1]. I _think_ it's roughly what you've described above, and I think it has the characteristics we need to simplify scripts/pahole-flags.sh (features are opt-in, no complaints on unrecognized features) such that we'll only need one more version-check clause, something like this: if [ "${pahole_ver}" -ge "126" ]; then extra_pahole_opt="-j --lang_exclude=rust --btf_features=encode_force,var,float,decl_tag,type_tag,enum64,optimized,consistent" fi New features would simply be added to the list above without a version check requirement and ignored for pahole versions that don't support them. I'll follow up with the kind layout/crc stuff once we converge on how we want to handle new BTF features. Thanks! Alan [1] https://lore.kernel.org/bpf/20231011091732.93254-1-alan.maguire@oracle.com/ >> >> So at BTF encoding time - if we support the newer header - we could >> add the feature set supported by the BTF encoding along with CRCs. >> That would be useful for tools - for example if bpftool encounters >> features it doesn't support in BTF it is trying to parse, it can >> complain upfront. Ditto for libbpf. >> >> With respect to the kind layout support, it probably isn't worth it. >> It would be a tax on every BTF encoding, and it only helps with >> parsing - as opposed to using - newer BTF features. As long as >> we can guarantee that a kernel doesn't wind up with BTF features >> it doesn't support in vmlinux/module BTF, I think that's enough. >> >> Alan >> >>> >>> All this cleverness in trying to guess what kernel supports and what >>> not (without actually running the kernel and feature-testing) will >>> just come biting us later on. This never works reliably. >>> >>> >>>> So this series attempts to detect the BTF kinds supported by the >>>> kernel/modules so that this can inform BTF encoding for older >>>> kernels. We look for BTF_KIND_MAX - either as an enumerated value >>>> in vmlinux DWARF (patch 1) or as an enumerated value in base vmlinux >>>> BTF (patch 3). Knowing this prior to encoding BTF allows us to specify >>>> skip_encoding options to avoid having BTF with kinds the kernel itself >>>> will not understand. >>>> >>>> The aim is to minimize pain for older stable kernels when new BTF >>>> kinds are introduced. Kind encoding [1] can solve the parsing problem >>>> with BTF, but this approach is intended to ensure generated BTF is >>>> usable when newer pahole runs on older kernels. >>>> >>>> This approach requires BTF kinds to be defined via an enumerated type, >>>> which happened for 5.16 and later. Older kernels than this used #defines >>>> so the approach will only work for 5.16 stable kernels and later currently. >>>> >>>> With this change in hand, adding new BTF kinds becomes a bit simpler, >>>> at least for the user of pahole. All that needs to be done is to add >>>> internal "skip_new_kind" booleans to struct conf_load and set them >>>> in dwarves__set_btf_kind_max() if the detected maximum kind is less >>>> than the kind in question - in other words, if the kernel does not know >>>> about that kind. In that case, we will not use it in encoding. >>>> >>>> The approach was tested on Linux 5.16 as released, i.e. prior to the >>>> backports adding --skip_encoding logic, and the BTF generated did not >>>> contain kinds > BTF_KIND_MAX for the kernel (corresponding to >>>> BTF_KIND_DECL_TAG in that case). >>>> >>>> Changes since RFC [2]: >>>> - added --skip_autodetect_btf_kind_max to disable kind autodetection >>>> (Jiri, patch 2) >>>> >>>> [1] https://lore.kernel.org/bpf/20230616171728.530116-1-alan.maguire@oracle.com/ >>>> [2] https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/ >>>> >>>> Alan Maguire (3): >>>> dwarves: auto-detect maximum kind supported by vmlinux >>>> pahole: add --skip_autodetect_btf_kind_max to disable kind autodetect >>>> btf_encoder: learn BTF_KIND_MAX value from base BTF when generating >>>> split BTF >>>> >>>> btf_encoder.c | 37 +++++++++++++++++++++++++++++++++ >>>> btf_encoder.h | 2 ++ >>>> dwarf_loader.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ >>>> dwarves.h | 3 +++ >>>> man-pages/pahole.1 | 4 ++++ >>>> pahole.c | 10 +++++++++ >>>> 6 files changed, 108 insertions(+) >>>> >>>> -- >>>> 2.39.3 >>>> >>>
On Wed, Oct 11, 2023 at 2:27 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > On 19/09/2023 19:58, Andrii Nakryiko wrote: > > On Tue, Sep 19, 2023 at 9:30 AM Alan Maguire <alan.maguire@oracle.com> wrote: > >> > >> On 14/09/2023 18:58, Andrii Nakryiko wrote: > >>> On Wed, Sep 13, 2023 at 7:26 AM Alan Maguire <alan.maguire@oracle.com> wrote: > >>>> > >>>> When a newer pahole is run on an older kernel, it often knows about BTF > >>>> kinds that the kernel does not support, and adds them to the BTF > >>>> representation. This is a problem because the BTF generated is then > >>>> embedded in the kernel image. When it is later read - possibly by > >>>> a different older toolchain or by the kernel directly - it is not usable. > >>>> > >>>> The scripts/pahole-flags.sh script enumerates the various pahole options > >>>> available associated with various versions of pahole, but in the case > >>>> of an older kernel is the set of BTF kinds the _kernel_ can handle that > >>>> is of more importance. > >>>> > >>>> Because recent features such as BTF_KIND_ENUM64 are added by default > >>>> (and only skipped if --skip_encoding_btf_* is set), BTF will be > >>>> created with these newer kinds that the older kernel cannot read. > >>>> This can be (and has been) fixed by stable-backporting --skip options, > >>>> but this is cumbersome and would have to be done every time a new BTF kind > >>>> is introduced. > >>>> > >>> > >>> Yes, this is indeed the problem, but I don't think any sort of auto > >>> detection by pahole itself of what is the BTF_KIND_MAX is the best > >>> solution. Sometimes new features are added to existing kinds (like > >>> kflag usage, etc), and that will still break even with "auto > >>> detection". > >>> > >>> I think the solution is to design pahole behavior in such a way that > >>> it allows full control for old kernels to specify which BTF features > >>> are expected to be generated, while also allowing to default to all > >>> the latest and greatest BTF features by default for any other > >>> application. > >>> > >>> So, how about something like the following. By default, pahole > >>> generates all the BTF features it knows about. But we add a new flag > >>> that says to stay conservative and only generate a specified subset of > >>> BTF features. E.g.: > >>> > >>> 1) `pahole -J <eLF.o>` will generate everything > >>> > >>> 2) `pahole -J <elf.o> --btf_feature=basic` will generate only the very > >>> basic stuff (we can decide what constitutes basic, we can go all the > >>> way to before we added variables, or can pick any random state after > >>> that) > >>> > >>> 3) `pahole -J <elf.o> --btf_feature=basic --btf_feature=enum64 > >>> --btf_feature=fancy_funcs` will generate only requested bits. > >>> > >>> We can have --btf_feature=all as a convenience as well, but kernel > >>> scripts won't use it. > >>> > >>> From the very beginning, pahole should not fail with a feature name > >>> that it doesn't recognize, though (maybe emit a warning, don't know). > >>> So that kernel-side scripts can be simpler: when kernel starts to > >>> recognize some new BTF functionality, we just unconditionally add > >>> another `--btf_feature=<something>`. And that works starting from the > >>> first pahole version that supports this `--btf_feature` flag. > >>> > >> > >> The idea of a BTF feature flag set - not restricted to BTF kinds - > > > > so what about not trying to auto-detect anything and let kernel > > strictly opt into BTF functionality it expects from pahole and > > recognizes? > > > >> is a good one. I think it should be in the UAPI also though > >> as "enum btf_features". A set of bitmask values - probably closely > >> mirroring the FEAT_BTF* . Something like this perhaps: > >> > >> enum btf_features { > >> BTF_FEATURE_BASIC = 0x1, /* _FUNC, _FUNC_PROTO */ > >> BTF_FEATURE_DATASEC = 0x2, /* _VAR, _DATASEC */ > >> > >> ..etc. A bitmask value would also be amenable to inclusion in > >> an updated struct btf_header. > > > > I don't know if I agree to add this to UAPI. It seems like an > > overkill, and it also raises a question of "what is a feature"? Any > > tiny addition, extension, etc could be considered a feature and we'll > > end up using all the bits very soon. With self-describing btf_type > > sizes, tools should be able to skip BTF types they don't recognize. > > And if there is some fancy kflag usage within an old BTF KIND, for > > example, then it will be up to the application to either complain, > > skip, or ignore. E.g., bpftool should try to emit all possible > > information during bpftool btf dump, even if it doesn't recognize a > > particular flag or enum. > > > > Based on the above, I've put together an RFC implementing a > > --btf_features=feature1[,feature2] > > ...parameter for pahole [1]. I _think_ it's roughly what you've > described above, and I think it has the characteristics we need > to simplify scripts/pahole-flags.sh (features are opt-in, no > complaints on unrecognized features) such that we'll only > need one more version-check clause, something like this: > > if [ "${pahole_ver}" -ge "126" ]; then > extra_pahole_opt="-j --lang_exclude=rust > --btf_features=encode_force,var,float,decl_tag,type_tag,enum64,optimized,consistent" > fi > > New features would simply be added to the list above without a > version check requirement and ignored for pahole versions that > don't support them. Yes, that's the hope. I left a few comments, I think this looks great overall, thanks! > > I'll follow up with the kind layout/crc stuff once we converge on > how we want to handle new BTF features. Thanks! Makes sense, and I think we have converged :) > > Alan > > [1] > https://lore.kernel.org/bpf/20231011091732.93254-1-alan.maguire@oracle.com/ > > >> > >> So at BTF encoding time - if we support the newer header - we could > >> add the feature set supported by the BTF encoding along with CRCs. > >> That would be useful for tools - for example if bpftool encounters > >> features it doesn't support in BTF it is trying to parse, it can > >> complain upfront. Ditto for libbpf. > >> > >> With respect to the kind layout support, it probably isn't worth it. > >> It would be a tax on every BTF encoding, and it only helps with > >> parsing - as opposed to using - newer BTF features. As long as > >> we can guarantee that a kernel doesn't wind up with BTF features > >> it doesn't support in vmlinux/module BTF, I think that's enough. > >> > >> Alan > >> > >>> > >>> All this cleverness in trying to guess what kernel supports and what > >>> not (without actually running the kernel and feature-testing) will > >>> just come biting us later on. This never works reliably. > >>> > >>> > >>>> So this series attempts to detect the BTF kinds supported by the > >>>> kernel/modules so that this can inform BTF encoding for older > >>>> kernels. We look for BTF_KIND_MAX - either as an enumerated value > >>>> in vmlinux DWARF (patch 1) or as an enumerated value in base vmlinux > >>>> BTF (patch 3). Knowing this prior to encoding BTF allows us to specify > >>>> skip_encoding options to avoid having BTF with kinds the kernel itself > >>>> will not understand. > >>>> > >>>> The aim is to minimize pain for older stable kernels when new BTF > >>>> kinds are introduced. Kind encoding [1] can solve the parsing problem > >>>> with BTF, but this approach is intended to ensure generated BTF is > >>>> usable when newer pahole runs on older kernels. > >>>> > >>>> This approach requires BTF kinds to be defined via an enumerated type, > >>>> which happened for 5.16 and later. Older kernels than this used #defines > >>>> so the approach will only work for 5.16 stable kernels and later currently. > >>>> > >>>> With this change in hand, adding new BTF kinds becomes a bit simpler, > >>>> at least for the user of pahole. All that needs to be done is to add > >>>> internal "skip_new_kind" booleans to struct conf_load and set them > >>>> in dwarves__set_btf_kind_max() if the detected maximum kind is less > >>>> than the kind in question - in other words, if the kernel does not know > >>>> about that kind. In that case, we will not use it in encoding. > >>>> > >>>> The approach was tested on Linux 5.16 as released, i.e. prior to the > >>>> backports adding --skip_encoding logic, and the BTF generated did not > >>>> contain kinds > BTF_KIND_MAX for the kernel (corresponding to > >>>> BTF_KIND_DECL_TAG in that case). > >>>> > >>>> Changes since RFC [2]: > >>>> - added --skip_autodetect_btf_kind_max to disable kind autodetection > >>>> (Jiri, patch 2) > >>>> > >>>> [1] https://lore.kernel.org/bpf/20230616171728.530116-1-alan.maguire@oracle.com/ > >>>> [2] https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/ > >>>> > >>>> Alan Maguire (3): > >>>> dwarves: auto-detect maximum kind supported by vmlinux > >>>> pahole: add --skip_autodetect_btf_kind_max to disable kind autodetect > >>>> btf_encoder: learn BTF_KIND_MAX value from base BTF when generating > >>>> split BTF > >>>> > >>>> btf_encoder.c | 37 +++++++++++++++++++++++++++++++++ > >>>> btf_encoder.h | 2 ++ > >>>> dwarf_loader.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ > >>>> dwarves.h | 3 +++ > >>>> man-pages/pahole.1 | 4 ++++ > >>>> pahole.c | 10 +++++++++ > >>>> 6 files changed, 108 insertions(+) > >>>> > >>>> -- > >>>> 2.39.3 > >>>> > >>>