Message ID | 20210328064121.2062927-1-yhs@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [kbuild] kbuild: add -grecord-gcc-switches to clang build | expand |
(replying to https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) Thanks for the patch! > +# gcc emits compilation flags in dwarf DW_AT_producer by default > +# while clang needs explicit flag. Add this flag explicitly. > +ifdef CONFIG_CC_IS_CLANG > +DEBUG_CFLAGS += -grecord-gcc-switches > +endif > + This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. Do we want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we don't have to pay that cost if that config is not set?
On 3/29/21 3:52 PM, Nick Desaulniers wrote: > (replying to https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) > > Thanks for the patch! > >> +# gcc emits compilation flags in dwarf DW_AT_producer by default >> +# while clang needs explicit flag. Add this flag explicitly. >> +ifdef CONFIG_CC_IS_CLANG >> +DEBUG_CFLAGS += -grecord-gcc-switches >> +endif >> + > > This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. Do we > want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we don't have > to pay that cost if that config is not set? Since this patch is mostly motivated to detect whether the kernel is built with clang lto or not. Let me add the flag only if lto is enabled. My measurement shows 0.5% increase to thinlto-vmlinux. The smaller percentage is due to larger .debug_info section (almost double) for thinlto vs. no lto. ifdef CONFIG_LTO_CLANG DEBUG_CFLAGS += -grecord-gcc-switches endif This will make pahole with any clang built kernels, lto or non-lto. If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF, I can do that in another revision.
On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: > > >On 3/29/21 3:52 PM, Nick Desaulniers wrote: >>(replying to https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) >> >>Thanks for the patch! >> >>>+# gcc emits compilation flags in dwarf DW_AT_producer by default >>>+# while clang needs explicit flag. Add this flag explicitly. >>>+ifdef CONFIG_CC_IS_CLANG >>>+DEBUG_CFLAGS += -grecord-gcc-switches >>>+endif >>>+ Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. >>This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. Do we >>want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we don't have >>to pay that cost if that config is not set? > >Since this patch is mostly motivated to detect whether the kernel is >built with clang lto or not. Let me add the flag only if lto is >enabled. My measurement shows 0.5% increase to thinlto-vmlinux. >The smaller percentage is due to larger .debug_info section >(almost double) for thinlto vs. no lto. > > ifdef CONFIG_LTO_CLANG > DEBUG_CFLAGS += -grecord-gcc-switches > endif > >This will make pahole with any clang built kernels, lto or non-lto. I share the same concern about sizes. Can't pahole know it is clang LTO via other means? If pahole just needs to know the one-bit information (clang LTO vs not), having every compile option seems unnecessary.... >If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF, >I can do that in another revision. > >-- >You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. >To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com. >To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/0b8d17be-e015-83c3-88d8-7c218cd01536%40fb.com.
On 3/30/21 5:25 PM, Fangrui Song wrote: > On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: >> >> >> On 3/29/21 3:52 PM, Nick Desaulniers wrote: >>> (replying to >>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) >>> >>> Thanks for the patch! >>> >>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default >>>> +# while clang needs explicit flag. Add this flag explicitly. >>>> +ifdef CONFIG_CC_IS_CLANG >>>> +DEBUG_CFLAGS += -grecord-gcc-switches >>>> +endif >>>> + > > Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. Could you know why? dwarf size concern? > >>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. >>> Do we >>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we >>> don't have >>> to pay that cost if that config is not set? >> >> Since this patch is mostly motivated to detect whether the kernel is >> built with clang lto or not. Let me add the flag only if lto is >> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. >> The smaller percentage is due to larger .debug_info section >> (almost double) for thinlto vs. no lto. >> >> ifdef CONFIG_LTO_CLANG >> DEBUG_CFLAGS += -grecord-gcc-switches >> endif >> >> This will make pahole with any clang built kernels, lto or non-lto. > > I share the same concern about sizes. Can't pahole know it is clang LTO > via other means? If pahole just needs to know the one-bit information > (clang LTO vs not), having every compile option seems unnecessary.... This is v2 of the patch https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ The flag will be guarded with CONFIG_LTO_CLANG. As mentioned in commit message of v2, the alternative is to go through every cu to find out whether DW_FORM_ref_addr is used or not. In other words, check every possible cross-cu references to find whether cross-cu reference actually happens or not. This is quite heavy for pahole... What we really want to know is whether cross-cu reference happens or not? If there is an easy way to get it, that will be great. > >> If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF, >> I can do that in another revision. >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clang Built Linux" group. >> To unsubscribe from this group and stop receiving emails from it, send >> an email to clang-built-linux+unsubscribe@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/clang-built-linux/0b8d17be-e015-83c3-88d8-7c218cd01536@fb.com >> .
On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote: > > > > On 3/30/21 5:25 PM, Fangrui Song wrote: > > On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: > >> > >> > >> On 3/29/21 3:52 PM, Nick Desaulniers wrote: > >>> (replying to > >>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) > >>> > >>> Thanks for the patch! > >>> > >>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default > >>>> +# while clang needs explicit flag. Add this flag explicitly. > >>>> +ifdef CONFIG_CC_IS_CLANG > >>>> +DEBUG_CFLAGS += -grecord-gcc-switches > >>>> +endif > >>>> + > > > > Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. > > Could you know why? dwarf size concern? > > > > >>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. > >>> Do we > >>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we > >>> don't have > >>> to pay that cost if that config is not set? > >> > >> Since this patch is mostly motivated to detect whether the kernel is > >> built with clang lto or not. Let me add the flag only if lto is > >> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. > >> The smaller percentage is due to larger .debug_info section > >> (almost double) for thinlto vs. no lto. > >> > >> ifdef CONFIG_LTO_CLANG > >> DEBUG_CFLAGS += -grecord-gcc-switches > >> endif > >> > >> This will make pahole with any clang built kernels, lto or non-lto. > > > > I share the same concern about sizes. Can't pahole know it is clang LTO > > via other means? If pahole just needs to know the one-bit information > > (clang LTO vs not), having every compile option seems unnecessary.... > > This is v2 of the patch > https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ > The flag will be guarded with CONFIG_LTO_CLANG. > > As mentioned in commit message of v2, the alternative is > to go through every cu to find out whether DW_FORM_ref_addr is used > or not. In other words, check every possible cross-cu references > to find whether cross-cu reference actually happens or not. This > is quite heavy for pahole... > > What we really want to know is whether cross-cu reference happens > or not? If there is an easy way to get it, that will be great. +David Blaikie > > > >> If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF, > >> I can do that in another revision. > >> > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "Clang Built Linux" group. > >> To unsubscribe from this group and stop receiving emails from it, send > >> an email to clang-built-linux+unsubscribe@googlegroups.com. > >> To view this discussion on the web visit > >> https://groups.google.com/d/msgid/clang-built-linux/0b8d17be-e015-83c3-88d8-7c218cd01536@fb.com > >> .
On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote: > > On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote: > > > > > > > > On 3/30/21 5:25 PM, Fangrui Song wrote: > > > On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: > > >> > > >> > > >> On 3/29/21 3:52 PM, Nick Desaulniers wrote: > > >>> (replying to > > >>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) > > >>> > > >>> Thanks for the patch! > > >>> > > >>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default > > >>>> +# while clang needs explicit flag. Add this flag explicitly. > > >>>> +ifdef CONFIG_CC_IS_CLANG > > >>>> +DEBUG_CFLAGS += -grecord-gcc-switches > > >>>> +endif > > >>>> + > > > > > > Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. > > > > Could you know why? dwarf size concern? > > > > > > > >>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. > > >>> Do we > > >>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we > > >>> don't have > > >>> to pay that cost if that config is not set? > > >> > > >> Since this patch is mostly motivated to detect whether the kernel is > > >> built with clang lto or not. Let me add the flag only if lto is > > >> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. > > >> The smaller percentage is due to larger .debug_info section > > >> (almost double) for thinlto vs. no lto. > > >> > > >> ifdef CONFIG_LTO_CLANG > > >> DEBUG_CFLAGS += -grecord-gcc-switches > > >> endif > > >> > > >> This will make pahole with any clang built kernels, lto or non-lto. > > > > > > I share the same concern about sizes. Can't pahole know it is clang LTO > > > via other means? If pahole just needs to know the one-bit information > > > (clang LTO vs not), having every compile option seems unnecessary.... > > > > This is v2 of the patch > > https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ > > The flag will be guarded with CONFIG_LTO_CLANG. > > > > As mentioned in commit message of v2, the alternative is > > to go through every cu to find out whether DW_FORM_ref_addr is used > > or not. In other words, check every possible cross-cu references > > to find whether cross-cu reference actually happens or not. This > > is quite heavy for pahole... > > > > What we really want to know is whether cross-cu reference happens > > or not? If there is an easy way to get it, that will be great. > > +David Blaikie Yep, that shouldn't be too hard to test for more directly - scanning .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that be workable rather than relying on detecting clang/lto from command line parameters? (GCC can produce these cross-CU references too, when using lto - so this approach would help make the solution generalize over GCC's behavior too)
On 3/30/21 7:51 PM, David Blaikie wrote: > On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote: >> >> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote: >>> >>> >>> >>> On 3/30/21 5:25 PM, Fangrui Song wrote: >>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: >>>>> >>>>> >>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote: >>>>>> (replying to >>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) >>>>>> >>>>>> Thanks for the patch! >>>>>> >>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default >>>>>>> +# while clang needs explicit flag. Add this flag explicitly. >>>>>>> +ifdef CONFIG_CC_IS_CLANG >>>>>>> +DEBUG_CFLAGS += -grecord-gcc-switches >>>>>>> +endif >>>>>>> + >>>> >>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. >>> >>> Could you know why? dwarf size concern? >>> >>>> >>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. >>>>>> Do we >>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we >>>>>> don't have >>>>>> to pay that cost if that config is not set? >>>>> >>>>> Since this patch is mostly motivated to detect whether the kernel is >>>>> built with clang lto or not. Let me add the flag only if lto is >>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. >>>>> The smaller percentage is due to larger .debug_info section >>>>> (almost double) for thinlto vs. no lto. >>>>> >>>>> ifdef CONFIG_LTO_CLANG >>>>> DEBUG_CFLAGS += -grecord-gcc-switches >>>>> endif >>>>> >>>>> This will make pahole with any clang built kernels, lto or non-lto. >>>> >>>> I share the same concern about sizes. Can't pahole know it is clang LTO >>>> via other means? If pahole just needs to know the one-bit information >>>> (clang LTO vs not), having every compile option seems unnecessary.... >>> >>> This is v2 of the patch >>> https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ >>> The flag will be guarded with CONFIG_LTO_CLANG. >>> >>> As mentioned in commit message of v2, the alternative is >>> to go through every cu to find out whether DW_FORM_ref_addr is used >>> or not. In other words, check every possible cross-cu references >>> to find whether cross-cu reference actually happens or not. This >>> is quite heavy for pahole... >>> >>> What we really want to know is whether cross-cu reference happens >>> or not? If there is an easy way to get it, that will be great. >> >> +David Blaikie > > Yep, that shouldn't be too hard to test for more directly - scanning > .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that > be workable rather than relying on detecting clang/lto from command > line parameters? (GCC can produce these cross-CU references too, when > using lto - so this approach would help make the solution generalize > over GCC's behavior too) Thanks, David. This should be better. I tried with a non-lto vmlinux. Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records. I will try this approach. If the time is a very small fraction of actual dwarf cu processing time, we should be fine. This definitely better than visit all die's in cu trying to detect cross-cu reference.
On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote: > > > > On 3/30/21 7:51 PM, David Blaikie wrote: > > On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote: > >> > >> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote: > >>> > >>> > >>> > >>> On 3/30/21 5:25 PM, Fangrui Song wrote: > >>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: > >>>>> > >>>>> > >>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote: > >>>>>> (replying to > >>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) > >>>>>> > >>>>>> Thanks for the patch! > >>>>>> > >>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default > >>>>>>> +# while clang needs explicit flag. Add this flag explicitly. > >>>>>>> +ifdef CONFIG_CC_IS_CLANG > >>>>>>> +DEBUG_CFLAGS += -grecord-gcc-switches > >>>>>>> +endif > >>>>>>> + > >>>> > >>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. > >>> > >>> Could you know why? dwarf size concern? > >>> > >>>> > >>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. > >>>>>> Do we > >>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we > >>>>>> don't have > >>>>>> to pay that cost if that config is not set? > >>>>> > >>>>> Since this patch is mostly motivated to detect whether the kernel is > >>>>> built with clang lto or not. Let me add the flag only if lto is > >>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. > >>>>> The smaller percentage is due to larger .debug_info section > >>>>> (almost double) for thinlto vs. no lto. > >>>>> > >>>>> ifdef CONFIG_LTO_CLANG > >>>>> DEBUG_CFLAGS += -grecord-gcc-switches > >>>>> endif > >>>>> > >>>>> This will make pahole with any clang built kernels, lto or non-lto. > >>>> > >>>> I share the same concern about sizes. Can't pahole know it is clang LTO > >>>> via other means? If pahole just needs to know the one-bit information > >>>> (clang LTO vs not), having every compile option seems unnecessary.... > >>> > >>> This is v2 of the patch > >>> https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ > >>> The flag will be guarded with CONFIG_LTO_CLANG. > >>> > >>> As mentioned in commit message of v2, the alternative is > >>> to go through every cu to find out whether DW_FORM_ref_addr is used > >>> or not. In other words, check every possible cross-cu references > >>> to find whether cross-cu reference actually happens or not. This > >>> is quite heavy for pahole... > >>> > >>> What we really want to know is whether cross-cu reference happens > >>> or not? If there is an easy way to get it, that will be great. > >> > >> +David Blaikie > > > > Yep, that shouldn't be too hard to test for more directly - scanning > > .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that > > be workable rather than relying on detecting clang/lto from command > > line parameters? (GCC can produce these cross-CU references too, when > > using lto - so this approach would help make the solution generalize > > over GCC's behavior too) > > Thanks, David. This should be better. I tried with a non-lto vmlinux. > Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then > "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records. What conclusions are you drawing from this number/data? (I'm not following how DW_CHILDREN_no relates to the topic - perhaps I'm missing something) > I will try this approach. If the time is a very small fraction of > actual dwarf cu processing time, we should be fine. This definitely > better than visit all die's in cu trying to detect cross-cu reference. *fingers crossed*
On 3/30/21 8:16 PM, David Blaikie wrote: > On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote: >> >> >> >> On 3/30/21 7:51 PM, David Blaikie wrote: >>> On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote: >>>> >>>> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote: >>>>> >>>>> >>>>> >>>>> On 3/30/21 5:25 PM, Fangrui Song wrote: >>>>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: >>>>>>> >>>>>>> >>>>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote: >>>>>>>> (replying to >>>>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) >>>>>>>> >>>>>>>> Thanks for the patch! >>>>>>>> >>>>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default >>>>>>>>> +# while clang needs explicit flag. Add this flag explicitly. >>>>>>>>> +ifdef CONFIG_CC_IS_CLANG >>>>>>>>> +DEBUG_CFLAGS += -grecord-gcc-switches >>>>>>>>> +endif >>>>>>>>> + >>>>>> >>>>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. >>>>> >>>>> Could you know why? dwarf size concern? >>>>> >>>>>> >>>>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. >>>>>>>> Do we >>>>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we >>>>>>>> don't have >>>>>>>> to pay that cost if that config is not set? >>>>>>> >>>>>>> Since this patch is mostly motivated to detect whether the kernel is >>>>>>> built with clang lto or not. Let me add the flag only if lto is >>>>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. >>>>>>> The smaller percentage is due to larger .debug_info section >>>>>>> (almost double) for thinlto vs. no lto. >>>>>>> >>>>>>> ifdef CONFIG_LTO_CLANG >>>>>>> DEBUG_CFLAGS += -grecord-gcc-switches >>>>>>> endif >>>>>>> >>>>>>> This will make pahole with any clang built kernels, lto or non-lto. >>>>>> >>>>>> I share the same concern about sizes. Can't pahole know it is clang LTO >>>>>> via other means? If pahole just needs to know the one-bit information >>>>>> (clang LTO vs not), having every compile option seems unnecessary.... >>>>> >>>>> This is v2 of the patch >>>>> https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ >>>>> The flag will be guarded with CONFIG_LTO_CLANG. >>>>> >>>>> As mentioned in commit message of v2, the alternative is >>>>> to go through every cu to find out whether DW_FORM_ref_addr is used >>>>> or not. In other words, check every possible cross-cu references >>>>> to find whether cross-cu reference actually happens or not. This >>>>> is quite heavy for pahole... >>>>> >>>>> What we really want to know is whether cross-cu reference happens >>>>> or not? If there is an easy way to get it, that will be great. >>>> >>>> +David Blaikie >>> >>> Yep, that shouldn't be too hard to test for more directly - scanning >>> .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that >>> be workable rather than relying on detecting clang/lto from command >>> line parameters? (GCC can produce these cross-CU references too, when >>> using lto - so this approach would help make the solution generalize >>> over GCC's behavior too) >> >> Thanks, David. This should be better. I tried with a non-lto vmlinux. >> Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then >> "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records. > > What conclusions are you drawing from this number/data? (I'm not > following how DW_CHILDREN_no relates to the topic - perhaps I'm > missing something) Approximation of the number of tags to visit: ... [10] DW_TAG_array_type DW_CHILDREN_yes DW_AT_type DW_FORM_ref4 DW_AT_sibling DW_FORM_ref4 [11] DW_TAG_variable DW_CHILDREN_no DW_AT_name DW_FORM_strp DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data2 DW_AT_decl_column DW_FORM_data1 DW_AT_type DW_FORM_ref4 DW_AT_external DW_FORM_flag_present DW_AT_declaration DW_FORM_flag_present [12] DW_TAG_member DW_CHILDREN_no DW_AT_name DW_FORM_string DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data1 DW_AT_decl_column DW_FORM_data1 DW_AT_type DW_FORM_ref4 DW_AT_data_member_location DW_FORM_data1 [13] DW_TAG_subrange_type DW_CHILDREN_no DW_AT_type DW_FORM_ref4 DW_AT_upper_bound DW_FORM_data1 ... The bigger number means more tags to visit and will consume more time. For a binary not compiled with lto, all these tags will be visited before declaring that the dwarf does not have cross-cu reference. So the number is just a relative guess on the cpu cost. But ya, have to have real implementation first... > >> I will try this approach. If the time is a very small fraction of >> actual dwarf cu processing time, we should be fine. This definitely >> better than visit all die's in cu trying to detect cross-cu reference. > > *fingers crossed* >
On Tue, Mar 30, 2021 at 8:27 PM Yonghong Song <yhs@fb.com> wrote: > > > > On 3/30/21 8:16 PM, David Blaikie wrote: > > On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote: > >> > >> > >> > >> On 3/30/21 7:51 PM, David Blaikie wrote: > >>> On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote: > >>>> > >>>> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 3/30/21 5:25 PM, Fangrui Song wrote: > >>>>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: > >>>>>>> > >>>>>>> > >>>>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote: > >>>>>>>> (replying to > >>>>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/) > >>>>>>>> > >>>>>>>> Thanks for the patch! > >>>>>>>> > >>>>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default > >>>>>>>>> +# while clang needs explicit flag. Add this flag explicitly. > >>>>>>>>> +ifdef CONFIG_CC_IS_CLANG > >>>>>>>>> +DEBUG_CFLAGS += -grecord-gcc-switches > >>>>>>>>> +endif > >>>>>>>>> + > >>>>>> > >>>>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. > >>>>> > >>>>> Could you know why? dwarf size concern? > >>>>> > >>>>>> > >>>>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. > >>>>>>>> Do we > >>>>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we > >>>>>>>> don't have > >>>>>>>> to pay that cost if that config is not set? > >>>>>>> > >>>>>>> Since this patch is mostly motivated to detect whether the kernel is > >>>>>>> built with clang lto or not. Let me add the flag only if lto is > >>>>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. > >>>>>>> The smaller percentage is due to larger .debug_info section > >>>>>>> (almost double) for thinlto vs. no lto. > >>>>>>> > >>>>>>> ifdef CONFIG_LTO_CLANG > >>>>>>> DEBUG_CFLAGS += -grecord-gcc-switches > >>>>>>> endif > >>>>>>> > >>>>>>> This will make pahole with any clang built kernels, lto or non-lto. > >>>>>> > >>>>>> I share the same concern about sizes. Can't pahole know it is clang LTO > >>>>>> via other means? If pahole just needs to know the one-bit information > >>>>>> (clang LTO vs not), having every compile option seems unnecessary.... > >>>>> > >>>>> This is v2 of the patch > >>>>> https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ > >>>>> The flag will be guarded with CONFIG_LTO_CLANG. > >>>>> > >>>>> As mentioned in commit message of v2, the alternative is > >>>>> to go through every cu to find out whether DW_FORM_ref_addr is used > >>>>> or not. In other words, check every possible cross-cu references > >>>>> to find whether cross-cu reference actually happens or not. This > >>>>> is quite heavy for pahole... > >>>>> > >>>>> What we really want to know is whether cross-cu reference happens > >>>>> or not? If there is an easy way to get it, that will be great. > >>>> > >>>> +David Blaikie > >>> > >>> Yep, that shouldn't be too hard to test for more directly - scanning > >>> .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that > >>> be workable rather than relying on detecting clang/lto from command > >>> line parameters? (GCC can produce these cross-CU references too, when > >>> using lto - so this approach would help make the solution generalize > >>> over GCC's behavior too) > >> > >> Thanks, David. This should be better. I tried with a non-lto vmlinux. > >> Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then > >> "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records. > > > > What conclusions are you drawing from this number/data? (I'm not > > following how DW_CHILDREN_no relates to the topic - perhaps I'm > > missing something) > > Approximation of the number of tags to visit: > > ... > [10] DW_TAG_array_type DW_CHILDREN_yes > DW_AT_type DW_FORM_ref4 > DW_AT_sibling DW_FORM_ref4 > > [11] DW_TAG_variable DW_CHILDREN_no > DW_AT_name DW_FORM_strp > DW_AT_decl_file DW_FORM_data1 > DW_AT_decl_line DW_FORM_data2 > DW_AT_decl_column DW_FORM_data1 > DW_AT_type DW_FORM_ref4 > DW_AT_external DW_FORM_flag_present > DW_AT_declaration DW_FORM_flag_present > > [12] DW_TAG_member DW_CHILDREN_no > DW_AT_name DW_FORM_string > DW_AT_decl_file DW_FORM_data1 > DW_AT_decl_line DW_FORM_data1 > DW_AT_decl_column DW_FORM_data1 > DW_AT_type DW_FORM_ref4 > DW_AT_data_member_location DW_FORM_data1 > > [13] DW_TAG_subrange_type DW_CHILDREN_no > DW_AT_type DW_FORM_ref4 > DW_AT_upper_bound DW_FORM_data1 > ... > The bigger number means more tags to visit and will consume more time. > For a binary not compiled with lto, all these tags will be visited > before declaring that the dwarf does not have cross-cu reference. > So the number is just a relative guess on the cpu cost. But ya, > have to have real implementation first... Ah, sounds good, yeah.
diff --git a/Makefile b/Makefile index d4784d181123..ab0119beb42d 100644 --- a/Makefile +++ b/Makefile @@ -839,6 +839,12 @@ dwarf-version-$(CONFIG_DEBUG_INFO_DWARF5) := 5 DEBUG_CFLAGS += -gdwarf-$(dwarf-version-y) endif +# gcc emits compilation flags in dwarf DW_AT_producer by default +# while clang needs explicit flag. Add this flag explicitly. +ifdef CONFIG_CC_IS_CLANG +DEBUG_CFLAGS += -grecord-gcc-switches +endif + ifdef CONFIG_DEBUG_INFO_REDUCED DEBUG_CFLAGS += $(call cc-option, -femit-struct-debug-baseonly) \ $(call cc-option,-fno-var-tracking)
Putting compilation flags in dwarf is helpful in that it tells what potential transformations may have happened to generate the final binary. Furthermore, we have a particular usecase in [1] where pahole wants to detect whether vmlinux is compiled with clang lto or not, and if vmlinux is compiled with clang lto, pahole will merge all debuginfo cu's into one pahole cu. Currently gcc seems put compilation flags into dwarf DW_AT_producer tag if -g is specified, while clang needs explicit flag -grecord-gcc-switches. For example, build with gcc 8.4.1 (make -j60): ... DW_AT_producer ("GNU C89 8.4.1 20200928 (Red Hat 8.4.1-1) -mno-sse -mno-mmx -mno-sse2 ...") DW_AT_language (DW_LANG_C89) DW_AT_name ("/home/yhs/work/bpf-next/arch/x86/kernel/ebda.c") build with clang 13 trunk (make -j60 LLVM=1): ... DW_AT_producer ("clang version 13.0.0 (https://github.com/llvm/llvm-project.git 11bf268864afbe35ad317e6354c51440d5184911)") DW_AT_language (DW_LANG_C89) DW_AT_name ("/home/yhs/work/bpf-next/arch/x86/kernel/ebda.c") With this patch, build with clang 13 trunk: ... DW_AT_producer ("clang version 13.0.0 (https://github.com/llvm/llvm-project.git 11bf268864afbe35ad317e6354c51440d5184911) /home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-13 -MMD -MF arch/x86/kernel/.ebda.o.d -nostdinc ...") DW_AT_language (DW_LANG_C89) DW_AT_name ("/home/yhs/work/bpf-next/arch/x86/kernel/ebda.c") With detailed compilation flags information, in [1], pahole is able to quickly decide whether merging cu's is a right choice or not. [1] https://lore.kernel.org/bpf/20210328061646.1955678-1-yhs@fb.com/T I tested with latest bpf-next, but the patch is also applied cleanly on top of latest linus tree. Signed-off-by: Yonghong Song <yhs@fb.com> --- Makefile | 6 ++++++ 1 file changed, 6 insertions(+)