diff mbox series

[kbuild] kbuild: add -grecord-gcc-switches to clang build

Message ID 20210328064121.2062927-1-yhs@fb.com (mailing list archive)
State New
Headers show
Series [kbuild] kbuild: add -grecord-gcc-switches to clang build | expand

Commit Message

Yonghong Song March 28, 2021, 6:41 a.m. UTC
Putting compilation flags in dwarf is helpful in that
it tells what potential transformations may have
happened to generate the final binary. Furthermore,
we have a particular usecase in [1] where pahole wants
to detect whether vmlinux is compiled with clang lto
or not, and if vmlinux is compiled with clang lto,
pahole will merge all debuginfo cu's into one pahole cu.

Currently gcc seems put compilation flags into
dwarf DW_AT_producer tag if -g is specified, while
clang needs explicit flag -grecord-gcc-switches.
For example,
 build with gcc 8.4.1 (make -j60):
   ...
   DW_AT_producer    ("GNU C89 8.4.1 20200928 (Red Hat 8.4.1-1) -mno-sse -mno-mmx -mno-sse2 ...")
   DW_AT_language    (DW_LANG_C89)
   DW_AT_name        ("/home/yhs/work/bpf-next/arch/x86/kernel/ebda.c")

 build with clang 13 trunk (make -j60 LLVM=1):
   ...
   DW_AT_producer    ("clang version 13.0.0 (https://github.com/llvm/llvm-project.git
                       11bf268864afbe35ad317e6354c51440d5184911)")
   DW_AT_language    (DW_LANG_C89)
   DW_AT_name        ("/home/yhs/work/bpf-next/arch/x86/kernel/ebda.c")

 With this patch, build with clang 13 trunk:
   ...
   DW_AT_producer    ("clang version 13.0.0 (https://github.com/llvm/llvm-project.git
                       11bf268864afbe35ad317e6354c51440d5184911)
                       /home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-13 -MMD
                       -MF arch/x86/kernel/.ebda.o.d -nostdinc ...")
   DW_AT_language    (DW_LANG_C89)
   DW_AT_name        ("/home/yhs/work/bpf-next/arch/x86/kernel/ebda.c")

 With detailed compilation flags information, in [1], pahole is able to quickly
 decide whether merging cu's is a right choice or not.

 [1] https://lore.kernel.org/bpf/20210328061646.1955678-1-yhs@fb.com/T

I tested with latest bpf-next, but the patch is also applied cleanly on
top of latest linus tree.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 Makefile | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Nick Desaulniers March 29, 2021, 10:52 p.m. UTC | #1
(replying to https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)

Thanks for the patch!

> +# gcc emits compilation flags in dwarf DW_AT_producer by default
> +# while clang needs explicit flag. Add this flag explicitly.
> +ifdef CONFIG_CC_IS_CLANG
> +DEBUG_CFLAGS	+= -grecord-gcc-switches
> +endif
> +

This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. Do we
want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we don't have
to pay that cost if that config is not set?
Yonghong Song March 30, 2021, 11:54 p.m. UTC | #2
On 3/29/21 3:52 PM, Nick Desaulniers wrote:
> (replying to https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
> 
> Thanks for the patch!
> 
>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
>> +# while clang needs explicit flag. Add this flag explicitly.
>> +ifdef CONFIG_CC_IS_CLANG
>> +DEBUG_CFLAGS	+= -grecord-gcc-switches
>> +endif
>> +
> 
> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. Do we
> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we don't have
> to pay that cost if that config is not set?

Since this patch is mostly motivated to detect whether the kernel is
built with clang lto or not. Let me add the flag only if lto is
enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
The smaller percentage is due to larger .debug_info section
(almost double) for thinlto vs. no lto.

  ifdef CONFIG_LTO_CLANG
  DEBUG_CFLAGS   += -grecord-gcc-switches
  endif

This will make pahole with any clang built kernels, lto or non-lto.

If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF,
I can do that in another revision.
Fāng-ruì Sòng March 31, 2021, 12:25 a.m. UTC | #3
On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
>
>
>On 3/29/21 3:52 PM, Nick Desaulniers wrote:
>>(replying to https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
>>
>>Thanks for the patch!
>>
>>>+# gcc emits compilation flags in dwarf DW_AT_producer by default
>>>+# while clang needs explicit flag. Add this flag explicitly.
>>>+ifdef CONFIG_CC_IS_CLANG
>>>+DEBUG_CFLAGS	+= -grecord-gcc-switches
>>>+endif
>>>+

Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.

>>This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. Do we
>>want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we don't have
>>to pay that cost if that config is not set?
>
>Since this patch is mostly motivated to detect whether the kernel is
>built with clang lto or not. Let me add the flag only if lto is
>enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
>The smaller percentage is due to larger .debug_info section
>(almost double) for thinlto vs. no lto.
>
> ifdef CONFIG_LTO_CLANG
> DEBUG_CFLAGS   += -grecord-gcc-switches
> endif
>
>This will make pahole with any clang built kernels, lto or non-lto.

I share the same concern about sizes. Can't pahole know it is clang LTO
via other means? If pahole just needs to know the one-bit information
(clang LTO vs not), having every compile option seems unnecessary....

>If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF,
>I can do that in another revision.
>
>-- 
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/0b8d17be-e015-83c3-88d8-7c218cd01536%40fb.com.
Yonghong Song March 31, 2021, 1:47 a.m. UTC | #4
On 3/30/21 5:25 PM, Fangrui Song wrote:
> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
>>
>>
>> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
>>> (replying to 
>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
>>>
>>> Thanks for the patch!
>>>
>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
>>>> +# while clang needs explicit flag. Add this flag explicitly.
>>>> +ifdef CONFIG_CC_IS_CLANG
>>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
>>>> +endif
>>>> +
> 
> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.

Could you know why? dwarf size concern?

> 
>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. 
>>> Do we
>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we 
>>> don't have
>>> to pay that cost if that config is not set?
>>
>> Since this patch is mostly motivated to detect whether the kernel is
>> built with clang lto or not. Let me add the flag only if lto is
>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
>> The smaller percentage is due to larger .debug_info section
>> (almost double) for thinlto vs. no lto.
>>
>> ifdef CONFIG_LTO_CLANG
>> DEBUG_CFLAGS   += -grecord-gcc-switches
>> endif
>>
>> This will make pahole with any clang built kernels, lto or non-lto.
> 
> I share the same concern about sizes. Can't pahole know it is clang LTO
> via other means? If pahole just needs to know the one-bit information
> (clang LTO vs not), having every compile option seems unnecessary....

This is v2 of the patch
   https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
The flag will be guarded with CONFIG_LTO_CLANG.

As mentioned in commit message of v2, the alternative is
to go through every cu to find out whether DW_FORM_ref_addr is used
or not. In other words, check every possible cross-cu references
to find whether cross-cu reference actually happens or not. This
is quite heavy for pahole...

What we really want to know is whether cross-cu reference happens
or not? If there is an easy way to get it, that will be great.

> 
>> If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF,
>> I can do that in another revision.
>>
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "Clang Built Linux" group.
>> To unsubscribe from this group and stop receiving emails from it, send 
>> an email to clang-built-linux+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/clang-built-linux/0b8d17be-e015-83c3-88d8-7c218cd01536@fb.com 
>> .
Fāng-ruì Sòng March 31, 2021, 2:39 a.m. UTC | #5
On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 3/30/21 5:25 PM, Fangrui Song wrote:
> > On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
> >>
> >>
> >> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
> >>> (replying to
> >>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
> >>>
> >>> Thanks for the patch!
> >>>
> >>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
> >>>> +# while clang needs explicit flag. Add this flag explicitly.
> >>>> +ifdef CONFIG_CC_IS_CLANG
> >>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
> >>>> +endif
> >>>> +
> >
> > Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.
>
> Could you know why? dwarf size concern?
>
> >
> >>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang.
> >>> Do we
> >>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we
> >>> don't have
> >>> to pay that cost if that config is not set?
> >>
> >> Since this patch is mostly motivated to detect whether the kernel is
> >> built with clang lto or not. Let me add the flag only if lto is
> >> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
> >> The smaller percentage is due to larger .debug_info section
> >> (almost double) for thinlto vs. no lto.
> >>
> >> ifdef CONFIG_LTO_CLANG
> >> DEBUG_CFLAGS   += -grecord-gcc-switches
> >> endif
> >>
> >> This will make pahole with any clang built kernels, lto or non-lto.
> >
> > I share the same concern about sizes. Can't pahole know it is clang LTO
> > via other means? If pahole just needs to know the one-bit information
> > (clang LTO vs not), having every compile option seems unnecessary....
>
> This is v2 of the patch
>    https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
> The flag will be guarded with CONFIG_LTO_CLANG.
>
> As mentioned in commit message of v2, the alternative is
> to go through every cu to find out whether DW_FORM_ref_addr is used
> or not. In other words, check every possible cross-cu references
> to find whether cross-cu reference actually happens or not. This
> is quite heavy for pahole...
>
> What we really want to know is whether cross-cu reference happens
> or not? If there is an easy way to get it, that will be great.

+David Blaikie

> >
> >> If the maintainer wants further restriction with CONFIG_DEBUG_INFO_BTF,
> >> I can do that in another revision.
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> >> Groups "Clang Built Linux" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> >> an email to clang-built-linux+unsubscribe@googlegroups.com.
> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/clang-built-linux/0b8d17be-e015-83c3-88d8-7c218cd01536@fb.com
> >> .
David Blaikie March 31, 2021, 2:51 a.m. UTC | #6
On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>
> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote:
> >
> >
> >
> > On 3/30/21 5:25 PM, Fangrui Song wrote:
> > > On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
> > >>
> > >>
> > >> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
> > >>> (replying to
> > >>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
> > >>>
> > >>> Thanks for the patch!
> > >>>
> > >>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
> > >>>> +# while clang needs explicit flag. Add this flag explicitly.
> > >>>> +ifdef CONFIG_CC_IS_CLANG
> > >>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
> > >>>> +endif
> > >>>> +
> > >
> > > Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.
> >
> > Could you know why? dwarf size concern?
> >
> > >
> > >>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang.
> > >>> Do we
> > >>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we
> > >>> don't have
> > >>> to pay that cost if that config is not set?
> > >>
> > >> Since this patch is mostly motivated to detect whether the kernel is
> > >> built with clang lto or not. Let me add the flag only if lto is
> > >> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
> > >> The smaller percentage is due to larger .debug_info section
> > >> (almost double) for thinlto vs. no lto.
> > >>
> > >> ifdef CONFIG_LTO_CLANG
> > >> DEBUG_CFLAGS   += -grecord-gcc-switches
> > >> endif
> > >>
> > >> This will make pahole with any clang built kernels, lto or non-lto.
> > >
> > > I share the same concern about sizes. Can't pahole know it is clang LTO
> > > via other means? If pahole just needs to know the one-bit information
> > > (clang LTO vs not), having every compile option seems unnecessary....
> >
> > This is v2 of the patch
> >    https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
> > The flag will be guarded with CONFIG_LTO_CLANG.
> >
> > As mentioned in commit message of v2, the alternative is
> > to go through every cu to find out whether DW_FORM_ref_addr is used
> > or not. In other words, check every possible cross-cu references
> > to find whether cross-cu reference actually happens or not. This
> > is quite heavy for pahole...
> >
> > What we really want to know is whether cross-cu reference happens
> > or not? If there is an easy way to get it, that will be great.
>
> +David Blaikie

Yep, that shouldn't be too hard to test for more directly - scanning
.debug_abbrev for DW_FORM_ref_addr should be what you need. Would that
be workable rather than relying on detecting clang/lto from command
line parameters? (GCC can produce these cross-CU references too, when
using lto - so this approach would help make the solution generalize
over GCC's behavior too)
Yonghong Song March 31, 2021, 3:13 a.m. UTC | #7
On 3/30/21 7:51 PM, David Blaikie wrote:
> On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>>
>> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote:
>>>
>>>
>>>
>>> On 3/30/21 5:25 PM, Fangrui Song wrote:
>>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
>>>>>
>>>>>
>>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
>>>>>> (replying to
>>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
>>>>>>
>>>>>> Thanks for the patch!
>>>>>>
>>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
>>>>>>> +# while clang needs explicit flag. Add this flag explicitly.
>>>>>>> +ifdef CONFIG_CC_IS_CLANG
>>>>>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
>>>>>>> +endif
>>>>>>> +
>>>>
>>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.
>>>
>>> Could you know why? dwarf size concern?
>>>
>>>>
>>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang.
>>>>>> Do we
>>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we
>>>>>> don't have
>>>>>> to pay that cost if that config is not set?
>>>>>
>>>>> Since this patch is mostly motivated to detect whether the kernel is
>>>>> built with clang lto or not. Let me add the flag only if lto is
>>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
>>>>> The smaller percentage is due to larger .debug_info section
>>>>> (almost double) for thinlto vs. no lto.
>>>>>
>>>>> ifdef CONFIG_LTO_CLANG
>>>>> DEBUG_CFLAGS   += -grecord-gcc-switches
>>>>> endif
>>>>>
>>>>> This will make pahole with any clang built kernels, lto or non-lto.
>>>>
>>>> I share the same concern about sizes. Can't pahole know it is clang LTO
>>>> via other means? If pahole just needs to know the one-bit information
>>>> (clang LTO vs not), having every compile option seems unnecessary....
>>>
>>> This is v2 of the patch
>>>     https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
>>> The flag will be guarded with CONFIG_LTO_CLANG.
>>>
>>> As mentioned in commit message of v2, the alternative is
>>> to go through every cu to find out whether DW_FORM_ref_addr is used
>>> or not. In other words, check every possible cross-cu references
>>> to find whether cross-cu reference actually happens or not. This
>>> is quite heavy for pahole...
>>>
>>> What we really want to know is whether cross-cu reference happens
>>> or not? If there is an easy way to get it, that will be great.
>>
>> +David Blaikie
> 
> Yep, that shouldn't be too hard to test for more directly - scanning
> .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that
> be workable rather than relying on detecting clang/lto from command
> line parameters? (GCC can produce these cross-CU references too, when
> using lto - so this approach would help make the solution generalize
> over GCC's behavior too)

Thanks, David. This should be better. I tried with a non-lto vmlinux.
Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then
"grep "DW_CHILDREN_no" log | wc -l" and get 231676 records.

I will try this approach. If the time is a very small fraction of
actual dwarf cu processing time, we should be fine. This definitely 
better than visit all die's in cu trying to detect cross-cu reference.
David Blaikie March 31, 2021, 3:16 a.m. UTC | #8
On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 3/30/21 7:51 PM, David Blaikie wrote:
> > On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote:
> >>
> >> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote:
> >>>
> >>>
> >>>
> >>> On 3/30/21 5:25 PM, Fangrui Song wrote:
> >>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
> >>>>>
> >>>>>
> >>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
> >>>>>> (replying to
> >>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
> >>>>>>
> >>>>>> Thanks for the patch!
> >>>>>>
> >>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
> >>>>>>> +# while clang needs explicit flag. Add this flag explicitly.
> >>>>>>> +ifdef CONFIG_CC_IS_CLANG
> >>>>>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
> >>>>>>> +endif
> >>>>>>> +
> >>>>
> >>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.
> >>>
> >>> Could you know why? dwarf size concern?
> >>>
> >>>>
> >>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang.
> >>>>>> Do we
> >>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we
> >>>>>> don't have
> >>>>>> to pay that cost if that config is not set?
> >>>>>
> >>>>> Since this patch is mostly motivated to detect whether the kernel is
> >>>>> built with clang lto or not. Let me add the flag only if lto is
> >>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
> >>>>> The smaller percentage is due to larger .debug_info section
> >>>>> (almost double) for thinlto vs. no lto.
> >>>>>
> >>>>> ifdef CONFIG_LTO_CLANG
> >>>>> DEBUG_CFLAGS   += -grecord-gcc-switches
> >>>>> endif
> >>>>>
> >>>>> This will make pahole with any clang built kernels, lto or non-lto.
> >>>>
> >>>> I share the same concern about sizes. Can't pahole know it is clang LTO
> >>>> via other means? If pahole just needs to know the one-bit information
> >>>> (clang LTO vs not), having every compile option seems unnecessary....
> >>>
> >>> This is v2 of the patch
> >>>     https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
> >>> The flag will be guarded with CONFIG_LTO_CLANG.
> >>>
> >>> As mentioned in commit message of v2, the alternative is
> >>> to go through every cu to find out whether DW_FORM_ref_addr is used
> >>> or not. In other words, check every possible cross-cu references
> >>> to find whether cross-cu reference actually happens or not. This
> >>> is quite heavy for pahole...
> >>>
> >>> What we really want to know is whether cross-cu reference happens
> >>> or not? If there is an easy way to get it, that will be great.
> >>
> >> +David Blaikie
> >
> > Yep, that shouldn't be too hard to test for more directly - scanning
> > .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that
> > be workable rather than relying on detecting clang/lto from command
> > line parameters? (GCC can produce these cross-CU references too, when
> > using lto - so this approach would help make the solution generalize
> > over GCC's behavior too)
>
> Thanks, David. This should be better. I tried with a non-lto vmlinux.
> Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then
> "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records.

What conclusions are you drawing from this number/data? (I'm not
following how DW_CHILDREN_no relates to the topic - perhaps I'm
missing something)

> I will try this approach. If the time is a very small fraction of
> actual dwarf cu processing time, we should be fine. This definitely
> better than visit all die's in cu trying to detect cross-cu reference.

*fingers crossed*
Yonghong Song March 31, 2021, 3:26 a.m. UTC | #9
On 3/30/21 8:16 PM, David Blaikie wrote:
> On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 3/30/21 7:51 PM, David Blaikie wrote:
>>> On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>>>>
>>>> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 3/30/21 5:25 PM, Fangrui Song wrote:
>>>>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
>>>>>>>> (replying to
>>>>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
>>>>>>>>
>>>>>>>> Thanks for the patch!
>>>>>>>>
>>>>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
>>>>>>>>> +# while clang needs explicit flag. Add this flag explicitly.
>>>>>>>>> +ifdef CONFIG_CC_IS_CLANG
>>>>>>>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
>>>>>>>>> +endif
>>>>>>>>> +
>>>>>>
>>>>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.
>>>>>
>>>>> Could you know why? dwarf size concern?
>>>>>
>>>>>>
>>>>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang.
>>>>>>>> Do we
>>>>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we
>>>>>>>> don't have
>>>>>>>> to pay that cost if that config is not set?
>>>>>>>
>>>>>>> Since this patch is mostly motivated to detect whether the kernel is
>>>>>>> built with clang lto or not. Let me add the flag only if lto is
>>>>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
>>>>>>> The smaller percentage is due to larger .debug_info section
>>>>>>> (almost double) for thinlto vs. no lto.
>>>>>>>
>>>>>>> ifdef CONFIG_LTO_CLANG
>>>>>>> DEBUG_CFLAGS   += -grecord-gcc-switches
>>>>>>> endif
>>>>>>>
>>>>>>> This will make pahole with any clang built kernels, lto or non-lto.
>>>>>>
>>>>>> I share the same concern about sizes. Can't pahole know it is clang LTO
>>>>>> via other means? If pahole just needs to know the one-bit information
>>>>>> (clang LTO vs not), having every compile option seems unnecessary....
>>>>>
>>>>> This is v2 of the patch
>>>>>      https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
>>>>> The flag will be guarded with CONFIG_LTO_CLANG.
>>>>>
>>>>> As mentioned in commit message of v2, the alternative is
>>>>> to go through every cu to find out whether DW_FORM_ref_addr is used
>>>>> or not. In other words, check every possible cross-cu references
>>>>> to find whether cross-cu reference actually happens or not. This
>>>>> is quite heavy for pahole...
>>>>>
>>>>> What we really want to know is whether cross-cu reference happens
>>>>> or not? If there is an easy way to get it, that will be great.
>>>>
>>>> +David Blaikie
>>>
>>> Yep, that shouldn't be too hard to test for more directly - scanning
>>> .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that
>>> be workable rather than relying on detecting clang/lto from command
>>> line parameters? (GCC can produce these cross-CU references too, when
>>> using lto - so this approach would help make the solution generalize
>>> over GCC's behavior too)
>>
>> Thanks, David. This should be better. I tried with a non-lto vmlinux.
>> Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then
>> "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records.
> 
> What conclusions are you drawing from this number/data? (I'm not
> following how DW_CHILDREN_no relates to the topic - perhaps I'm
> missing something)

Approximation of the number of tags to visit:

...
[10] DW_TAG_array_type  DW_CHILDREN_yes
         DW_AT_type      DW_FORM_ref4
         DW_AT_sibling   DW_FORM_ref4

[11] DW_TAG_variable    DW_CHILDREN_no
         DW_AT_name      DW_FORM_strp
         DW_AT_decl_file DW_FORM_data1
         DW_AT_decl_line DW_FORM_data2
         DW_AT_decl_column       DW_FORM_data1
         DW_AT_type      DW_FORM_ref4
         DW_AT_external  DW_FORM_flag_present
         DW_AT_declaration       DW_FORM_flag_present

[12] DW_TAG_member      DW_CHILDREN_no
         DW_AT_name      DW_FORM_string
         DW_AT_decl_file DW_FORM_data1
         DW_AT_decl_line DW_FORM_data1
         DW_AT_decl_column       DW_FORM_data1
         DW_AT_type      DW_FORM_ref4
         DW_AT_data_member_location      DW_FORM_data1

[13] DW_TAG_subrange_type       DW_CHILDREN_no
         DW_AT_type      DW_FORM_ref4
         DW_AT_upper_bound       DW_FORM_data1
...
The bigger number means more tags to visit and will consume more time.
For a binary not compiled with lto, all these tags will be visited
before declaring that the dwarf does not have cross-cu reference.
So the number is just a relative guess on the cpu cost. But ya,
have to have real implementation first...

> 
>> I will try this approach. If the time is a very small fraction of
>> actual dwarf cu processing time, we should be fine. This definitely
>> better than visit all die's in cu trying to detect cross-cu reference.
> 
> *fingers crossed*
>
David Blaikie March 31, 2021, 3:28 a.m. UTC | #10
On Tue, Mar 30, 2021 at 8:27 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 3/30/21 8:16 PM, David Blaikie wrote:
> > On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@fb.com> wrote:
> >>
> >>
> >>
> >> On 3/30/21 7:51 PM, David Blaikie wrote:
> >>> On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@google.com> wrote:
> >>>>
> >>>> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@fb.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 3/30/21 5:25 PM, Fangrui Song wrote:
> >>>>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote:
> >>>>>>>> (replying to
> >>>>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/)
> >>>>>>>>
> >>>>>>>> Thanks for the patch!
> >>>>>>>>
> >>>>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default
> >>>>>>>>> +# while clang needs explicit flag. Add this flag explicitly.
> >>>>>>>>> +ifdef CONFIG_CC_IS_CLANG
> >>>>>>>>> +DEBUG_CFLAGS    += -grecord-gcc-switches
> >>>>>>>>> +endif
> >>>>>>>>> +
> >>>>>>
> >>>>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't.
> >>>>>
> >>>>> Could you know why? dwarf size concern?
> >>>>>
> >>>>>>
> >>>>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang.
> >>>>>>>> Do we
> >>>>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we
> >>>>>>>> don't have
> >>>>>>>> to pay that cost if that config is not set?
> >>>>>>>
> >>>>>>> Since this patch is mostly motivated to detect whether the kernel is
> >>>>>>> built with clang lto or not. Let me add the flag only if lto is
> >>>>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux.
> >>>>>>> The smaller percentage is due to larger .debug_info section
> >>>>>>> (almost double) for thinlto vs. no lto.
> >>>>>>>
> >>>>>>> ifdef CONFIG_LTO_CLANG
> >>>>>>> DEBUG_CFLAGS   += -grecord-gcc-switches
> >>>>>>> endif
> >>>>>>>
> >>>>>>> This will make pahole with any clang built kernels, lto or non-lto.
> >>>>>>
> >>>>>> I share the same concern about sizes. Can't pahole know it is clang LTO
> >>>>>> via other means? If pahole just needs to know the one-bit information
> >>>>>> (clang LTO vs not), having every compile option seems unnecessary....
> >>>>>
> >>>>> This is v2 of the patch
> >>>>>      https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/
> >>>>> The flag will be guarded with CONFIG_LTO_CLANG.
> >>>>>
> >>>>> As mentioned in commit message of v2, the alternative is
> >>>>> to go through every cu to find out whether DW_FORM_ref_addr is used
> >>>>> or not. In other words, check every possible cross-cu references
> >>>>> to find whether cross-cu reference actually happens or not. This
> >>>>> is quite heavy for pahole...
> >>>>>
> >>>>> What we really want to know is whether cross-cu reference happens
> >>>>> or not? If there is an easy way to get it, that will be great.
> >>>>
> >>>> +David Blaikie
> >>>
> >>> Yep, that shouldn't be too hard to test for more directly - scanning
> >>> .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that
> >>> be workable rather than relying on detecting clang/lto from command
> >>> line parameters? (GCC can produce these cross-CU references too, when
> >>> using lto - so this approach would help make the solution generalize
> >>> over GCC's behavior too)
> >>
> >> Thanks, David. This should be better. I tried with a non-lto vmlinux.
> >> Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then
> >> "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records.
> >
> > What conclusions are you drawing from this number/data? (I'm not
> > following how DW_CHILDREN_no relates to the topic - perhaps I'm
> > missing something)
>
> Approximation of the number of tags to visit:
>
> ...
> [10] DW_TAG_array_type  DW_CHILDREN_yes
>          DW_AT_type      DW_FORM_ref4
>          DW_AT_sibling   DW_FORM_ref4
>
> [11] DW_TAG_variable    DW_CHILDREN_no
>          DW_AT_name      DW_FORM_strp
>          DW_AT_decl_file DW_FORM_data1
>          DW_AT_decl_line DW_FORM_data2
>          DW_AT_decl_column       DW_FORM_data1
>          DW_AT_type      DW_FORM_ref4
>          DW_AT_external  DW_FORM_flag_present
>          DW_AT_declaration       DW_FORM_flag_present
>
> [12] DW_TAG_member      DW_CHILDREN_no
>          DW_AT_name      DW_FORM_string
>          DW_AT_decl_file DW_FORM_data1
>          DW_AT_decl_line DW_FORM_data1
>          DW_AT_decl_column       DW_FORM_data1
>          DW_AT_type      DW_FORM_ref4
>          DW_AT_data_member_location      DW_FORM_data1
>
> [13] DW_TAG_subrange_type       DW_CHILDREN_no
>          DW_AT_type      DW_FORM_ref4
>          DW_AT_upper_bound       DW_FORM_data1
> ...
> The bigger number means more tags to visit and will consume more time.
> For a binary not compiled with lto, all these tags will be visited
> before declaring that the dwarf does not have cross-cu reference.
> So the number is just a relative guess on the cpu cost. But ya,
> have to have real implementation first...

Ah, sounds good, yeah.
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index d4784d181123..ab0119beb42d 100644
--- a/Makefile
+++ b/Makefile
@@ -839,6 +839,12 @@  dwarf-version-$(CONFIG_DEBUG_INFO_DWARF5) := 5
 DEBUG_CFLAGS	+= -gdwarf-$(dwarf-version-y)
 endif
 
+# gcc emits compilation flags in dwarf DW_AT_producer by default
+# while clang needs explicit flag. Add this flag explicitly.
+ifdef CONFIG_CC_IS_CLANG
+DEBUG_CFLAGS	+= -grecord-gcc-switches
+endif
+
 ifdef CONFIG_DEBUG_INFO_REDUCED
 DEBUG_CFLAGS	+= $(call cc-option, -femit-struct-debug-baseonly) \
 		   $(call cc-option,-fno-var-tracking)