diff mbox series

[v4,3/3] scripts/faddr2line: Skip over mapping symbols in output from readelf

Message ID 20230914131225.13415-4-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series Fix 'faddr2line' for LLVM arm64 builds | expand

Commit Message

Will Deacon Sept. 14, 2023, 1:12 p.m. UTC
Mapping symbols emitted in the readelf output can confuse the
'faddr2line' symbol size calculation, resulting in the erroneous
rejection of valid offsets. This is especially prevalent when building
an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
prefixed with a 32-bit data value in a '$d.n' section. For example:

447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
   104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
   106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
   111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
   112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
    36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process

Adding a warning to do_one_initcall() results in:

  | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260

Which 'faddr2line' refuses to accept:

$ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
no match for do_one_initcall+0xf4/0x260

Filter out these entries from readelf using a shell reimplementation of
is_mapping_symbol(), so that the size of a symbol is calculated as a
delta to the next symbol present in ksymtab.

Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: John Stultz <jstultz@google.com>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 scripts/faddr2line | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Nick Desaulniers Sept. 18, 2023, 3:46 p.m. UTC | #1
On Thu, Sep 14, 2023 at 6:12 AM Will Deacon <will@kernel.org> wrote:
>
> Mapping symbols emitted in the readelf output can confuse the
> 'faddr2line' symbol size calculation, resulting in the erroneous
> rejection of valid offsets. This is especially prevalent when building
> an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> prefixed with a 32-bit data value in a '$d.n' section. For example:
>
> 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
>    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
>    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
>    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
>    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
>     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
>
> Adding a warning to do_one_initcall() results in:
>
>   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
>
> Which 'faddr2line' refuses to accept:
>
> $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> no match for do_one_initcall+0xf4/0x260
>
> Filter out these entries from readelf using a shell reimplementation of
> is_mapping_symbol(), so that the size of a symbol is calculated as a
> delta to the next symbol present in ksymtab.
>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: John Stultz <jstultz@google.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  scripts/faddr2line | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/scripts/faddr2line b/scripts/faddr2line
> index 6b8206802157..20d9b3d37843 100755
> --- a/scripts/faddr2line
> +++ b/scripts/faddr2line
> @@ -179,6 +179,11 @@ __faddr2line() {
>                         local cur_sym_elf_size=${fields[2]}
>                         local cur_sym_name=${fields[7]:-}
>
> +                       # is_mapping_symbol(cur_sym_name)
> +                       if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then

Thanks for the patch!

I'm curious about the `|$` in the final part of the regex.  IIUC that
will match something like
$a
Do we have any such symbols without `.<n>` suffixes?

With aarch64 defconfig + cfi:
$ llvm-readelf -s vmlinux | grep '\$' | rev | cut -d ' ' -f 1 | rev | sort -u
I only see $d.<n> and $x.<n> where the initial value of <n> is zero
(as opposed to no `.<n>` suffix).
Can we tighten up that last part of the regex to be `\$[adtx]\.[0-9]+$` ?
Or perhaps you've observed mapping symbols use another convention than
what clang is doing?

https://sourceware.org/binutils/docs/as/AArch64-Mapping-Symbols.html
also only mentions $d and $x. Ah,
https://developer.arm.com/documentation/dui0803/a/Accessing-and-managing-symbols-with-armlink/About-mapping-symbols
mentions $a for A32 and $t for T32.
Consider adding a link to the ARM documentation on mapping symbols in
the commit message?

(Curiously, `llvm-nm` does not print these symbols, but `llvm-readelf -s` does).

> +                               continue
> +                       fi
> +
>                         if [[ $cur_sym_addr = $sym_addr ]] &&
>                            [[ $cur_sym_elf_size = $sym_elf_size ]] &&
>                            [[ $cur_sym_name = $sym_name ]]; then
> --
> 2.42.0.283.g2d96d420d3-goog
>
Masahiro Yamada Sept. 25, 2023, 4:50 p.m. UTC | #2
On Thu, Sep 14, 2023 at 10:12 PM Will Deacon <will@kernel.org> wrote:
>
> Mapping symbols emitted in the readelf output can confuse the
> 'faddr2line' symbol size calculation, resulting in the erroneous
> rejection of valid offsets. This is especially prevalent when building
> an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> prefixed with a 32-bit data value in a '$d.n' section. For example:
>
> 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
>    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
>    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
>    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
>    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
>     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
>
> Adding a warning to do_one_initcall() results in:
>
>   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
>
> Which 'faddr2line' refuses to accept:
>
> $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> no match for do_one_initcall+0xf4/0x260
>
> Filter out these entries from readelf using a shell reimplementation of
> is_mapping_symbol(), so that the size of a symbol is calculated as a
> delta to the next symbol present in ksymtab.
>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: John Stultz <jstultz@google.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  scripts/faddr2line | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/scripts/faddr2line b/scripts/faddr2line
> index 6b8206802157..20d9b3d37843 100755
> --- a/scripts/faddr2line
> +++ b/scripts/faddr2line
> @@ -179,6 +179,11 @@ __faddr2line() {
>                         local cur_sym_elf_size=${fields[2]}
>                         local cur_sym_name=${fields[7]:-}
>
> +                       # is_mapping_symbol(cur_sym_name)
> +                       if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then
> +                               continue
> +                       fi
> +


Too many parentheses.


The latest include/linux/module_symbol.h looks like this.

static inline int is_mapping_symbol(const char *str)
{
        if (str[0] == '.' && str[1] == 'L')
                return true;
        if (str[0] == 'L' && str[1] == '0')
                return true;
        return str[0] == '$';
}






Does this work?

if [[ ${cur_sym_name} =~ ^(\.L|L0|\$) ]]; then
        continue
fi








>                         if [[ $cur_sym_addr = $sym_addr ]] &&
>                            [[ $cur_sym_elf_size = $sym_elf_size ]] &&
>                            [[ $cur_sym_name = $sym_name ]]; then
> --
> 2.42.0.283.g2d96d420d3-goog
>
Will Deacon Sept. 29, 2023, 2:15 p.m. UTC | #3
On Tue, Sep 26, 2023 at 01:50:20AM +0900, Masahiro Yamada wrote:
> On Thu, Sep 14, 2023 at 10:12 PM Will Deacon <will@kernel.org> wrote:
> >
> > Mapping symbols emitted in the readelf output can confuse the
> > 'faddr2line' symbol size calculation, resulting in the erroneous
> > rejection of valid offsets. This is especially prevalent when building
> > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> > prefixed with a 32-bit data value in a '$d.n' section. For example:
> >
> > 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
> >    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
> >    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
> >    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
> >    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
> >     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
> >
> > Adding a warning to do_one_initcall() results in:
> >
> >   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
> >
> > Which 'faddr2line' refuses to accept:
> >
> > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> > no match for do_one_initcall+0xf4/0x260
> >
> > Filter out these entries from readelf using a shell reimplementation of
> > is_mapping_symbol(), so that the size of a symbol is calculated as a
> > delta to the next symbol present in ksymtab.
> >
> > Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> > Cc: John Stultz <jstultz@google.com>
> > Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  scripts/faddr2line | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/scripts/faddr2line b/scripts/faddr2line
> > index 6b8206802157..20d9b3d37843 100755
> > --- a/scripts/faddr2line
> > +++ b/scripts/faddr2line
> > @@ -179,6 +179,11 @@ __faddr2line() {
> >                         local cur_sym_elf_size=${fields[2]}
> >                         local cur_sym_name=${fields[7]:-}
> >
> > +                       # is_mapping_symbol(cur_sym_name)
> > +                       if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then
> > +                               continue
> > +                       fi
> > +
> 
> 
> Too many parentheses.

Ha, well _that_ is subjective! I really think they help when it comes to
regex syntax. However...

> The latest include/linux/module_symbol.h looks like this.
> 
> static inline int is_mapping_symbol(const char *str)
> {
>         if (str[0] == '.' && str[1] == 'L')
>                 return true;
>         if (str[0] == 'L' && str[1] == '0')
>                 return true;
>         return str[0] == '$';
> }

...oh, nice, that got simplified a whole lot by ff09f6fd2972 ("modpost,
kallsyms: Treat add '$'-prefixed symbols as mapping symbols") in the
recent merge window, so I can definitely simplify the regex.

> Does this work?
> 
> if [[ ${cur_sym_name} =~ ^(\.L|L0|\$) ]]; then
>         continue
> fi

Looks about right.

Will
Will Deacon Oct. 2, 2023, 3:26 p.m. UTC | #4
On Mon, Sep 18, 2023 at 08:46:22AM -0700, Nick Desaulniers wrote:
> On Thu, Sep 14, 2023 at 6:12 AM Will Deacon <will@kernel.org> wrote:
> >
> > Mapping symbols emitted in the readelf output can confuse the
> > 'faddr2line' symbol size calculation, resulting in the erroneous
> > rejection of valid offsets. This is especially prevalent when building
> > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> > prefixed with a 32-bit data value in a '$d.n' section. For example:
> >
> > 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
> >    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
> >    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
> >    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
> >    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
> >     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
> >
> > Adding a warning to do_one_initcall() results in:
> >
> >   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
> >
> > Which 'faddr2line' refuses to accept:
> >
> > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> > no match for do_one_initcall+0xf4/0x260
> >
> > Filter out these entries from readelf using a shell reimplementation of
> > is_mapping_symbol(), so that the size of a symbol is calculated as a
> > delta to the next symbol present in ksymtab.
> >
> > Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> > Cc: John Stultz <jstultz@google.com>
> > Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  scripts/faddr2line | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/scripts/faddr2line b/scripts/faddr2line
> > index 6b8206802157..20d9b3d37843 100755
> > --- a/scripts/faddr2line
> > +++ b/scripts/faddr2line
> > @@ -179,6 +179,11 @@ __faddr2line() {
> >                         local cur_sym_elf_size=${fields[2]}
> >                         local cur_sym_name=${fields[7]:-}
> >
> > +                       # is_mapping_symbol(cur_sym_name)
> > +                       if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then
> 
> Thanks for the patch!
> 
> I'm curious about the `|$` in the final part of the regex.  IIUC that
> will match something like
> $a
> Do we have any such symbols without `.<n>` suffixes?

tbh, I just blindly followed the implementation of is_mapping_symbol()
at the time, but Masahiro has since pointed out that it's been
significantly simplified so this regex should get much more manageable
in the next version.

Will
diff mbox series

Patch

diff --git a/scripts/faddr2line b/scripts/faddr2line
index 6b8206802157..20d9b3d37843 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -179,6 +179,11 @@  __faddr2line() {
 			local cur_sym_elf_size=${fields[2]}
 			local cur_sym_name=${fields[7]:-}
 
+			# is_mapping_symbol(cur_sym_name)
+			if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then
+				continue
+			fi
+
 			if [[ $cur_sym_addr = $sym_addr ]] &&
 			   [[ $cur_sym_elf_size = $sym_elf_size ]] &&
 			   [[ $cur_sym_name = $sym_name ]]; then