diff mbox series

recordmcount: avoid using ABS symbol as reference

Message ID 20210607023839.26387-1-mark-pk.tsai@mediatek.com (mailing list archive)
State New
Headers show
Series recordmcount: avoid using ABS symbol as reference | expand

Commit Message

Mark-PK Tsai June 7, 2021, 2:38 a.m. UTC
Avoid using ABS symbol, which won't be relocate, as reference.

On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).

Section Headers:
[Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
[65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
[65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8

find_secsym_ndx, which use r_info in rela section to find the reference
symbol, may take ABS symbol as base.

Symbol table '.symtab' contains 453285 entries:
   Num:    Value          Size Type    Bind   Vis       Ndx Name
     6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count

Which cause an invalid address in __mcount_loc.

Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
---
 scripts/recordmcount.h | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Ard Biesheuvel June 7, 2021, 6:37 a.m. UTC | #1
On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
>
> Avoid using ABS symbol, which won't be relocate, as reference.
>
> On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
>
> Section Headers:
> [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
>

A RELA section's r_info field points to the section to which it
applies. This is why in the example above section #65522 points to
section #65521. This has nothing to do with the numerical value of
SHN_ABS.

> find_secsym_ndx, which use r_info in rela section to find the reference
> symbol, may take ABS symbol as base.
>
> Symbol table '.symtab' contains 453285 entries:
>    Num:    Value          Size Type    Bind   Vis       Ndx Name
>      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
>
> Which cause an invalid address in __mcount_loc.
>

Could you give a better account of the error you are trying to address?

Also, arm64 no longer defines a section_count symbol (since v5.11), so
please make sure that the diagnostics of the issue you are addressing
are accurate for mainline.


> Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> ---
>  scripts/recordmcount.h | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> index f9b19524da11..9b69167fb7ff 100644
> --- a/scripts/recordmcount.h
> +++ b/scripts/recordmcount.h
> @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
>         for (symp = sym0, t = nsym; t; --t, ++symp) {
>                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
>
> +               /* avoid absolute symbols */
> +               if (symp->st_shndx == SHN_ABS)
> +                       continue;
> +
>                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
>                         /* avoid STB_WEAK */
>                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
> --
> 2.18.0
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Mark-PK Tsai June 7, 2021, 6:59 a.m. UTC | #2
> > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> >
> > Avoid using ABS symbol, which won't be relocate, as reference.
> >
> > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> >
> > Section Headers:
> > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> >
>
> A RELA section's r_info field points to the section to which it
> applies. This is why in the example above section #65522 points to
> section #65521. This has nothing to do with the numerical value of
> SHN_ABS.

If the r_info of RELA section is 65521(0xfff1),
find_secsym_ndx() will use it to find the base symbol.

And in the symbol search loop in find_secsym_ndx(), get_symindex will
return 0xfff1 if the symbol is in ABS section.

In this case, find_secsym_ndx() will return a absolute symbol as
base, which won't be relocate, if an ABS symbol is found before the
real symbol in section 65521.

>
> > find_secsym_ndx, which use r_info in rela section to find the reference
> > symbol, may take ABS symbol as base.
> >
> > Symbol table '.symtab' contains 453285 entries:
> >    Num:    Value          Size Type    Bind   Vis       Ndx Name
> >      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
> >
> > Which cause an invalid address in __mcount_loc.
> >
>
> Could you give a better account of the error you are trying to address?
>
> Also, arm64 no longer defines a section_count symbol (since v5.11), so
> please make sure that the diagnostics of the issue you are addressing
> are accurate for mainline.
>

My kernel version is 5.4.61.
But as I explained, I suppose mainline also have this issue.

>
> > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > ---
> >  scripts/recordmcount.h | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > index f9b19524da11..9b69167fb7ff 100644
> > --- a/scripts/recordmcount.h
> > +++ b/scripts/recordmcount.h
> > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
> >         for (symp = sym0, t = nsym; t; --t, ++symp) {
> >                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
> >
> > +               /* avoid absolute symbols */
> > +               if (symp->st_shndx == SHN_ABS)
> > +                       continue;
> > +
> >                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
> >                         /* avoid STB_WEAK */
> >                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
Ard Biesheuvel June 7, 2021, 7:07 a.m. UTC | #3
On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
>
> > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > >
> > > Avoid using ABS symbol, which won't be relocate, as reference.
> > >
> > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > >
> > > Section Headers:
> > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > >
> >
> > A RELA section's r_info field points to the section to which it
> > applies. This is why in the example above section #65522 points to
> > section #65521. This has nothing to do with the numerical value of
> > SHN_ABS.
>
> If the r_info of RELA section is 65521(0xfff1),
> find_secsym_ndx() will use it to find the base symbol.
>

But what does that have to do with the sh_info field of the RELA
section's Elf_Shdr struct? IOW, what is the relevance of section
#65521 here?

> And in the symbol search loop in find_secsym_ndx(), get_symindex will
> return 0xfff1 if the symbol is in ABS section.
>
> In this case, find_secsym_ndx() will return a absolute symbol as
> base, which won't be relocate, if an ABS symbol is found before the
> real symbol in section 65521.
>

I see your point here.

> >
> > > find_secsym_ndx, which use r_info in rela section to find the reference
> > > symbol, may take ABS symbol as base.
> > >
> > > Symbol table '.symtab' contains 453285 entries:
> > >    Num:    Value          Size Type    Bind   Vis       Ndx Name
> > >      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
> > >
> > > Which cause an invalid address in __mcount_loc.
> > >
> >
> > Could you give a better account of the error you are trying to address?
> >
> > Also, arm64 no longer defines a section_count symbol (since v5.11), so
> > please make sure that the diagnostics of the issue you are addressing
> > are accurate for mainline.
> >
>
> My kernel version is 5.4.61.
> But as I explained, I suppose mainline also have this issue.
>

Mainline is what we work on. So please base your changes (and your
commit log) on mainline.


> >
> > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > ---
> > >  scripts/recordmcount.h | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > > index f9b19524da11..9b69167fb7ff 100644
> > > --- a/scripts/recordmcount.h
> > > +++ b/scripts/recordmcount.h
> > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
> > >         for (symp = sym0, t = nsym; t; --t, ++symp) {
> > >                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
> > >
> > > +               /* avoid absolute symbols */
> > > +               if (symp->st_shndx == SHN_ABS)
> > > +                       continue;
> > > +
> > >                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
> > >                         /* avoid STB_WEAK */
> > >                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
Mark-PK Tsai June 7, 2021, 7:42 a.m. UTC | #4
> On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> >
> > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > >
> > > > Avoid using ABS symbol, which won't be relocate, as reference.
> > > >
> > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > > >
> > > > Section Headers:
> > > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > > >
> > >
> > > A RELA section's r_info field points to the section to which it
> > > applies. This is why in the example above section #65522 points to
> > > section #65521. This has nothing to do with the numerical value of
> > > SHN_ABS.
> >
> > If the r_info of RELA section is 65521(0xfff1),

Oh sorry, I mean sh_info here.

> > find_secsym_ndx() will use it to find the base symbol.
> >
>
> But what does that have to do with the sh_info field of the RELA
> section's Elf_Shdr struct? IOW, what is the relevance of section
> #65521 here?
>

So what I mean is the problem occur if the sh_info of a RELA section
is #65521.

> > And in the symbol search loop in find_secsym_ndx(), get_symindex will
> > return 0xfff1 if the symbol is in ABS section.
> >
> > In this case, find_secsym_ndx() will return a absolute symbol as
> > base, which won't be relocate, if an ABS symbol is found before the
> > real symbol in section 65521.
> >
>
> I see your point here.
>
> > >
> > > > find_secsym_ndx, which use r_info in rela section to find the reference

sh_info.

> > > > symbol, may take ABS symbol as base.
> > > >
> > > > Symbol table '.symtab' contains 453285 entries:
> > > >    Num:    Value          Size Type    Bind   Vis       Ndx Name
> > > >      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
> > > >
> > > > Which cause an invalid address in __mcount_loc.
> > > >
> > >
> > > Could you give a better account of the error you are trying to address?
> > >
> > > Also, arm64 no longer defines a section_count symbol (since v5.11), so
> > > please make sure that the diagnostics of the issue you are addressing
> > > are accurate for mainline.
> > >
> >
> > My kernel version is 5.4.61.
> > But as I explained, I suppose mainline also have this issue.
> >
>
> Mainline is what we work on. So please base your changes (and your
> commit log) on mainline.
>

I understand it.
But the platform I can reproduce the problem is only support to 5.4 LTS now.
And port it to the latest mainline kernel have much more work to do, can I just
keep this commit log? Or just remove the example I posted in the commit messsage?

>
> > >
> > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > ---
> > > >  scripts/recordmcount.h | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > > > index f9b19524da11..9b69167fb7ff 100644
> > > > --- a/scripts/recordmcount.h
> > > > +++ b/scripts/recordmcount.h
> > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
> > > >         for (symp = sym0, t = nsym; t; --t, ++symp) {
> > > >                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
> > > >
> > > > +               /* avoid absolute symbols */
> > > > +               if (symp->st_shndx == SHN_ABS)
> > > > +                       continue;
> > > > +
> > > >                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
> > > >                         /* avoid STB_WEAK */
> > > >                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
Mark-PK Tsai June 7, 2021, 8:06 a.m. UTC | #5
> > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > >
> > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > >
> > > > > Avoid using ABS symbol, which won't be relocate, as reference.
> > > > >
> > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > > > >
> > > > > Section Headers:
> > > > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > > > >
> > > >
> > > > A RELA section's r_info field points to the section to which it
> > > > applies. This is why in the example above section #65522 points to
> > > > section #65521. This has nothing to do with the numerical value of
> > > > SHN_ABS.
> > >
> > > If the r_info of RELA section is 65521(0xfff1),
>
> Oh sorry, I mean sh_info here.
>
> > > find_secsym_ndx() will use it to find the base symbol.
> > >
> >
> > But what does that have to do with the sh_info field of the RELA
> > section's Elf_Shdr struct? IOW, what is the relevance of section
> > #65521 here?
> >
>
> So what I mean is the problem occur if the sh_info of a RELA section
> is #65521.

Actually the problem occur if the sh_info of a RELA section is in
the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE).
Maybe I should add a is_shndx_special() to check this like
scripts/mod/modpost.h did?

>
> > > And in the symbol search loop in find_secsym_ndx(), get_symindex will
> > > return 0xfff1 if the symbol is in ABS section.
> > >
> > > In this case, find_secsym_ndx() will return a absolute symbol as
> > > base, which won't be relocate, if an ABS symbol is found before the
> > > real symbol in section 65521.
> > >
> >
> > I see your point here.
> >
> > > >
> > > > > find_secsym_ndx, which use r_info in rela section to find the reference
>
> sh_info.
>
> > > > > symbol, may take ABS symbol as base.
> > > > >
> > > > > Symbol table '.symtab' contains 453285 entries:
> > > > >    Num:    Value          Size Type    Bind   Vis       Ndx Name
> > > > >      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
> > > > >
> > > > > Which cause an invalid address in __mcount_loc.
> > > > >
> > > >
> > > > Could you give a better account of the error you are trying to address?
> > > >
> > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so
> > > > please make sure that the diagnostics of the issue you are addressing
> > > > are accurate for mainline.
> > > >
> > >
> > > My kernel version is 5.4.61.
> > > But as I explained, I suppose mainline also have this issue.
> > >
> >
> > Mainline is what we work on. So please base your changes (and your
> > commit log) on mainline.
> >
>
> I understand it.
> But the platform I can reproduce the problem is only support to 5.4 LTS now.
> And port it to the latest mainline kernel have much more work to do, can I just
> keep this commit log? Or just remove the example I posted in the commit messsage?
>
> >
> > > >
> > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > > ---
> > > > >  scripts/recordmcount.h | 4 ++++
> > > > >  1 file changed, 4 insertions(+)
> > > > >
> > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > > > > index f9b19524da11..9b69167fb7ff 100644
> > > > > --- a/scripts/recordmcount.h
> > > > > +++ b/scripts/recordmcount.h
> > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
> > > > >         for (symp = sym0, t = nsym; t; --t, ++symp) {
> > > > >                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
> > > > >
> > > > > +               /* avoid absolute symbols */
> > > > > +               if (symp->st_shndx == SHN_ABS)
> > > > > +                       continue;
> > > > > +
> > > > >                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
> > > > >                         /* avoid STB_WEAK */
> > > > >                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
>
Ard Biesheuvel June 7, 2021, 9:50 a.m. UTC | #6
On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
>
> > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > >
> > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > > >
> > > > > > Avoid using ABS symbol, which won't be relocate, as reference.
> > > > > >
> > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > > > > >
> > > > > > Section Headers:
> > > > > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > > > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > > > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > > > > >
> > > > >
> > > > > A RELA section's r_info field points to the section to which it
> > > > > applies. This is why in the example above section #65522 points to
> > > > > section #65521. This has nothing to do with the numerical value of
> > > > > SHN_ABS.
> > > >
> > > > If the r_info of RELA section is 65521(0xfff1),
> >
> > Oh sorry, I mean sh_info here.
> >
> > > > find_secsym_ndx() will use it to find the base symbol.
> > > >
> > >
> > > But what does that have to do with the sh_info field of the RELA
> > > section's Elf_Shdr struct? IOW, what is the relevance of section
> > > #65521 here?
> > >
> >
> > So what I mean is the problem occur if the sh_info of a RELA section
> > is #65521.
>
> Actually the problem occur if the sh_info of a RELA section is in
> the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE).
> Maybe I should add a is_shndx_special() to check this like
> scripts/mod/modpost.h did?
>

So if I understand all of this correctly, we are running into a
fundamental issue here, where the linker emits more sections than the
sh_info field can describe, overflowing into the reserved range.

I don't think papering over it like this is going to be maintainable
going forward.



> >
> > > > And in the symbol search loop in find_secsym_ndx(), get_symindex will
> > > > return 0xfff1 if the symbol is in ABS section.
> > > >
> > > > In this case, find_secsym_ndx() will return a absolute symbol as
> > > > base, which won't be relocate, if an ABS symbol is found before the
> > > > real symbol in section 65521.
> > > >
> > >
> > > I see your point here.
> > >
> > > > >
> > > > > > find_secsym_ndx, which use r_info in rela section to find the reference
> >
> > sh_info.
> >
> > > > > > symbol, may take ABS symbol as base.
> > > > > >
> > > > > > Symbol table '.symtab' contains 453285 entries:
> > > > > >    Num:    Value          Size Type    Bind   Vis       Ndx Name
> > > > > >      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
> > > > > >
> > > > > > Which cause an invalid address in __mcount_loc.
> > > > > >
> > > > >
> > > > > Could you give a better account of the error you are trying to address?
> > > > >
> > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so
> > > > > please make sure that the diagnostics of the issue you are addressing
> > > > > are accurate for mainline.
> > > > >
> > > >
> > > > My kernel version is 5.4.61.
> > > > But as I explained, I suppose mainline also have this issue.
> > > >
> > >
> > > Mainline is what we work on. So please base your changes (and your
> > > commit log) on mainline.
> > >
> >
> > I understand it.
> > But the platform I can reproduce the problem is only support to 5.4 LTS now.
> > And port it to the latest mainline kernel have much more work to do, can I just
> > keep this commit log? Or just remove the example I posted in the commit messsage?
> >
> > >
> > > > >
> > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > > > ---
> > > > > >  scripts/recordmcount.h | 4 ++++
> > > > > >  1 file changed, 4 insertions(+)
> > > > > >
> > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > > > > > index f9b19524da11..9b69167fb7ff 100644
> > > > > > --- a/scripts/recordmcount.h
> > > > > > +++ b/scripts/recordmcount.h
> > > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
> > > > > >         for (symp = sym0, t = nsym; t; --t, ++symp) {
> > > > > >                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
> > > > > >
> > > > > > +               /* avoid absolute symbols */
> > > > > > +               if (symp->st_shndx == SHN_ABS)
> > > > > > +                       continue;
> > > > > > +
> > > > > >                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
> > > > > >                         /* avoid STB_WEAK */
> > > > > >                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
> >
Mark-PK Tsai June 7, 2021, 10:32 a.m. UTC | #7
> On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> >
> > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > >
> > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > > > >
> > > > > > > Avoid using ABS symbol, which won't be relocate, as reference.
> > > > > > >
> > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > > > > > >
> > > > > > > Section Headers:
> > > > > > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > > > > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > > > > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > > > > > >
> > > > > >
> > > > > > A RELA section's r_info field points to the section to which it
> > > > > > applies. This is why in the example above section #65522 points to
> > > > > > section #65521. This has nothing to do with the numerical value of
> > > > > > SHN_ABS.
> > > > >
> > > > > If the r_info of RELA section is 65521(0xfff1),
> > >
> > > Oh sorry, I mean sh_info here.
> > >
> > > > > find_secsym_ndx() will use it to find the base symbol.
> > > > >
> > > >
> > > > But what does that have to do with the sh_info field of the RELA
> > > > section's Elf_Shdr struct? IOW, what is the relevance of section
> > > > #65521 here?
> > > >
> > >
> > > So what I mean is the problem occur if the sh_info of a RELA section
> > > is #65521.
> >
> > Actually the problem occur if the sh_info of a RELA section is in
> > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE).
> > Maybe I should add a is_shndx_special() to check this like
> > scripts/mod/modpost.h did?
> >
>
> So if I understand all of this correctly, we are running into a
> fundamental issue here, where the linker emits more sections than the
> sh_info field can describe, overflowing into the reserved range.

Actually the problem is about comparing st_shndx and sh_info.

When the st_shndx is larger than SHN_LORESERVE, the linker mark
it as SHN_XINDEX, and get_symindex() will handle it.

But sh_info for a RELA section take the actual shndx which may
be in the reserved range.
So I suppose before compare the sh_info and st_shndx,
we need to do more check in find_secsym_ndx().

>
> I don't think papering over it like this is going to be maintainable
> going forward.
>

Do you have suggestion for how to deal with this problem?

>
>
> > >
> > > > > And in the symbol search loop in find_secsym_ndx(), get_symindex will
> > > > > return 0xfff1 if the symbol is in ABS section.
> > > > >
> > > > > In this case, find_secsym_ndx() will return a absolute symbol as
> > > > > base, which won't be relocate, if an ABS symbol is found before the
> > > > > real symbol in section 65521.
> > > > >
> > > >
> > > > I see your point here.
> > > >
> > > > > >
> > > > > > > find_secsym_ndx, which use r_info in rela section to find the reference
> > >
> > > sh_info.
> > >
> > > > > > > symbol, may take ABS symbol as base.
> > > > > > >
> > > > > > > Symbol table '.symtab' contains 453285 entries:
> > > > > > >    Num:    Value          Size Type    Bind   Vis       Ndx Name
> > > > > > >      6: 0000000000000002     0 NOTYPE  LOCAL  DEFAULT   ABS section_count
> > > > > > >
> > > > > > > Which cause an invalid address in __mcount_loc.
> > > > > > >
> > > > > >
> > > > > > Could you give a better account of the error you are trying to address?
> > > > > >
> > > > > > Also, arm64 no longer defines a section_count symbol (since v5.11), so
> > > > > > please make sure that the diagnostics of the issue you are addressing
> > > > > > are accurate for mainline.
> > > > > >
> > > > >
> > > > > My kernel version is 5.4.61.
> > > > > But as I explained, I suppose mainline also have this issue.
> > > > >
> > > >
> > > > Mainline is what we work on. So please base your changes (and your
> > > > commit log) on mainline.
> > > >
> > >
> > > I understand it.
> > > But the platform I can reproduce the problem is only support to 5.4 LTS now.
> > > And port it to the latest mainline kernel have much more work to do, can I just
> > > keep this commit log? Or just remove the example I posted in the commit messsage?
> > >
> > > >
> > > > > >
> > > > > > > Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
> > > > > > > ---
> > > > > > >  scripts/recordmcount.h | 4 ++++
> > > > > > >  1 file changed, 4 insertions(+)
> > > > > > >
> > > > > > > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > > > > > > index f9b19524da11..9b69167fb7ff 100644
> > > > > > > --- a/scripts/recordmcount.h
> > > > > > > +++ b/scripts/recordmcount.h
> > > > > > > @@ -526,6 +526,10 @@ static int find_secsym_ndx(unsigned const txtndx,
> > > > > > >         for (symp = sym0, t = nsym; t; --t, ++symp) {
> > > > > > >                 unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
> > > > > > >
> > > > > > > +               /* avoid absolute symbols */
> > > > > > > +               if (symp->st_shndx == SHN_ABS)
> > > > > > > +                       continue;
> > > > > > > +
> > > > > > >                 if (txtndx == get_symindex(symp, symtab, symtab_shndx)
> > > > > > >                         /* avoid STB_WEAK */
> > > > > > >                     && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {
> > >
Peter Zijlstra June 7, 2021, 11:44 a.m. UTC | #8
On Mon, Jun 07, 2021 at 11:50:40AM +0200, Ard Biesheuvel wrote:
> On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> >
> > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > >
> > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > > > >
> > > > > > > Avoid using ABS symbol, which won't be relocate, as reference.
> > > > > > >
> > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > > > > > >
> > > > > > > Section Headers:
> > > > > > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > > > > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > > > > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > > > > > >
> > > > > >
> > > > > > A RELA section's r_info field points to the section to which it
> > > > > > applies. This is why in the example above section #65522 points to
> > > > > > section #65521. This has nothing to do with the numerical value of
> > > > > > SHN_ABS.
> > > > >
> > > > > If the r_info of RELA section is 65521(0xfff1),
> > >
> > > Oh sorry, I mean sh_info here.
> > >
> > > > > find_secsym_ndx() will use it to find the base symbol.
> > > > >
> > > >
> > > > But what does that have to do with the sh_info field of the RELA
> > > > section's Elf_Shdr struct? IOW, what is the relevance of section
> > > > #65521 here?
> > > >
> > >
> > > So what I mean is the problem occur if the sh_info of a RELA section
> > > is #65521.
> >
> > Actually the problem occur if the sh_info of a RELA section is in
> > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE).
> > Maybe I should add a is_shndx_special() to check this like
> > scripts/mod/modpost.h did?
> >
> 
> So if I understand all of this correctly, we are running into a
> fundamental issue here, where the linker emits more sections than the
> sh_info field can describe, overflowing into the reserved range.
> 
> I don't think papering over it like this is going to be maintainable
> going forward.

There's an extended section header index section for just that. And
recordmcount actually seems to use that as well.

I can't seem to find enough of the thread to figure out what the actual
problem is though. The lore archive doesn't have anything prior to this
message.

One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When
SHN_XINDEX, then use .symtab_shndx.

Apparently you've found a case where neither is true? In that case
objtool seems to use shndx 0. A matching recordmcount patch would be
something like this.


diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index f9b19524da11..d99cc0aed6fe 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
 	unsigned long offset;
 	int index;
 
-	if (sym->st_shndx != SHN_XINDEX)
+	if (sym->st_shndx > SHN_UDEF &&
+	    sym->st_shndx < SHN_LORESERVE)
 		return w2(sym->st_shndx);
 
-	offset = (unsigned long)sym - (unsigned long)symtab;
-	index = offset / sizeof(*sym);
+	if (sym->st_shndx == SHN_XINDEX) {
+		offset = (unsigned long)sym - (unsigned long)symtab;
+		index = offset / sizeof(*sym);
 
-	return w(symtab_shndx[index]);
+		return w(symtab_shndx[index]);
+	}
+
+	return 0;
 }
 
 static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
Mark-PK Tsai June 7, 2021, 1:18 p.m. UTC | #9
> On Mon, Jun 07, 2021 at 11:50:40AM +0200, Ard Biesheuvel wrote:
> > On Mon, 7 Jun 2021 at 10:06, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > >
> > > > > On Mon, 7 Jun 2021 at 08:59, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > > >
> > > > > > > > On Mon, 7 Jun 2021 at 04:42, Mark-PK Tsai <mark-pk.tsai@mediatek.com> wrote:
> > > > > > > >
> > > > > > > > Avoid using ABS symbol, which won't be relocate, as reference.
> > > > > > > >
> > > > > > > > On arm64 platform, if there's shndx equals SHN_ABS(0xfff1).
> > > > > > > >
> > > > > > > > Section Headers:
> > > > > > > > [Nr]    Name                         Type      Address          Off      Size   ES  Flg Lk     Inf    Al
> > > > > > > > [65521] .text.n_tty_receive_buf      PROGBITS  0000000000000000 3cdab520 000054 00  AX  0      0      4
> > > > > > > > [65522] .rela.text.n_tty_receive_buf RELA      0000000000000000 3cdab578 000030 18  I   152076 65521  8
> > > > > > > >
> > > > > > >
> > > > > > > A RELA section's r_info field points to the section to which it
> > > > > > > applies. This is why in the example above section #65522 points to
> > > > > > > section #65521. This has nothing to do with the numerical value of
> > > > > > > SHN_ABS.
> > > > > >
> > > > > > If the r_info of RELA section is 65521(0xfff1),
> > > >
> > > > Oh sorry, I mean sh_info here.
> > > >
> > > > > > find_secsym_ndx() will use it to find the base symbol.
> > > > > >
> > > > >
> > > > > But what does that have to do with the sh_info field of the RELA
> > > > > section's Elf_Shdr struct? IOW, what is the relevance of section
> > > > > #65521 here?
> > > > >
> > > >
> > > > So what I mean is the problem occur if the sh_info of a RELA section
> > > > is #65521.
> > >
> > > Actually the problem occur if the sh_info of a RELA section is in
> > > the special section index range(SHN_LORESERVE ~ SHN_HIRESERVE).
> > > Maybe I should add a is_shndx_special() to check this like
> > > scripts/mod/modpost.h did?
> > >
> >
> > So if I understand all of this correctly, we are running into a
> > fundamental issue here, where the linker emits more sections than the
> > sh_info field can describe, overflowing into the reserved range.
> >
> > I don't think papering over it like this is going to be maintainable
> > going forward.
>
> There's an extended section header index section for just that. And
> recordmcount actually seems to use that as well.
>
> I can't seem to find enough of the thread to figure out what the actual
> problem is though. The lore archive doesn't have anything prior to this
> message.
>
> One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When
> SHN_XINDEX, then use .symtab_shndx.
>
> Apparently you've found a case where neither is true? In that case

Yes, that's what my mean.
get_symindex returns st_shndx directly even if st_shndx is in the reserve range.
So either do not use get_symindex for those symbols or do extra handling
for it like the patch you provide will solve the problem.

> objtool seems to use shndx 0. A matching recordmcount patch would be
> something like this.
>
>
> diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> index f9b19524da11..d99cc0aed6fe 100644
> --- a/scripts/recordmcount.h
> +++ b/scripts/recordmcount.h
> @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
>  	unsigned long offset;
>  	int index;
>
> -	if (sym->st_shndx != SHN_XINDEX)
> +	if (sym->st_shndx > SHN_UDEF &&
> +	    sym->st_shndx < SHN_LORESERVE)
>  		return w2(sym->st_shndx);
>
> -	offset = (unsigned long)sym - (unsigned long)symtab;
> -	index = offset / sizeof(*sym);
> +	if (sym->st_shndx == SHN_XINDEX) {
> +		offset = (unsigned long)sym - (unsigned long)symtab;
> +		index = offset / sizeof(*sym);
>
> -	return w(symtab_shndx[index]);
> +		return w(symtab_shndx[index]);
> +	}
> +
> +	return 0;
>  }
>
>  static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)

Thanks for the suggestion.
Skip all the symbols in the special sections seems fine because
those sections should not be processed here.
Steven Rostedt June 7, 2021, 1:44 p.m. UTC | #10
On Mon, 7 Jun 2021 13:44:21 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> There's an extended section header index section for just that. And
> recordmcount actually seems to use that as well.
> 
> I can't seem to find enough of the thread to figure out what the actual
> problem is though. The lore archive doesn't have anything prior to this
> message.
> 
> One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When
> SHN_XINDEX, then use .symtab_shndx.
> 
> Apparently you've found a case where neither is true? In that case
> objtool seems to use shndx 0. A matching recordmcount patch would be
> something like this.

Mark-PK,

Does the below patch fix it for you too (if you backport it to your
kernel). I much rather have recordmcount match objtool, as one day the
two will hopefully merge to one executable.

-- Steve


> 
> 
> diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> index f9b19524da11..d99cc0aed6fe 100644
> --- a/scripts/recordmcount.h
> +++ b/scripts/recordmcount.h
> @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
>  	unsigned long offset;
>  	int index;
>  
> -	if (sym->st_shndx != SHN_XINDEX)
> +	if (sym->st_shndx > SHN_UDEF &&
> +	    sym->st_shndx < SHN_LORESERVE)
>  		return w2(sym->st_shndx);
>  
> -	offset = (unsigned long)sym - (unsigned long)symtab;
> -	index = offset / sizeof(*sym);
> +	if (sym->st_shndx == SHN_XINDEX) {
> +		offset = (unsigned long)sym - (unsigned long)symtab;
> +		index = offset / sizeof(*sym);
>  
> -	return w(symtab_shndx[index]);
> +		return w(symtab_shndx[index]);
> +	}
> +
> +	return 0;
>  }
>  
>  static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
Peter Zijlstra June 7, 2021, 3:40 p.m. UTC | #11
On Mon, Jun 07, 2021 at 01:44:21PM +0200, Peter Zijlstra wrote:
> One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When
> SHN_XINDEX, then use .symtab_shndx.
> 
> Apparently you've found a case where neither is true? In that case
> objtool seems to use shndx 0. A matching recordmcount patch would be
> something like this.
> 

Apparently I'm consistently bad at spelling SHM_UNDEF today..

> diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> index f9b19524da11..d99cc0aed6fe 100644
> --- a/scripts/recordmcount.h
> +++ b/scripts/recordmcount.h
> @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
>  	unsigned long offset;
>  	int index;
>  
> -	if (sym->st_shndx != SHN_XINDEX)
> +	if (sym->st_shndx > SHN_UDEF &&
> +	    sym->st_shndx < SHN_LORESERVE)
>  		return w2(sym->st_shndx);
>  
> -	offset = (unsigned long)sym - (unsigned long)symtab;
> -	index = offset / sizeof(*sym);
> +	if (sym->st_shndx == SHN_XINDEX) {
> +		offset = (unsigned long)sym - (unsigned long)symtab;
> +		index = offset / sizeof(*sym);
>  
> -	return w(symtab_shndx[index]);
> +		return w(symtab_shndx[index]);
> +	}
> +
> +	return 0;
>  }
>  
>  static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
Mark-PK Tsai June 7, 2021, 5:31 p.m. UTC | #12
> On Mon, 7 Jun 2021 13:44:21 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > There's an extended section header index section for just that. And
> > recordmcount actually seems to use that as well.
> > 
> > I can't seem to find enough of the thread to figure out what the actual
> > problem is though. The lore archive doesn't have anything prior to this
> > message.
> > 
> > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When
> > SHN_XINDEX, then use .symtab_shndx.
> > 
> > Apparently you've found a case where neither is true? In that case
> > objtool seems to use shndx 0. A matching recordmcount patch would be
> > something like this.

Hi Peter,

Should I resend the below patch as v2?

> 
> Mark-PK,
> 
> Does the below patch fix it for you too (if you backport it to your
> kernel). I much rather have recordmcount match objtool, as one day the
> two will hopefully merge to one executable.
> 
> -- Steve
> 

Yes, this patch fix it.

> 
> > 
> > 
> > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > index f9b19524da11..d99cc0aed6fe 100644
> > --- a/scripts/recordmcount.h
> > +++ b/scripts/recordmcount.h
> > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
> >  	unsigned long offset;
> >  	int index;
> >  
> > -	if (sym->st_shndx != SHN_XINDEX)
> > +	if (sym->st_shndx > SHN_UDEF &&
> > +	    sym->st_shndx < SHN_LORESERVE)
> >  		return w2(sym->st_shndx);
> >  
> > -	offset = (unsigned long)sym - (unsigned long)symtab;
> > -	index = offset / sizeof(*sym);
> > +	if (sym->st_shndx == SHN_XINDEX) {
> > +		offset = (unsigned long)sym - (unsigned long)symtab;
> > +		index = offset / sizeof(*sym);
> >  
> > -	return w(symtab_shndx[index]);
> > +		return w(symtab_shndx[index]);
> > +	}
> > +
> > +	return 0;
> >  }
> >  
> >  static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
Mark-PK Tsai June 8, 2021, 1:15 a.m. UTC | #13
> On Mon, Jun 07, 2021 at 01:44:21PM +0200, Peter Zijlstra wrote:
> > One should only use st_shndx when >SHN_UDEF and <SHN_LORESERVE. When
> > SHN_XINDEX, then use .symtab_shndx.
> > 
> > Apparently you've found a case where neither is true? In that case
> > objtool seems to use shndx 0. A matching recordmcount patch would be
> > something like this.
> > 
> 
> Apparently I'm consistently bad at spelling SHM_UNDEF today..

I test the below patch and it work for me.
I only correct the UNDEF typo without any other modification.

Could I push this patch or you will push it?
I guess I have to add your signed-off-by.

> 
> > diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
> > index f9b19524da11..d99cc0aed6fe 100644
> > --- a/scripts/recordmcount.h
> > +++ b/scripts/recordmcount.h
> > @@ -194,13 +194,18 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
> >  	unsigned long offset;
> >  	int index;
> >  
> > -	if (sym->st_shndx != SHN_XINDEX)
> > +	if (sym->st_shndx > SHN_UDEF &&
> > +	    sym->st_shndx < SHN_LORESERVE)
> >  		return w2(sym->st_shndx);
> >  
> > -	offset = (unsigned long)sym - (unsigned long)symtab;
> > -	index = offset / sizeof(*sym);
> > +	if (sym->st_shndx == SHN_XINDEX) {
> > +		offset = (unsigned long)sym - (unsigned long)symtab;
> > +		index = offset / sizeof(*sym);
> >  
> > -	return w(symtab_shndx[index]);
> > +		return w(symtab_shndx[index]);
> > +	}
> > +
> > +	return 0;
> >  }
> >  
> >  static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
diff mbox series

Patch

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index f9b19524da11..9b69167fb7ff 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -526,6 +526,10 @@  static int find_secsym_ndx(unsigned const txtndx,
 	for (symp = sym0, t = nsym; t; --t, ++symp) {
 		unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
 
+		/* avoid absolute symbols */
+		if (symp->st_shndx == SHN_ABS)
+			continue;
+
 		if (txtndx == get_symindex(symp, symtab, symtab_shndx)
 			/* avoid STB_WEAK */
 		    && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) {