diff mbox series

modpost: Ignore relaxation and alignment marker relocs on LoongArch

Message ID 20231227070317.1936234-1-kernel@xen0n.name (mailing list archive)
State New
Headers show
Series modpost: Ignore relaxation and alignment marker relocs on LoongArch | expand

Commit Message

WANG Xuerui Dec. 27, 2023, 7:03 a.m. UTC
From: WANG Xuerui <git@xen0n.name>

With recent trunk versions of binutils and gcc, alignment directives are
represented with R_LARCH_ALIGN relocs on LoongArch, which is necessary
for the linker to maintain alignment requirements during its relaxation
passes. And even though the kernel is built with relaxation disabled, so
far a small number of R_LARCH_RELAX marker relocs are still emitted as
part of la.* pseudo instructions in assembly. These two kinds of relocs
do not refer to symbols, which can trip up modpost's section mismatch
checks, because the r_offset of said relocs can be zero or any other
meaningless value, eventually leading to a `from == NULL` condition in
default_mismatch_handler and SIGSEGV.

As the two kinds of relocs are not concerned with symbols, just ignore
them for section mismatch check purposes.

Fixes: 3d36f4298ba9 ("LoongArch: Switch to relative exception tables")
Signed-off-by: WANG Xuerui <git@xen0n.name>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Youling Tang <tangyouling@loongson.cn>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: loongarch@lists.linux.dev
---
 scripts/mod/modpost.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Comments

Xi Ruoyao Dec. 27, 2023, 11:06 a.m. UTC | #1
On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
> And even though the kernel is built with relaxation disabled, so
> far a small number of R_LARCH_RELAX marker relocs are still emitted as
> part of la.* pseudo instructions in assembly.

I'd consider it a toolchain bug...  Is there a reproducer?
Huacai Chen Jan. 4, 2024, 8:57 a.m. UTC | #2
On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
>
> On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
> > And even though the kernel is built with relaxation disabled, so
> > far a small number of R_LARCH_RELAX marker relocs are still emitted as
> > part of la.* pseudo instructions in assembly.
>
> I'd consider it a toolchain bug...  Is there a reproducer?
Any updates? Should I apply this patch for loongarch-next?

Huacai

>
> --
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University
>
Masahiro Yamada Jan. 4, 2024, 10:59 a.m. UTC | #3
On Thu, Jan 4, 2024 at 5:58 PM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
> >
> > On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
> > > And even though the kernel is built with relaxation disabled, so
> > > far a small number of R_LARCH_RELAX marker relocs are still emitted as
> > > part of la.* pseudo instructions in assembly.
> >
> > I'd consider it a toolchain bug...  Is there a reproducer?
> Any updates? Should I apply this patch for loongarch-next?


This is odd.

At least, Fixes: 3d36f4298ba9
is unrelated.


The instruction to reproduce it was requested.

I did not see any error for defconfig
with loongarch64-linux-gcc 13.2 provided by 0day bot.




--
Best Regards
Masahiro Yamada
Xi Ruoyao Jan. 4, 2024, 4:26 p.m. UTC | #4
On Thu, 2024-01-04 at 16:57 +0800, Huacai Chen wrote:
> On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
> > 
> > On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
> > > And even though the kernel is built with relaxation disabled, so
> > > far a small number of R_LARCH_RELAX marker relocs are still emitted as
> > > part of la.* pseudo instructions in assembly.
> > 
> > I'd consider it a toolchain bug...  Is there a reproducer?
> Any updates? Should I apply this patch for loongarch-next?

Tiezhu told me this should be reproducible with GCC 14 and Binutils-2.42
development snapshots and defconfig.  I'm trying...
Xi Ruoyao Jan. 4, 2024, 4:50 p.m. UTC | #5
On Fri, 2024-01-05 at 00:26 +0800, Xi Ruoyao wrote:
> On Thu, 2024-01-04 at 16:57 +0800, Huacai Chen wrote:
> > On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
> > > 
> > > On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
> > > > And even though the kernel is built with relaxation disabled, so
> > > > far a small number of R_LARCH_RELAX marker relocs are still emitted as
> > > > part of la.* pseudo instructions in assembly.
> > > 
> > > I'd consider it a toolchain bug...  Is there a reproducer?
> > Any updates? Should I apply this patch for loongarch-next?
> 
> Tiezhu told me this should be reproducible with GCC 14 and Binutils-2.42
> development snapshots and defconfig.  I'm trying...

Unfortunately I still cannot reproduce the issue (with GCC 14.0.0
20231208, binutils 2.41.50.20240104, and defconfig).  One possibility is
this was already fixed somehow, another is the issue was triggered with
a GCC snapshot > 20231208 but I think this very unlikely.

Xuerui: can you find the guilty .o file containing R_LARCH_RELAX and/or
R_LARCH_ALIGN?  Then you can just run "make path/to/something.s"
(replace ".o" with ".s") to produce the assembly file and send it to me
for further investigation.
Tiezhu Yang Jan. 5, 2024, 8:09 a.m. UTC | #6
On 01/05/2024 12:26 AM, Xi Ruoyao wrote:
> On Thu, 2024-01-04 at 16:57 +0800, Huacai Chen wrote:
>> On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
>>>
>>> On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
>>>> And even though the kernel is built with relaxation disabled, so
>>>> far a small number of R_LARCH_RELAX marker relocs are still emitted as
>>>> part of la.* pseudo instructions in assembly.
>>>
>>> I'd consider it a toolchain bug...  Is there a reproducer?
>> Any updates? Should I apply this patch for loongarch-next?
>
> Tiezhu told me this should be reproducible with GCC 14 and Binutils-2.42
> development snapshots and defconfig.  I'm trying...

1. How to reproduce

I update the latest upstream toolchains (20240105):

[fedora@linux 6.7.test]$ gcc --version
gcc (GCC) 14.0.0 20240105 (experimental)
[fedora@linux 6.7.test]$ as --version
GNU assembler (GNU Binutils) 2.41.50.20240105
[fedora@linux 6.7.test]$ ld --version
GNU ld (GNU Binutils) 2.41.50.20240105

and then test it again, here is the failure info:

[fedora@linux 6.7.test]$ git log --oneline | head -1
610a9b8f49fb Linux 6.7-rc8
[fedora@linux 6.7.test]$ make loongson3_defconfig
[fedora@linux 6.7.test]$ make
...
   AR      built-in.a
   AR      vmlinux.a
   LD      vmlinux.o
   OBJCOPY modules.builtin.modinfo
   GEN     modules.builtin
   GEN     .vmlinux.objs
   MODPOST Module.symvers
make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 139
make[1]: *** [/home/fedora/6.7.test/Makefile:1863: modpost] Error 2
make: *** [Makefile:234: __sub-make] Error 2

2. Additional info

I can confirm that the slightly older version of toolchains (20231127)
have no the above failure, so I guess this is related with toolchains.

3. How to fix

(1) One way is to modify the kernel code, with this kernel patch,
     there is no building failure with the latest upstream toolchains.
(2) The other way is to analysis and fix the binutils code,
     it need more work to do.

Thanks,
Tiezhu
Jinyang He Jan. 5, 2024, 10:10 a.m. UTC | #7
On 2024-01-05 16:09, Tiezhu Yang wrote:
>
>
> On 01/05/2024 12:26 AM, Xi Ruoyao wrote:
>> On Thu, 2024-01-04 at 16:57 +0800, Huacai Chen wrote:
>>> On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
>>>>
>>>> On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
>>>>> And even though the kernel is built with relaxation disabled, so
>>>>> far a small number of R_LARCH_RELAX marker relocs are still 
>>>>> emitted as
>>>>> part of la.* pseudo instructions in assembly.
>>>>
>>>> I'd consider it a toolchain bug...  Is there a reproducer?
>>> Any updates? Should I apply this patch for loongarch-next?
>>
>> Tiezhu told me this should be reproducible with GCC 14 and Binutils-2.42
>> development snapshots and defconfig.  I'm trying...
>
> 1. How to reproduce
>
> I update the latest upstream toolchains (20240105):
>
> [fedora@linux 6.7.test]$ gcc --version
> gcc (GCC) 14.0.0 20240105 (experimental)
> [fedora@linux 6.7.test]$ as --version
> GNU assembler (GNU Binutils) 2.41.50.20240105
> [fedora@linux 6.7.test]$ ld --version
> GNU ld (GNU Binutils) 2.41.50.20240105
>
> and then test it again, here is the failure info:
>
> [fedora@linux 6.7.test]$ git log --oneline | head -1
> 610a9b8f49fb Linux 6.7-rc8
> [fedora@linux 6.7.test]$ make loongson3_defconfig
> [fedora@linux 6.7.test]$ make
> ...
>   AR      built-in.a
>   AR      vmlinux.a
>   LD      vmlinux.o
>   OBJCOPY modules.builtin.modinfo
>   GEN     modules.builtin
>   GEN     .vmlinux.objs
>   MODPOST Module.symvers
> make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 139
> make[1]: *** [/home/fedora/6.7.test/Makefile:1863: modpost] Error 2
> make: *** [Makefile:234: __sub-make] Error 2
>
> 2. Additional info
>
> I can confirm that the slightly older version of toolchains (20231127)
> have no the above failure, so I guess this is related with toolchains.
>
> 3. How to fix
>
> (1) One way is to modify the kernel code, with this kernel patch,
>     there is no building failure with the latest upstream toolchains.
> (2) The other way is to analysis and fix the binutils code,
>     it need more work to do.

Hi,

I have an idea about that, but I didn't really get into it. The improved
R_LARCH_ALIGN (psABI v2.30) requires a symbol index. The symbol is only
created at the first time to handle alignment directive. This means that
all other sections may use this symbol. If the section of this symbol is
discarded, there may be problems.

Thanks,

Jinyang


>
> Thanks,
> Tiezhu
>
Huacai Chen Jan. 7, 2024, 2:30 a.m. UTC | #8
Hi, Xuerui,

Could you please update a V2 to just modify the commit message?

Huacai

On Fri, Jan 5, 2024 at 6:11 PM Jinyang He <hejinyang@loongson.cn> wrote:
>
>
> On 2024-01-05 16:09, Tiezhu Yang wrote:
> >
> >
> > On 01/05/2024 12:26 AM, Xi Ruoyao wrote:
> >> On Thu, 2024-01-04 at 16:57 +0800, Huacai Chen wrote:
> >>> On Wed, Dec 27, 2023 at 7:06 PM Xi Ruoyao <xry111@xry111.site> wrote:
> >>>>
> >>>> On Wed, 2023-12-27 at 15:03 +0800, WANG Xuerui wrote:
> >>>>> And even though the kernel is built with relaxation disabled, so
> >>>>> far a small number of R_LARCH_RELAX marker relocs are still
> >>>>> emitted as
> >>>>> part of la.* pseudo instructions in assembly.
> >>>>
> >>>> I'd consider it a toolchain bug...  Is there a reproducer?
> >>> Any updates? Should I apply this patch for loongarch-next?
> >>
> >> Tiezhu told me this should be reproducible with GCC 14 and Binutils-2.42
> >> development snapshots and defconfig.  I'm trying...
> >
> > 1. How to reproduce
> >
> > I update the latest upstream toolchains (20240105):
> >
> > [fedora@linux 6.7.test]$ gcc --version
> > gcc (GCC) 14.0.0 20240105 (experimental)
> > [fedora@linux 6.7.test]$ as --version
> > GNU assembler (GNU Binutils) 2.41.50.20240105
> > [fedora@linux 6.7.test]$ ld --version
> > GNU ld (GNU Binutils) 2.41.50.20240105
> >
> > and then test it again, here is the failure info:
> >
> > [fedora@linux 6.7.test]$ git log --oneline | head -1
> > 610a9b8f49fb Linux 6.7-rc8
> > [fedora@linux 6.7.test]$ make loongson3_defconfig
> > [fedora@linux 6.7.test]$ make
> > ...
> >   AR      built-in.a
> >   AR      vmlinux.a
> >   LD      vmlinux.o
> >   OBJCOPY modules.builtin.modinfo
> >   GEN     modules.builtin
> >   GEN     .vmlinux.objs
> >   MODPOST Module.symvers
> > make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 139
> > make[1]: *** [/home/fedora/6.7.test/Makefile:1863: modpost] Error 2
> > make: *** [Makefile:234: __sub-make] Error 2
> >
> > 2. Additional info
> >
> > I can confirm that the slightly older version of toolchains (20231127)
> > have no the above failure, so I guess this is related with toolchains.
> >
> > 3. How to fix
> >
> > (1) One way is to modify the kernel code, with this kernel patch,
> >     there is no building failure with the latest upstream toolchains.
> > (2) The other way is to analysis and fix the binutils code,
> >     it need more work to do.
>
> Hi,
>
> I have an idea about that, but I didn't really get into it. The improved
> R_LARCH_ALIGN (psABI v2.30) requires a symbol index. The symbol is only
> created at the first time to handle alignment directive. This means that
> all other sections may use this symbol. If the section of this symbol is
> discarded, there may be problems.
>
> Thanks,
>
> Jinyang
>
>
> >
> > Thanks,
> > Tiezhu
> >
>
diff mbox series

Patch

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index cb6406f485a9..a4df47372b95 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1346,6 +1346,14 @@  static Elf_Addr addend_mips_rel(uint32_t *location, unsigned int r_type)
 #define R_LARCH_SUB32		55
 #endif
 
+#ifndef R_LARCH_RELAX
+#define R_LARCH_RELAX		100
+#endif
+
+#ifndef R_LARCH_ALIGN
+#define R_LARCH_ALIGN		102
+#endif
+
 static void get_rel_type_and_sym(struct elf_info *elf, uint64_t r_info,
 				 unsigned int *r_type, unsigned int *r_sym)
 {
@@ -1400,9 +1408,16 @@  static void section_rela(struct module *mod, struct elf_info *elf,
 				continue;
 			break;
 		case EM_LOONGARCH:
-			if (!strcmp("__ex_table", fromsec) &&
-			    r_type == R_LARCH_SUB32)
+			switch (r_type) {
+			case R_LARCH_SUB32:
+				if (!strcmp("__ex_table", fromsec))
+					continue;
+				break;
+			case R_LARCH_RELAX:
+			case R_LARCH_ALIGN:
+				/* these relocs do not refer to symbols */
 				continue;
+			}
 			break;
 		}