diff mbox series

LoongArch: Disable module from accessing external data directly

Message ID 20231108040447.288870-1-wangrui@loongson.cn (mailing list archive)
State New
Headers show
Series LoongArch: Disable module from accessing external data directly | expand

Commit Message

WANG Rui Nov. 8, 2023, 4:04 a.m. UTC
The distance between vmlinux and the module is too far so that PC-REL
cannot be accessed directly, only GOT.

When compiling module with GCC, the option `-mdirect-extern-access` is
disabled by default. The Clang option `-fdirect-access-external-data`
is enabled by default, so it needs to be explicitly disabled.

Signed-off-by: WANG Rui <wangrui@loongson.cn>
---
 arch/loongarch/Makefile | 2 ++
 1 file changed, 2 insertions(+)

Comments

Xi Ruoyao Nov. 8, 2023, 8:30 a.m. UTC | #1
On Wed, 2023-11-08 at 12:04 +0800, WANG Rui wrote:
> When compiling module with GCC, the option `-mdirect-extern-access` is
> disabled by default. The Clang option `-fdirect-access-external-data`
> is enabled by default, so it needs to be explicitly disabled.

I'm wondering why it's enabled by default.

For this simple test case:

extern char **environ;

int main()
{
  __builtin_printf("%10s\n", environ[0]);
}

With Clang 17.0.4 and "clang t1.c -S -O2", it compiles to:

main:
	addi.d	$sp, $sp, -16
	st.d	$ra, $sp, 8
	pcalau12i	$a0, %got_pc_hi20(environ)
	ld.d	$a0, $a0, %got_pc_lo12(environ)
	ld.d	$a0, $a0, 0
	ld.d	$a1, $a0, 0
	pcalau12i	$a0, %pc_hi20(.L.str)
	addi.d	$a0, $a0, %pc_lo12(.L.str)
	bl	%plt(printf)
	move	$a0, $zero
	ld.d	$ra, $sp, 8
	addi.d	$sp, $sp, 16
	ret

So GOT is used for accessing the external variable environ.  With "clang
t1.c -S -O2 -fdirect-access-external-data", we get:

main:
	addi.d	$sp, $sp, -16
	st.d	$ra, $sp, 8
	pcalau12i	$a0, %pc_hi20(environ)
	addi.d	$a0, $a0, %pc_lo12(environ)
	ld.d	$a0, $a0, 0
	ld.d	$a1, $a0, 0
	pcalau12i	$a0, %pc_hi20(.L.str)
	addi.d	$a0, $a0, %pc_lo12(.L.str)
	bl	%plt(printf)
	move	$a0, $zero
	ld.d	$ra, $sp, 8
	addi.d	$sp, $sp, 16
	ret

then the linked binary triggers a SIGBUS.  Ideally this should be
detected by the linker at link time, but currently the BFD linker fails
to detect this error (FWIW this flaw is caused by a really nasty method
for the medium code model implementation).  So to me -fno-direct-access-
external-data is the default.  I also grepped for -fdirect-access-
external-data in the kernel building system but I've not found any
match. 

Are you using a different version of Clang, or maybe Clang has some
configuration-time option to make -fdirect-access-external-data the
default?

Note that to translate a TU for a normal (dynamically-linked user-space)
executable on LoongArch Linux, -fdirect-access-external-data should not
be used (because copy relocation is now considered a bad idea and we'll
not support it for a new architecture).  Fangrui?

-fdirect-access-external-data can be used in KBUILD_CFLAGS_KERNEL for
avoiding GOT in the main kernel image, OTOH.
WANG Rui Nov. 8, 2023, 9:20 a.m. UTC | #2
On Wed, Nov 8, 2023 at 4:37 PM Xi Ruoyao <xry111@xry111.site> wrote:
>
> On Wed, 2023-11-08 at 12:04 +0800, WANG Rui wrote:
> > When compiling module with GCC, the option `-mdirect-extern-access` is
> > disabled by default. The Clang option `-fdirect-access-external-data`
> > is enabled by default, so it needs to be explicitly disabled.
>
> I'm wondering why it's enabled by default.
>
> For this simple test case:
>
> extern char **environ;
>
> int main()
> {
>   __builtin_printf("%10s\n", environ[0]);
> }
>
> With Clang 17.0.4 and "clang t1.c -S -O2", it compiles to:
>
> main:
>         addi.d  $sp, $sp, -16
>         st.d    $ra, $sp, 8
>         pcalau12i       $a0, %got_pc_hi20(environ)
>         ld.d    $a0, $a0, %got_pc_lo12(environ)
>         ld.d    $a0, $a0, 0
>         ld.d    $a1, $a0, 0
>         pcalau12i       $a0, %pc_hi20(.L.str)
>         addi.d  $a0, $a0, %pc_lo12(.L.str)
>         bl      %plt(printf)
>         move    $a0, $zero
>         ld.d    $ra, $sp, 8
>         addi.d  $sp, $sp, 16
>         ret
>
> So GOT is used for accessing the external variable environ.  With "clang
> t1.c -S -O2 -fdirect-access-external-data", we get:
>
> main:
>         addi.d  $sp, $sp, -16
>         st.d    $ra, $sp, 8
>         pcalau12i       $a0, %pc_hi20(environ)
>         addi.d  $a0, $a0, %pc_lo12(environ)
>         ld.d    $a0, $a0, 0
>         ld.d    $a1, $a0, 0
>         pcalau12i       $a0, %pc_hi20(.L.str)
>         addi.d  $a0, $a0, %pc_lo12(.L.str)
>         bl      %plt(printf)
>         move    $a0, $zero
>         ld.d    $ra, $sp, 8
>         addi.d  $sp, $sp, 16
>         ret
>
> then the linked binary triggers a SIGBUS.  Ideally this should be
> detected by the linker at link time, but currently the BFD linker fails
> to detect this error (FWIW this flaw is caused by a really nasty method
> for the medium code model implementation).  So to me -fno-direct-access-
> external-data is the default.  I also grepped for -fdirect-access-
> external-data in the kernel building system but I've not found any
> match.
>
> Are you using a different version of Clang, or maybe Clang has some
> configuration-time option to make -fdirect-access-external-data the
> default?

The clang enables `direct-access-external-data` by default in PIC and
disables it by default in no-PIC. This also applies to PIE. [1]

I found that clang PIE in different default states for different
environments. For instance, cross-compile is off, while native-compile
is on.

>
> Note that to translate a TU for a normal (dynamically-linked user-space)
> executable on LoongArch Linux, -fdirect-access-external-data should not
> be used (because copy relocation is now considered a bad idea and we'll
> not support it for a new architecture).  Fangrui?
>
> -fdirect-access-external-data can be used in KBUILD_CFLAGS_KERNEL for
> avoiding GOT in the main kernel image, OTOH.

I also saw that compiling vmlinux already includes the `-fno-PIE`
option, which for clang is `direct-access-external-data` enabled.

>
> --
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University
>

[1] https://github.com/llvm/llvm-project/blob/llvmorg-17.0.4/clang/lib/Frontend/CompilerInvocation.cpp#L1654-L1659
Xi Ruoyao Nov. 8, 2023, 9:25 a.m. UTC | #3
On Wed, 2023-11-08 at 17:20 +0800, WANG Rui wrote:
> > then the linked binary triggers a SIGBUS.  Ideally this should be
> > detected by the linker at link time, but currently the BFD linker
> > fails
> > to detect this error (FWIW this flaw is caused by a really nasty
> > method
> > for the medium code model implementation).  So to me -fno-direct-
> > access-
> > external-data is the default.  I also grepped for -fdirect-access-
> > external-data in the kernel building system but I've not found any
> > match.
> > 
> > Are you using a different version of Clang, or maybe Clang has some
> > configuration-time option to make -fdirect-access-external-data the
> > default?
> 
> The clang enables `direct-access-external-data` by default in PIC and
> disables it by default in no-PIC. This also applies to PIE. [1]

Oh sh*t:

xry111@nanmen2 ~ $ clang t1.c -O2 -fno-pie -no-pie
xry111@nanmen2 ~ $ ./a.out
Bus error (core dumped)

I'll consider it a Clang bug then.
WANG Rui Nov. 8, 2023, 9:36 a.m. UTC | #4
On Wed, Nov 8, 2023 at 5:26 PM Xi Ruoyao <xry111@xry111.site> wrote:
>
> On Wed, 2023-11-08 at 17:20 +0800, WANG Rui wrote:
> > > then the linked binary triggers a SIGBUS.  Ideally this should be
> > > detected by the linker at link time, but currently the BFD linker
> > > fails
> > > to detect this error (FWIW this flaw is caused by a really nasty
> > > method
> > > for the medium code model implementation).  So to me -fno-direct-
> > > access-
> > > external-data is the default.  I also grepped for -fdirect-access-
> > > external-data in the kernel building system but I've not found any
> > > match.
> > >
> > > Are you using a different version of Clang, or maybe Clang has some
> > > configuration-time option to make -fdirect-access-external-data the
> > > default?
> >
> > The clang enables `direct-access-external-data` by default in PIC and
> > disables it by default in no-PIC. This also applies to PIE. [1]
>
> Oh sh*t:
>
> xry111@nanmen2 ~ $ clang t1.c -O2 -fno-pie -no-pie
> xry111@nanmen2 ~ $ ./a.out
> Bus error (core dumped)
>
> I'll consider it a Clang bug then.

That's it, no copy relocations. As far as I know, copying relocations
has some issues and is not recommended by Fangrui.

For modules, if distance is not a problem, `no-pic` and
`direct-access-external-data` can be together because the code is
writable. Does it seem reasonable to exist?

>
> --
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University
>
WANG Rui Nov. 8, 2023, 9:39 a.m. UTC | #5
On Wed, Nov 8, 2023 at 5:36 PM WANG Rui <wangrui@loongson.cn> wrote:
>
> On Wed, Nov 8, 2023 at 5:26 PM Xi Ruoyao <xry111@xry111.site> wrote:
> >
> > On Wed, 2023-11-08 at 17:20 +0800, WANG Rui wrote:
> > > > then the linked binary triggers a SIGBUS.  Ideally this should be
> > > > detected by the linker at link time, but currently the BFD linker
> > > > fails
> > > > to detect this error (FWIW this flaw is caused by a really nasty
> > > > method
> > > > for the medium code model implementation).  So to me -fno-direct-
> > > > access-
> > > > external-data is the default.  I also grepped for -fdirect-access-
> > > > external-data in the kernel building system but I've not found any
> > > > match.
> > > >
> > > > Are you using a different version of Clang, or maybe Clang has some
> > > > configuration-time option to make -fdirect-access-external-data the
> > > > default?
> > >
> > > The clang enables `direct-access-external-data` by default in PIC and
> > > disables it by default in no-PIC. This also applies to PIE. [1]
> >
> > Oh sh*t:
> >
> > xry111@nanmen2 ~ $ clang t1.c -O2 -fno-pie -no-pie
> > xry111@nanmen2 ~ $ ./a.out
> > Bus error (core dumped)
> >
> > I'll consider it a Clang bug then.
>
> That's it, no copy relocations. As far as I know, copying relocations
> has some issues and is not recommended by Fangrui.
>
> For modules, if distance is not a problem, `no-pic` and
> `direct-access-external-data` can be together because the code is
> writable. Does it seem reasonable to exist?

Of course, for LoongArch, it is better for `no-pic` to disable
`direct-access-external-data` by default. I will send a patch.
Xi Ruoyao Nov. 8, 2023, 9:46 a.m. UTC | #6
On Wed, 2023-11-08 at 17:36 +0800, WANG Rui wrote:
> > xry111@nanmen2 ~ $ clang t1.c -O2 -fno-pie -no-pie
> > xry111@nanmen2 ~ $ ./a.out
> > Bus error (core dumped)
> > 
> > I'll consider it a Clang bug then.

https://github.com/llvm/llvm-project/issues/71645

> That's it, no copy relocations. As far as I know, copying relocations
> has some issues and is not recommended by Fangrui.
> 
> For modules, if distance is not a problem, `no-pic` and
> `direct-access-external-data` can be together because the code is
> writable. Does it seem reasonable to exist?

It may be usable, but the result is generally worse than relying on GOT.

For example, consider a module referring two data symbols in vmlinux,
foo and bar.  The symbol foo is referred 10 times and bar is referred 8
times.

With the current GOT-based approach, the total space usage is (2 GOT
entry * (8 bytes / GOT entry)) + ((10 + 8) * 2 instruction * 4 (bytes /
instruction)) = 160 bytes.

With -fdirect-access-external-data, we must add -mcmodel=extreme too
because the modules are too far away from vmlinux in the kernel address
space, then the total space usage will be (10 + 8) * 5 instruction * 4
(bytes / instruction) = 360 bytes.

One possible approach to resolve the issue is relocating vmlinux from
XKPRANGE to XKVRANGE and fit vmlinux + all modules into a 2GiB range. 
Then the total space usage will be (10 + 8) * 2 instruction * 4 (bytes /
instruction) = 144 bytes.  But I don't know how to implement this, and
running vmlinux in XKVRANGE may have a performance penalty.
diff mbox series

Patch

diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index b86f2ff31659..9eeb0c05f3f4 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -68,6 +68,8 @@  LDFLAGS_vmlinux			+= -static -n -nostdlib
 ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
 cflags-y			+= $(call cc-option,-mexplicit-relocs)
 KBUILD_CFLAGS_KERNEL		+= $(call cc-option,-mdirect-extern-access)
+KBUILD_AFLAGS_MODULE		+= $(call cc-option,-fno-direct-access-external-data)
+KBUILD_CFLAGS_MODULE		+= $(call cc-option,-fno-direct-access-external-data)
 KBUILD_AFLAGS_MODULE		+= $(call cc-option,-mno-relax) $(call cc-option,-Wa$(comma)-mno-relax)
 KBUILD_CFLAGS_MODULE		+= $(call cc-option,-mno-relax) $(call cc-option,-Wa$(comma)-mno-relax)
 else