diff mbox series

[v9,09/27] rust: add `compiler_builtins` crate

Message ID 20220805154231.31257-10-ojeda@kernel.org (mailing list archive)
State New, archived
Headers show
Series Rust support | expand

Commit Message

Miguel Ojeda Aug. 5, 2022, 3:41 p.m. UTC
Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`.
Since we do not need the vast majority of them, we avoid the
dependency by providing our own crate.

Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com>
Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com>
Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
Co-developed-by: Sven Van Asbroeck <thesven73@gmail.com>
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Co-developed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
---
 rust/compiler_builtins.rs | 63 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)
 create mode 100644 rust/compiler_builtins.rs

Comments

Kees Cook Aug. 17, 2022, 8:08 p.m. UTC | #1
On Fri, Aug 05, 2022 at 05:41:54PM +0200, Miguel Ojeda wrote:
> Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`.
> Since we do not need the vast majority of them, we avoid the
> dependency by providing our own crate.
> 
> Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com>
> Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com>

Reviewed-by: Kees Cook <keescook@chromium.org>
Nick Desaulniers Aug. 22, 2022, 11:55 p.m. UTC | #2
On Fri, Aug 5, 2022 at 8:44 AM Miguel Ojeda <ojeda@kernel.org> wrote:
>
> Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`.
> Since we do not need the vast majority of them, we avoid the
> dependency by providing our own crate.
>
> Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com>
> Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com>
> Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
> Co-developed-by: Sven Van Asbroeck <thesven73@gmail.com>
> Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
> Co-developed-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

I still really really do not like this patch. Thoughts below.

> ---
>  rust/compiler_builtins.rs | 63 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
>  create mode 100644 rust/compiler_builtins.rs
>
> diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs
> new file mode 100644
> index 000000000000..f8f39a3e6855
> --- /dev/null
> +++ b/rust/compiler_builtins.rs
> @@ -0,0 +1,63 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Our own `compiler_builtins`.
> +//!
> +//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`].
> +//! Since we do not need the vast majority of them, we avoid the dependency
> +//! by providing this file.
> +//!
> +//! At the moment, some builtins are required that should not be. For instance,
> +//! [`core`] has 128-bit integers functionality which we should not be compiling

It's not just 128b, these are for double word sizes, so for 32b
targets (I recall an earlier version of series supporting armv6) these
symbols are for 64b types.

> +//! in. We will work with upstream [`core`] to provide feature flags to disable
> +//! the parts we do not need. For the moment, we define them to [`panic!`] at

Where is the bug filed upstream (or in your issue tracker)?

> +//! runtime for simplicity to catch mistakes, instead of performing surgery
> +//! on `core.o`.

Wait, so Kbuild performs surgery on alloc, but core is off limits? bah

Core is huge!
$ du -h rust/core.o
1.3M    rust/core.o

> +//!
> +//! In any case, all these symbols are weakened to ensure we do not override
> +//! those that may be provided by the rest of the kernel.
> +//!
> +//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins
> +//! [`compiler-rt`]: https://compiler-rt.llvm.org/
> +
> +#![feature(compiler_builtins)]
> +#![compiler_builtins]
> +#![no_builtins]
> +#![no_std]
> +
> +macro_rules! define_panicking_intrinsics(
> +    ($reason: tt, { $($ident: ident, )* }) => {
> +        $(
> +            #[doc(hidden)]
> +            #[no_mangle]
> +            pub extern "C" fn $ident() {
> +                panic!($reason);
> +            }
> +        )*
> +    }
> +);
> +
> +define_panicking_intrinsics!("`f32` should not be used", {
> +    __eqsf2,

I don't see ^ this symbol in core.

$ llvm-nm rust/core.o | grep "U __" | sort | tr -s " "
 U __gesf2
 U __lesf2
 U __udivti3
 U __umodti3
 U __unorddf2
 U __unordsf2
 U __x86_indirect_thunk_r11

so a few of these seem extraneous. Are you sure they're coming from
core? (or was it the different arch support that was removed from v8
to facilitate v9?)

__gesf2, __lesf2, and __unordsf2 all seem to be coming from
<<f32>::classify> and <<f32>::to_bits::ct_f32_to_u32>.  Surely we can
avoid more f32 support by relying on objcopy
-N/--strip-symbol=/--strip-symbols=filename to slice out/drop symbols
from core.o?

<core::fmt::num::exp_u128> uses __umodti3+__udivti3

__unorddf2 is referenced by <<f64>::classify> and
<<f64>::to_bits::ct_f64_to_u64>.

Can we slice those 5 symbols out from core, or are they further
referenced from within core?  If we can, we can drop this patch
outright, and avoid a false-negative when C code generates references
to these symbols from the compiler runtime, for architectures where
don't want to provide a full compiler runtime.  (--rtlib=none is a
pipe dream; first we need to get to the point where we're not
-ffreestanding for all targets...)

I suspect someone can iteratively whittle down core to avoid these symbols?

```
diff --git a/rust/Makefile b/rust/Makefile
index 7700d3853404..e64a82c21156 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -359,6 +359,7 @@ $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs
$(obj)/target.json FORCE
 $(obj)/compiler_builtins.o: private rustc_objcopy = -w -W '__*'
 $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE
        $(call if_changed_dep,rustc_library)
+       @$(OBJCOPY) --strip-symbols=$(obj)/strip-symbols.txt $(obj)/core.o

 $(obj)/alloc.o: private skip_clippy = 1
 $(obj)/alloc.o: private skip_flags = -Dunreachable_pub
```
then define symbols in rust/strip-symbols.txt?

I'm getting the sense that no, there are many functions like
<usize as core::fmt::UpperExp>::fmt
<i64 as core::fmt::UpperExp>::fmt
<u64 as core::fmt::UpperExp>::fmt

that call many of these functions...so if you call format! on an
i64/u64/usize you'll potentially panic the kernel? Seems kinda
dangerous.

I'm also seeing disassemblers having trouble completely disassembling
core.o ... the more I look to see who's calling what, the more
instances I find...

> +    __gesf2,
> +    __lesf2,
> +    __nesf2,
> +    __unordsf2,
> +});
> +
> +define_panicking_intrinsics!("`f64` should not be used", {
> +    __unorddf2,
> +});
> +
> +define_panicking_intrinsics!("`i128` should not be used", {
> +    __ashrti3,
> +    __muloti4,
> +    __multi3,
> +});
> +
> +define_panicking_intrinsics!("`u128` should not be used", {
> +    __ashlti3,
> +    __lshrti3,
> +    __udivmodti4,
> +    __udivti3,
> +    __umodti3,
> +});
> --
> 2.37.1
>
Nick Desaulniers Aug. 24, 2022, 6:38 p.m. UTC | #3
On Mon, Aug 22, 2022 at 4:55 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Fri, Aug 5, 2022 at 8:44 AM Miguel Ojeda <ojeda@kernel.org> wrote:
> >
> > Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`.
> > Since we do not need the vast majority of them, we avoid the
> > dependency by providing our own crate.
> >
> > Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com>
> > Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com>
> > Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com>
> > Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
> > Co-developed-by: Sven Van Asbroeck <thesven73@gmail.com>
> > Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
> > Co-developed-by: Gary Guo <gary@garyguo.net>
> > Signed-off-by: Gary Guo <gary@garyguo.net>
> > Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
>
> I still really really do not like this patch. Thoughts below.
>
> > ---
> >  rust/compiler_builtins.rs | 63 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 63 insertions(+)
> >  create mode 100644 rust/compiler_builtins.rs
> >
> > diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs
> > new file mode 100644
> > index 000000000000..f8f39a3e6855
> > --- /dev/null
> > +++ b/rust/compiler_builtins.rs
> > @@ -0,0 +1,63 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +//! Our own `compiler_builtins`.
> > +//!
> > +//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`].
> > +//! Since we do not need the vast majority of them, we avoid the dependency
> > +//! by providing this file.
> > +//!
> > +//! At the moment, some builtins are required that should not be. For instance,
> > +//! [`core`] has 128-bit integers functionality which we should not be compiling
>
> It's not just 128b, these are for double word sizes, so for 32b
> targets (I recall an earlier version of series supporting armv6) these
> symbols are for 64b types.
>
> > +//! in. We will work with upstream [`core`] to provide feature flags to disable
> > +//! the parts we do not need. For the moment, we define them to [`panic!`] at
> > +//! runtime for simplicity to catch mistakes, instead of performing surgery
> > +//! on `core.o`.
>
> Wait, so Kbuild performs surgery on alloc, but core is off limits? bah
>
> > +//!
> > +//! In any case, all these symbols are weakened to ensure we do not override
> > +//! those that may be provided by the rest of the kernel.
> > +//!
> > +//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins
> > +//! [`compiler-rt`]: https://compiler-rt.llvm.org/
> > +
> > +#![feature(compiler_builtins)]
> > +#![compiler_builtins]
> > +#![no_builtins]
> > +#![no_std]
> > +
> > +macro_rules! define_panicking_intrinsics(
> > +    ($reason: tt, { $($ident: ident, )* }) => {
> > +        $(
> > +            #[doc(hidden)]
> > +            #[no_mangle]
> > +            pub extern "C" fn $ident() {
> > +                panic!($reason);
> > +            }
> > +        )*
> > +    }
> > +);
> > +
> > +define_panicking_intrinsics!("`f32` should not be used", {
> > +    __eqsf2,
>
> I don't see ^ this symbol in core.
>
> $ llvm-nm rust/core.o | grep "U __" | sort | tr -s " "
>  U __gesf2
>  U __lesf2
>  U __udivti3
>  U __umodti3
>  U __unorddf2
>  U __unordsf2
>  U __x86_indirect_thunk_r11
>
> so a few of these seem extraneous. Are you sure they're coming from
> core? (or was it the different arch support that was removed from v8
> to facilitate v9?)
>
> __gesf2, __lesf2, and __unordsf2 all seem to be coming from
> <<f32>::classify> and <<f32>::to_bits::ct_f32_to_u32>.  Surely we can
> avoid more f32 support by relying on objcopy
> -N/--strip-symbol=/--strip-symbols=filename to slice out/drop symbols
> from core.o?
>
> <core::fmt::num::exp_u128> uses __umodti3+__udivti3
>
> __unorddf2 is referenced by <<f64>::classify> and
> <<f64>::to_bits::ct_f64_to_u64>.
>
> Can we slice those 5 symbols out from core, or are they further
> referenced from within core?  If we can, we can drop this patch
> outright, and avoid a false-negative when C code generates references
> to these symbols from the compiler runtime, for architectures where
> don't want to provide a full compiler runtime.  (--rtlib=none is a
> pipe dream; first we need to get to the point where we're not
> -ffreestanding for all targets...)
>
> I suspect someone can iteratively whittle down core to avoid these symbols?
>
> ```
> diff --git a/rust/Makefile b/rust/Makefile
> index 7700d3853404..e64a82c21156 100644
> --- a/rust/Makefile
> +++ b/rust/Makefile
> @@ -359,6 +359,7 @@ $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs
> $(obj)/target.json FORCE
>  $(obj)/compiler_builtins.o: private rustc_objcopy = -w -W '__*'
>  $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE
>         $(call if_changed_dep,rustc_library)
> +       @$(OBJCOPY) --strip-symbols=$(obj)/strip-symbols.txt $(obj)/core.o
>
>  $(obj)/alloc.o: private skip_clippy = 1
>  $(obj)/alloc.o: private skip_flags = -Dunreachable_pub
> ```
> then define symbols in rust/strip-symbols.txt?
>
> I'm getting the sense that no, there are many functions like
> <usize as core::fmt::UpperExp>::fmt
> <i64 as core::fmt::UpperExp>::fmt
> <u64 as core::fmt::UpperExp>::fmt

Another idea I had, and Ard mentioned to me this morning that efistub
does something similar:

I think objcopy can rename symbol references.  So instead of calling
foo, I think you can use objcopy to call bar instead.

To avoid the case of Rust core refering to symbols typically provided
by the --rtlib={libgcc|compiler-rt}, what if we:
1. build core.o (with reference to the unfavorable symbols)
2. use objcopy to rename references in core.o to the symbols listed in
this file; maybe prepend them with `rust_core_` or something.
3. provide this crate with panic'ing definitions, but for the renamed symbols.

That way, enabling CONFIG_RUST=y won't hide the case where C code is
triggering such libcalls from the C compiler?

To restate the concern, I don't want CONFIG_RUST=y to hide that fact
that someone wrote `float x, y; if (x != y) ...` in C code, where the
C compiler will likely lower that to a libcall to __nesf2 since with
the current approach in this patch, linkage would proceed. The current
behavior is to intentionally fail to link.  I think what I describe
above may work to keep this intended behavior of the kernel's build
system.

Ah, looks like there's objcopy flags:
       --redefine-sym old=new
           Change the name of a symbol old, to new.  This can be
useful when one is trying link
           two things together for which you have no source, and there
are name collisions.

       --redefine-syms=filename
           Apply --redefine-sym to each symbol pair "old new" listed
in the file filename.
           filename is simply a flat file, with one symbol pair per
line.  Line comments may be
           introduced by the hash character.  This option may be given
more than once.

So probably could start with my diff from above, but use
--redefine-syms=filename instead of --strip-symbols=.


>
> > +    __gesf2,
> > +    __lesf2,
> > +    __nesf2,
> > +    __unordsf2,
> > +});
> > +
> > +define_panicking_intrinsics!("`f64` should not be used", {
> > +    __unorddf2,
> > +});
> > +
> > +define_panicking_intrinsics!("`i128` should not be used", {
> > +    __ashrti3,
> > +    __muloti4,
> > +    __multi3,
> > +});
> > +
> > +define_panicking_intrinsics!("`u128` should not be used", {
> > +    __ashlti3,
> > +    __lshrti3,
> > +    __udivmodti4,
> > +    __udivti3,
> > +    __umodti3,
> > +});
> > --
> > 2.37.1
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers



--
Thanks,
~Nick Desaulniers
Gary Guo Aug. 29, 2022, 5:11 p.m. UTC | #4
On Wed, 24 Aug 2022 11:38:46 -0700
Nick Desaulniers <ndesaulniers@google.com> wrote:

> Another idea I had, and Ard mentioned to me this morning that efistub
> does something similar:
> 
> I think objcopy can rename symbol references.  So instead of calling
> foo, I think you can use objcopy to call bar instead.
> 
> To avoid the case of Rust core refering to symbols typically provided
> by the --rtlib={libgcc|compiler-rt}, what if we:
> 1. build core.o (with reference to the unfavorable symbols)
> 2. use objcopy to rename references in core.o to the symbols listed in
> this file; maybe prepend them with `rust_core_` or something.
> 3. provide this crate with panic'ing definitions, but for the renamed
> symbols.
> 
> That way, enabling CONFIG_RUST=y won't hide the case where C code is
> triggering such libcalls from the C compiler?
> 
> To restate the concern, I don't want CONFIG_RUST=y to hide that fact
> that someone wrote `float x, y; if (x != y) ...` in C code, where the
> C compiler will likely lower that to a libcall to __nesf2 since with
> the current approach in this patch, linkage would proceed. The current
> behavior is to intentionally fail to link.  I think what I describe
> above may work to keep this intended behavior of the kernel's build
> system.
> 
> Ah, looks like there's objcopy flags:
>        --redefine-sym old=new
>            Change the name of a symbol old, to new.  This can be
> useful when one is trying link
>            two things together for which you have no source, and there
> are name collisions.
> 
>        --redefine-syms=filename
>            Apply --redefine-sym to each symbol pair "old new" listed
> in the file filename.
>            filename is simply a flat file, with one symbol pair per
> line.  Line comments may be
>            introduced by the hash character.  This option may be given
> more than once.
> 
> So probably could start with my diff from above, but use
> --redefine-syms=filename instead of --strip-symbols=.
> 
> --
> Thanks,
> ~Nick Desaulniers

This is the approach we are about to take, see
https://github.com/Rust-for-Linux/linux/pull/779.

It's easy for just one symbol that is known to be not defined, but it
can be more complex as some of these symbols can be defined on a
platform but not another, so we would have to generate a dynamic list
of what symbols to "hijack" depending on the config. Currently we
avoid this issue by having all symbols in compiler_builtins
crate weak, but we couldn't weakly redefine a symbol.

It's certainly doable, though, but it require some effort. I am
not quite available recently so hadn't further extend and polish my
patch.

Best,
Gary
diff mbox series

Patch

diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs
new file mode 100644
index 000000000000..f8f39a3e6855
--- /dev/null
+++ b/rust/compiler_builtins.rs
@@ -0,0 +1,63 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+//! Our own `compiler_builtins`.
+//!
+//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`].
+//! Since we do not need the vast majority of them, we avoid the dependency
+//! by providing this file.
+//!
+//! At the moment, some builtins are required that should not be. For instance,
+//! [`core`] has 128-bit integers functionality which we should not be compiling
+//! in. We will work with upstream [`core`] to provide feature flags to disable
+//! the parts we do not need. For the moment, we define them to [`panic!`] at
+//! runtime for simplicity to catch mistakes, instead of performing surgery
+//! on `core.o`.
+//!
+//! In any case, all these symbols are weakened to ensure we do not override
+//! those that may be provided by the rest of the kernel.
+//!
+//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins
+//! [`compiler-rt`]: https://compiler-rt.llvm.org/
+
+#![feature(compiler_builtins)]
+#![compiler_builtins]
+#![no_builtins]
+#![no_std]
+
+macro_rules! define_panicking_intrinsics(
+    ($reason: tt, { $($ident: ident, )* }) => {
+        $(
+            #[doc(hidden)]
+            #[no_mangle]
+            pub extern "C" fn $ident() {
+                panic!($reason);
+            }
+        )*
+    }
+);
+
+define_panicking_intrinsics!("`f32` should not be used", {
+    __eqsf2,
+    __gesf2,
+    __lesf2,
+    __nesf2,
+    __unordsf2,
+});
+
+define_panicking_intrinsics!("`f64` should not be used", {
+    __unorddf2,
+});
+
+define_panicking_intrinsics!("`i128` should not be used", {
+    __ashrti3,
+    __muloti4,
+    __multi3,
+});
+
+define_panicking_intrinsics!("`u128` should not be used", {
+    __ashlti3,
+    __lshrti3,
+    __udivmodti4,
+    __udivti3,
+    __umodti3,
+});