Message ID | 20220805154231.31257-10-ojeda@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Rust support | expand |
On Fri, Aug 05, 2022 at 05:41:54PM +0200, Miguel Ojeda wrote: > Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`. > Since we do not need the vast majority of them, we avoid the > dependency by providing our own crate. > > Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com> > Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org>
On Fri, Aug 5, 2022 at 8:44 AM Miguel Ojeda <ojeda@kernel.org> wrote: > > Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`. > Since we do not need the vast majority of them, we avoid the > dependency by providing our own crate. > > Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com> > Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com> > Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com> > Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com> > Co-developed-by: Sven Van Asbroeck <thesven73@gmail.com> > Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> > Co-developed-by: Gary Guo <gary@garyguo.net> > Signed-off-by: Gary Guo <gary@garyguo.net> > Signed-off-by: Miguel Ojeda <ojeda@kernel.org> I still really really do not like this patch. Thoughts below. > --- > rust/compiler_builtins.rs | 63 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 63 insertions(+) > create mode 100644 rust/compiler_builtins.rs > > diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs > new file mode 100644 > index 000000000000..f8f39a3e6855 > --- /dev/null > +++ b/rust/compiler_builtins.rs > @@ -0,0 +1,63 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +//! Our own `compiler_builtins`. > +//! > +//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`]. > +//! Since we do not need the vast majority of them, we avoid the dependency > +//! by providing this file. > +//! > +//! At the moment, some builtins are required that should not be. For instance, > +//! [`core`] has 128-bit integers functionality which we should not be compiling It's not just 128b, these are for double word sizes, so for 32b targets (I recall an earlier version of series supporting armv6) these symbols are for 64b types. > +//! in. We will work with upstream [`core`] to provide feature flags to disable > +//! the parts we do not need. For the moment, we define them to [`panic!`] at Where is the bug filed upstream (or in your issue tracker)? > +//! runtime for simplicity to catch mistakes, instead of performing surgery > +//! on `core.o`. Wait, so Kbuild performs surgery on alloc, but core is off limits? bah Core is huge! $ du -h rust/core.o 1.3M rust/core.o > +//! > +//! In any case, all these symbols are weakened to ensure we do not override > +//! those that may be provided by the rest of the kernel. > +//! > +//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins > +//! [`compiler-rt`]: https://compiler-rt.llvm.org/ > + > +#![feature(compiler_builtins)] > +#![compiler_builtins] > +#![no_builtins] > +#![no_std] > + > +macro_rules! define_panicking_intrinsics( > + ($reason: tt, { $($ident: ident, )* }) => { > + $( > + #[doc(hidden)] > + #[no_mangle] > + pub extern "C" fn $ident() { > + panic!($reason); > + } > + )* > + } > +); > + > +define_panicking_intrinsics!("`f32` should not be used", { > + __eqsf2, I don't see ^ this symbol in core. $ llvm-nm rust/core.o | grep "U __" | sort | tr -s " " U __gesf2 U __lesf2 U __udivti3 U __umodti3 U __unorddf2 U __unordsf2 U __x86_indirect_thunk_r11 so a few of these seem extraneous. Are you sure they're coming from core? (or was it the different arch support that was removed from v8 to facilitate v9?) __gesf2, __lesf2, and __unordsf2 all seem to be coming from <<f32>::classify> and <<f32>::to_bits::ct_f32_to_u32>. Surely we can avoid more f32 support by relying on objcopy -N/--strip-symbol=/--strip-symbols=filename to slice out/drop symbols from core.o? <core::fmt::num::exp_u128> uses __umodti3+__udivti3 __unorddf2 is referenced by <<f64>::classify> and <<f64>::to_bits::ct_f64_to_u64>. Can we slice those 5 symbols out from core, or are they further referenced from within core? If we can, we can drop this patch outright, and avoid a false-negative when C code generates references to these symbols from the compiler runtime, for architectures where don't want to provide a full compiler runtime. (--rtlib=none is a pipe dream; first we need to get to the point where we're not -ffreestanding for all targets...) I suspect someone can iteratively whittle down core to avoid these symbols? ``` diff --git a/rust/Makefile b/rust/Makefile index 7700d3853404..e64a82c21156 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -359,6 +359,7 @@ $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs $(obj)/target.json FORCE $(obj)/compiler_builtins.o: private rustc_objcopy = -w -W '__*' $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE $(call if_changed_dep,rustc_library) + @$(OBJCOPY) --strip-symbols=$(obj)/strip-symbols.txt $(obj)/core.o $(obj)/alloc.o: private skip_clippy = 1 $(obj)/alloc.o: private skip_flags = -Dunreachable_pub ``` then define symbols in rust/strip-symbols.txt? I'm getting the sense that no, there are many functions like <usize as core::fmt::UpperExp>::fmt <i64 as core::fmt::UpperExp>::fmt <u64 as core::fmt::UpperExp>::fmt that call many of these functions...so if you call format! on an i64/u64/usize you'll potentially panic the kernel? Seems kinda dangerous. I'm also seeing disassemblers having trouble completely disassembling core.o ... the more I look to see who's calling what, the more instances I find... > + __gesf2, > + __lesf2, > + __nesf2, > + __unordsf2, > +}); > + > +define_panicking_intrinsics!("`f64` should not be used", { > + __unorddf2, > +}); > + > +define_panicking_intrinsics!("`i128` should not be used", { > + __ashrti3, > + __muloti4, > + __multi3, > +}); > + > +define_panicking_intrinsics!("`u128` should not be used", { > + __ashlti3, > + __lshrti3, > + __udivmodti4, > + __udivti3, > + __umodti3, > +}); > -- > 2.37.1 >
On Mon, Aug 22, 2022 at 4:55 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Fri, Aug 5, 2022 at 8:44 AM Miguel Ojeda <ojeda@kernel.org> wrote: > > > > Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`. > > Since we do not need the vast majority of them, we avoid the > > dependency by providing our own crate. > > > > Co-developed-by: Alex Gaynor <alex.gaynor@gmail.com> > > Signed-off-by: Alex Gaynor <alex.gaynor@gmail.com> > > Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com> > > Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com> > > Co-developed-by: Sven Van Asbroeck <thesven73@gmail.com> > > Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> > > Co-developed-by: Gary Guo <gary@garyguo.net> > > Signed-off-by: Gary Guo <gary@garyguo.net> > > Signed-off-by: Miguel Ojeda <ojeda@kernel.org> > > I still really really do not like this patch. Thoughts below. > > > --- > > rust/compiler_builtins.rs | 63 +++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 63 insertions(+) > > create mode 100644 rust/compiler_builtins.rs > > > > diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs > > new file mode 100644 > > index 000000000000..f8f39a3e6855 > > --- /dev/null > > +++ b/rust/compiler_builtins.rs > > @@ -0,0 +1,63 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > + > > +//! Our own `compiler_builtins`. > > +//! > > +//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`]. > > +//! Since we do not need the vast majority of them, we avoid the dependency > > +//! by providing this file. > > +//! > > +//! At the moment, some builtins are required that should not be. For instance, > > +//! [`core`] has 128-bit integers functionality which we should not be compiling > > It's not just 128b, these are for double word sizes, so for 32b > targets (I recall an earlier version of series supporting armv6) these > symbols are for 64b types. > > > +//! in. We will work with upstream [`core`] to provide feature flags to disable > > +//! the parts we do not need. For the moment, we define them to [`panic!`] at > > +//! runtime for simplicity to catch mistakes, instead of performing surgery > > +//! on `core.o`. > > Wait, so Kbuild performs surgery on alloc, but core is off limits? bah > > > +//! > > +//! In any case, all these symbols are weakened to ensure we do not override > > +//! those that may be provided by the rest of the kernel. > > +//! > > +//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins > > +//! [`compiler-rt`]: https://compiler-rt.llvm.org/ > > + > > +#![feature(compiler_builtins)] > > +#![compiler_builtins] > > +#![no_builtins] > > +#![no_std] > > + > > +macro_rules! define_panicking_intrinsics( > > + ($reason: tt, { $($ident: ident, )* }) => { > > + $( > > + #[doc(hidden)] > > + #[no_mangle] > > + pub extern "C" fn $ident() { > > + panic!($reason); > > + } > > + )* > > + } > > +); > > + > > +define_panicking_intrinsics!("`f32` should not be used", { > > + __eqsf2, > > I don't see ^ this symbol in core. > > $ llvm-nm rust/core.o | grep "U __" | sort | tr -s " " > U __gesf2 > U __lesf2 > U __udivti3 > U __umodti3 > U __unorddf2 > U __unordsf2 > U __x86_indirect_thunk_r11 > > so a few of these seem extraneous. Are you sure they're coming from > core? (or was it the different arch support that was removed from v8 > to facilitate v9?) > > __gesf2, __lesf2, and __unordsf2 all seem to be coming from > <<f32>::classify> and <<f32>::to_bits::ct_f32_to_u32>. Surely we can > avoid more f32 support by relying on objcopy > -N/--strip-symbol=/--strip-symbols=filename to slice out/drop symbols > from core.o? > > <core::fmt::num::exp_u128> uses __umodti3+__udivti3 > > __unorddf2 is referenced by <<f64>::classify> and > <<f64>::to_bits::ct_f64_to_u64>. > > Can we slice those 5 symbols out from core, or are they further > referenced from within core? If we can, we can drop this patch > outright, and avoid a false-negative when C code generates references > to these symbols from the compiler runtime, for architectures where > don't want to provide a full compiler runtime. (--rtlib=none is a > pipe dream; first we need to get to the point where we're not > -ffreestanding for all targets...) > > I suspect someone can iteratively whittle down core to avoid these symbols? > > ``` > diff --git a/rust/Makefile b/rust/Makefile > index 7700d3853404..e64a82c21156 100644 > --- a/rust/Makefile > +++ b/rust/Makefile > @@ -359,6 +359,7 @@ $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs > $(obj)/target.json FORCE > $(obj)/compiler_builtins.o: private rustc_objcopy = -w -W '__*' > $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE > $(call if_changed_dep,rustc_library) > + @$(OBJCOPY) --strip-symbols=$(obj)/strip-symbols.txt $(obj)/core.o > > $(obj)/alloc.o: private skip_clippy = 1 > $(obj)/alloc.o: private skip_flags = -Dunreachable_pub > ``` > then define symbols in rust/strip-symbols.txt? > > I'm getting the sense that no, there are many functions like > <usize as core::fmt::UpperExp>::fmt > <i64 as core::fmt::UpperExp>::fmt > <u64 as core::fmt::UpperExp>::fmt Another idea I had, and Ard mentioned to me this morning that efistub does something similar: I think objcopy can rename symbol references. So instead of calling foo, I think you can use objcopy to call bar instead. To avoid the case of Rust core refering to symbols typically provided by the --rtlib={libgcc|compiler-rt}, what if we: 1. build core.o (with reference to the unfavorable symbols) 2. use objcopy to rename references in core.o to the symbols listed in this file; maybe prepend them with `rust_core_` or something. 3. provide this crate with panic'ing definitions, but for the renamed symbols. That way, enabling CONFIG_RUST=y won't hide the case where C code is triggering such libcalls from the C compiler? To restate the concern, I don't want CONFIG_RUST=y to hide that fact that someone wrote `float x, y; if (x != y) ...` in C code, where the C compiler will likely lower that to a libcall to __nesf2 since with the current approach in this patch, linkage would proceed. The current behavior is to intentionally fail to link. I think what I describe above may work to keep this intended behavior of the kernel's build system. Ah, looks like there's objcopy flags: --redefine-sym old=new Change the name of a symbol old, to new. This can be useful when one is trying link two things together for which you have no source, and there are name collisions. --redefine-syms=filename Apply --redefine-sym to each symbol pair "old new" listed in the file filename. filename is simply a flat file, with one symbol pair per line. Line comments may be introduced by the hash character. This option may be given more than once. So probably could start with my diff from above, but use --redefine-syms=filename instead of --strip-symbols=. > > > + __gesf2, > > + __lesf2, > > + __nesf2, > > + __unordsf2, > > +}); > > + > > +define_panicking_intrinsics!("`f64` should not be used", { > > + __unorddf2, > > +}); > > + > > +define_panicking_intrinsics!("`i128` should not be used", { > > + __ashrti3, > > + __muloti4, > > + __multi3, > > +}); > > + > > +define_panicking_intrinsics!("`u128` should not be used", { > > + __ashlti3, > > + __lshrti3, > > + __udivmodti4, > > + __udivti3, > > + __umodti3, > > +}); > > -- > > 2.37.1 > > > > > -- > Thanks, > ~Nick Desaulniers -- Thanks, ~Nick Desaulniers
On Wed, 24 Aug 2022 11:38:46 -0700 Nick Desaulniers <ndesaulniers@google.com> wrote: > Another idea I had, and Ard mentioned to me this morning that efistub > does something similar: > > I think objcopy can rename symbol references. So instead of calling > foo, I think you can use objcopy to call bar instead. > > To avoid the case of Rust core refering to symbols typically provided > by the --rtlib={libgcc|compiler-rt}, what if we: > 1. build core.o (with reference to the unfavorable symbols) > 2. use objcopy to rename references in core.o to the symbols listed in > this file; maybe prepend them with `rust_core_` or something. > 3. provide this crate with panic'ing definitions, but for the renamed > symbols. > > That way, enabling CONFIG_RUST=y won't hide the case where C code is > triggering such libcalls from the C compiler? > > To restate the concern, I don't want CONFIG_RUST=y to hide that fact > that someone wrote `float x, y; if (x != y) ...` in C code, where the > C compiler will likely lower that to a libcall to __nesf2 since with > the current approach in this patch, linkage would proceed. The current > behavior is to intentionally fail to link. I think what I describe > above may work to keep this intended behavior of the kernel's build > system. > > Ah, looks like there's objcopy flags: > --redefine-sym old=new > Change the name of a symbol old, to new. This can be > useful when one is trying link > two things together for which you have no source, and there > are name collisions. > > --redefine-syms=filename > Apply --redefine-sym to each symbol pair "old new" listed > in the file filename. > filename is simply a flat file, with one symbol pair per > line. Line comments may be > introduced by the hash character. This option may be given > more than once. > > So probably could start with my diff from above, but use > --redefine-syms=filename instead of --strip-symbols=. > > -- > Thanks, > ~Nick Desaulniers This is the approach we are about to take, see https://github.com/Rust-for-Linux/linux/pull/779. It's easy for just one symbol that is known to be not defined, but it can be more complex as some of these symbols can be defined on a platform but not another, so we would have to generate a dynamic list of what symbols to "hijack" depending on the config. Currently we avoid this issue by having all symbols in compiler_builtins crate weak, but we couldn't weakly redefine a symbol. It's certainly doable, though, but it require some effort. I am not quite available recently so hadn't further extend and polish my patch. Best, Gary
diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs new file mode 100644 index 000000000000..f8f39a3e6855 --- /dev/null +++ b/rust/compiler_builtins.rs @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Our own `compiler_builtins`. +//! +//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`]. +//! Since we do not need the vast majority of them, we avoid the dependency +//! by providing this file. +//! +//! At the moment, some builtins are required that should not be. For instance, +//! [`core`] has 128-bit integers functionality which we should not be compiling +//! in. We will work with upstream [`core`] to provide feature flags to disable +//! the parts we do not need. For the moment, we define them to [`panic!`] at +//! runtime for simplicity to catch mistakes, instead of performing surgery +//! on `core.o`. +//! +//! In any case, all these symbols are weakened to ensure we do not override +//! those that may be provided by the rest of the kernel. +//! +//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins +//! [`compiler-rt`]: https://compiler-rt.llvm.org/ + +#![feature(compiler_builtins)] +#![compiler_builtins] +#![no_builtins] +#![no_std] + +macro_rules! define_panicking_intrinsics( + ($reason: tt, { $($ident: ident, )* }) => { + $( + #[doc(hidden)] + #[no_mangle] + pub extern "C" fn $ident() { + panic!($reason); + } + )* + } +); + +define_panicking_intrinsics!("`f32` should not be used", { + __eqsf2, + __gesf2, + __lesf2, + __nesf2, + __unordsf2, +}); + +define_panicking_intrinsics!("`f64` should not be used", { + __unorddf2, +}); + +define_panicking_intrinsics!("`i128` should not be used", { + __ashrti3, + __muloti4, + __multi3, +}); + +define_panicking_intrinsics!("`u128` should not be used", { + __ashlti3, + __lshrti3, + __udivmodti4, + __udivti3, + __umodti3, +});