mbox series

[00/13,RFC] Rust support

Message ID 20210414184604.23473-1-ojeda@kernel.org (mailing list archive)
Headers show
Series Rust support | expand

Message

Miguel Ojeda April 14, 2021, 6:45 p.m. UTC
From: Miguel Ojeda <ojeda@kernel.org>

Some of you have noticed the past few weeks and months that
a serious attempt to bring a second language to the kernel was
being forged. We are finally here, with an RFC that adds support
for Rust to the Linux kernel.

This cover letter is fairly long, since there are quite a few topics
to describe, but I hope it answers as many questions as possible
before the discussion starts.

If you are interested in following this effort, please join us
in the mailing list at:

    rust-for-linux@vger.kernel.org

and take a look at the project itself at:

    https://github.com/Rust-for-Linux

Cheers,
Miguel


# A second language in the kernel

We know there are huge costs and risks in introducing a new main
language in the kernel. We risk dividing efforts and we increase
the knowledge required to contribute to some parts of the kernel.

Most importantly, any new language introduced means any module
written in that language will be way harder to replace later on
if the support for the new language gets dropped.

Nevertheless, we believe that, even today, the advantages of using
Rust outweighs the cost. We will explain why in the following
sections.

Please note that the Rust support is intended to enable writing
drivers and similar "leaf" modules in Rust, at least for the
foreseeable future. In particular, we do not intend to rewrite
the kernel core nor the major kernel subsystems (e.g. `kernel/`,
`mm/`, `sched/`...). Instead, the Rust support is built on top
of those.


## Goals

By using Rust in the Linux kernel, our hope is that:

  - New code written in Rust has a reduced risk of memory safety bugs,
    data races and logic bugs overall, thanks to the language
    properties mentioned below.

  - Maintainers are more confident in refactoring and accepting
    patches for modules thanks to the safe subset of Rust.

  - New drivers and modules become easier to write, thanks to
    abstractions that are easier to reason about, based on modern
    language features, as well as backed by detailed documentation.

  - More people get involved overall in developing the kernel
    thanks to the usage of a modern language.

  - By taking advantage of Rust tooling, we keep enforcing the
    documentation guidelines we have established so far in the
    project. For instance, we require having all public APIs, safety
    preconditions, `unsafe` blocks and type invariants documented.


## Why Rust?

Rust is a systems programming language that brings several key
advantages over C in the context of the Linux kernel:

  - No undefined behavior in the safe subset (when unsafe code is
    sound), including memory safety and the absence of data races.

  - Stricter type system for further reduction of logic errors.

  - A clear distinction between safe and `unsafe` code.

  - Featureful language: sum types, pattern matching, generics,
    RAII, lifetimes, shared & exclusive references, modules &
    visibility, powerful hygienic and procedural macros...

  - Extensive freestanding standard library: vocabulary types such
    as `Result` and `Option`, iterators, formatting, pinning,
    checked/saturating/wrapping integer arithmetic, etc.

  - Integrated out of the box tooling: documentation generator,
    formatter and linter all based on the compiler itself.

Overall, Rust is a language that has successfully leveraged decades
of experience from system programming languages as well as functional
ones, and added lifetimes and borrow checking on top.


## Why not?

Rust also has disadvantages compared to C in the context of
the Linux kernel:

  - The many years of effort in tooling for C around the kernel,
    including compiler plugins, sanitizers, Coccinelle, lockdep,
    sparse... However, this will likely improve if Rust usage in
    the kernel grows over time.

  - Single implementation based on LLVM. There are third-party
    efforts underway to fix this, such as a GCC frontend,
    a `rustc` backend based on Cranelift and `mrustc`,
    a compiler intended to reduce the bootstrapping chain.
    Any help for those projects would be very welcome!

  - Not standardized. While it is not clear whether standardization
    would be beneficial for the kernel, several points minimize
    this issue in any case: the Rust stability promise, the extensive
    documentation, the WIP reference, the detailed RFCs...

  - Slower compilation in general, due to more complex language
    features and limitations in the current compiler.

  - At the present time, we require certain nightly features.
    That is, features that are not available in the stable compiler.
    Nevertheless, we aim to remove this restriction within a year
    by either `rustc` landing the features in stable or removing
    our usage of them otherwise. We maintain a report here:

        https://github.com/Rust-for-Linux/linux/issues/2

  - Bigger than needed text currently, due to the unused parts from
    the `core` and `alloc` Rust standard libraries. We plan to address
    this over time.

Most of these disadvantages arise from the fact that Rust is a much
younger and less used language. However, we believe Rust is likely
to become an important part of systems programming, just as C has been
during the last decades, and so most of these issues will be reduced
as different industries put resources behind Rust.


## Design

There are a few key design choices to have in mind.

First of all, Rust kernel modules require some shared code that is
enabled via a configuration option (`CONFIG_RUST`). This makes
individual modules way smaller. This support consists of:

  - The Rust standard library. Currently `core` and `alloc`, but
    likely only a subset of `core` in the future. These pieces
    are basically the equivalent of the freestanding subset of
    the C standard library.

  - The abstractions wrapping the kernel APIs. These live inside
    `rust/kernel/`. The intention is to make these as safe as
    possible so that modules written in Rust require the smallest
    amount of `unsafe` code possible.

  - Other bits such as the `module!` procedural macro, the compiler
    builtins, the generated bindings and helpers, etc.

This support takes a fair amount of space, although it will be
reduced since there are some features from the Rust standard library
that we do not use.

Here are some examples from a small x86_64 config we use in the CI:

       text    data     bss      dec

    7464833 1492128 2301996 11258957 vmlinux (without Rust support)
    7682527 1709252 2301996 11693775 vmlinux (with    Rust support)
    7682527 1721540 2301996 11706063 vmlinux (plus overflow checks)

       2224       0      16     2240 samples/rust/rust_semaphore_c.o
       3694       0      10     3704 samples/rust/rust_semaphore.o
       2367     768      16     3151 samples/rust/rust_semaphore_c.ko
       3829     768      10     4607 samples/rust/rust_semaphore.ko

      80554    5904   20249   106707 drivers/android/binder.o
      12365    1240       9    13614 drivers/android/binder_alloc.o
      92818       8      16    92842 drivers/android/rust_binder.o

That is a 3% increase in text and a 4% increase in the total for
`vmlinux` with overflow checking enabled. The modules themselves are
relatively close to their C alternatives.

In the table above we can also see the comparison between Binder and
its Rust port prototype. Please note that while the Rust version is
not equivalent to the C original module yet, it is close enough to
provide a rough estimation. Here, we see the sum of the texts of the
C Binder are less than that of the Rust driver, while the total column
is bigger.

Secondly, modules written in Rust should never use the C kernel APIs
directly. The whole point of using Rust in the kernel is that
we develop safe abstractions so that modules are easier to reason
about and, therefore, to review, refactor, etc.

Furthermore, the bindings to the C side of the kernel are generated
on-the-fly via `bindgen` (an official Rust tool). Using it allows us
to avoid the need to update the bindings on the Rust side.

Macros still need to be handled manually, and some functions are
inlined, which requires us to create helpers to call them from Rust.

Thirdly, in Rust code bases, most documentation is written alongside
the source code, in Markdown. We follow this convention, thus while
we have a few general documents in `Documentation/rust/`, most of
the actual documentation is in the source code itself.

In order to read this documentation easily, Rust provides a tool
to generate HTML documentation, just like Sphinx/kernel-doc, but
suited to Rust code bases and the language concepts.

Moreover, as explained above, we are taking the chance to enforce
some documentation guidelines. We are also enforcing automatic code
formatting, a set of Clippy lints, etc. We decided to go with Rust's
idiomatic style, i.e. keeping `rustfmt` defaults. For instance, this
means 4 spaces are used for indentation, rather than a tab. We are
happy to change that if needed -- we think what is important is
keeping the formatting automated.

Finally, to avoid exposing GPL symbols as non-GPL (even indirectly),
we export all our Rust support symbols in the kernel as GPL.


## Status

The Rust support presented here is experimental and many kernel APIs
and abstractions are, of course, missing. Covering the entire API
surface of the kernel will take a long time to develop and mature.
Other implementation details are also a work in progress.

However, the support is good enough that prototyping modules can
start today. This RFC includes a working port of an existing module:
Binder, the Android IPC mechanism. While it is not meant to be used
in production just yet, it showcases what can already be done and how
actual Rust modules could look like in the future.

Regarding compilers, we support Clang-built kernels as well as
`LLVM=1` builds where possible (i.e. as long as supported by
the ClangBuiltLinux project). We also maintain some configurations
of GCC-built kernels working, but they are not intended to be used
at the present time. Having a `bindgen` backend for GCC would be
ideal to improve support for those builds.

Concerning architectures, we already support `x86_64`, `arm64` and
`ppc64le`. Adding support for variants of those as well as `riscv`,
`s390` and `mips` should be possible with some work.

We also joined `linux-next` (with a special waiver). Currently,
the support is gated behind `!COMPILE_TEST` since we did not want
to break any production CIs by mistake, but if feedback for this RFC
is positive, then we will remove that restriction.


## Upstreaming plan

As usual, getting into mainline early is the best way forward to
sort out any missing details, so we are happy to send these changes
as soon as the upcoming merge window.

However, at which point we submit them will depend on the feedback
we receive on this RFC and what the overall sentiment from
high-level maintainers is.


## Reviewing this RFC

We would like to get comments from the perspective of module writers.
In particular on the samples in patch 9 and on Binder in patch 13.
That is, as a module writer, how do you feel about the Rust code
shown there? Do you see yourself writing similar Rust code in
the future, taking into account the safety/no-UB benefits?

Comments on the Rust abstractions themselves and other details of
this RFC are, of course, welcome, but please note that they are
a work in progress.

Another important topic we would like feedback on is the Rust
"native" documentation that is written alongside the code, as
explained above. We have uploaded it here:

    https://rust-for-linux.github.io/docs/kernel/

We like how this kind of generated documentation looks. Please take
a look and let us know what you think!


## Testing this RFC

If you want to test things out, please follow the Quick Start guide
in `Documentation/rust/quick-start.rst`. It will help you setup Rust
and the rest of the tools needed to build and test this RFC.

At the time of writing, the RFC series matches our main repository,
but if you want to follow along, check out the `rust` branch from
our main tree:

    https://github.com/Rust-for-Linux/linux.git


## Acknowledgements

The signatures in the main commits correspond to the people that
wrote code that has ended up in them at the present time. However,
we would like to give credit to everyone that has contributed in
one way or another to the Rust for Linux project:

  - Alex Gaynor and Geoffrey Thomas wrote the first safe Rust
    abstractions for kernel features (such as `chrdev`, `printk`,
    `random`, `sysctl`...) and used them in a framework to build
    out-of-tree modules in Rust leveraging `bindgen` for bindings.
    They presented their work at the Linux Security Summit 2019.

  - Nick Desaulniers bravely raised the Rust topic in the LKML and
    organized a talk at the Linux Plumbers Conference 2020. He also
    pulled some strings to move things forward!

  - Miguel Ojeda created the Rust for Linux project to group
    the different efforts/people in one place and kickstarted it by
    adding Kbuild support for Rust into the kernel, integrating Alex's
    and Geoffrey's abstractions into what is now `rust/kernel/`
    and adding support for built-in modules and sharing the common
    Rust code. He kept working on writing the infrastructure
    foundations: the `module!` proc macro and new printing macros,
    the exports and compiler builtins magic, the kernel config symbols
    for conditional compilation, the different Rust tooling
    integrations, the documentation, the CI... He fixed a couple bits
    in `rustc` and `rustdoc` that were needed for the kernel.
    He is coordinating the project.

  - Alex Gaynor has spent a lot of time reviewing the majority of
    PRs after the integration took place, cleaned up a few of the
    abstractions further, added support for `THIS_MODULE`...
    He is a maintainer of the project.

  - Wedson Almeida Filho wrote most of the rest of the abstractions,
    including all the synchronization ones in `rust/kernel/sync/`,
    a better abstraction for file operations, support for ioctls,
    miscellaneous devices, failing allocations, `container_of!` and
    `offset_of!`... These are all needed for his Binder (Android IPC)
    Rust module, which is the first Rust kernel module intended for
    (eventual) production. He is a maintainer of the project.

  - Adam Bratschi-Kaye added support for `charp`, array, string and
    integer module parameter types, the `fsync` file operation,
    the stack probing test... He has also attended most meetings and
    reviewed some PRs.

  - Finn Behrens worked on `O=` builds and NixOS support, the Rust
    confdata printer, testing the Kbuild support as well as sending
    proposals for a couple new abstractions. He has attended a few
    meetings and reviewed some PRs even while busy with his studies.

  - Manish Goregaokar implemented the fallible `Box`, `Arc`, and `Rc`
    allocator APIs in Rust's `alloc` standard library for us.

  - Boqun Feng is working hard on the different options for
    threading abstractions and has reviewed most of the `sync` PRs.

  - Michael Ellerman added initial support for ppc64le and actively
    reviews further changes and issues related to it.

  - Dan Robertson is working on adding softdeps to the `module!`
    macro.

  - Sumera Priyadarsini worked on improving the error messages for
    the `module!` macro.

  - Ngo Iok Ui (Wu Yu Wei) worked on generating `core` and `alloc`
    docs locally too, although in the end we could not merge it.

  - Geoffrey Thomas kept giving us a lot of valuable input from his
    experience implementing some of the abstractions and never
    missed a meeting.

  - bjorn3 for his knowledgeable input on `rustc` internals and
    reviewing related code.

  - Josh Triplett helped us move forward the project early on in
    the Plumbers conference and acts as liaison to the core Rust team.

  - John Ericson worked on advancing `cargo`'s `-Zbuild-std` support,
    the Rust compiler targets and joined a few of the meetings.

  - Joshua Abraham reviewed a few PRs and joined some of
    the meetings.

  - Konstantin Ryabitsev for his patience with all the requests
    regarding Rust for Linux within the kernel.org infrastructure.

  - Stephen Rothwell for his flexibility and help on including
    the project into linux-next.

  - John 'Warthog9' Hawley and David S. Miller for setting up the
    rust-for-linux@vger.kernel.org mailing list.

  - Jonathan Corbet for his feedback on the Rust documentation,
    Markdown and the different choices we will need to discuss.

  - Guillaume Gomez and Joshua Nelson for early feedback on
    a proposal on an external references map file for `rustdoc`
    that would allow us to easily link to Sphinx/C entries.

  - Many folks that have reported issues, tested the project,
    helped spread the word, joined discussions and contributed in
    other ways! In no particular order: Pavel Machek, Geert Stappers,
    Kees Cook, Milan, Daniel Kolsoi, Arnd Bergmann, ahomescu,
    Josh Stone, Manas, Christian Brauner, Boris-Chengbiao Zhou,
    Luis Gerhorst...

Miguel Ojeda (12):
  kallsyms: Support "big" kernel symbols (2-byte lengths)
  kallsyms: Increase maximum kernel symbol length to 512
  Makefile: Generate CLANG_FLAGS even in GCC builds
  Kbuild: Rust support
  Rust: Compiler builtins crate
  Rust: Module crate
  Rust: Kernel crate
  Rust: Export generated symbols
  Samples: Rust examples
  Documentation: Rust general information
  MAINTAINERS: Rust
  Rust: add abstractions for Binder (WIP)

Wedson Almeida Filho (1):
  Android: Binder IPC in Rust (WIP)

 .gitignore                             |   2 +
 .rustfmt.toml                          |  12 +
 Documentation/doc-guide/kernel-doc.rst |   3 +
 Documentation/index.rst                |   1 +
 Documentation/kbuild/kbuild.rst        |   4 +
 Documentation/process/changes.rst      |   9 +
 Documentation/rust/arch-support.rst    |  29 +
 Documentation/rust/coding.rst          |  92 +++
 Documentation/rust/docs.rst            | 109 +++
 Documentation/rust/index.rst           |  20 +
 Documentation/rust/quick-start.rst     | 203 ++++++
 MAINTAINERS                            |  14 +
 Makefile                               | 147 +++-
 arch/arm64/rust/target.json            |  40 ++
 arch/powerpc/rust/target.json          |  30 +
 arch/x86/rust/target.json              |  42 ++
 drivers/android/Kconfig                |   7 +
 drivers/android/Makefile               |   2 +
 drivers/android/allocation.rs          | 252 +++++++
 drivers/android/context.rs             |  80 +++
 drivers/android/defs.rs                |  92 +++
 drivers/android/node.rs                | 479 +++++++++++++
 drivers/android/process.rs             | 950 +++++++++++++++++++++++++
 drivers/android/range_alloc.rs         | 191 +++++
 drivers/android/rust_binder.rs         | 128 ++++
 drivers/android/thread.rs              | 821 +++++++++++++++++++++
 drivers/android/transaction.rs         | 206 ++++++
 include/linux/kallsyms.h               |   2 +-
 include/linux/spinlock.h               |  17 +-
 include/uapi/linux/android/binder.h    |  22 +-
 init/Kconfig                           |  27 +
 kernel/kallsyms.c                      |   7 +
 kernel/livepatch/core.c                |   4 +-
 kernel/printk/printk.c                 |   2 +
 lib/Kconfig.debug                      | 100 +++
 rust/.gitignore                        |   5 +
 rust/Makefile                          | 152 ++++
 rust/compiler_builtins.rs              | 146 ++++
 rust/exports.c                         |  16 +
 rust/helpers.c                         |  86 +++
 rust/kernel/allocator.rs               |  68 ++
 rust/kernel/bindings.rs                |  22 +
 rust/kernel/bindings_helper.h          |  18 +
 rust/kernel/buffer.rs                  |  39 +
 rust/kernel/c_types.rs                 | 133 ++++
 rust/kernel/chrdev.rs                  | 162 +++++
 rust/kernel/error.rs                   | 106 +++
 rust/kernel/file_operations.rs         | 668 +++++++++++++++++
 rust/kernel/lib.rs                     | 200 ++++++
 rust/kernel/linked_list.rs             | 245 +++++++
 rust/kernel/miscdev.rs                 | 109 +++
 rust/kernel/module_param.rs            | 497 +++++++++++++
 rust/kernel/pages.rs                   | 173 +++++
 rust/kernel/prelude.rs                 |  22 +
 rust/kernel/print.rs                   | 461 ++++++++++++
 rust/kernel/random.rs                  |  50 ++
 rust/kernel/raw_list.rs                | 361 ++++++++++
 rust/kernel/static_assert.rs           |  38 +
 rust/kernel/sync/arc.rs                | 184 +++++
 rust/kernel/sync/condvar.rs            | 138 ++++
 rust/kernel/sync/guard.rs              |  82 +++
 rust/kernel/sync/locked_by.rs          | 112 +++
 rust/kernel/sync/mod.rs                |  68 ++
 rust/kernel/sync/mutex.rs              | 101 +++
 rust/kernel/sync/spinlock.rs           | 108 +++
 rust/kernel/sysctl.rs                  | 185 +++++
 rust/kernel/types.rs                   |  73 ++
 rust/kernel/user_ptr.rs                | 282 ++++++++
 rust/module.rs                         | 685 ++++++++++++++++++
 samples/Kconfig                        |   2 +
 samples/Makefile                       |   1 +
 samples/rust/Kconfig                   | 103 +++
 samples/rust/Makefile                  |  11 +
 samples/rust/rust_chrdev.rs            |  66 ++
 samples/rust/rust_minimal.rs           |  40 ++
 samples/rust/rust_miscdev.rs           | 145 ++++
 samples/rust/rust_module_parameters.rs |  72 ++
 samples/rust/rust_print.rs             |  58 ++
 samples/rust/rust_semaphore.rs         | 178 +++++
 samples/rust/rust_semaphore_c.c        | 212 ++++++
 samples/rust/rust_stack_probing.rs     |  42 ++
 samples/rust/rust_sync.rs              |  84 +++
 scripts/Makefile.build                 |  19 +
 scripts/Makefile.lib                   |  12 +
 scripts/kallsyms.c                     |  33 +-
 scripts/kconfig/confdata.c             |  67 +-
 scripts/rust-version.sh                |  31 +
 tools/include/linux/kallsyms.h         |   2 +-
 tools/include/linux/lockdep.h          |   2 +-
 tools/lib/perf/include/perf/event.h    |   2 +-
 tools/lib/symbol/kallsyms.h            |   2 +-
 91 files changed, 11080 insertions(+), 45 deletions(-)
 create mode 100644 .rustfmt.toml
 create mode 100644 Documentation/rust/arch-support.rst
 create mode 100644 Documentation/rust/coding.rst
 create mode 100644 Documentation/rust/docs.rst
 create mode 100644 Documentation/rust/index.rst
 create mode 100644 Documentation/rust/quick-start.rst
 create mode 100644 arch/arm64/rust/target.json
 create mode 100644 arch/powerpc/rust/target.json
 create mode 100644 arch/x86/rust/target.json
 create mode 100644 drivers/android/allocation.rs
 create mode 100644 drivers/android/context.rs
 create mode 100644 drivers/android/defs.rs
 create mode 100644 drivers/android/node.rs
 create mode 100644 drivers/android/process.rs
 create mode 100644 drivers/android/range_alloc.rs
 create mode 100644 drivers/android/rust_binder.rs
 create mode 100644 drivers/android/thread.rs
 create mode 100644 drivers/android/transaction.rs
 create mode 100644 rust/.gitignore
 create mode 100644 rust/Makefile
 create mode 100644 rust/compiler_builtins.rs
 create mode 100644 rust/exports.c
 create mode 100644 rust/helpers.c
 create mode 100644 rust/kernel/allocator.rs
 create mode 100644 rust/kernel/bindings.rs
 create mode 100644 rust/kernel/bindings_helper.h
 create mode 100644 rust/kernel/buffer.rs
 create mode 100644 rust/kernel/c_types.rs
 create mode 100644 rust/kernel/chrdev.rs
 create mode 100644 rust/kernel/error.rs
 create mode 100644 rust/kernel/file_operations.rs
 create mode 100644 rust/kernel/lib.rs
 create mode 100644 rust/kernel/linked_list.rs
 create mode 100644 rust/kernel/miscdev.rs
 create mode 100644 rust/kernel/module_param.rs
 create mode 100644 rust/kernel/pages.rs
 create mode 100644 rust/kernel/prelude.rs
 create mode 100644 rust/kernel/print.rs
 create mode 100644 rust/kernel/random.rs
 create mode 100644 rust/kernel/raw_list.rs
 create mode 100644 rust/kernel/static_assert.rs
 create mode 100644 rust/kernel/sync/arc.rs
 create mode 100644 rust/kernel/sync/condvar.rs
 create mode 100644 rust/kernel/sync/guard.rs
 create mode 100644 rust/kernel/sync/locked_by.rs
 create mode 100644 rust/kernel/sync/mod.rs
 create mode 100644 rust/kernel/sync/mutex.rs
 create mode 100644 rust/kernel/sync/spinlock.rs
 create mode 100644 rust/kernel/sysctl.rs
 create mode 100644 rust/kernel/types.rs
 create mode 100644 rust/kernel/user_ptr.rs
 create mode 100644 rust/module.rs
 create mode 100644 samples/rust/Kconfig
 create mode 100644 samples/rust/Makefile
 create mode 100644 samples/rust/rust_chrdev.rs
 create mode 100644 samples/rust/rust_minimal.rs
 create mode 100644 samples/rust/rust_miscdev.rs
 create mode 100644 samples/rust/rust_module_parameters.rs
 create mode 100644 samples/rust/rust_print.rs
 create mode 100644 samples/rust/rust_semaphore.rs
 create mode 100644 samples/rust/rust_semaphore_c.c
 create mode 100644 samples/rust/rust_stack_probing.rs
 create mode 100644 samples/rust/rust_sync.rs
 create mode 100755 scripts/rust-version.sh

Comments

Linus Torvalds April 14, 2021, 7:31 p.m. UTC | #1
On Wed, Apr 14, 2021 at 11:47 AM <ojeda@kernel.org> wrote:
>
> +#[alloc_error_handler]
> +fn oom(_layout: Layout) -> ! {
> +    panic!("Out of memory!");
> +}
> +
> +#[no_mangle]
> +pub fn __rust_alloc_error_handler(_size: usize, _align: usize) -> ! {
> +    panic!("Out of memory!");
> +}

Again, excuse my lack of internal Rust knowledge, but when do these
end up being an issue?

If the Rust compiler ends up doing hidden allocations, and they then
cause panics, then one of the main *points* of Rustification is
entirely broken. That's 100% the opposite of being memory-safe at
build time.

An allocation failure in some random driver must never ever be
something that the compiler just turns into a panic. It must be
something that is caught and handled synchronously and results in an
ENOMEM error return.

So the fact that the core patches have these kinds of

    panic!("Out of memory!");

things in them as part of just the support infrastructure makes me go
"Yeah, that's fundamentally wrong".

And if this is some default that is called only when the Rust code
doesn't have error handling, then once again - I think it needs to be
a *build-time* failure, not a runtime one. Because having unsafe code
that will cause a panic only under very special situations that are
hard to trigger is about the worst possible case.

             Linus
Linus Torvalds April 14, 2021, 7:44 p.m. UTC | #2
On Wed, Apr 14, 2021 at 11:46 AM <ojeda@kernel.org> wrote:
>
> Some of you have noticed the past few weeks and months that
> a serious attempt to bring a second language to the kernel was
> being forged. We are finally here, with an RFC that adds support
> for Rust to the Linux kernel.

So I replied with my reactions to a couple of the individual patches,
but on the whole I don't hate it.

HOWEVER.

I do think that the "run-time failure panic" is a fundamental issue.

I may not understand the ramifications of when it can happen, so maybe
it's less of an issue than I think it is, but very fundamentally I
think that if some Rust allocation can cause a panic, this is simply
_fundamentally_ not acceptable.

Allocation failures in a driver or non-core code - and that is by
definition all of any new Rust code - can never EVER validly cause
panics. Same goes for "oh, some case I didn't test used 128-bit
integers or floating point".

So if the Rust compiler causes hidden allocations that cannot be
caught and returned as errors, then I seriously think that this whole
approach needs to be entirely NAK'ed, and the Rust infrastructure -
whether at the compiler level or in the kernel wrappers - needs more
work.

So if the panic was just some placeholder for things that _can_ be
caught, then I think that catching code absolutely needs to be
written, and not left as a to-do.

And if the panic situation is some fundamental "this is what the Rust
compiler does for internal allocation failures", then I think it needs
more than just kernel wrapper work - it needs the Rust compiler to be
*fixed*.

Because kernel code is different from random user-space system tools.
Running out of memory simply MUST NOT cause an abort.  It needs to
just result in an error return.

I don't know enough about how the out-of-memory situations would be
triggered and caught to actually know whether this is a fundamental
problem or not, so my reaction comes from ignorance, but basically the
rule has to be that there are absolutely zero run-time "panic()"
calls. Unsafe code has to either be caught at compile time, or it has
to be handled dynamically as just a regular error.

With the main point of Rust being safety, there is no way I will ever
accept "panic dynamically" (whether due to out-of-memory or due to
anything else - I also reacted to the "floating point use causes
dynamic panics") as a feature in the Rust model.

           Linus
Miguel Ojeda April 14, 2021, 7:50 p.m. UTC | #3
On Wed, Apr 14, 2021 at 9:31 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Again, excuse my lack of internal Rust knowledge, but when do these
> end up being an issue?
>
> If the Rust compiler ends up doing hidden allocations, and they then
> cause panics, then one of the main *points* of Rustification is
> entirely broken. That's 100% the opposite of being memory-safe at
> build time.

Of course! What happens here is that we use, for the moment, `alloc`,
which is part of the Rust standard library. However, we will be
customizing/rewriting `alloc` as needed to customize its types (things
like `Box`, `Vec`, etc.) so that we can do things like pass allocation
flags, ensure we always have fallible allocations, perhaps reuse some
of the kernel data structures, etc.

Cheers,
Miguel
Matthew Wilcox April 14, 2021, 8:09 p.m. UTC | #4
On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
>   - Manish Goregaokar implemented the fallible `Box`, `Arc`, and `Rc`
>     allocator APIs in Rust's `alloc` standard library for us.

There's a philosophical point to be discussed here which you're skating
right over!  Should rust-in-the-linux-kernel provide the same memory
allocation APIs as the rust-standard-library, or should it provide a Rusty
API to the standard-linux-memory-allocation APIs?  You seem to be doing
both ... there was a wrapper around alloc_pages() in the Binder patches,
and then you talk about Box, Arc and Rc here.

Maybe there's some details about when one can use one kind of API and
when to use another.  But I fear that we'll have Rust code at interrupt
level trying to use allocators which assume that they can sleep, and
things will go badly wrong.

By the way, I don't think that Rust necessarily has to conform to the
current way that Linux works.  If this prompted us to track the current
context (inside spinlock, handling interrupt, performing writeback, etc)
and do away with (some) GFP flags, that's not the end of the world.
We're already moving in that direction to a certain extent with the
scoped memory allocation APIs to replace GFP_NOFS / GFP_NOIO.
Miguel Ojeda April 14, 2021, 8:20 p.m. UTC | #5
On Wed, Apr 14, 2021 at 9:45 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> With the main point of Rust being safety, there is no way I will ever
> accept "panic dynamically" (whether due to out-of-memory or due to
> anything else - I also reacted to the "floating point use causes
> dynamic panics") as a feature in the Rust model.

Agreed on all your points. As I mentioned in the other message (we
crossed emails), we have a lot of work to do regarding `alloc` and
slicing `core` for things that are not needed for the kernel
(floating-point, etc.).

We haven't done it just yet because it is not a trivial amount of work
and we wanted to have some overall sentiment from you and the
community overall before tackling everything. But it is doable and
there isn't any fundamental reason that prevents it (in fact, the
language supports no-allocation code).

Worst case, we may need to request a few bits here and there to the
`rustc` and standard library teams, but that should be about it.

In summary, to be clear:

  - On allocation: this is just our usage of `alloc` in order to speed
development up -- it will be replaced (or customized, we have to
decide how we will approach it) with our own allocation and data
structures.

  - On floating-point, 128-bit, etc.: the main issue is that the
`core` library is a single big blob at the moment. I have already
mentioned this to some Rust team folks. We will need a way to "cut"
some things out, for instance with the "feature flags" they already
have for other crates (or they can split `core` in to several, like
`alloc` is for similar reasons). Or we could do it on our side
somehow, but I prefer to avoid that (we cannot easily customize `core`
like we can with `alloc`, because it is tied to the compiler too
tightly).

Thanks a lot for having taken a look so quickly, by the way!

Cheers,
Miguel
Linus Torvalds April 14, 2021, 8:21 p.m. UTC | #6
On Wed, Apr 14, 2021 at 1:10 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> There's a philosophical point to be discussed here which you're skating
> right over!  Should rust-in-the-linux-kernel provide the same memory
> allocation APIs as the rust-standard-library, or should it provide a Rusty
> API to the standard-linux-memory-allocation APIs?

Yeah, I think that the standard Rust API may simply not be acceptable
inside the kernel, if it has similar behavior to the (completely
broken) C++ "new" operator.

So anything that does "panic!" in the normal Rust API model needs to
be (statically) caught, and never exposed as an actual call to
"panic()/BUG()" in the kernel.

So "Result<T, E>" is basically the way to go, and if the standard Rust
library alloc() model is based on "panic!" then that kind of model
must simply not be used in the kernel.

             Linus

           Linus
Miguel Ojeda April 14, 2021, 8:29 p.m. UTC | #7
On Wed, Apr 14, 2021 at 10:10 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> >   - Manish Goregaokar implemented the fallible `Box`, `Arc`, and `Rc`
> >     allocator APIs in Rust's `alloc` standard library for us.
>
> There's a philosophical point to be discussed here which you're skating
> right over!  Should rust-in-the-linux-kernel provide the same memory
> allocation APIs as the rust-standard-library, or should it provide a Rusty
> API to the standard-linux-memory-allocation APIs?  You seem to be doing
> both ... there was a wrapper around alloc_pages() in the Binder patches,
> and then you talk about Box, Arc and Rc here.

Please see my reply to Linus. The Rust standard library team is doing
work on allocators, fallible allocations, etc., but that is very much
a WIP. We hope that our usage and needs inform them in their design.

Manish Goregaokar implemented the `try_reserve` feature since he knew
we wanted to have fallible allocations etc. (I was not really involved
in that, perhaps somebody else can comment); but we will have to
replace `alloc` anyway in the near feature, and we wanted to give
Manish credit for advancing the state of the art there nevertheless.

> Maybe there's some details about when one can use one kind of API and
> when to use another.  But I fear that we'll have Rust code at interrupt
> level trying to use allocators which assume that they can sleep, and
> things will go badly wrong.

Definitely. In fact, we want to have all public functions exposed by
Rust infrastructure tagged with the context they can work in, etc.
Ideally, we could propose a language feature like "colored `unsafe`"
so that one can actually inform the compiler that a function is only
safe in some contexts, e.g. `unsafe(interrupt)`. But language features
are a moonshot, for the moment we want to go with the annotation in
the doc-comment, like we do with the `Safety` preconditions and type
invariants.

Cheers,
Miguel
Josh Triplett April 14, 2021, 8:35 p.m. UTC | #8
On Wed, Apr 14, 2021 at 01:21:52PM -0700, Linus Torvalds wrote:
> On Wed, Apr 14, 2021 at 1:10 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > There's a philosophical point to be discussed here which you're skating
> > right over!  Should rust-in-the-linux-kernel provide the same memory
> > allocation APIs as the rust-standard-library, or should it provide a Rusty
> > API to the standard-linux-memory-allocation APIs?
> 
> Yeah, I think that the standard Rust API may simply not be acceptable
> inside the kernel, if it has similar behavior to the (completely
> broken) C++ "new" operator.
> 
> So anything that does "panic!" in the normal Rust API model needs to
> be (statically) caught, and never exposed as an actual call to
> "panic()/BUG()" in the kernel.

Rust has both kinds of allocation APIs: you can call a method like
`Box::new` that panics on allocation failure, or a method like
`Box::try_new` that returns an error on allocation failure.

With some additional infrastructure that's still in progress, we could
just not supply the former kind of methods at all, and *only* supply the
latter, so that you're forced to handle allocation failure. That just
requires introducing some further ability to customize the Rust standard
library.

(There are some cases of methods in the standard library that don't have
a `try_` equivalent, but we could fix that. Right now, for instance,
there isn't a `try_` equivalent of every Vec method, and you're instead
expected to call `try_reserve` to make sure you have enough memory
first; however, that could potentially be changed.)
David Laight April 14, 2021, 10:08 p.m. UTC | #9
From: Linus Torvalds
> Sent: 14 April 2021 21:22
> 
> On Wed, Apr 14, 2021 at 1:10 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > There's a philosophical point to be discussed here which you're skating
> > right over!  Should rust-in-the-linux-kernel provide the same memory
> > allocation APIs as the rust-standard-library, or should it provide a Rusty
> > API to the standard-linux-memory-allocation APIs?
> 
> Yeah, I think that the standard Rust API may simply not be acceptable
> inside the kernel, if it has similar behavior to the (completely
> broken) C++ "new" operator.

ISTM that having memory allocation failure cause a user process
to exit is a complete failure in something designed to run as
and kind of service program.

There are all sorts of reasons why malloc() might fail.
You almost never want a 'real' program to abort on one.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Nick Desaulniers April 15, 2021, 12:22 a.m. UTC | #10
On Wed, Apr 14, 2021 at 11:47 AM <ojeda@kernel.org> wrote:
>
> From: Miguel Ojeda <ojeda@kernel.org>
>
> Some of you have noticed the past few weeks and months that
> a serious attempt to bring a second language to the kernel was
> being forged. We are finally here, with an RFC that adds support
> for Rust to the Linux kernel.
>
> This cover letter is fairly long, since there are quite a few topics
> to describe, but I hope it answers as many questions as possible
> before the discussion starts.
>
> If you are interested in following this effort, please join us
> in the mailing list at:
>
>     rust-for-linux@vger.kernel.org
>
> and take a look at the project itself at:
>
>     https://github.com/Rust-for-Linux

Looks like Wedson's writeup is now live. Nice job Wedson!
https://security.googleblog.com/2021/04/rust-in-linux-kernel.html
Kees Cook April 15, 2021, 1:38 a.m. UTC | #11
Before anything else: yay! I'm really glad to see this RFC officially
hit LKML. :)

On Wed, Apr 14, 2021 at 10:20:51PM +0200, Miguel Ojeda wrote:
>   - On floating-point, 128-bit, etc.: the main issue is that the
> `core` library is a single big blob at the moment. I have already
> mentioned this to some Rust team folks. We will need a way to "cut"
> some things out, for instance with the "feature flags" they already
> have for other crates (or they can split `core` in to several, like
> `alloc` is for similar reasons). Or we could do it on our side
> somehow, but I prefer to avoid that (we cannot easily customize `core`
> like we can with `alloc`, because it is tied to the compiler too
> tightly).

Besides just FP, 128-bit, etc, I remain concerned about just basic
math operations. C has no way to describe the intent of integer
overflow, so the kernel was left with the only "predictable" result:
wrap around. Unfortunately, this is wrong in most cases, and we're left
with entire classes of vulnerability related to such overflows.

When originally learning Rust I was disappointed to see that (by default)
Rust similarly ignores the overflow problem, but I'm glad to see the
very intentional choices in the Rust-in-Linux design to deal with it
directly. I think the default behavior should be saturate-with-WARN
(this will match the ultimate goals of the UBSAN overflow support[1][2]
in the C portions of the kernel). Rust code wanting wrapping/checking
can expressly use those. The list of exploitable overflows is loooong,
and this will remain a weakness in Rust unless we get it right from
the start. What's not clear to me is if it's better to say "math with
undeclared overflow expectation" will saturate" or to say "all math must
declare its overflow expectation".

-Kees

[1] https://github.com/KSPP/linux/issues/26
[2] https://github.com/KSPP/linux/issues/27
David Laight April 15, 2021, 8:26 a.m. UTC | #12
...
> Besides just FP, 128-bit, etc, I remain concerned about just basic
> math operations. C has no way to describe the intent of integer
> overflow, so the kernel was left with the only "predictable" result:
> wrap around. Unfortunately, this is wrong in most cases, and we're left
> with entire classes of vulnerability related to such overflows.

I'm not sure any of the alternatives (except perhaps panic)
are much better.
Many years ago I used a COBOL system that skipped the assignment
if ADD X to Y (y += x) would overflow.
That gave a very hard to spot error when the sump of a long list
way a little too large.
If it had wrapped the error would be obvious.

There are certainly places where saturate is good.
Mostly when dealing with analogue samples.

I guess the problematic code is stuff that checks:
	if (foo->size + constant > limit) goto error;
instead of:
	if (foo->size > limit - constant) goto error;

    David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Miguel Ojeda April 15, 2021, 10:05 a.m. UTC | #13
On Thu, Apr 15, 2021 at 2:23 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Looks like Wedson's writeup is now live. Nice job Wedson!
> https://security.googleblog.com/2021/04/rust-in-linux-kernel.html

+1 It is very nicely written and explains the semaphore samples
(included in the RFC) he wrote, with nice tables comparing how
different parts look like between C and Rust!

Anyone interested in this RFC, C or Rust, please take a look!

Cheers,
Miguel
Miguel Ojeda April 15, 2021, 12:39 p.m. UTC | #14
On Thu, Apr 15, 2021 at 3:38 AM Kees Cook <keescook@chromium.org> wrote:
>
> Before anything else: yay! I'm really glad to see this RFC officially
> hit LKML. :)

Thanks! :)

> When originally learning Rust I was disappointed to see that (by default)
> Rust similarly ignores the overflow problem, but I'm glad to see the
> very intentional choices in the Rust-in-Linux design to deal with it
> directly. I think the default behavior should be saturate-with-WARN
> (this will match the ultimate goals of the UBSAN overflow support[1][2]
> in the C portions of the kernel). Rust code wanting wrapping/checking
> can expressly use those. The list of exploitable overflows is loooong,
> and this will remain a weakness in Rust unless we get it right from
> the start. What's not clear to me is if it's better to say "math with
> undeclared overflow expectation" will saturate" or to say "all math must
> declare its overflow expectation".

+1 Agreed, we need to get this right (and ideally make both the C and
Rust sides agree...).

Cheers,
Miguel
Kees Cook April 15, 2021, 6:08 p.m. UTC | #15
On Thu, Apr 15, 2021 at 08:26:21AM +0000, David Laight wrote:
> ...
> > Besides just FP, 128-bit, etc, I remain concerned about just basic
> > math operations. C has no way to describe the intent of integer
> > overflow, so the kernel was left with the only "predictable" result:
> > wrap around. Unfortunately, this is wrong in most cases, and we're left
> > with entire classes of vulnerability related to such overflows.
> 
> I'm not sure any of the alternatives (except perhaps panic)
> are much better.
> Many years ago I used a COBOL system that skipped the assignment
> if ADD X to Y (y += x) would overflow.
> That gave a very hard to spot error when the sump of a long list
> way a little too large.
> If it had wrapped the error would be obvious.
> 
> There are certainly places where saturate is good.
> Mostly when dealing with analogue samples.
> 
> I guess the problematic code is stuff that checks:
> 	if (foo->size + constant > limit) goto error;
> instead of:
> 	if (foo->size > limit - constant) goto error;

Right. This and alloc(size * count) are the primary offenders. :)
Peter Zijlstra April 15, 2021, 6:58 p.m. UTC | #16
On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:

> Rust is a systems programming language that brings several key
> advantages over C in the context of the Linux kernel:
> 
>   - No undefined behavior in the safe subset (when unsafe code is
>     sound), including memory safety and the absence of data races.

And yet I see not a single mention of the Rust Memory Model and how it
aligns (or not) with the LKMM. The C11 memory model for example is a
really poor fit for LKMM.

> ## Why not?
> 
> Rust also has disadvantages compared to C in the context of
> the Linux kernel:
> 
>   - The many years of effort in tooling for C around the kernel,
>     including compiler plugins, sanitizers, Coccinelle, lockdep,
>     sparse... However, this will likely improve if Rust usage in
>     the kernel grows over time.

This; can we mercilessly break the .rs bits when refactoring? What
happens the moment we cannot boot x86_64 without Rust crap on?

We can ignore this as a future problem, but I think it's only fair to
discuss now. I really don't care for that future, and IMO adding this
Rust or any other second language is a fail.

> Thirdly, in Rust code bases, most documentation is written alongside
> the source code, in Markdown. We follow this convention, thus while
> we have a few general documents in `Documentation/rust/`, most of
> the actual documentation is in the source code itself.
> 
> In order to read this documentation easily, Rust provides a tool
> to generate HTML documentation, just like Sphinx/kernel-doc, but
> suited to Rust code bases and the language concepts.

HTML is not a valid documentation format. Heck, markdown itself is
barely readable.

> Moreover, as explained above, we are taking the chance to enforce
> some documentation guidelines. We are also enforcing automatic code
> formatting, a set of Clippy lints, etc. We decided to go with Rust's
> idiomatic style, i.e. keeping `rustfmt` defaults. For instance, this
> means 4 spaces are used for indentation, rather than a tab. We are
> happy to change that if needed -- we think what is important is
> keeping the formatting automated.

It is really *really* hard to read. It has all sorts of weird things,
like operators at the beginning after a line break:

	if (foo
	    || bar)

which is just wrong. And it suffers from CamelCase, which is just about
the worst thing ever. Not even the C++ std libs have that (or had, back
when I still did knew C++).

I also see:

	if (foo) {
		...
	}

and

	if foo {
	}

the latter, ofcourse, being complete rubbish.

> Another important topic we would like feedback on is the Rust
> "native" documentation that is written alongside the code, as
> explained above. We have uploaded it here:
> 
>     https://rust-for-linux.github.io/docs/kernel/
> 
> We like how this kind of generated documentation looks. Please take
> a look and let us know what you think!

I cannot view with less or vim. Therefore it looks not at all.

>   - Boqun Feng is working hard on the different options for
>     threading abstractions and has reviewed most of the `sync` PRs.

Boqun, I know you're familiar with LKMM, can you please talk about how
Rust does things and how it interacts?
Wedson Almeida Filho April 16, 2021, 2:22 a.m. UTC | #17
On Thu, Apr 15, 2021 at 08:58:16PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> 
> > Rust is a systems programming language that brings several key
> > advantages over C in the context of the Linux kernel:
> > 
> >   - No undefined behavior in the safe subset (when unsafe code is
> >     sound), including memory safety and the absence of data races.
> 
> And yet I see not a single mention of the Rust Memory Model and how it
> aligns (or not) with the LKMM. The C11 memory model for example is a
> really poor fit for LKMM.

We don't intend to directly expose C data structures to Rust code (outside the
kernel crate). Instead, we intend to provide wrappers that expose safe
interfaces even though the implementation may use unsafe blocks. So we expect
the vast majority of Rust code to just care about the Rust memory model.

We admittedly don't have a huge number of wrappers yet, but we do have enough to
implement most of Binder and so far it's been ok. We do intend to eventually
cover other classes of drivers that may unveil unforeseen difficulties, we'll
see.

If you have concerns that we might have overlooked, we'd be happy to hear about
them from you (or anyone else).

> HTML is not a valid documentation format. Heck, markdown itself is
> barely readable.

Are you stating [what you perceive as] a fact or just venting? If the former,
would you mind enlightening us with some evidence?

> It is really *really* hard to read. It has all sorts of weird things,
> like operators at the beginning after a line break:
> 
> 	if (foo
> 	    || bar)
> 
> which is just wrong. And it suffers from CamelCase, which is just about
> the worst thing ever. Not even the C++ std libs have that (or had, back
> when I still did knew C++).
> 
> I also see:
> 
> 	if (foo) {
> 		...
> 	}
> 
> and
> 
> 	if foo {
> 	}
> 
> the latter, ofcourse, being complete rubbish.

There are advantages to adopting the preferred style of a language (when one
exists). We, of course, are not required to adopt it but I am of the opinion
that we should have good reasons to diverge if that's our choice in the end.

"Not having parentheses around the if-clause expression is complete rubbish"
doesn't sound like a good reason to me.
Al Viro April 16, 2021, 4:25 a.m. UTC | #18
On Fri, Apr 16, 2021 at 03:22:16AM +0100, Wedson Almeida Filho wrote:

> > HTML is not a valid documentation format. Heck, markdown itself is
> > barely readable.
> 
> Are you stating [what you perceive as] a fact or just venting? If the former,
> would you mind enlightening us with some evidence?

How about "not everyone uses a browser as a part of their workflow"?
I realize that it might sound ridiculous for folks who spent a while
around Mozilla, but it's really true and kernel community actually
has quite a few of such freaks.  And as one of those freaks I can tell
you where exactly I would like you to go and what I would like you to do
with implicit suggestions to start a browser when I need to read some
in-tree documentation.

Linus might have different reasons, obviously.
Boqun Feng April 16, 2021, 4:27 a.m. UTC | #19
[Copy LKMM people, Josh, Nick and Wedson]

On Thu, Apr 15, 2021 at 08:58:16PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> 
> > Rust is a systems programming language that brings several key
> > advantages over C in the context of the Linux kernel:
> > 
> >   - No undefined behavior in the safe subset (when unsafe code is
> >     sound), including memory safety and the absence of data races.
> 
> And yet I see not a single mention of the Rust Memory Model and how it
> aligns (or not) with the LKMM. The C11 memory model for example is a
> really poor fit for LKMM.
> 

I think Rust currently uses C11 memory model as per:

	https://doc.rust-lang.org/nomicon/atomics.html

, also I guess another reason that they pick C11 memory model is because
LLVM has the support by default.

But I think the Rust Community still wants to have a good memory model,
and they are open to any kind of suggestion and input. I think we (LKMM
people) should really get involved, because the recent discussion on
RISC-V's atomics shows that if we didn't people might get a "broken"
design because they thought C11 memory model is good enough:

	https://lore.kernel.org/lkml/YGyZPCxJYGOvqYZQ@boqun-archlinux/

And the benefits are mutual: a) Linux Kernel Memory Model (LKMM) is
defined by combining the requirements of developers and the behavior of
hardwares, it's pratical and can be a very good input for memory model
designing in Rust; b) Once Rust has a better memory model, the compiler
technologies whatever Rust compilers use to suppor the memory model can
be adopted to C compilers and we can get that part for free.

At least I personally is very intereted to help Rust on a complete and
pratical memory model ;-)

Josh, I think it's good if we can connect to the people working on Rust
memoryg model, I think the right person is Ralf Jung and the right place
is https://github.com/rust-lang/unsafe-code-guidelines, but you
cerntainly know better than me ;-) Or maybe we can use Rust-for-Linux or
linux-toolchains list to discuss.

[...]
> >   - Boqun Feng is working hard on the different options for
> >     threading abstractions and has reviewed most of the `sync` PRs.
> 
> Boqun, I know you're familiar with LKMM, can you please talk about how
> Rust does things and how it interacts?

As Wedson said in the other email, currently there is no code requiring
synchronization between C side and Rust side, so we are currently fine.
But in the longer term, we need to teach Rust memory model about the
"design patterns" used in Linux kernel for parallel programming.

What I have been doing so far is reviewing patches which have memory
orderings in Rust-for-Linux project, try to make sure we don't include
memory ordering bugs for the beginning.

Regards,
Boqun
Wedson Almeida Filho April 16, 2021, 5:02 a.m. UTC | #20
On Fri, Apr 16, 2021 at 04:25:34AM +0000, Al Viro wrote:

> > Are you stating [what you perceive as] a fact or just venting? If the former,
> > would you mind enlightening us with some evidence?
> 
> How about "not everyone uses a browser as a part of their workflow"?

The documentation is available in markdown alongside the code. You don't need a
browser to see it. I, for one, use neovim and a rust LSP, so I can see the
documentation by pressing shift+k.

> I realize that it might sound ridiculous for folks who spent a while
> around Mozilla, but it's really true and kernel community actually
> has quite a few of such freaks.

I haven't spent any time around Mozilla myself (not that there's anything wrong
with it), so I can't really comment on this.

> And as one of those freaks I can tell
> you where exactly I would like you to go and what I would like you to do
> with implicit suggestions to start a browser when I need to read some
> in-tree documentation.

I could be mistaken but you seem angry. Perhaps it wouldn't be a bad idea to
read your own code of conduct, I don't think you need a browser for that either.
Paul Zimmerman April 16, 2021, 5:39 a.m. UTC | #21
On Fri, Apr 16, 2021 at 06:02:33 +0100, Wedson Almeida Filho wrote:
> On Fri, Apr 16, 2021 at 04:25:34AM +0000, Al Viro wrote:
>
>>> Are you stating [what you perceive as] a fact or just venting? If the former,
>>> would you mind enlightening us with some evidence?
>> 
>> How about "not everyone uses a browser as a part of their workflow"?
>
> The documentation is available in markdown alongside the code. You don't need a
> browser to see it. I, for one, use neovim and a rust LSP, so I can see the
> documentation by pressing shift+k.
>
>> I realize that it might sound ridiculous for folks who spent a while
>> around Mozilla, but it's really true and kernel community actually
>> has quite a few of such freaks.
>
> I haven't spent any time around Mozilla myself (not that there's anything wrong
> with it), so I can't really comment on this.
>
>> And as one of those freaks I can tell
>> you where exactly I would like you to go and what I would like you to do
>> with implicit suggestions to start a browser when I need to read some
>> in-tree documentation.
>
> I could be mistaken but you seem angry. Perhaps it wouldn't be a bad idea to
> read your own code of conduct, I don't think you need a browser for that either.

Haven't you folks ever head of lynx? Good old-fashioned command-line tool that
opens html files in a terminal window, supports following links within the file,
good stuff like that. I don't see how the dinosaurs^W traditional folks could
object to that!

-- Paul
Nick Desaulniers April 16, 2021, 6:04 a.m. UTC | #22
On Thu, Apr 15, 2021 at 9:27 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> [Copy LKMM people, Josh, Nick and Wedson]
>
> On Thu, Apr 15, 2021 at 08:58:16PM +0200, Peter Zijlstra wrote:
> > On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> >
> > > Rust is a systems programming language that brings several key
> > > advantages over C in the context of the Linux kernel:
> > >
> > >   - No undefined behavior in the safe subset (when unsafe code is
> > >     sound), including memory safety and the absence of data races.
> >
> > And yet I see not a single mention of the Rust Memory Model and how it
> > aligns (or not) with the LKMM. The C11 memory model for example is a
> > really poor fit for LKMM.
> >
>
> I think Rust currently uses C11 memory model as per:
>
>         https://doc.rust-lang.org/nomicon/atomics.html
>
> , also I guess another reason that they pick C11 memory model is because
> LLVM has the support by default.
>
> But I think the Rust Community still wants to have a good memory model,
> and they are open to any kind of suggestion and input. I think we (LKMM
> people) should really get involved, because the recent discussion on
> RISC-V's atomics shows that if we didn't people might get a "broken"
> design because they thought C11 memory model is good enough:
>
>         https://lore.kernel.org/lkml/YGyZPCxJYGOvqYZQ@boqun-archlinux/
>
> And the benefits are mutual: a) Linux Kernel Memory Model (LKMM) is
> defined by combining the requirements of developers and the behavior of
> hardwares, it's pratical and can be a very good input for memory model
> designing in Rust; b) Once Rust has a better memory model, the compiler
> technologies whatever Rust compilers use to suppor the memory model can
> be adopted to C compilers and we can get that part for free.

Yes, I agree; I think that's a very good approach.  Avoiding the ISO
WG14 is interesting; at least the merits could be debated in the
public and not behind closed doors.

>
> At least I personally is very intereted to help Rust on a complete and
> pratical memory model ;-)
>
> Josh, I think it's good if we can connect to the people working on Rust
> memoryg model, I think the right person is Ralf Jung and the right place
> is https://github.com/rust-lang/unsafe-code-guidelines, but you
> cerntainly know better than me ;-) Or maybe we can use Rust-for-Linux or
> linux-toolchains list to discuss.
>
> [...]
> > >   - Boqun Feng is working hard on the different options for
> > >     threading abstractions and has reviewed most of the `sync` PRs.
> >
> > Boqun, I know you're familiar with LKMM, can you please talk about how
> > Rust does things and how it interacts?
>
> As Wedson said in the other email, currently there is no code requiring
> synchronization between C side and Rust side, so we are currently fine.
> But in the longer term, we need to teach Rust memory model about the
> "design patterns" used in Linux kernel for parallel programming.
>
> What I have been doing so far is reviewing patches which have memory
> orderings in Rust-for-Linux project, try to make sure we don't include
> memory ordering bugs for the beginning.
>
> Regards,
> Boqun
Peter Zijlstra April 16, 2021, 7:09 a.m. UTC | #23
On Fri, Apr 16, 2021 at 03:22:16AM +0100, Wedson Almeida Filho wrote:
> On Thu, Apr 15, 2021 at 08:58:16PM +0200, Peter Zijlstra wrote:
> > On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> > 
> > > Rust is a systems programming language that brings several key
> > > advantages over C in the context of the Linux kernel:
> > > 
> > >   - No undefined behavior in the safe subset (when unsafe code is
> > >     sound), including memory safety and the absence of data races.
> > 
> > And yet I see not a single mention of the Rust Memory Model and how it
> > aligns (or not) with the LKMM. The C11 memory model for example is a
> > really poor fit for LKMM.
> 
> We don't intend to directly expose C data structures to Rust code (outside the
> kernel crate). Instead, we intend to provide wrappers that expose safe
> interfaces even though the implementation may use unsafe blocks. So we expect
> the vast majority of Rust code to just care about the Rust memory model.
> 
> We admittedly don't have a huge number of wrappers yet, but we do have enough to
> implement most of Binder and so far it's been ok. We do intend to eventually
> cover other classes of drivers that may unveil unforeseen difficulties, we'll
> see.
> 
> If you have concerns that we might have overlooked, we'd be happy to hear about
> them from you (or anyone else).

Well, the obvious example would be seqlocks. C11 can't do them. The not
sharing of data structures would avoid most of that, but will also cost
you in performance.

Simlar thing for RCU; C11 can't optimally do that; it needs to make
rcu_dereference() a load-acquire [something ARM64 has already done in C
because the compiler might be too clever by half when doing LTO :-(].
But it's the compiler needing the acquire semantics, not the computer,
which is just bloody wrong.

And there's more sharp corners to be had. But yes, if you're not
actually sharing anything; and taking the performance hit that comes
with that, you might get away with it.

> > HTML is not a valid documentation format. Heck, markdown itself is
> > barely readable.
> 
> Are you stating [what you perceive as] a fact or just venting? If the former,
> would you mind enlightening us with some evidence?

I've yet to see a program that renders HTML (including all the cruft
often used in docs, which might include SVG graphics and whatnot) sanely
in ASCII. Lynx does not qualify, it's output is atrocious crap.

Yes, lynx lets you read HTML in ASCII, but at the cost of bleeding
eyeballs and missing content.

Nothing beats a sane ASCII document with possibly, where really needed
some ASCII art.

Sadly the whole kernel documentation project is moving away from that as
well, which just means I'm back to working on an undocumented codebase.
This rst crap they adopted is unreadable garbage.

> > It is really *really* hard to read. It has all sorts of weird things,
> > like operators at the beginning after a line break:
> > 
> > 	if (foo
> > 	    || bar)
> > 
> > which is just wrong. And it suffers from CamelCase, which is just about
> > the worst thing ever. Not even the C++ std libs have that (or had, back
> > when I still did knew C++).
> > 
> > I also see:
> > 
> > 	if (foo) {
> > 		...
> > 	}
> > 
> > and
> > 
> > 	if foo {
> > 	}
> > 
> > the latter, ofcourse, being complete rubbish.
> 
> There are advantages to adopting the preferred style of a language (when one
> exists). We, of course, are not required to adopt it but I am of the opinion
> that we should have good reasons to diverge if that's our choice in the end.
> 
> "Not having parentheses around the if-clause expression is complete rubbish"
> doesn't sound like a good reason to me.

Of course it does; my internal lexer keeps screaming syntax error at me;
how am I going to understand code when I can't sanely read it?

The more you make it look like (Kernel) C, the easier it is for us C
people to actually read. My eyes have been reading C for almost 30 years
by now, they have a lexer built in the optical nerve; reading something
that looks vaguely like C but is definitely not C is an utterly painful
experience.

You're asking to join us, not the other way around. I'm fine in a world
without Rust.
Peter Zijlstra April 16, 2021, 7:46 a.m. UTC | #24
On Fri, Apr 16, 2021 at 06:02:33AM +0100, Wedson Almeida Filho wrote:
> On Fri, Apr 16, 2021 at 04:25:34AM +0000, Al Viro wrote:

> > And as one of those freaks I can tell
> > you where exactly I would like you to go and what I would like you to do
> > with implicit suggestions to start a browser when I need to read some
> > in-tree documentation.
> 
> I could be mistaken but you seem angry. Perhaps it wouldn't be a bad idea to
> read your own code of conduct, I don't think you need a browser for that either.

Welcome to LKML. CoC does not forbid human emotions just yet. Deal with
it.
Michal Kubecek April 16, 2021, 8:16 a.m. UTC | #25
On Thu, Apr 15, 2021 at 08:58:16PM +0200, Peter Zijlstra wrote:
> 
> This; can we mercilessly break the .rs bits when refactoring? What
> happens the moment we cannot boot x86_64 without Rust crap on?
> 
> We can ignore this as a future problem, but I think it's only fair to
> discuss now. I really don't care for that future, and IMO adding this
> Rust or any other second language is a fail.

I believe this is the most important question and we really need
a honest answer in advance: where exactly is this heading? At the moment
and with this experimental RFC, rust stuff can be optional and isolated
but it's obvious that the plan is very different: to have rust all
around the standard kernel tree. (If not, why is the example driver in
drivers/char/ ?)

And I don't see how the two languages might coexist peacefully without
rust toolchain being necessary for building any kernel useful in
practice and anyone seriously involved in kernel development having to
be proficient in both languages. Neither of these looks appealing to
me.

The dependency on rust toolchain was exactly what made me give up on
building Firefox from mercurial snapshots few years ago. To be able to
build them, one needed bleeding edge snapshots of rust toolchain which
my distribution couldn't possibly provide and building them myself
required way too much effort. This very discussion already revealed that
rust kernel code would provide similar experience. I also have my doubts
about the "optional" part; once there are some interesting drivers
written in rust, even if only in the form of out of tree modules, there
will be an enormous pressure on distributions, both community and
enterprise, to enable rust support. Once the major distributions do,
most others will have to follow. And from what I have seen, you need
rust toolchain for build even if you want to only load modules written
in rust.

The other problem is even worse. Once we have non-trivial amount of rust
code around the tree, even if it's "just some drivers", you cannot
completely ignore it. One example would be internal API changes. Today,
if I want to touch e.g. ethtool_ops, I need to adjust all in tree NIC
drivers providing the affected callback and adjust them. Usually most of
the patch is generated by spatch but manual tweaks are often needed here
and there. In the world of bilingual kernel with nontrivial number of
NIC drivers written in rust, I don't see how I could do that without
also being proficient in rust.

Also, how about maintainers and reviewers? What if someone comes with
a new module under foo/ or foo/bar/ and relevant maintainer does not
know rust or just not well enough to be able to review the submission
properly? Can they simply say "Sorry, I don't speak rust so no rust in
foo/bar/"? Leaf drivers are one thing, how about netfilter matches and
targets, TCP congestion control algorighms, qdiscs, filesystems, ...?
Having kernel tree divided into "rusty" and "rustfree" zones does not
sound like a great idea. But if we don't want that, do we expect every
subsystem maintainer and reviewer to learn rust to a level sufficient
for reviewing rust (kernel) code? Rust enthusiasts tell us they want to
open kernel development to more people but the result could as well be
exactly the opposite: it could restrict kernel development to people
proficient in _both_ languages.

As Peter said, it's not an imminent problem but as it's obvious this is
just the first step, we should have a clear idea what the plan is and
what we can and should expect.

Michal
Willy Tarreau April 16, 2021, 9:29 a.m. UTC | #26
On Fri, Apr 16, 2021 at 10:16:05AM +0200, Michal Kubecek wrote:
> And I don't see how the two languages might coexist peacefully without
> rust toolchain being necessary for building any kernel useful in
> practice and anyone seriously involved in kernel development having to
> be proficient in both languages.

Two languages ? No, one is specified and multiple-implemented, the other
one is the defined as what its only compiler understands at the moment,
so it's not a language, it's a compiler's reference manual at best. I'm
waiting for the day you're force to write things which look wrong with a
big comment around saying "it's not a bug it's a workaround for a bug in
the unique compiler, waiting to be retrofitted into the spec to solve the
problem for every user". Already seen for at least another "language"
implemented by a single vendor 22 years ago.

> Neither of these looks appealing to me.
> 
> The dependency on rust toolchain was exactly what made me give up on
> building Firefox from mercurial snapshots few years ago. To be able to
> build them, one needed bleeding edge snapshots of rust toolchain which
> my distribution couldn't possibly provide and building them myself
> required way too much effort. This very discussion already revealed that
> rust kernel code would provide similar experience. I also have my doubts
> about the "optional" part; once there are some interesting drivers
> written in rust, even if only in the form of out of tree modules, there
> will be an enormous pressure on distributions, both community and
> enterprise, to enable rust support.

Yes this scarily looks like the usual "embrace and extend... and abandon
the corpse once it doesn't move anymore".

I've already faced situations where I couldn't compile a recent 5.x kernel
using my previous gcc-4.7 compiler and this really really really pissed me
off because I'd had it in a build farm for many architectures and I had to
give up. But I also know that updating to a newer version will take time,
will be durable and will be worth it for the long term (except for the fact
that gcc doubles the build time every two versions). But here having to use
*the* compiler of the day and being prepared to keep a collection of them
to work with different stable kernels, no!

Also, I'm a bit worried about long-term survival of the new
language-of-the-day-that-makes-you-look-cool-at-beer-events. I was once
told perl would replace C everywhere. Does someone use it outside of
checkpatch.pl anymore ? Then I was told that C was dead because PHP was
appearing everywhere. I've even seen (slow) log processors written with
it. Now PHP seems to only be a WAF-selling argument. Then Ruby was "safe"
and would rule them all. Safe as its tab[-1] which crashed the interpreter.
Anyone heard of it recently ? Then Python, whose 2.7 is still present on
a lot of systems because the forced transition to 3 broke tons of code.
Will there ever be a 4 after this sore experience ? Then JS, Rust, Go,
Zig and I don't know what. What I'm noting is that such languages appear,
serve a purpose well, have their moment of fame, last a decade and
disappear except at a few enthousiasts. C has been there for 50 years
and served as the basis of many newer languages so it's still well
understood. I'm sure about one thing, the C bugs we have today will be
fixable in 20 years. I'm not even sure the Rust code we'll merge today
will still be compilable in 10 years nor will support the relevant
architectures available by then, and probably this code will have to
be rewritten in C to become maintained again.

> The other problem is even worse. Once we have non-trivial amount of rust
> code around the tree, even if it's "just some drivers", you cannot
> completely ignore it. One example would be internal API changes. Today,
> if I want to touch e.g. ethtool_ops, I need to adjust all in tree NIC
> drivers providing the affected callback and adjust them. Usually most of
> the patch is generated by spatch but manual tweaks are often needed here
> and there. In the world of bilingual kernel with nontrivial number of
> NIC drivers written in rust, I don't see how I could do that without
> also being proficient in rust.

You'll simply change the code you're able to change and those in charge
of their driver will use your commit message as instruction to fix the
build on theirs. How do you want it to be otherwise ?

> Rust enthusiasts tell us they want to
> open kernel development to more people but the result could as well be
> exactly the opposite: it could restrict kernel development to people
> proficient in _both_ languages.

This has been my understanding from the very beginning. Language prophets
always want to conquier very visible targets as a gauge of their baby's
popularity.

> As Peter said, it's not an imminent problem but as it's obvious this is
> just the first step, we should have a clear idea what the plan is and
> what we can and should expect.

I think the experience could be... interesting. However, I do note that
I've read quite a few claims about better security yada yada due to a
stricter memory model. Except that I seem to have understood that a lot
of code will have to run in unsafe mode (which partially voids some of
the benefits), that we'll be at the mercy of the unique compiler's bugs,
and that in addition code auditing will be very hard and reviews of the
boundaries between the two languages almost inexistent. This is precisely
what will become the new playground of attackers, and I predict a
significant increase of vulnerabilities past this point. Time will tell,
hoping it's never too late to rollback if it gets crazy. As long as the
code remains readable, it could be rewritten in C to regain control...

> Michal

Willy
Peter Zijlstra April 16, 2021, 11:24 a.m. UTC | #27
On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
>   - Featureful language: sum types, pattern matching, generics,
>     RAII, lifetimes, shared & exclusive references, modules &
>     visibility, powerful hygienic and procedural macros...

IMO RAII is over-valued, but just in case you care, the below seems to
work just fine. No fancy new language needed, works today. Similarly you
can create refcount_t guards, or with a little more work full blown
smart_ptr crud.

---
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index e19323521f9c..f03a72dd8cea 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -197,4 +197,22 @@ extern void mutex_unlock(struct mutex *lock);
 
 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 
+struct mutex_guard {
+	struct mutex *mutex;
+};
+
+static inline struct mutex_guard mutex_guard_lock(struct mutex *mutex)
+{
+	mutex_lock(mutex);
+	return (struct mutex_guard){ .mutex = mutex, };
+}
+
+static inline void mutex_guard_unlock(struct mutex_guard *guard)
+{
+	mutex_unlock(guard->mutex);
+}
+
+#define DEFINE_MUTEX_GUARD(name, lock)			\
+	struct mutex_guard __attribute__((__cleanup__(mutex_guard_unlock))) name = mutex_guard_lock(lock)
+
 #endif /* __LINUX_MUTEX_H */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8ee3249de2f0..603d197a83b8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5715,16 +5715,15 @@ static long perf_compat_ioctl(struct file *file, unsigned int cmd,
 
 int perf_event_task_enable(void)
 {
+	DEFINE_MUTEX_GUARD(event_mutex, &current->perf_event_mutex);
 	struct perf_event_context *ctx;
 	struct perf_event *event;
 
-	mutex_lock(&current->perf_event_mutex);
 	list_for_each_entry(event, &current->perf_event_list, owner_entry) {
 		ctx = perf_event_ctx_lock(event);
 		perf_event_for_each_child(event, _perf_event_enable);
 		perf_event_ctx_unlock(event, ctx);
 	}
-	mutex_unlock(&current->perf_event_mutex);
 
 	return 0;
 }
Wedson Almeida Filho April 16, 2021, 1:07 p.m. UTC | #28
On Fri, Apr 16, 2021 at 01:24:23PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> >   - Featureful language: sum types, pattern matching, generics,
> >     RAII, lifetimes, shared & exclusive references, modules &
> >     visibility, powerful hygienic and procedural macros...
> 
> IMO RAII is over-valued, but just in case you care, the below seems to
> work just fine. No fancy new language needed, works today. Similarly you
> can create refcount_t guards, or with a little more work full blown
> smart_ptr crud.

Peter, we do care, thank you for posting this. It's a great example for us to
discuss some of the minutiae of what we think Rust brings to the table in
addition to what's already possible in C.

> 
> ---
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index e19323521f9c..f03a72dd8cea 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -197,4 +197,22 @@ extern void mutex_unlock(struct mutex *lock);
>  
>  extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
>  
> +struct mutex_guard {
> +	struct mutex *mutex;
> +};
> +
> +static inline struct mutex_guard mutex_guard_lock(struct mutex *mutex)
> +{
> +	mutex_lock(mutex);
> +	return (struct mutex_guard){ .mutex = mutex, };
> +}
> +
> +static inline void mutex_guard_unlock(struct mutex_guard *guard)
> +{
> +	mutex_unlock(guard->mutex);
> +}
> +
> +#define DEFINE_MUTEX_GUARD(name, lock)			\
> +	struct mutex_guard __attribute__((__cleanup__(mutex_guard_unlock))) name = mutex_guard_lock(lock)
> +
>  #endif /* __LINUX_MUTEX_H */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8ee3249de2f0..603d197a83b8 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5715,16 +5715,15 @@ static long perf_compat_ioctl(struct file *file, unsigned int cmd,
>  
>  int perf_event_task_enable(void)
>  {
> +	DEFINE_MUTEX_GUARD(event_mutex, &current->perf_event_mutex);

There is nothing in C forcing developers to actually use DEFINE_MUTEX_GUARD. So
someone may simply forget (or not know that they need) to lock
current->perf_event_mutex and directly access some field protected by it. This
is unlikely to happen when one first writes the code, but over time as different
people modify the code and invariants change, it is possible for this to happen.

In Rust, this isn't possible: the data protected by a lock is only accessible
when the lock is locked. So developers cannot accidentally make mistakes of this
kind. And since the enforcement happens at compile time, there is no runtime
cost.

This, we believe, is fundamental to the discussion: we agree that many of these
idioms can be implemented in C (albeit in this case with a compiler extension),
but their use is optional, people can (and do) still make mistakes that lead to
vulnerabilities; Rust disallows classes of  mistakes by construction.

Another scenario: suppose within perf_event_task_enable you need to call a
function that requires the mutex to be locked and that will unlock it for you on
error (or unconditionally, doesn't matter). How would you do that in C? In Rust,
there is a clean idiomatic way of transferring ownership of a guard (or any
other object) such that the previous owner cannot continue to use it after
ownership is transferred. Again, this is enforced at compile time. I'm happy to
provide a small example if that would help.

Again, thanks for bringing this up. And please keep your concerns and feedback
coming, we very much want to have these discussions and try to improve what we
have based on feedback from the community.

>  	struct perf_event_context *ctx;
>  	struct perf_event *event;
>  
> -	mutex_lock(&current->perf_event_mutex);
>  	list_for_each_entry(event, &current->perf_event_list, owner_entry) {
>  		ctx = perf_event_ctx_lock(event);
>  		perf_event_for_each_child(event, _perf_event_enable);
>  		perf_event_ctx_unlock(event, ctx);
>  	}
> -	mutex_unlock(&current->perf_event_mutex);
>  
>  	return 0;
>  }
Peter Zijlstra April 16, 2021, 2:19 p.m. UTC | #29
On Fri, Apr 16, 2021 at 02:07:49PM +0100, Wedson Almeida Filho wrote:
> On Fri, Apr 16, 2021 at 01:24:23PM +0200, Peter Zijlstra wrote:

> >  int perf_event_task_enable(void)
> >  {
> > +	DEFINE_MUTEX_GUARD(event_mutex, &current->perf_event_mutex);
> 
> There is nothing in C forcing developers to actually use DEFINE_MUTEX_GUARD. So
> someone may simply forget (or not know that they need) to lock
> current->perf_event_mutex and directly access some field protected by it. This
> is unlikely to happen when one first writes the code, but over time as different
> people modify the code and invariants change, it is possible for this to happen.
> 
> In Rust, this isn't possible: the data protected by a lock is only accessible
> when the lock is locked. So developers cannot accidentally make mistakes of this
> kind. And since the enforcement happens at compile time, there is no runtime
> cost.
> 
> This, we believe, is fundamental to the discussion: we agree that many of these
> idioms can be implemented in C (albeit in this case with a compiler extension),
> but their use is optional, people can (and do) still make mistakes that lead to
> vulnerabilities; Rust disallows classes of  mistakes by construction.

Does this also not prohibit constructs where modification must be done
while holding two locks, but reading can be done while holding either
lock?

That's a semi common scheme in the kernel, but not something that's
expressible by, for example, the Java sync keyword.

It also very much doesn't work for RCU, where modification must be done
under a lock, but access is done essentially lockless.

I would much rather have a language extention where we can associate
custom assertions with variable access, sorta like a sanitizer:

static inline void assert_foo_bar(struct foo *f)
{
	lockdep_assert_held(&f->lock);
}

struct foo {
	spinlock_t lock;
	int bar __assert__(assert_foo_bar);
};

Such things can be optional and only enabled for debug builds on new
compilers.

> Another scenario: suppose within perf_event_task_enable you need to call a
> function that requires the mutex to be locked and that will unlock it for you on
> error (or unconditionally, doesn't matter). How would you do that in C? In Rust,
> there is a clean idiomatic way of transferring ownership of a guard (or any
> other object) such that the previous owner cannot continue to use it after
> ownership is transferred. Again, this is enforced at compile time. I'm happy to
> provide a small example if that would help.

C does indeed not have the concept of ownership, unlike modern C++ I
think. But I would much rather see a C language extention for that than
go Rust.

This would mean a far more agressive push for newer C compilers than
we've ever done before, but at least it would all still be a single
language. Convertion to the new stuff can be done gradually and where
it makes sense and new extentions can be evaluated on performance impact
etc.
Miguel Ojeda April 16, 2021, 2:21 p.m. UTC | #30
On Fri, Apr 16, 2021 at 1:24 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> IMO RAII is over-valued, but just in case you care, the below seems to
> work just fine. No fancy new language needed, works today. Similarly you
> can create refcount_t guards, or with a little more work full blown
> smart_ptr crud.

Please note that even smart pointers (as in C++'s `std::unique_ptr`
etc.) do not guarantee memory safety. Yes, they help a lot writing
sound code (in particular exception-safe C++ code), but they do not
bring the same guarantees.

That's why using C language extensions (the existing ones, that is) to
recreate RAII/guards, smart pointers, etc. would only bring you to a
point closer to C++, but not to Rust.

Cheers,
Miguel
Matthew Wilcox April 16, 2021, 3:03 p.m. UTC | #31
On Fri, Apr 16, 2021 at 02:07:49PM +0100, Wedson Almeida Filho wrote:
> There is nothing in C forcing developers to actually use DEFINE_MUTEX_GUARD. So
> someone may simply forget (or not know that they need) to lock
> current->perf_event_mutex and directly access some field protected by it. This
> is unlikely to happen when one first writes the code, but over time as different
> people modify the code and invariants change, it is possible for this to happen.
> 
> In Rust, this isn't possible: the data protected by a lock is only accessible
> when the lock is locked. So developers cannot accidentally make mistakes of this
> kind. And since the enforcement happens at compile time, there is no runtime
> cost.

Well, we could do that in C too.

struct unlocked_inode {
	spinlock_t i_lock;
};

struct locked_inode {
	spinlock_t i_lock;
	unsigned short i_bytes;
	blkcnt_t i_blocks;
};

struct locked_inode *lock_inode(struct unlocked_inode *inode)
{
	spin_lock(&inode->i_lock);
	return (struct locked_inode *)inode;
}

There's a combinatoric explosion when you have multiple locks in a data
structure that protect different things in it (and things in a data
structure that are protected by locks outside that data structure),
but I'm not sufficiently familiar with Rust to know if/how it solves
that problem.

Anyway, my point is that if we believe this is a sufficiently useful
feature to have, and we're willing to churn the kernel, it's less churn
to do this than it is to rewrite in Rust.

> Another scenario: suppose within perf_event_task_enable you need to call a
> function that requires the mutex to be locked and that will unlock it for you on
> error (or unconditionally, doesn't matter). How would you do that in C? In Rust,
> there is a clean idiomatic way of transferring ownership of a guard (or any
> other object) such that the previous owner cannot continue to use it after
> ownership is transferred. Again, this is enforced at compile time. I'm happy to
> provide a small example if that would help.

I think we could do that too with an __attribute__((free)).  It isn't,
of course, actually freeing the pointer to the locked_inode, but it will
make the compiler think the pointer is invalid after the function returns.

(hm, looks like gcc doesn't actually have __attribute__((free)) yet.
that's unfortunate.  there's a potential solution in gcc-11 that might
do what we need)
Miguel Ojeda April 16, 2021, 3:04 p.m. UTC | #32
On Fri, Apr 16, 2021 at 4:19 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Does this also not prohibit constructs where modification must be done
> while holding two locks, but reading can be done while holding either
> lock?

Yeah, this came up in some discussions we had. There are some locking
patterns that we need to think about how to model best within Rust's
type system.

But even if some patterns cannot be made safe, that is fine and does
not diminish the advantages everywhere else.

> I would much rather have a language extention where we can associate
> custom assertions with variable access, sorta like a sanitizer:
>
> static inline void assert_foo_bar(struct foo *f)
> {
>         lockdep_assert_held(&f->lock);
> }
>
> struct foo {
>         spinlock_t lock;
>         int bar __assert__(assert_foo_bar);
> };
>
> Such things can be optional and only enabled for debug builds on new
> compilers.

More sanitizers and ways to check "unsafe" code is sound are always
welcome -- not just for C, also for Rust `unsafe` code (e.g. Miri).

However, the main advantage of Rust for us is its safe subset (which
handles quite a lot of patterns, thanks to the lifetime tracking /
borrow checker).

Of course, we could propose something similar for C -- in fact, there
was a recent discussion around this in the C committee triggered by my
n2659 "Safety attributes for C" paper. However, achieving that would
require a lot of work, time, new syntax, etc. It is not something that
is in the radar just yet.

Similarly, if some compiler ends up implementing an extension that
actually realizes the same guarantees as Rust, we would likely end up
wrapping everything with macros like in the guards example you
mentioned, and even then we would not have got the rest of the
advantages that Rust brings to the table.

> C does indeed not have the concept of ownership, unlike modern C++ I
> think. But I would much rather see a C language extention for that than
> go Rust.

Many "resource-like" C++ types model ownership, yes; e.g.
`std::unique_ptr` for memory, as well as a myriad of ones in different
projects for different kinds of resources, plus generic ones like  the
proposed P0052. However, they do not enforce their usage is correct.

Cheers,
Miguel
Wedson Almeida Filho April 16, 2021, 3:33 p.m. UTC | #33
On Fri, Apr 16, 2021 at 04:19:07PM +0200, Peter Zijlstra wrote:
> Does this also not prohibit constructs where modification must be done
> while holding two locks, but reading can be done while holding either
> lock?

I don't believe it does. Remember that we have full control of the abstractions,
so we can (and will when the need arises) build an abstraction that provides the
functionality you describe. For the read path, we can have functions that return
a read-only guard (which is the gateway to the data in Rust) when locking either
of the locks, or when showing evidence that either lock is already locked (i.e.,
by temporarily transferring ownership of another guard). Note that this is
another area where Rust offers advantages: read-only guards (in C, if you take a
read lock, nothing prevents you from making changes to fields you should only be
allowed to read); and the ability to take temporary ownership, giving it back
even within the same function.

Similarly, to access a mutable guard, you'd have to show evidence that both
locks are held.

> That's a semi common scheme in the kernel, but not something that's
> expressible by, for example, the Java sync keyword.
> 
> It also very much doesn't work for RCU, where modification must be done
> under a lock, but access is done essentially lockless.

Why not? RCU is a lock -- it may have zero cost in most (all?) architectures on
the read path, but it is a lock. We can model access to variables/fields
protected by it just like any other lock, with the implementation of lock/unlock
optimizing to no-ops on the read path where possible.

In fact, this is also an advantage of Rust. It would *force* developers to
lock/unlock the RCU lock before they can access the protected data.

> I would much rather have a language extention where we can associate
> custom assertions with variable access, sorta like a sanitizer:
> 
> static inline void assert_foo_bar(struct foo *f)
> {
> 	lockdep_assert_held(&f->lock);
> }
> 
> struct foo {
> 	spinlock_t lock;
> 	int bar __assert__(assert_foo_bar);
> };
> 
> Such things can be optional and only enabled for debug builds on new
> compilers.

These would be great, but would still fall short of the compile-time guaranteed
safety that Rust offers in these cases.

> C does indeed not have the concept of ownership, unlike modern C++ I
> think. But I would much rather see a C language extention for that than
> go Rust.
> 
> This would mean a far more agressive push for newer C compilers than
> we've ever done before, but at least it would all still be a single
> language. Convertion to the new stuff can be done gradually and where
> it makes sense and new extentions can be evaluated on performance impact
> etc.

I encourage you to pursue this. We'd all benefit from better C. I'd be happy to
review and provide feedback on proposed extensions that are deemed
equivalent/better than what Rust offers.

My background is also in C. I'm no Rust fanboy, I'm just taking what I think is
a pragmatic view of the available options.
Peter Zijlstra April 16, 2021, 3:43 p.m. UTC | #34
On Fri, Apr 16, 2021 at 05:04:41PM +0200, Miguel Ojeda wrote:
> Of course, we could propose something similar for C -- in fact, there
> was a recent discussion around this in the C committee triggered by my
> n2659 "Safety attributes for C" paper. 

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2659.htm

That's just not making any damn sense what so ever. That seems to be
about sprinkling abort() all over the place, which is just total
rubbish.
Theodore Ts'o April 16, 2021, 3:58 p.m. UTC | #35
On Fri, Apr 16, 2021 at 02:07:49PM +0100, Wedson Almeida Filho wrote:
> On Fri, Apr 16, 2021 at 01:24:23PM +0200, Peter Zijlstra wrote:
> > On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> > >   - Featureful language: sum types, pattern matching, generics,
> > >     RAII, lifetimes, shared & exclusive references, modules &
> > >     visibility, powerful hygienic and procedural macros...
> > 
> > IMO RAII is over-valued, but just in case you care, the below seems to
> > work just fine. No fancy new language needed, works today. Similarly you
> > can create refcount_t guards, or with a little more work full blown
> > smart_ptr crud.
> 
> Peter, we do care, thank you for posting this. It's a great example for us to
> discuss some of the minutiae of what we think Rust brings to the table in
> addition to what's already possible in C.

Another fairly common use case is a lockless, racy test of a
particular field, as an optimization before we take the lock before we
test it for realsies.  In this particular case, we can't allocate
memory while holding a spinlock, so we check to see without taking the
spinlock to see whether we should allocate memory (which is expensive,
and unnecessasry most of the time):

alloc_transaction:
	/*
	 * This check is racy but it is just an optimization of allocating new
	 * transaction early if there are high chances we'll need it. If we
	 * guess wrong, we'll retry or free the unused transaction.
	 */
	if (!data_race(journal->j_running_transaction)) {
		/*
		 * If __GFP_FS is not present, then we may be being called from
		 * inside the fs writeback layer, so we MUST NOT fail.
		 */
		if ((gfp_mask & __GFP_FS) == 0)
			gfp_mask |= __GFP_NOFAIL;
		new_transaction = kmem_cache_zalloc(transaction_cache,
						    gfp_mask);
		if (!new_transaction)
			return -ENOMEM;
	}
	...
repeat:
	read_lock(&journal->j_state_lock);
	...
	if (!journal->j_running_transaction) {
		read_unlock(&journal->j_state_lock);
		if (!new_transaction)
			goto alloc_transaction;
		write_lock(&journal->j_state_lock);
		if (!journal->j_running_transaction &&
		    (handle->h_reserved || !journal->j_barrier_count)) {
			jbd2_get_transaction(journal, new_transaction);
			new_transaction = NULL;
		}
		write_unlock(&journal->j_state_lock);
		goto repeat;
	}
	...


The other thing that I'll note is that diferent elements in thet
journal structure are protected by different spinlocks; we don't have
a global lock protecting the entire structure, which is critical for
scalability on systems with a large number of CPU's with a lot of
threads all wanting to perform file system operations.

So having a guard structure which can't be bypassed on the entire
structure would result in a pretty massive performance penalty for the
ext4 file system.  I know that initially the use of Rust in the kernel
is targetted for less performance critical modules, such as device
drivers, but I thought I would mention some of the advantages of more
advanced locking techniques.

Cheers,

					- Ted
Willy Tarreau April 16, 2021, 4:14 p.m. UTC | #36
On Fri, Apr 16, 2021 at 04:33:51PM +0100, Wedson Almeida Filho wrote:
> On Fri, Apr 16, 2021 at 04:19:07PM +0200, Peter Zijlstra wrote:
> > Does this also not prohibit constructs where modification must be done
> > while holding two locks, but reading can be done while holding either
> > lock?
> 
> I don't believe it does. Remember that we have full control of the abstractions,
> so we can (and will when the need arises) build an abstraction that provides the
> functionality you describe. For the read path, we can have functions that return
> a read-only guard (which is the gateway to the data in Rust) when locking either
> of the locks, or when showing evidence that either lock is already locked (i.e.,
> by temporarily transferring ownership of another guard).

But will this remain syntactically readable/writable by mere humans ?
I mean, I keep extremely bad memories of having tried to write a loop
oconcatenating at most N times a string to another one, where N was a
number provided on the command line, with the compiler shouting at me
all the time until I blindly copy-pasted random pieces of unreadable
code from the net with a horribly complicated syntax that still
resulted in the impossibility for me to check for memory allocation
before failing. So I'm wondering how complicated that can become after
adding all sort of artificial protections on top of this :-/

> Note that this is
> another area where Rust offers advantages: read-only guards (in C, if you take a
> read lock, nothing prevents you from making changes to fields you should only be
> allowed to read);

But I'm happily doing that when I know what I'm doing. What you call a
read lock usually is in fact a shared lock as opposed to an exclusive
lock (generally used for writes). For me it's perfectly valid to perform
atomic writes under a read lock instead of forcing everyone to wait by
taking a write lock. You may for example take a read lock on a structure
to make sure that a field you're accessing in it points to stable memory
that is only modified under the write lock, but the pointer itself is
atomically accessed and swapped under the read lock.

> In fact, this is also an advantage of Rust. It would *force* developers to
> lock/unlock the RCU lock before they can access the protected data.

I'm really afraid by languages which force developers to do this or that.
Many bugs in C come from casts because developers know their use case
better than the compiler's developers, and result in lack of warnings
when the code evolves, leaving pending bugs behind. What is important
in my opinion is to let developers express what they want and report
suspicious constructs, not to force them to dirtily work around rules
that conflict with their use case :-/

Willy
Wedson Almeida Filho April 16, 2021, 4:21 p.m. UTC | #37
On Fri, Apr 16, 2021 at 11:58:05AM -0400, Theodore Ts'o wrote:
> Another fairly common use case is a lockless, racy test of a
> particular field, as an optimization before we take the lock before we
> test it for realsies.  In this particular case, we can't allocate
> memory while holding a spinlock, so we check to see without taking the
> spinlock to see whether we should allocate memory (which is expensive,
> and unnecessasry most of the time):

I'd have to think more about whether we can build generic safe abstraction for
this pattern. But even if we can't, we always have the unsafe escape hatch: we
can grant unsafe unlocked access to the data; in such cases, the onus is on the
caller to convince themselves that what they're doing is safe, i.e., the
compiler won't offer compile-time guarantees.

However, and I think this is also an advantage of Rust, such unsafe accesses
*must* be explicitly tagged as such (and this is enforced at compile-time), so
you'd do something like:

// SAFETY: The below is safe because...
if !unsafe{ journal.access_unlocked().j_running_transaction } {
}

And the idea is that unsafe blocks like the one above will require additional
scrutiny from reviewers. So this also makes the lives of maintainers/reviewers
easier as they'd know that these sections need more attention.

> The other thing that I'll note is that diferent elements in thet
> journal structure are protected by different spinlocks; we don't have
> a global lock protecting the entire structure, which is critical for
> scalability on systems with a large number of CPU's with a lot of
> threads all wanting to perform file system operations.

Yes, this is fine, the way to do it in Rust would be to break your struct up
into something like (we have something like this in Binder):

struct X {
    [...]
}

struct Y {
    [...]
}

struct Z {
    x: SpinLock<X>,
    y: SpinLock<Y>,
    a: u32,
    [...]
}
  
> So having a guard structure which can't be bypassed on the entire
> structure would result in a pretty massive performance penalty for the
> ext4 file system.  I know that initially the use of Rust in the kernel
> is targetted for less performance critical modules, such as device
> drivers, but I thought I would mention some of the advantages of more
> advanced locking techniques.

Thanks for this. Yes, while the initial target is drivers, we do want to provide
a general framework that could potentially be used anywhere.

Please let us know if you find other patterns that seem problematic.

Cheers,
-Wedson
Miguel Ojeda April 16, 2021, 4:21 p.m. UTC | #38
On Fri, Apr 16, 2021 at 5:43 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2659.htm
>
> That's just not making any damn sense what so ever. That seems to be
> about sprinkling abort() all over the place, which is just total
> rubbish.

No, it is not about that. It is semantically a no-op: N2659 is not
about a safe subset of C -- it just triggered discussions about it in
the reflector.

The point is that we think it is important to improve C for the kernel
too. We are not working on bringing Rust to the kernel "just because
we want Rust", but because we believe it has a sizable amount of
advantages that outweigh the costs.

Cheers,
Miguel
Miguel Ojeda April 16, 2021, 5:10 p.m. UTC | #39
On Fri, Apr 16, 2021 at 6:14 PM Willy Tarreau <w@1wt.eu> wrote:
>
> I'm really afraid by languages which force developers to do this or that.
> Many bugs in C come from casts because developers know their use case
> better than the compiler's developers, and result in lack of warnings
> when the code evolves, leaving pending bugs behind. What is important
> in my opinion is to let developers express what they want and report
> suspicious constructs, not to force them to dirtily work around rules
> that conflict with their use case :-/

I understand your concerns. The idea is that by restricting some
patterns (in the safe subset), you gain the ability to guarantee the
absence of UB (as long as the `unsafe` code is sound).

But please note that the `unsafe` side is still there, and you can
reach out for it when needed.

Thus, if you find yourself in a situation where the safe abstractions
are not enough for what you need to convey, you have two options:
ideally, you think about how to model that pattern in a way that can
be exposed as a safe API so that others can reuse it. And if that is
not possible, you reach out for `unsafe` yourself.

Even in those cases where there is no other way around `unsafe`, note
that you still have gained something very important: now you have made
it explicit in the code that this is needed, and you will have written
a `SAFETY` annotation that tells others why your usage is sound (i.e.
why it cannot trigger UB).

And by having the compiler enforce this safe-unsafe split, you can
review safe code without having to constantly worry about UB; and be
extra alert when dealing with `unsafe` blocks.

Of course, UB is only a subset of errors, but it is a major one, and
particularly critical for privileged code.

Cheers,
Miguel
Peter Zijlstra April 16, 2021, 5:18 p.m. UTC | #40
On Fri, Apr 16, 2021 at 07:10:17PM +0200, Miguel Ojeda wrote:

> Of course, UB is only a subset of errors, but it is a major one, and
> particularly critical for privileged code.

I've seen relatively few UBSAN warnings that weren't due to UBSAN being
broken.
Willy Tarreau April 16, 2021, 5:37 p.m. UTC | #41
Hi Miguel,

On Fri, Apr 16, 2021 at 07:10:17PM +0200, Miguel Ojeda wrote:
> And by having the compiler enforce this safe-unsafe split, you can
> review safe code without having to constantly worry about UB; and be
> extra alert when dealing with `unsafe` blocks.

I do appreciate this safe/unsafe split and a few other things I've seen
in the language. The equivalent I'm using in C is stronger typing and
"const" modifiers wherever possible. Of course it's much more limited,
it's just to explain that I do value this. I just feel like "unsafe"
is the universal response to any question "how would I do this" while
at the same time "safe" is the best selling argument for the language.
As such, I strongly doubt about the real benefits once facing reality
with everything marked unsafe. Except that it will be easier to blame
the person having written the unsafe one-liner instead of writing 60
cryptic lines doing the functional equivalent using some lesser known
extensions :-/

> Of course, UB is only a subset of errors, but it is a major one, and
> particularly critical for privileged code.

Not in my experience. I do create bugs that very seldomly stem from UB,
like any of us probably. But the vast majority of my bugs are caused by
stupid logic errors. When you invert an error check somewhere because
the function name looks like a boolean but its result works the other
way around, you can pass 10 times over it without noticing, and the
compiler will not help. And these ones are due to the human brain not
being that powerful in front of a computer, and whatever language will
not change this. Or worse, if it's harder to express what I want, I
will write more bugs. It happened to me quite a few times already
trying to work around absurd gcc warnings.

Based on the comments in this thread and the responses often being
around "we'll try to get this done" or "we'll bring the issue to the
compiler team", combined with the difficulty to keep control over
resources usage, I'm really not convinced at all it's suited for
low-level development. I understand the interest of the experiment
to help the language evolve into that direction, but I fear that
the kernel will soon be as bloated and insecure as a browser, and
that's really not to please me.

Cheers,
Willy
Connor Kuehl April 16, 2021, 5:46 p.m. UTC | #42
On 4/16/21 12:37 PM, Willy Tarreau wrote:
> Hi Miguel,
> 
> On Fri, Apr 16, 2021 at 07:10:17PM +0200, Miguel Ojeda wrote:
>> And by having the compiler enforce this safe-unsafe split, you can
>> review safe code without having to constantly worry about UB; and be
>> extra alert when dealing with `unsafe` blocks.
> 
> I do appreciate this safe/unsafe split and a few other things I've seen
> in the language. The equivalent I'm using in C is stronger typing and
> "const" modifiers wherever possible. Of course it's much more limited,
> it's just to explain that I do value this. I just feel like "unsafe"
> is the universal response to any question "how would I do this" while
> at the same time "safe" is the best selling argument for the language.
> As such, I strongly doubt about the real benefits once facing reality
> with everything marked unsafe. Except that it will be easier to blame
> the person having written the unsafe one-liner instead of writing 60
> cryptic lines doing the functional equivalent using some lesser known
> extensions :-/
> 

It's possible that many of the questions you've been specifically asking
about, by sheer coincidence, are targeted towards the problems that would
indeed require a lower-level abstraction built within an unsafe block; meaning
you've managed to evade the tons of other upper layers that could be written
in safe Rust.

Indeed, at a certain layer, unsafe is unavoidable for the kind of work that
is done in the kernel. The goal is to shrink the unsafe blocks as much as
possible and confirm the correctness of those pieces, then build safe
abstractions on top of it.

For what it's worth, if there was some post-human apocalyptic world where
literally everything had to go inside an unsafe block, the silver lining
in a hypothetical situation like this is that unsafe does not disable all
of Rust's static analysis like the borrow checker, etc. It allows you to
do things like directly dereference a pointer, etc. Unsafe also doesn't
automatically mean that the code is wrong or that it has memory issues;
it just means that the compiler can't guarantee that it doesn't based on
what you do in the unsafe block.

Connor
Matthew Wilcox April 16, 2021, 6:08 p.m. UTC | #43
On Fri, Apr 16, 2021 at 07:18:48PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 16, 2021 at 07:10:17PM +0200, Miguel Ojeda wrote:
> 
> > Of course, UB is only a subset of errors, but it is a major one, and
> > particularly critical for privileged code.
> 
> I've seen relatively few UBSAN warnings that weren't due to UBSAN being
> broken.

Lucky you.

84c34df158cf215b0cd1475ab3b8e6f212f81f23

(i'd argue this is C being broken; promoting only as far as int, when
assigning to an unsigned long is Bad, but until/unless either GCC fixes
that or the language committee realises that being stuck in the 1970s
is Bad, people are going to keep making this kind of mistake)
Paul E. McKenney April 16, 2021, 6:47 p.m. UTC | #44
On Thu, Apr 15, 2021 at 11:04:37PM -0700, Nick Desaulniers wrote:
> On Thu, Apr 15, 2021 at 9:27 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > [Copy LKMM people, Josh, Nick and Wedson]
> >
> > On Thu, Apr 15, 2021 at 08:58:16PM +0200, Peter Zijlstra wrote:
> > > On Wed, Apr 14, 2021 at 08:45:51PM +0200, ojeda@kernel.org wrote:
> > >
> > > > Rust is a systems programming language that brings several key
> > > > advantages over C in the context of the Linux kernel:
> > > >
> > > >   - No undefined behavior in the safe subset (when unsafe code is
> > > >     sound), including memory safety and the absence of data races.
> > >
> > > And yet I see not a single mention of the Rust Memory Model and how it
> > > aligns (or not) with the LKMM. The C11 memory model for example is a
> > > really poor fit for LKMM.
> > >
> >
> > I think Rust currently uses C11 memory model as per:
> >
> >         https://doc.rust-lang.org/nomicon/atomics.html
> >
> > , also I guess another reason that they pick C11 memory model is because
> > LLVM has the support by default.
> >
> > But I think the Rust Community still wants to have a good memory model,
> > and they are open to any kind of suggestion and input. I think we (LKMM
> > people) should really get involved, because the recent discussion on
> > RISC-V's atomics shows that if we didn't people might get a "broken"
> > design because they thought C11 memory model is good enough:
> >
> >         https://lore.kernel.org/lkml/YGyZPCxJYGOvqYZQ@boqun-archlinux/
> >
> > And the benefits are mutual: a) Linux Kernel Memory Model (LKMM) is
> > defined by combining the requirements of developers and the behavior of
> > hardwares, it's pratical and can be a very good input for memory model
> > designing in Rust; b) Once Rust has a better memory model, the compiler
> > technologies whatever Rust compilers use to suppor the memory model can
> > be adopted to C compilers and we can get that part for free.
> 
> Yes, I agree; I think that's a very good approach.  Avoiding the ISO
> WG14 is interesting; at least the merits could be debated in the
> public and not behind closed doors.

WG14 (C) and WG21 (C++) are at least somewhat open.  Here are some of
the proposals a few of us have in flight:

P2055R0 A Relaxed Guide to memory_order_relaxed
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2055r0.pdf
P0124R7 Linux-Kernel Memory Model (vs. that of C/C++)
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0124r7.html
P1726R4 Pointer lifetime-end zap
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1726r4.pdf
	https://docs.google.com/document/d/1MfagxTa6H0rTxtq9Oxyh4X53NzKqOt7y3hZBVzO_LMk/edit?usp=sharing
P1121R2 Hazard Pointers: Proposed Interface and Wording for Concurrency TS 2
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1121r2.pdf
P1382R1 volatile_load<T> and volatile_store<T>
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1382r1.pdf
P1122R2 Proposed Wording for Concurrent Data Structures: Read-Copy-Update (RCU)
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1122r2.pdf
	https://docs.google.com/document/d/1MfagxTa6H0rTxtq9Oxyh4X53NzKqOt7y3hZBVzO_LMk/edit?usp=sharing
P0190R4 Proposal for New memory order consume Definition
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
P0750R1 Consume
	http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0750r1.html

P1726R4 is of particular concern, along with consume.

> > At least I personally is very intereted to help Rust on a complete and
> > pratical memory model ;-)
> >
> > Josh, I think it's good if we can connect to the people working on Rust
> > memoryg model, I think the right person is Ralf Jung and the right place
> > is https://github.com/rust-lang/unsafe-code-guidelines, but you
> > cerntainly know better than me ;-) Or maybe we can use Rust-for-Linux or
> > linux-toolchains list to discuss.
> >
> > [...]
> > > >   - Boqun Feng is working hard on the different options for
> > > >     threading abstractions and has reviewed most of the `sync` PRs.
> > >
> > > Boqun, I know you're familiar with LKMM, can you please talk about how
> > > Rust does things and how it interacts?
> >
> > As Wedson said in the other email, currently there is no code requiring
> > synchronization between C side and Rust side, so we are currently fine.
> > But in the longer term, we need to teach Rust memory model about the
> > "design patterns" used in Linux kernel for parallel programming.
> >
> > What I have been doing so far is reviewing patches which have memory
> > orderings in Rust-for-Linux project, try to make sure we don't include
> > memory ordering bugs for the beginning.

I believe that compatibility with both C/C++ and the Linux kernel are
important.

							Thanx, Paul
Josh Triplett April 16, 2021, 8:48 p.m. UTC | #45
On Fri, Apr 16, 2021 at 12:27:39PM +0800, Boqun Feng wrote:
> Josh, I think it's good if we can connect to the people working on Rust
> memoryg model, I think the right person is Ralf Jung and the right place
> is https://github.com/rust-lang/unsafe-code-guidelines, but you
> cerntainly know better than me ;-) Or maybe we can use Rust-for-Linux or
> linux-toolchains list to discuss.

Ralf is definitely the right person to talk to. I don't think the UCG
repository is the right place to start that discussion, though. For now,
I'd suggest starting an email thread with Ralf and some C-and-kernel
memory model folks (hi Paul!) to suss out the most important changes
that would be needed.

With my language team hat on, I'd *absolutely* like to see the Rust
memory model support RCU-style deferred reclamation in a sound way,
ideally with as little unsafe code as possible.
comex April 17, 2021, 5:23 a.m. UTC | #46
On Fri, Apr 16, 2021 at 4:24 AM Peter Zijlstra <peterz@infradead.org> wrote:
> Simlar thing for RCU; C11 can't optimally do that; it needs to make
> rcu_dereference() a load-acquire [something ARM64 has already done in C
> because the compiler might be too clever by half when doing LTO :-(].
> But it's the compiler needing the acquire semantics, not the computer,
> which is just bloody wrong.

You may already know, but perhaps worth clarifying:

C11 does have atomic_signal_fence() which is a compiler fence.  But a
compiler fence only ensures the loads will be emitted in the right
order, not that the CPU will execute them in the right order.  CPU
architectures tend to guarantee that two loads will be executed in the
right order if the second one's address depends on the first one's
result, but a dependent load can stop being dependent after compiler
optimizations involving value speculation.  Using a load-acquire works
around this, not because it stops the compiler from performing any
optimization, but because it tells the computer to execute the loads
in the right order *even if* the compiler has broken the value
dependence.

So C11 atomics don't make the situation worse, compared to Linux's
atomics implementation based on volatile and inline assembly.  Both
are unsound in the presence of value speculation.  C11 atomics were
*supposed* to make the situation better, with memory_order_consume,
which would have specifically forbidden the compiler from performing
value speculation.  But all the compilers punted on getting this to
work and instead just implemented memory_order_consume as
memory_order_acquire.

As for Rust, it compiles to the same LLVM IR that Clang compiles C
into.  Volatile, inline assembly, and C11-based atomics: all of these
are available in Rust, and generate exactly the same code as their C
counterparts, for better or for worse.  Unfortunately, the Rust
project has relatively limited muscle when it comes to contributing to
LLVM.  So while it would definitely be nice if Rust could make RCU
sound, and from a specification perspective I think people would be
quite willing and probably easier to work with than the C committee...
I suspect that implementing this would require the kind of sweeping
change to LLVM that is probably not going to come from Rust.

There are other areas where I think that kind of discussion might be
more fruitful.  For example, the Rust documentation currently says
that a volatile read racing with a non-volatile write (i.e. seqlocks)
is undefined behavior. [1]  However, I am of the opinion that this is
essentially a spec bug, for reasons that are probably not worth
getting into here.

[1] https://doc.rust-lang.org/nightly/std/ptr/fn.read_volatile.html
Peter Zijlstra April 17, 2021, 11:17 a.m. UTC | #47
On Fri, Apr 16, 2021 at 07:08:29PM +0100, Matthew Wilcox wrote:
> On Fri, Apr 16, 2021 at 07:18:48PM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 16, 2021 at 07:10:17PM +0200, Miguel Ojeda wrote:
> > 
> > > Of course, UB is only a subset of errors, but it is a major one, and
> > > particularly critical for privileged code.
> > 
> > I've seen relatively few UBSAN warnings that weren't due to UBSAN being
> > broken.
> 
> Lucky you.
> 
> 84c34df158cf215b0cd1475ab3b8e6f212f81f23
> 
> (i'd argue this is C being broken; promoting only as far as int, when
> assigning to an unsigned long is Bad, but until/unless either GCC fixes
> that or the language committee realises that being stuck in the 1970s
> is Bad, people are going to keep making this kind of mistake)

Well, I think the rules actually make sense, at the point in the syntax
tree where + happens, we have 'unsigned char' and 'int', so at that
point we promote to 'int'. Subsequently 'int' gets shifted and bad
things happen.

The 'unsigned long' doesn't happen until quite a bit later.

Anyway, the rules are imo fairly clear and logical, but yes they can be
annoying. The really silly thing here is that << and >> have UB at all,
and I would love a -fwrapv style flag that simply defines it. Yes it
will generate worse code in some cases, but having the UB there is just
stupid.

That of course doesn't help your case here, it would simply misbehave
and not be UB.

Another thing the C rules cannot really express is a 32x32->64
multiplication, some (older) versions of GCC can be tricked into it, but
mostly it just doesn't want to do that sanely and the C rules are
absolutely no help there.
Willy Tarreau April 17, 2021, 11:46 a.m. UTC | #48
On Sat, Apr 17, 2021 at 01:17:21PM +0200, Peter Zijlstra wrote:
> Well, I think the rules actually make sense, at the point in the syntax
> tree where + happens, we have 'unsigned char' and 'int', so at that
> point we promote to 'int'. Subsequently 'int' gets shifted and bad
> things happen.

That's always the problem caused by signedness being applied to the
type while modern machines do not care about that and use it during
(or even after) the operation instead :-/

We'd need to define some macros to zero-extend and sign-extend some
values to avoid such issues. I'm sure this would be more intuitive
than trying to guess how many casts (and in what order) to place to
make sure an operation works as desired.

> The 'unsigned long' doesn't happen until quite a bit later.
> 
> Anyway, the rules are imo fairly clear and logical, but yes they can be
> annoying. The really silly thing here is that << and >> have UB at all,
> and I would love a -fwrapv style flag that simply defines it. Yes it
> will generate worse code in some cases, but having the UB there is just
> stupid.

I'd also love to have a UB-less mode with well defined semantics for
plenty of operations that are known to work well on modern machines,
like integer wrapping, bit shifts ignoring higher bits etc. Lots of
stuff we often have to write useless code for, just to please the
compiler.

> That of course doesn't help your case here, it would simply misbehave
> and not be UB.
> 
> Another thing the C rules cannot really express is a 32x32->64
> multiplication, some (older) versions of GCC can be tricked into it, but
> mostly it just doesn't want to do that sanely and the C rules are
> absolutely no help there.

For me the old trick of casting one side as long long still works:

  unsigned long long mul3264(unsigned int a, unsigned int b)
  {
        return (unsigned long long)a * b;
  }

i386:
  00000000 <mul3264>:
     0: 8b 44 24 08           mov    0x8(%esp),%eax
     4: f7 64 24 04           mull   0x4(%esp)
     8: c3                    ret    

x86_64:
  0000000000000000 <mul3264>:
     0: 89 f8                 mov    %edi,%eax
     2: 89 f7                 mov    %esi,%edi
     4: 48 0f af c7           imul   %rdi,%rax
     8: c3                    retq   

Or maybe you had something else in mind ?

Willy
David Laight April 17, 2021, 12:41 p.m. UTC | #49
From: Peter Zijlstra
> Sent: 16 April 2021 15:19
> 
> On Fri, Apr 16, 2021 at 02:07:49PM +0100, Wedson Almeida Filho wrote:
> > On Fri, Apr 16, 2021 at 01:24:23PM +0200, Peter Zijlstra wrote:
> 
> > >  int perf_event_task_enable(void)
> > >  {
> > > +	DEFINE_MUTEX_GUARD(event_mutex, &current->perf_event_mutex);
> >
> > There is nothing in C forcing developers to actually use DEFINE_MUTEX_GUARD. So
> > someone may simply forget (or not know that they need) to lock
> > current->perf_event_mutex and directly access some field protected by it. This
> > is unlikely to happen when one first writes the code, but over time as different
> > people modify the code and invariants change, it is possible for this to happen.
> >
> > In Rust, this isn't possible: the data protected by a lock is only accessible
> > when the lock is locked. So developers cannot accidentally make mistakes of this
> > kind. And since the enforcement happens at compile time, there is no runtime
> > cost.
> >
> > This, we believe, is fundamental to the discussion: we agree that many of these
> > idioms can be implemented in C (albeit in this case with a compiler extension),
> > but their use is optional, people can (and do) still make mistakes that lead to
> > vulnerabilities; Rust disallows classes of  mistakes by construction.
> 
> Does this also not prohibit constructs where modification must be done
> while holding two locks, but reading can be done while holding either
> lock?
> 
> That's a semi common scheme in the kernel, but not something that's
> expressible by, for example, the Java sync keyword.
> 
> It also very much doesn't work for RCU, where modification must be done
> under a lock, but access is done essentially lockless.
...

Or the cases where the locks are released in the 'wrong' order.
Typically for:
	lock(table)
	item = lookup(table, key)
	lock(item)
	unlock(table)
	...
	unlock(item)

(In the kernel the table lock might be RCU.)

Or, with similar data:
	write_lock(table);
	foreach(item, table)
		lock(item)
		unlock(item)
	/* No items can be locked until we release the write_lock.
	...
	unlock(table)

You can also easily end up with a 'fubar' we have at work where
someone wrote a C++ condvar class that inherits from mutex.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Laight April 17, 2021, 12:46 p.m. UTC | #50
..
> The more you make it look like (Kernel) C, the easier it is for us C
> people to actually read. My eyes have been reading C for almost 30 years
> by now, they have a lexer built in the optical nerve; reading something
> that looks vaguely like C but is definitely not C is an utterly painful
> experience.

I'll see your 30 years and raise to over 35.
(And writing code that accesses hardware for 6 or 7 years before that.)

Both Java and go can look more like the K&R style C than any of the
examples from microsoft - which seem to utilise as much vertical space
as humanly? possible.

Those rust examples seemed to be of the horrid microsoft sytle.
Nothing about that style makes reading code easy.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Wedson Almeida Filho April 17, 2021, 1:01 p.m. UTC | #51
On Sat, Apr 17, 2021 at 12:41:23PM +0000, David Laight wrote:
> Or the cases where the locks are released in the 'wrong' order.
> Typically for:
> 	lock(table)
> 	item = lookup(table, key)
> 	lock(item)
> 	unlock(table)
> 	...
> 	unlock(item)

This is expressible in Rust with something like:

    table = table_mutex.lock()
    item = table.lookup(key).lock()
    drop(table)
    ...
    // item will be unlocked when it goes out of scope or on drop(item)

The added bonus here from Rust is that table is not accessible after
drop(table), so a developer cannot accidentally access fields after unlocking
it.

> 
> (In the kernel the table lock might be RCU.)
> 
> Or, with similar data:
> 	write_lock(table);
> 	foreach(item, table)
> 		lock(item)
> 		unlock(item)
> 	/* No items can be locked until we release the write_lock.
> 	...
> 	unlock(table)

I think I'm missing something here. Would you help me understand what part is
out of the ordinary in the code above? It would be expressible in Rust with
something like:

    table = table_mutex.write();
    for (item_mutex in table)
        item = item_mutex.lock
        // item is unlocked at the end of the loop iteration (out of scope)
    // table gets unlocked when it goes out of scope

Cheers,
-Wedson
Wedson Almeida Filho April 17, 2021, 1:29 p.m. UTC | #52
On Fri, Apr 16, 2021 at 04:03:07PM +0100, Matthew Wilcox wrote:
> Well, we could do that in C too.
> 
> struct unlocked_inode {
> 	spinlock_t i_lock;
> };
> 
> struct locked_inode {
> 	spinlock_t i_lock;
> 	unsigned short i_bytes;
> 	blkcnt_t i_blocks;
> };
> 
> struct locked_inode *lock_inode(struct unlocked_inode *inode)
> {
> 	spin_lock(&inode->i_lock);
> 	return (struct locked_inode *)inode;
> }

Indeed you can do this kind of thing in C, but as I said before (apologies if
I'm too repetitive on this) Rust forces you to do it the right way, whereas the
lack of enforcement in C leaves room for mistakes.

If you do add extensions to C to add some of these restrictions (and I encourage
you to pursue such extensions as we all benefit from better C), it is likely not
sufficient to reach the level of compile-time guarantee that Rust offers because
you need a whole slew of restrictions/enforcements.

I also note that academics have a formalisation of [a subset of] Rust that show
the soundness of these guarantees and the requirements on unsafe to compose
safely. So we're not talking about guesswork, there are formal machine-checked
proofs published about this (see for example
https://people.mpi-sws.org/~dreyer/papers/safe-sysprog-rust/paper.pdf).
David Laight April 17, 2021, 1:46 p.m. UTC | #53
From: Peter Zijlstra
> Sent: 17 April 2021 12:17
...
> > (i'd argue this is C being broken; promoting only as far as int, when
> > assigning to an unsigned long is Bad, but until/unless either GCC fixes
> > that or the language committee realises that being stuck in the 1970s
> > is Bad, people are going to keep making this kind of mistake)
> 
> Well, I think the rules actually make sense, at the point in the syntax
> tree where + happens, we have 'unsigned char' and 'int', so at that
> point we promote to 'int'. Subsequently 'int' gets shifted and bad
> things happen.

The 1970s were fine.
K&R C was sign preserving - so 'unsigned char' promoted to 'unsigned int'.
The ANSI C committee broke things by changing it to value preserving
with the strong preference for signed types.

Even with ANSI C the type of ((unsigned char)x + 1) can be unsigned!
All it needs as an architecture where sizeof (int) == 1.
(sizeof (char) has to be 1 - even though that isn't directly explicit.)

Of course, having 32-bit 'char' and 'int' does give problems with
the value for EOF.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Wedson Almeida Filho April 17, 2021, 1:53 p.m. UTC | #54
On Fri, Apr 16, 2021 at 06:14:44PM +0200, Willy Tarreau wrote:

> But will this remain syntactically readable/writable by mere humans ?

I would certainly hope so.

> > Note that this is
> > another area where Rust offers advantages: read-only guards (in C, if you take a
> > read lock, nothing prevents you from making changes to fields you should only be
> > allowed to read);
> 
> But I'm happily doing that when I know what I'm doing. What you call a
> read lock usually is in fact a shared lock as opposed to an exclusive
> lock (generally used for writes). For me it's perfectly valid to perform
> atomic writes under a read lock instead of forcing everyone to wait by
> taking a write lock. You may for example take a read lock on a structure
> to make sure that a field you're accessing in it points to stable memory
> that is only modified under the write lock, but the pointer itself is
> atomically accessed and swapped under the read lock.

Yes, this is a great example. Also easily expressible in Rust: they have this
concept of interior mutability where certain types allow their contents to be
modified even when shared immutably. Atomics offer such interior mutability, so
the scenario you describe is fine.

Rust in fact has an extra enforcement here that C doesn't: it requires interior
mutability for this scenario to be allowed, so you can't do it with a plain
naked type (say u64) -- you'd need to use something like an atomic64_t, where
you're required to specify memory ordering when accessing them.

In C we of course have atomics but the compiler never alerts us for when we need
them.

> > In fact, this is also an advantage of Rust. It would *force* developers to
> > lock/unlock the RCU lock before they can access the protected data.
> 
> I'm really afraid by languages which force developers to do this or that.

When I say that Rust forces developers to do certain things, it's to provide the
compile-time safety guarantees. Some of these requirements are imposed by our
own abstractions -- we can always revisit and try to improve them. In cases when
the abstractions cannot be further refined, developers always have the escape
hatch of unsafety, where they're allowed to do pretty much everything as in C,
but then they also give up the compile-time guarantees for those parts.
Willy Tarreau April 17, 2021, 2:21 p.m. UTC | #55
On Sat, Apr 17, 2021 at 02:53:10PM +0100, Wedson Almeida Filho wrote:
> > > Note that this is
> > > another area where Rust offers advantages: read-only guards (in C, if you take a
> > > read lock, nothing prevents you from making changes to fields you should only be
> > > allowed to read);
> > 
> > But I'm happily doing that when I know what I'm doing. What you call a
> > read lock usually is in fact a shared lock as opposed to an exclusive
> > lock (generally used for writes). For me it's perfectly valid to perform
> > atomic writes under a read lock instead of forcing everyone to wait by
> > taking a write lock. You may for example take a read lock on a structure
> > to make sure that a field you're accessing in it points to stable memory
> > that is only modified under the write lock, but the pointer itself is
> > atomically accessed and swapped under the read lock.
> 
> Yes, this is a great example. Also easily expressible in Rust: they have this
> concept of interior mutability where certain types allow their contents to be
> modified even when shared immutably. Atomics offer such interior mutability, so
> the scenario you describe is fine.
> 
> Rust in fact has an extra enforcement here that C doesn't: it requires interior
> mutability for this scenario to be allowed, so you can't do it with a plain
> naked type (say u64) -- you'd need to use something like an atomic64_t, where
> you're required to specify memory ordering when accessing them.
> 
> In C we of course have atomics but the compiler never alerts us for when we need
> them.

OK thanks for explaining.

> > > In fact, this is also an advantage of Rust. It would *force* developers to
> > > lock/unlock the RCU lock before they can access the protected data.
> > 
> > I'm really afraid by languages which force developers to do this or that.
> 
> When I say that Rust forces developers to do certain things, it's to provide the
> compile-time safety guarantees. Some of these requirements are imposed by our
> own abstractions -- we can always revisit and try to improve them. In cases when
> the abstractions cannot be further refined, developers always have the escape
> hatch of unsafety, where they're allowed to do pretty much everything as in C,
> but then they also give up the compile-time guarantees for those parts.

Well, I can't express how much I hate abstractions because I constantly
need to know what it's doing under the hood, and I spend my time reading
the output asm code because I always want to confirm my assumptions about
the compiler not cheating on me (and not hitting one of its bugs),
especially after C compilers have become so smart that they completely
replace your code with what they think is better for you, (including
nothing), so I guess all of this is really not for someone like me.

However while I'm pretty sure that based on our respective experiences
we'd probably disagree forever on a wide number of approaches when it
comes to deciding whether the developer or the compiler should have the
last say, I sincerely appreciate that you take the time to calmly explain
your differing views and the rationale behind, so many thanks for this!

Willy
Peter Zijlstra April 17, 2021, 2:24 p.m. UTC | #56
On Sat, Apr 17, 2021 at 01:46:23PM +0200, Willy Tarreau wrote:
> For me the old trick of casting one side as long long still works:
> 
>   unsigned long long mul3264(unsigned int a, unsigned int b)
>   {
>         return (unsigned long long)a * b;
>   }
> 
> i386:
>   00000000 <mul3264>:
>      0: 8b 44 24 08           mov    0x8(%esp),%eax
>      4: f7 64 24 04           mull   0x4(%esp)
>      8: c3                    ret    
> 
> x86_64:
>   0000000000000000 <mul3264>:
>      0: 89 f8                 mov    %edi,%eax
>      2: 89 f7                 mov    %esi,%edi
>      4: 48 0f af c7           imul   %rdi,%rax
>      8: c3                    retq   
> 
> Or maybe you had something else in mind ?

Last time I tried it, the thing refused :/ which is how we ended up with
mul_u32_u32() in asm.
Willy Tarreau April 17, 2021, 2:36 p.m. UTC | #57
On Sat, Apr 17, 2021 at 04:24:43PM +0200, Peter Zijlstra wrote:
> On Sat, Apr 17, 2021 at 01:46:23PM +0200, Willy Tarreau wrote:
> > For me the old trick of casting one side as long long still works:
> > 
> >   unsigned long long mul3264(unsigned int a, unsigned int b)
> >   {
> >         return (unsigned long long)a * b;
> >   }
> > 
> > i386:
> >   00000000 <mul3264>:
> >      0: 8b 44 24 08           mov    0x8(%esp),%eax
> >      4: f7 64 24 04           mull   0x4(%esp)
> >      8: c3                    ret    
> > 
> > x86_64:
> >   0000000000000000 <mul3264>:
> >      0: 89 f8                 mov    %edi,%eax
> >      2: 89 f7                 mov    %esi,%edi
> >      4: 48 0f af c7           imul   %rdi,%rax
> >      8: c3                    retq   
> > 
> > Or maybe you had something else in mind ?
> 
> Last time I tried it, the thing refused :/ which is how we ended up with
> mul_u32_u32() in asm.

Oh I trust you, I do remember having noticed it on one gcc version as
well (maybe 4.5). But I've been successfully using this since 2.95, and
could quickly recheck that 4.7, 4.8, 5.4, 6.5, 7.4, 9.3 and 11-trunk do
produce the code above, which is reassuring, as we all prefer to limit
the amount of asm statements.

Willy
Paolo Bonzini April 17, 2021, 2:51 p.m. UTC | #58
On 16/04/21 09:09, Peter Zijlstra wrote:
> Well, the obvious example would be seqlocks. C11 can't do them

Sure it can.  C11 requires annotating with (the equivalent of) READ_ONCE 
all reads of seqlock-protected fields, but the memory model supports 
seqlocks just fine.

> Simlar thing for RCU; C11 can't optimally do that

Technically if you know what you're doing (i.e. that you're not on 
Alpha) you can do RCU using a relaxed load followed by an 
atomic_signal_fence(memory_order_consume).  Which I agree is horrible 
and not entirely within the standard, but it works in practice.  The 
Linux implementation of memory barriers, atomic RMW primitives, 
load-acquire/store-release etc. is also completely outside the standard, 
so it's not much different and more portable.

The only thing that I really, really miss when programming with C11 
atomics is smp_mb__{before,after}_atomic().

Paolo
Paolo Bonzini April 17, 2021, 3:11 p.m. UTC | #59
On 16/04/21 17:58, Theodore Ts'o wrote:
> Another fairly common use case is a lockless, racy test of a
> particular field, as an optimization before we take the lock before we
> test it for realsies.  In this particular case, we can't allocate
> memory while holding a spinlock, so we check to see without taking the
> spinlock to see whether we should allocate memory (which is expensive,
> and unnecessasry most of the time):
> 
> alloc_transaction:
> 	/*
> 	 * This check is racy but it is just an optimization of allocating new
> 	 * transaction early if there are high chances we'll need it. If we
> 	 * guess wrong, we'll retry or free the unused transaction.
> 	 */
> 	if (!data_race(journal->j_running_transaction)) {
> 		/*
> 		 * If __GFP_FS is not present, then we may be being called from
> 		 * inside the fs writeback layer, so we MUST NOT fail.
> 		 */
> 		if ((gfp_mask & __GFP_FS) == 0)
> 			gfp_mask |= __GFP_NOFAIL;
> 		new_transaction = kmem_cache_zalloc(transaction_cache,
> 						    gfp_mask);
> 		if (!new_transaction)
> 			return -ENOMEM;
> 	}

 From my limited experience with Rust, things like these are a bit 
annoying indeed, sooner or later Mutex<> just doesn't cut it and you 
have to deal with its limitations.

In this particular case you would use an AtomicBool field, place it 
outside the Mutex-protected struct, and make sure that is only accessed 
under the lock just like in C.
One easy way out is to make the Mutex protect (officially) nothing, i.e. 
Mutex<()>, and handle the mutable fields yourself using RefCell (which 
gives you run-time checking but has some some space cost) or UnsafeCell 
(which is unsafe as the name says).  Rust makes it pretty easy to write 
smart pointers (Mutex<>'s lock guard itself is a smart pointer) so you 
also have the possibility of writing a safe wrapper for the combination 
of Mutex<()> and UnsafeCell.

Another example is when yu have a list of XYZ objects and use the same 
mutex for both the list of XYZ and a field in struct XYZ.  You could 
place that field in an UnsafeCell and write a function that receives a 
guard for the list lock and returns the field, or something like that. 
It *is* quite ugly though.

As an aside, from a teaching standpoint associating a Mutex with a 
specific data structure is bad IMNSHO, because it encourages too 
fine-grained locking.  Sometimes the easiest path to scalability is to 
use a more coarse lock and ensure that contention is extremely rare. 
But it does work for most simple use cases (and device drivers would 
qualify as simple more often than not).

Paolo
Miguel Ojeda April 17, 2021, 3:23 p.m. UTC | #60
On Sat, Apr 17, 2021 at 4:21 PM Willy Tarreau <w@1wt.eu> wrote:
>
> Well, I can't express how much I hate abstractions because I constantly
> need to know what it's doing under the hood, and I spend my time reading
> the output asm code because I always want to confirm my assumptions about
> the compiler not cheating on me (and not hitting one of its bugs),
> especially after C compilers have become so smart that they completely
> replace your code with what they think is better for you, (including
> nothing), so I guess all of this is really not for someone like me.

Concerning compiler bugs etc.: as you mention, nowadays that applies
to both C and Rust.

Manually inspecting the output asm does not really scale anymore
because we need to worry about all compiler versions out there,
including future ones, for both GCC and Clang.

So we would need to disable optimizations, or reduce the set of
supported compiler versions, or automatically check the generated asm
(like in the compiler's own test suites), or keep binary blobs after
checking (like in some safety-critical domains), etc.

Since none of those are really doable for us (except perhaps for small
subsets of code or unit tests), we need other ways to help with this.
Rust provides several here. For instance, the UB-less subset means
less surprises and less double-checking if some particular construct
is UB and may give problems later on.

Similarly, Rust is actually more explicit in many cases than C, to
reduce surprises further. For instance, regarding implicit type
conversions, none of these compile:

    fn f1(n: i32) -> i64 { n }
    fn f2(n: i32, m: i64) { n + m; }
    fn f3(b: bool) -> i32 { b }
    fn f4(n: i32) -> bool { n }
    fn f5(n: i32) -> i32 { if n { 42 } else { 53 } }

Building abstractions also helps to ensure you get the semantics you
want in the face of smart optimizers -- rather than the opposite. For
instance, continuing with the integer examples, you may use a
`NonZeroU32`. Or a `Wrapping<u32>` for intentional wrapping integer
arithmetic, etc.

Cheers,
Miguel
Richard Weinberger April 17, 2021, 8:42 p.m. UTC | #61
On Thu, Apr 15, 2021 at 2:41 AM <ojeda@kernel.org> wrote:
> Regarding compilers, we support Clang-built kernels as well as
> `LLVM=1` builds where possible (i.e. as long as supported by
> the ClangBuiltLinux project). We also maintain some configurations
> of GCC-built kernels working, but they are not intended to be used
> at the present time. Having a `bindgen` backend for GCC would be
> ideal to improve support for those builds.

Sp this effectively means gcc is a second class citizen and even if
gcc is supported
at some point one needs a super recent gcc *and* rust toolchain to build
a rust-enabeled kernel?
I understand that this is right now not a big deal, but as soon a
non-trival subsystem
is rust-only people are forced to upgrade.

Don't get me wrong, I'm all for having rust support in Linux.
But I'm a bit worried about new dependencies on compiler toolchains.
As someone who works a lot with long supported embedded systems I learned that
as soon an application gains a hard dependency on clang or rust I'm in trouble.
Wedson Almeida Filho April 18, 2021, 3:31 p.m. UTC | #62
On Wed, Apr 14, 2021 at 09:09:53PM +0100, Matthew Wilcox wrote:
> By the way, I don't think that Rust necessarily has to conform to the
> current way that Linux works.  If this prompted us to track the current
> context (inside spinlock, handling interrupt, performing writeback, etc)
> and do away with (some) GFP flags, that's not the end of the world.
> We're already moving in that direction to a certain extent with the
> scoped memory allocation APIs to replace GFP_NOFS / GFP_NOIO.

I hadn't myself considered this option but it looks enticing to me. Do you have
a sense of which GFP flags we wouldn't be able to obviate even if we did track
state?
Wedson Almeida Filho April 18, 2021, 3:51 p.m. UTC | #63
On Sat, Apr 17, 2021 at 04:21:27PM +0200, Willy Tarreau wrote:
> Well, I can't express how much I hate abstractions because I constantly
> need to know what it's doing under the hood, and I spend my time reading
> the output asm code because I always want to confirm my assumptions about
> the compiler not cheating on me (and not hitting one of its bugs),
> especially after C compilers have become so smart that they completely
> replace your code with what they think is better for you, (including
> nothing),

I understand the feeling. One thing I can say about the abstractions we've been
talking about is that they're zero-cost. So you'd still have the ability to
inspect generated code and relate that to source, although it would still be
subject to optimisations like C (or perhaps more optimisations as the compiler
knows more about the code).

> so I guess all of this is really not for someone like me.

This may indeed be the case. But I'd invite you to try it out for yourself
anyway before discounting it. I used to hate destructors in C++ because they
were called implicitly: C was king because I had full control. Now I find myself
publicly backing Rust. I feel the advantages outweigh the cost.

> However while I'm pretty sure that based on our respective experiences
> we'd probably disagree forever on a wide number of approaches when it
> comes to deciding whether the developer or the compiler should have the
> last say, I sincerely appreciate that you take the time to calmly explain
> your differing views and the rationale behind, so many thanks for this!

Thank you. I also appreciate your willingness to engage with us.

Cheers,
-Wedson
Peter Zijlstra April 19, 2021, 7:32 a.m. UTC | #64
On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
> On 16/04/21 09:09, Peter Zijlstra wrote:
> > Well, the obvious example would be seqlocks. C11 can't do them
> 
> Sure it can.  C11 requires annotating with (the equivalent of) READ_ONCE all
> reads of seqlock-protected fields, but the memory model supports seqlocks
> just fine.

How does that help?

IIRC there's two problems, one on each side the lock. On the write side
we have:

	seq++;
	smp_wmb();
	X = r;
	Y = r;
	smp_wmb();
	seq++;

Which C11 simply cannot do right because it does't have wmb. You end up
having to use seq_cst for the first wmb or make both X and Y (on top of
the last seq) a store-release, both options are sub-optimal.

On the read side you get:

	do {
	  s = seq;
	  smp_rmb();
	  r1 = X;
	  r2 = Y;
	  smp_rmb();
	} while ((s&1) || seq != s);

And then you get into trouble the last barrier, so the first seq load
can be load-acquire, after which the loads of X, Y come after, but you
need then to happen before the second seq load, for which you then need
seq_cst, or make X and Y load-acquire. Again, not optimal.

I have also seen *many* broken variants of it on the web. Some work on
x86 but are totally broken when you build them on LL/SC ARM64.
Paolo Bonzini April 19, 2021, 7:53 a.m. UTC | #65
On 19/04/21 09:32, Peter Zijlstra wrote:
> On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
>> On 16/04/21 09:09, Peter Zijlstra wrote:
>>> Well, the obvious example would be seqlocks. C11 can't do them
>>
>> Sure it can.  C11 requires annotating with (the equivalent of) READ_ONCE all
>> reads of seqlock-protected fields, but the memory model supports seqlocks
>> just fine.
> 
> How does that help?
> 
> IIRC there's two problems, one on each side the lock. On the write side
> we have:
> 
> 	seq++;
> 	smp_wmb();
> 	X = r;
> 	Y = r;
> 	smp_wmb();
> 	seq++;
> 
> Which C11 simply cannot do right because it does't have wmb.

It has atomic_thread_fence(memory_order_release), and 
atomic_thread_fence(memory_order_acquire) on the read side.

> You end up
> having to use seq_cst for the first wmb or make both X and Y (on top of
> the last seq) a store-release, both options are sub-optimal.

seq_cst (except for the fence which is just smp_mb) is a pile of manure, 
no doubt about that. :)

Paolo
Peter Zijlstra April 19, 2021, 8:26 a.m. UTC | #66
On Mon, Apr 19, 2021 at 09:53:06AM +0200, Paolo Bonzini wrote:
> On 19/04/21 09:32, Peter Zijlstra wrote:
> > On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
> > > On 16/04/21 09:09, Peter Zijlstra wrote:
> > > > Well, the obvious example would be seqlocks. C11 can't do them
> > > 
> > > Sure it can.  C11 requires annotating with (the equivalent of) READ_ONCE all
> > > reads of seqlock-protected fields, but the memory model supports seqlocks
> > > just fine.
> > 
> > How does that help?
> > 
> > IIRC there's two problems, one on each side the lock. On the write side
> > we have:
> > 
> > 	seq++;
> > 	smp_wmb();
> > 	X = r;
> > 	Y = r;
> > 	smp_wmb();
> > 	seq++;
> > 
> > Which C11 simply cannot do right because it does't have wmb.
> 
> It has atomic_thread_fence(memory_order_release), and
> atomic_thread_fence(memory_order_acquire) on the read side.

https://godbolt.org/z/85xoPxeE5

void writer(void)
{
    atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
    atomic_thread_fence(memory_order_acquire);

    X = 1;
    Y = 2;

    atomic_store_explicit(&seq, seq+1, memory_order_release);
}

gives:

writer:
        adrp    x1, .LANCHOR0
        add     x0, x1, :lo12:.LANCHOR0
        ldr     w2, [x1, #:lo12:.LANCHOR0]
        add     w2, w2, 1
        str     w2, [x0]
        dmb     ishld
        ldr     w1, [x1, #:lo12:.LANCHOR0]
        mov     w3, 1
        mov     w2, 2
        stp     w3, w2, [x0, 4]
        add     w1, w1, w3
        stlr    w1, [x0]
        ret

Which, afaict, is completely buggered. What it seems to be doing is
turning the seq load into a load-acquire, but what we really need is to
make sure the seq store (increment) is ordered before the other stores.
Peter Zijlstra April 19, 2021, 8:35 a.m. UTC | #67
On Mon, Apr 19, 2021 at 10:26:57AM +0200, Peter Zijlstra wrote:

> https://godbolt.org/z/85xoPxeE5

That wants _Atomic on the seq definition for clang.

> void writer(void)
> {
>     atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
>     atomic_thread_fence(memory_order_acquire);
> 
>     X = 1;
>     Y = 2;
> 
>     atomic_store_explicit(&seq, seq+1, memory_order_release);
> }
> 
> gives:
> 
> writer:
>         adrp    x1, .LANCHOR0
>         add     x0, x1, :lo12:.LANCHOR0
>         ldr     w2, [x1, #:lo12:.LANCHOR0]
>         add     w2, w2, 1
>         str     w2, [x0]
>         dmb     ishld
>         ldr     w1, [x1, #:lo12:.LANCHOR0]
>         mov     w3, 1
>         mov     w2, 2
>         stp     w3, w2, [x0, 4]
>         add     w1, w1, w3
>         stlr    w1, [x0]
>         ret
> 
> Which, afaict, is completely buggered. What it seems to be doing is
> turning the seq load into a load-acquire, but what we really need is to
> make sure the seq store (increment) is ordered before the other stores.

Put differently, what you seem to want is store-acquire, but there ain't
no such thing.
Paolo Bonzini April 19, 2021, 9:02 a.m. UTC | #68
On 19/04/21 10:26, Peter Zijlstra wrote:
> On Mon, Apr 19, 2021 at 09:53:06AM +0200, Paolo Bonzini wrote:
>> On 19/04/21 09:32, Peter Zijlstra wrote:
>>> On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
>>>> On 16/04/21 09:09, Peter Zijlstra wrote:
>>>>> Well, the obvious example would be seqlocks. C11 can't do them
>>>>
>>>> Sure it can.  C11 requires annotating with (the equivalent of) READ_ONCE all
>>>> reads of seqlock-protected fields, but the memory model supports seqlocks
>>>> just fine.
>>>
>>> How does that help?
>>>
>>> IIRC there's two problems, one on each side the lock. On the write side
>>> we have:
>>>
>>> 	seq++;
>>> 	smp_wmb();
>>> 	X = r;
>>> 	Y = r;
>>> 	smp_wmb();
>>> 	seq++;
>>>
>>> Which C11 simply cannot do right because it does't have wmb.
>>
>> It has atomic_thread_fence(memory_order_release), and
>> atomic_thread_fence(memory_order_acquire) on the read side.
> 
> https://godbolt.org/z/85xoPxeE5
> 
> void writer(void)
> {
>      atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
>      atomic_thread_fence(memory_order_acquire);

This needs to be memory_order_release.  The only change in the resulting 
assembly is that "dmb ishld" becomes "dmb ish", which is not as good as 
the "dmb ishst" you get from smp_wmb() but not buggy either.

The read side can use "dmb ishld" so it gets the same code as Linux.

LWN needs a "C11 memory model for kernel folks" article.  In the 
meanwhile there is 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html 
which is the opposite (Linux kernel memory model for C11 folks).

Paolo

> 
>      X = 1;
>      Y = 2;
> 
>      atomic_store_explicit(&seq, seq+1, memory_order_release);
> }
> 
> gives:
> 
> writer:
>          adrp    x1, .LANCHOR0
>          add     x0, x1, :lo12:.LANCHOR0
>          ldr     w2, [x1, #:lo12:.LANCHOR0]
>          add     w2, w2, 1
>          str     w2, [x0]
>          dmb     ishld
>          ldr     w1, [x1, #:lo12:.LANCHOR0]
>          mov     w3, 1
>          mov     w2, 2
>          stp     w3, w2, [x0, 4]
>          add     w1, w1, w3
>          stlr    w1, [x0]
>          ret
> 
> Which, afaict, is completely buggered. What it seems to be doing is
> turning the seq load into a load-acquire, but what we really need is to
> make sure the seq store (increment) is ordered before the other stores.
> 
>
Peter Zijlstra April 19, 2021, 9:36 a.m. UTC | #69
On Mon, Apr 19, 2021 at 11:02:12AM +0200, Paolo Bonzini wrote:
> > void writer(void)
> > {
> >      atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
> >      atomic_thread_fence(memory_order_acquire);
> 
> This needs to be memory_order_release.  The only change in the resulting
> assembly is that "dmb ishld" becomes "dmb ish", which is not as good as the
> "dmb ishst" you get from smp_wmb() but not buggy either.

Yuck! And that is what requires the insides to be
atomic_store_explicit(), otherwise this fence doesn't have to affect
them.

I also don't see how this is better than seq_cst.

But yes, not broken, but also very much not optimal.
Paolo Bonzini April 19, 2021, 9:40 a.m. UTC | #70
On 19/04/21 11:36, Peter Zijlstra wrote:
> On Mon, Apr 19, 2021 at 11:02:12AM +0200, Paolo Bonzini wrote:
>>> void writer(void)
>>> {
>>>       atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
>>>       atomic_thread_fence(memory_order_acquire);
>>
>> This needs to be memory_order_release.  The only change in the resulting
>> assembly is that "dmb ishld" becomes "dmb ish", which is not as good as the
>> "dmb ishst" you get from smp_wmb() but not buggy either.
> 
> Yuck! And that is what requires the insides to be
> atomic_store_explicit(), otherwise this fence doesn't have to affect
> them.

Not just that, even the write needs to be atomic_store_explicit in order 
to avoid a data race.atomic_store_explicit

> I also don't see how this is better than seq_cst.

It is better than seq_cst on TSO architectures.  Another possibility is 
to use release stores for everything (both increments and the stores 
between them).

> But yes, not broken, but also very much not optimal.

Agreed on that, just like RCU/memory_order_consume.

Paolo
Will Deacon April 19, 2021, 11:01 a.m. UTC | #71
On Mon, Apr 19, 2021 at 11:40:54AM +0200, Paolo Bonzini wrote:
> On 19/04/21 11:36, Peter Zijlstra wrote:
> > On Mon, Apr 19, 2021 at 11:02:12AM +0200, Paolo Bonzini wrote:
> > > > void writer(void)
> > > > {
> > > >       atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
> > > >       atomic_thread_fence(memory_order_acquire);
> > > 
> > > This needs to be memory_order_release.  The only change in the resulting
> > > assembly is that "dmb ishld" becomes "dmb ish", which is not as good as the
> > > "dmb ishst" you get from smp_wmb() but not buggy either.
> > 
> > Yuck! And that is what requires the insides to be
> > atomic_store_explicit(), otherwise this fence doesn't have to affect
> > them.
> 
> Not just that, even the write needs to be atomic_store_explicit in order to
> avoid a data race.atomic_store_explicit

https://wg21.link/P0690

was an attempt to address this, but I don't know if any of the ideas got
adopted in the end.

Will
Linus Torvalds April 19, 2021, 5:14 p.m. UTC | #72
On Mon, Apr 19, 2021 at 2:36 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> I also don't see how this is better than seq_cst.
>
> But yes, not broken, but also very much not optimal.

I continue to feel like kernel people should just entirely ignore the
C++ memory ordering standard.

It's inferior to what we already have, and simply not helpful. It
doesn't actually solve any problems as far as the kernel is concerned,
and it generates its own set of issues (ie assuming that the compiler
supports it, and assuming the compiler gets it right).

The really subtle cases that it could have been helpful for (eg RCU,
or the load-store control dependencies) were _too_ subtle for the
standard.

And I do not believe Rust changes _any_ of that.

Any kernel Rust code will simply have to follow the LKMM rules, and
use the kernel model for the interfaces. Things like the C++ memory
model is simply not _relevant_ to the kernel.

         Linus
Paolo Bonzini April 19, 2021, 6:38 p.m. UTC | #73
On 19/04/21 19:14, Linus Torvalds wrote:
> On Mon, Apr 19, 2021 at 2:36 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> I also don't see how this is better than seq_cst.
>>
>> But yes, not broken, but also very much not optimal.
> 
> I continue to feel like kernel people should just entirely ignore the
> C++ memory ordering standard.
> 
> It's inferior to what we already have, and simply not helpful. It
> doesn't actually solve any problems as far as the kernel is concerned,
> and it generates its own set of issues (ie assuming that the compiler
> supports it, and assuming the compiler gets it right).
> 
> The really subtle cases that it could have been helpful for (eg RCU,
> or the load-store control dependencies) were _too_ subtle for the
> standard.
> 
> And I do not believe Rust changes _any_ of that.

It changes it for the worse, in that access to fields that are shared 
across threads *must* either use atomic types (which boil down to the 
same compiler intrinsics as the C/C++ memory model) or synchronization 
primitives.  LKMM operates in the grey area between the C standard and 
what gcc/clang actually implement, but there's no such grey area in Rust 
unless somebody wants to rewrite arch/*/asm atomic access primitives and 
memory barriers in Rust.

Of course it's possible to say Rust code just uses the C/C++/Rust model 
and C code follows the LKMM, but that really only delays the inevitable 
until a driver is written part in C part in Rust, and needs to perform 
accesses outside synchronization primitives.

Paolo

> Any kernel Rust code will simply have to follow the LKMM rules, and
> use the kernel model for the interfaces. Things like the C++ memory
> model is simply not _relevant_ to the kernel.
Linus Torvalds April 19, 2021, 6:50 p.m. UTC | #74
On Mon, Apr 19, 2021 at 11:38 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> It changes it for the worse, in that access to fields that are shared
> across threads *must* either use atomic types

Well, we won't be using those broken types in the core kernel, so that
would all be entirely on the Rust side.

And I don't expect the Rust side to do a lot of non-locked accesses,
which presumably shouldn't need any of this anyway.

If Rust code ends up accessing actual real kernel data structures with
memory ordering, then that will be to types that do *not* follow the
useless C++ atomics, and that in turn presumably means that it will be
done as "unsafe" helpers that do what the LKMM does (ie READ_ONCE()
and all the rest of it).

                 Linus
Nick Desaulniers April 19, 2021, 8:35 p.m. UTC | #75
On Fri, Apr 16, 2021 at 11:47 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Apr 15, 2021 at 11:04:37PM -0700, Nick Desaulniers wrote:
> > On Thu, Apr 15, 2021 at 9:27 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> > >
> > > But I think the Rust Community still wants to have a good memory model,
> > > and they are open to any kind of suggestion and input. I think we (LKMM
> > > people) should really get involved, because the recent discussion on
> > > RISC-V's atomics shows that if we didn't people might get a "broken"
> > > design because they thought C11 memory model is good enough:
> > >
> > >         https://lore.kernel.org/lkml/YGyZPCxJYGOvqYZQ@boqun-archlinux/
> > >
> > > And the benefits are mutual: a) Linux Kernel Memory Model (LKMM) is
> > > defined by combining the requirements of developers and the behavior of
> > > hardwares, it's pratical and can be a very good input for memory model
> > > designing in Rust; b) Once Rust has a better memory model, the compiler
> > > technologies whatever Rust compilers use to suppor the memory model can
> > > be adopted to C compilers and we can get that part for free.
> >
> > Yes, I agree; I think that's a very good approach.  Avoiding the ISO
> > WG14 is interesting; at least the merits could be debated in the
> > public and not behind closed doors.
>
> WG14 (C) and WG21 (C++) are at least somewhat open.  Here are some of
> the proposals a few of us have in flight:

Wow, the working groups have been busy.  Thank you Paul and Boqun (and
anyone else on thread) for authoring many of the proposals listed
below.  Looks like I have a lot of reading to do to catch up.

Have any of those been accepted yet, or led to amendments to either
language's specs?  Where's the best place to track that?

My point has more to do with _participation_.  Rust's RFC process is
well documented (https://rust-lang.github.io/rfcs/introduction.html)
and is done via github pull requests[0].

This is a much different process than drafts thrown over the wall.
What hope do any kernel contributors have to participate in the ISO
WGs, other than hoping their contributions to a draft foresee/address
any concerns members of the committee might have?  How do members of
the ISO WG communicate with folks interested in the outcomes of their
decisions?

The two processes are quite different; one doesn't require "joining a
national body" (which I assume involves some monetary transaction, if
not for the individual participant, for their employer) for instance.
(http://www.open-std.org/jtc1/sc22/wg14/www/contributing which links
to https://www.iso.org/members.html; I wonder if we have kernel
contributors in those grayed out countries?)

It was always very ironic to me that the most important body of free
software was subject to decisions made by ISO, for better or for
worse.  I would think Rust's RFC process would be more accessible to
kernel developers, modulo the anti-github crowd, but with the Rust's
community's values in inclusion I'm sure they'd be happy to accomodate
folks for the RFC process without requiring github.  I'm not sure ISO
can be as flexible for non-members.

Either way, I think Rust's RFC process is something worth adding to
the list of benefits under the heading "Why Rust?" in this proposed
RFC.

>
> P2055R0 A Relaxed Guide to memory_order_relaxed
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2055r0.pdf
> P0124R7 Linux-Kernel Memory Model (vs. that of C/C++)
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0124r7.html
> P1726R4 Pointer lifetime-end zap
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1726r4.pdf
>         https://docs.google.com/document/d/1MfagxTa6H0rTxtq9Oxyh4X53NzKqOt7y3hZBVzO_LMk/edit?usp=sharing
> P1121R2 Hazard Pointers: Proposed Interface and Wording for Concurrency TS 2
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1121r2.pdf
> P1382R1 volatile_load<T> and volatile_store<T>
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1382r1.pdf
> P1122R2 Proposed Wording for Concurrent Data Structures: Read-Copy-Update (RCU)
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1122r2.pdf
>         https://docs.google.com/document/d/1MfagxTa6H0rTxtq9Oxyh4X53NzKqOt7y3hZBVzO_LMk/edit?usp=sharing
> P0190R4 Proposal for New memory order consume Definition
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
> P0750R1 Consume
>         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0750r1.html

Does wg14 not participate in these discussions? (Or, is there a lot of
overlap between participants in both?)
http://93.90.116.65/JTC1/SC22/WG14/www/docs/ seems like a list of
proposals and meeting minutes, but all of the above links look like
WG21.  The model of decisions being made for C++ then trickling down
to C is definitely curious.  Though perhaps for the topics of memory
orderings there's enough overlap between the two languages for it not
to matter.

>
> P1726R4 is of particular concern, along with consume.


[0] You can see all of the existing ones here:
https://github.com/rust-lang/rfcs/tree/master/text, with links to
discussions for each on github.  (Here's one that has not been
accepted yet: https://github.com/rust-lang/rfcs/blob/master/text/1937-ques-in-main.md,
see the link to the issue in the rust issue tracker has lots of
comments _from the community_:
https://github.com/rust-lang/rust/issues/43301).  I guess the
equivalents for the ISO WGs would be the meeting minutes?
--
Thanks,
~Nick Desaulniers
Paul E. McKenney April 19, 2021, 9:37 p.m. UTC | #76
On Mon, Apr 19, 2021 at 01:35:56PM -0700, Nick Desaulniers wrote:
> On Fri, Apr 16, 2021 at 11:47 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Apr 15, 2021 at 11:04:37PM -0700, Nick Desaulniers wrote:
> > > On Thu, Apr 15, 2021 at 9:27 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> > > >
> > > > But I think the Rust Community still wants to have a good memory model,
> > > > and they are open to any kind of suggestion and input. I think we (LKMM
> > > > people) should really get involved, because the recent discussion on
> > > > RISC-V's atomics shows that if we didn't people might get a "broken"
> > > > design because they thought C11 memory model is good enough:
> > > >
> > > >         https://lore.kernel.org/lkml/YGyZPCxJYGOvqYZQ@boqun-archlinux/
> > > >
> > > > And the benefits are mutual: a) Linux Kernel Memory Model (LKMM) is
> > > > defined by combining the requirements of developers and the behavior of
> > > > hardwares, it's pratical and can be a very good input for memory model
> > > > designing in Rust; b) Once Rust has a better memory model, the compiler
> > > > technologies whatever Rust compilers use to suppor the memory model can
> > > > be adopted to C compilers and we can get that part for free.
> > >
> > > Yes, I agree; I think that's a very good approach.  Avoiding the ISO
> > > WG14 is interesting; at least the merits could be debated in the
> > > public and not behind closed doors.
> >
> > WG14 (C) and WG21 (C++) are at least somewhat open.  Here are some of
> > the proposals a few of us have in flight:
> 
> Wow, the working groups have been busy.  Thank you Paul and Boqun (and
> anyone else on thread) for authoring many of the proposals listed
> below.  Looks like I have a lot of reading to do to catch up.

And this is only the proposals relating to low-level concurrency.
There are way more proposals relating to vector processing, GPGPUs,
task-based concurrency, and so on.  Here is the list of papers submitted
thus far this year:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/

> Have any of those been accepted yet, or led to amendments to either
> language's specs?  Where's the best place to track that?

None of them have yet to be accepted.  P1121R2 has been recommended for
Concurrency Technical Specification 2, which is a stepping stone to the
International Standard.  I hope that P1122R2 will follow soon.

> My point has more to do with _participation_.  Rust's RFC process is
> well documented (https://rust-lang.github.io/rfcs/introduction.html)
> and is done via github pull requests[0].
> 
> This is a much different process than drafts thrown over the wall.
> What hope do any kernel contributors have to participate in the ISO
> WGs, other than hoping their contributions to a draft foresee/address
> any concerns members of the committee might have?  How do members of
> the ISO WG communicate with folks interested in the outcomes of their
> decisions?

I participate in ISO SC22 WG21 (C++) and to a lesser extent WG14 (C).
Participation is key.  The various US National Laboratories send quite a
few people, which has a lot to do with these two standards accommodating
their wishes.

> The two processes are quite different; one doesn't require "joining a
> national body" (which I assume involves some monetary transaction, if
> not for the individual participant, for their employer) for instance.
> (http://www.open-std.org/jtc1/sc22/wg14/www/contributing which links
> to https://www.iso.org/members.html; I wonder if we have kernel
> contributors in those grayed out countries?)

Your employer is already a member of WG21 (C++), so there is no ISO
obstacle to participation by you or by your colleagues.  If you contact
me offlist, I would be happy to connect you to your employer's WG21
representative.

> It was always very ironic to me that the most important body of free
> software was subject to decisions made by ISO, for better or for
> worse.  I would think Rust's RFC process would be more accessible to
> kernel developers, modulo the anti-github crowd, but with the Rust's
> community's values in inclusion I'm sure they'd be happy to accomodate
> folks for the RFC process without requiring github.  I'm not sure ISO
> can be as flexible for non-members.

Being a member is not all that necessary.  Yes, at the end of the day,
only national bodies can formally vote, but I have not come across a
case of anyone being barred from discussions, paper submissions, or straw
polls (informal votes used to set direction) due to not being a member.
Given that the national bodies are only permitted to comment on and vote
on finished proposals, you can argue that individuals have more influence.

Similarly, it is not necessary to be personally acquainted with me in
order to get patches into Linux-kernel RCU.

> Either way, I think Rust's RFC process is something worth adding to
> the list of benefits under the heading "Why Rust?" in this proposed
> RFC.

Comparing Rust's process to that of the C or C++ standard committees is
best done by someone who understands both, and that set does not appear
to include either you or me.  ;-)

I could easily believe that Rust's process is lighter weight, but on
the other hand, I doubt that Rust's definition has legal standing in
any jurisdiction just yet.

> > P2055R0 A Relaxed Guide to memory_order_relaxed
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2055r0.pdf
> > P0124R7 Linux-Kernel Memory Model (vs. that of C/C++)
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0124r7.html
> > P1726R4 Pointer lifetime-end zap
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1726r4.pdf
> >         https://docs.google.com/document/d/1MfagxTa6H0rTxtq9Oxyh4X53NzKqOt7y3hZBVzO_LMk/edit?usp=sharing
> > P1121R2 Hazard Pointers: Proposed Interface and Wording for Concurrency TS 2
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1121r2.pdf
> > P1382R1 volatile_load<T> and volatile_store<T>
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1382r1.pdf
> > P1122R2 Proposed Wording for Concurrent Data Structures: Read-Copy-Update (RCU)
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1122r2.pdf
> >         https://docs.google.com/document/d/1MfagxTa6H0rTxtq9Oxyh4X53NzKqOt7y3hZBVzO_LMk/edit?usp=sharing
> > P0190R4 Proposal for New memory order consume Definition
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
> > P0750R1 Consume
> >         http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0750r1.html
> 
> Does wg14 not participate in these discussions? (Or, is there a lot of
> overlap between participants in both?)
> http://93.90.116.65/JTC1/SC22/WG14/www/docs/ seems like a list of
> proposals and meeting minutes, but all of the above links look like
> WG21.  The model of decisions being made for C++ then trickling down
> to C is definitely curious.  Though perhaps for the topics of memory
> orderings there's enough overlap between the two languages for it not
> to matter.

Back in 2005, WG14 and WG21 agreed that low-level concurrency proposals
would be handled by WG21, and that WG14 members would participate.
But some things required working with both committees, for example,
there is a WG14 counterpart to P1726R4, namely N2443:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2443.pdf

> > P1726R4 is of particular concern, along with consume.
> 
> 
> [0] You can see all of the existing ones here:
> https://github.com/rust-lang/rfcs/tree/master/text, with links to
> discussions for each on github.  (Here's one that has not been
> accepted yet: https://github.com/rust-lang/rfcs/blob/master/text/1937-ques-in-main.md,
> see the link to the issue in the rust issue tracker has lots of
> comments _from the community_:
> https://github.com/rust-lang/rust/issues/43301).  I guess the
> equivalents for the ISO WGs would be the meeting minutes?

There are minutes on wikis, github issue trackers, follow-on papers,
and so on.

Once things are reasonably well set for a given proposal, the formal
national-body-level processes take over.  I have been happy to be able
to sit out that phase for the most part.  ;-)

							Thanx, Paul
Miguel Ojeda April 19, 2021, 10:03 p.m. UTC | #77
Hi Nick,

On Mon, Apr 19, 2021 at 10:36 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> This is a much different process than drafts thrown over the wall.
> What hope do any kernel contributors have to participate in the ISO
> WGs, other than hoping their contributions to a draft foresee/address
> any concerns members of the committee might have?  How do members of
> the ISO WG communicate with folks interested in the outcomes of their
> decisions?

For WG21, several folks write trip reports of each meeting, and you
can check the status of papers in GitHub at
https://github.com/cplusplus/papers/issues.

For WG14, there are way less papers going on. It is more or  less easy
to follow by reading the list of latest additions in the first pages
of the draft, as well as the Editor's Report.

> The two processes are quite different; one doesn't require "joining a
> national body" (which I assume involves some monetary transaction, if
> not for the individual participant, for their employer) for instance.
> (http://www.open-std.org/jtc1/sc22/wg14/www/contributing which links
> to https://www.iso.org/members.html; I wonder if we have kernel
> contributors in those grayed out countries?)

They are indeed very different processes. Being an ISO standard has
advantages and disadvantages.

In any case, I should note that not everyone that goes to the meetings
pays, e.g. some go as guests, some are funded by their country (or the
EU or other organizations), etc.

In fact, the bigger costs, in my experience, are the time commitment
(a week several times a year) and the costs of traveling (before the
pandemic, that is).

Furthermore, contributing proposals does not actually require
attending the meetings nor joining the committee -- some people
contribute to the standards via proxy, i.e. somebody else presents
their proposals in the committee.

> It was always very ironic to me that the most important body of free
> software was subject to decisions made by ISO, for better or for
> worse.  I would think Rust's RFC process would be more accessible to
> kernel developers, modulo the anti-github crowd, but with the Rust's
> community's values in inclusion I'm sure they'd be happy to accomodate
> folks for the RFC process without requiring github.  I'm not sure ISO
> can be as flexible for non-members.

Well, the kernel already ignores the C standard here and there. ;-) In
the end, it is "just" a standard -- the kernel and compilers can do
something else when they need.

> Either way, I think Rust's RFC process is something worth adding to
> the list of benefits under the heading "Why Rust?" in this proposed
> RFC.

The Rust RFC process has indeed upsides. It is very dynamic and easy
to participate, and allows for anybody to easily comment on proposals,
even anonymously. But, for better or worse, does not lead to an ISO
standard (which some people & companies really value, e.g. as
requirements in contracts etc.).

In the end, writing an RFC is similar to writing a paper for ISO. The
bigger differences, as mentioned above, are on the requirements if you
actually want to go there and present the paper yourself and/or if you
want to have voting rights etc.

Personally, I think some ISO policies could be improved for some types
of standards (or at least let WGs relax them to some degree), but...

Cheers,
Miguel
Nick Desaulniers April 20, 2021, 12:24 a.m. UTC | #78
On Fri, Apr 16, 2021 at 10:39 AM Willy Tarreau <w@1wt.eu> wrote:
>
> resources usage, I'm really not convinced at all it's suited for
> low-level development. I understand the interest of the experiment
> to help the language evolve into that direction, but I fear that
> the kernel will soon be as bloated and insecure as a browser, and
> that's really not to please me.

Dunno, I don't think the introduction of Rust made Firefox _more_ insecure.
https://wiki.mozilla.org/Oxidation#Within_Firefox

I pray no executives ever see Dmitry Vyukov's 2019 Linux Plumbers Conf
talk "Reflections on kernel quality, development process and testing."
https://www.youtube.com/watch?v=iAfrrNdl2f4
or his 2018 Linux Security Summit talk "Syzbot and the Tale of
Thousand Kernel Bugs" https://www.youtube.com/watch?v=qrBVXxZDVQY
(and they're just fuzzing the syscall interface and USB devices.
Imagine once folks can more easily craft malformed bluetooth and wifi
packets.)

I'd imagine the first term that comes to mind for them might be
"liability."  They are quite sensitive to these vulnerabilities with
silly names, logos, and websites.  There are many of us that believe
an incremental approach of introducing a memory safe language to our
existing infrastructure at the very least to attempt to improve the
quality of drivers for those that choose to use such tools is a better
approach.

I think a lot of the current discussion picking nits in syntax, format
of docs, ease of installation, or theoretical memory models for which
no language (not even the one the kernel is implemented in) provides
all rightly should still be added to a revised RFC under "Why not
[Rust]?" but perhaps are severely overlooking the benefits.  A
tradeoff for sure though.

Really, a key point is that a lot of common mistakes in C are compile
time errors in Rust. I know no "true" kernel dev would make such
mistakes in C, but is there nothing we can do to help our peers
writing drivers?  The point is to transfer cost from runtime to
compile time to avoid costs at runtime; like all of the memory safety
bugs which are costing our industry.

Curiously recurring statistics:
https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/
"Microsoft security engineer Matt Miller said that over the last 12
years, around 70 percent of all Microsoft patches were fixes for
memory safety bugs."

https://www.chromium.org/Home/chromium-security/memory-safety
"The Chromium project finds that around 70% of our serious security
bugs are memory safety problems."

https://security.googleblog.com/2021/01/data-driven-security-hardening-in.html
(59% of Critical and High severity vulnerabilities fixed in Android
Security Bulletins in 2019 are classified as "Memory," FWIW)

https://hacks.mozilla.org/2019/02/rewriting-a-browser-component-in-rust/
"If we’d had a time machine and could have written this component in
Rust from the start, 51 (73.9%) of these bugs would not have been
possible."
--
Thanks,
~Nick Desaulniers
Willy Tarreau April 20, 2021, 3:47 a.m. UTC | #79
Hi Nick,

On Mon, Apr 19, 2021 at 05:24:33PM -0700, Nick Desaulniers wrote:
> I don't think the introduction of Rust made Firefox _more_ insecure.
> https://wiki.mozilla.org/Oxidation#Within_Firefox

Browsers are human interfaces and do not fundamentally require low
level access to memory/hardware/whatever. They can be written in
about any language, only the resource usage and performance will
make a difference. As such, some were even written in Java or JS
for example.

Operating systems, and particularly drivers *do* require low-level
accesses, and stuff that can hardly be abstracted or understood by
a compiler. You may have to perform two 16-bit reads/writes on a
32-bit MMIO address to perform an operation and the compiler does
not have to know it, just to obey.

> Really, a key point is that a lot of common mistakes in C are compile
> time errors in Rust. I know no "true" kernel dev would make such
> mistakes in C,

Everyone makes mistakes, the level of attention varies over time and
the focus often changes when dealing with build errors. How many time
some of us facing a bug remembered having changed the code very late
after a build error, and being less careful from this point when the
goal changed from "let's do it right" to "let's get this to build" ?

> but is there nothing we can do to help our peers
> writing drivers?  The point is to transfer cost from runtime to
> compile time to avoid costs at runtime; like all of the memory safety
> bugs which are costing our industry.

And do we have stats on the number of logical bugs, some of which are
caused by developers trying to work around compilers' stubbornness ?
For me, personally speaking, they have *increased* over time, usually
trying to avoid some annoying modern gcc warnings, resulting in integer
casts being placed close to string formats, or returns being placed in
switch/case to avoid the fall-through warning, etc. Thus I'm worried that
a non-negligible part of the 70% of bugs caused by memory safety issues
could be replaced with logic bugs to get to the point where the rust
compiler finally accepts to compile the code. It makes me think about
researchers trying to reduce the causes of certain deaths and claiming
to "save lives" while in the end the people they "save" will simply die
from something else.

And I'm not particularly trying to blindly defend C here. I'm complaining
every single day about some of its shortcomings like the vast amount of
UB, stupid type promotion, counter-intuitive operators precedence when
combining bit-ops with arithmetic, limited size of enums, lack of rotate
operator, strict aliasing, or the recourse to asm() statements every 10
lines to do stuff that can hardly be expressed in a way understandable
by a compiler. I'm just seeing that a lot of the griefs I'm having
against C come from the compiler trying to be too smart or too stubborn,
so giving even more of the handle to a compiler doesn't appeal me at all.

In addition, we all know how painful it is to work around compiler bugs
by writing complex code that carefully avoids certain constructs. I'm
wondering if we'll still have that luxury with a stricter compiler, or
if the only response will have to be between "let's disable this driver
that does not compile" or "please force distros to upgrade their
compilers".

But we'll see :-/

Regards,
Willy
Greg KH April 20, 2021, 5:56 a.m. UTC | #80
On Mon, Apr 19, 2021 at 05:24:33PM -0700, Nick Desaulniers wrote:
> On Fri, Apr 16, 2021 at 10:39 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > resources usage, I'm really not convinced at all it's suited for
> > low-level development. I understand the interest of the experiment
> > to help the language evolve into that direction, but I fear that
> > the kernel will soon be as bloated and insecure as a browser, and
> > that's really not to please me.
> 
> Dunno, I don't think the introduction of Rust made Firefox _more_ insecure.
> https://wiki.mozilla.org/Oxidation#Within_Firefox
> 
> I pray no executives ever see Dmitry Vyukov's 2019 Linux Plumbers Conf
> talk "Reflections on kernel quality, development process and testing."
> https://www.youtube.com/watch?v=iAfrrNdl2f4
> or his 2018 Linux Security Summit talk "Syzbot and the Tale of
> Thousand Kernel Bugs" https://www.youtube.com/watch?v=qrBVXxZDVQY
> (and they're just fuzzing the syscall interface and USB devices.
> Imagine once folks can more easily craft malformed bluetooth and wifi
> packets.)
> 
> I'd imagine the first term that comes to mind for them might be
> "liability."  They are quite sensitive to these vulnerabilities with
> silly names, logos, and websites.  There are many of us that believe
> an incremental approach of introducing a memory safe language to our
> existing infrastructure at the very least to attempt to improve the
> quality of drivers for those that choose to use such tools is a better
> approach.

I would LOVE it if some "executives" would see the above presentations,
because then they would maybe actually fund developers to fix bugs and
maintain the kernel code, instead of only allowing them to add new
features.

Seriously, that's the real problem, that Dmitry's work has exposed, the
lack of people allowed to do this type of bugfixing and maintenance on
company time, for something that the company relies on, is a huge issue.
"executives" feel that they are willing to fund the initial work and
then "throw it over the wall to the community" once it is merged, and
then they can forget about it as "the community" will maintain it for
them for free.  And that's a lie, as Dmitry's work shows.

The world creates new use cases and testing ability all the time, which
exposes bugs that have been around in old code.  Once the bugs are fixed
in that layer of code, the next layer down can finally be tested and
then look, more corner cases of issues.

Rewriting the kernel in another language is not going to fix the
majority of the issues that fuzzing finds here automagically, as that
work has exposed us to things like fault-injection and malicious USB
packets that no language would have saved us from "automatically".  All
of those code paths deal with "unsafe" data that doesn't magically
become "safe" because we switch languages.

And over time, what we have determined is "safe" has changed!  People
forget that only a few years ago have we decided that the kernel now has
to protect userspace programs from malicious hardware.  That's a major
shift in thinking, now data that we used to blindly trust can not be
trusted at all.  And "executives" want us to fix all of those issues for
free, when really it's a major design shift for loads of kernel
subsystems to handle this new threat model.

So yes, please spread that talk around.  Maybe then will we actually get
funding and support to FIX the bugs that those tools test.  Right now,
the primary fixer of those findings are _INTERNS_ as that's all
companies are willing to fund to fix this type of thing.

And don't get me started on the inability for "executives" to fund other
parts of Linux that they rely on, because they want "other companies" to
do it instead.  The tragedy-of-the-commons is a real threat to Linux,
and always has been...

thanks,

greg k-h
Willy Tarreau April 20, 2021, 6:16 a.m. UTC | #81
On Tue, Apr 20, 2021 at 07:56:18AM +0200, Greg Kroah-Hartman wrote:
> I would LOVE it if some "executives" would see the above presentations,
> because then they would maybe actually fund developers to fix bugs and
> maintain the kernel code, instead of only allowing them to add new
> features.
> 
> Seriously, that's the real problem, that Dmitry's work has exposed, the
> lack of people allowed to do this type of bugfixing and maintenance on
> company time, for something that the company relies on, is a huge issue.
> "executives" feel that they are willing to fund the initial work and
> then "throw it over the wall to the community" once it is merged, and
> then they can forget about it as "the community" will maintain it for
> them for free.  And that's a lie, as Dmitry's work shows.

That's sadly the eternal situation, and I'm suspecting that software
development and maintenance is not identified as a requirement for a
large number of hardware vendors, especially on the consumer side where
margins are lower. A contractor is paid to develop a driver, *sometimes*
to try to mainline it (and the later they engage with the community, the
longer it takes in round trips), and once the code finally gets merged,
all the initial budget is depleted and no more software work will be
done.

Worse, we could imagine kicking unmaintained drivers faster off the
tree, but that would actually help these unscrupulous vendors by
forcing their customers to switch to the new model :-/  And most of
them wouldn't care either if their contributions were refused based
on their track record of not maintaining their code, since they often
see this as a convenience to please their customers and not something
they need (after all, relying on a bogus and vulnerable BSP has never
prevented from selling a device, quite the opposite).

In short, there is a parallel universe where running highly bogus and
vulnerable out-of-tree code seems like the norm and where there is no
sort of care for what is mainlined as it's possibly just made to look
"cool".

We also need to recognize that it's expectable that some vendors are
not willing to engage on supporting a driver for a decade if they
expect their device to last 5 years only, and maybe we should make
some rules clear about mainlining drivers and what to expect for
users (in which case the end of support would be clear and nobody
would be surprised if the driver is removed at the end of its
maintenance, barring a switch to a community maintainer).

Just my two cents,
Willy
Linus Walleij April 22, 2021, 10:03 a.m. UTC | #82
Hi folks,

"we will do less critical stuff, like device drivers, first".

OK I mostly do device drivers. Kind of like it. So I'd like to provide
feedback from that angle.

On Fri, Apr 16, 2021 at 4:22 AM Wedson Almeida Filho
<wedsonaf@google.com> wrote:

> We don't intend to directly expose C data structures to Rust code (outside the
> kernel crate). Instead, we intend to provide wrappers that expose safe
> interfaces even though the implementation may use unsafe blocks. So we expect
> the vast majority of Rust code to just care about the Rust memory model.

I'm a bit worried about this.

I am sure you are aware of this document:
Documentation/process/stable-api-nonsense.rst

We really like to change the internal APIs of the kernel, and it sounds to
me like Rust really likes a rust-side-vs-C-side approach to APIs, requiring
these wrappers to be written and maintained all over the place, and that
is going to affect the mobility of the kernel-internal APIs and make them
less mobile.

If it means I need to write and review less patches related to NULL
dereference and use-after-free etc etc, then it may very well be worth
it.

But as subsystem maintainer I'd like a clear picture of this wrapper
overhead, what does it usually entail? A typical kernel API has
vtable and a few variables, not much more than that.

I go to patch 12/13 and I see things like this:

+/// A descriptor of wrapped list elements.
+pub trait GetLinksWrapped: GetLinks {
+    /// Specifies which wrapper (e.g., `Box` and `Arc`) wraps the list entries.
+    type Wrapped: Wrapper<Self::EntryType>;
+}
+
+impl<T: ?Sized> GetLinksWrapped for Box<T>
+where
+    Box<T>: GetLinks,
+{
+    type Wrapped = Box<<Box<T> as GetLinks>::EntryType>;
+}
+
+impl<T: GetLinks + ?Sized> GetLinks for Box<T> {
+    type EntryType = T::EntryType;
+    fn get_links(data: &Self::EntryType) -> &Links<Self::EntryType> {
+        <T as GetLinks>::get_links(data)
+    }
+}

My God. Lose the horrible CamelCase to begin with. I hope the
language spec does not mandate that because our kernel C style
does not use it.

It becomes obvious that as subsystem maintainer for the Linux kernel
a casual drive-by experience with Rust is not going to suffice by far.

All subsystem maintainers are expected to understand and maintain
wrappers like these, right? That means all subsystem maintainers need
to be elevated to understand the above without effort if you wake them
up in their sleep at 4 in the morning.

This makes me a bit sceptic.

Get me right, we are of course good at doing really complicated stuff,
that's what engineers do. But we are not Iron Man. We need a clear
way into understanding and maintaining wrappers and we need support
with it when we don't understand it, so the kernel would need a Rust
wrapper maintainer that we can trust to stay around for the long term,
i.e. until their retirement, while actively teaching others for decades.
For an example see how RCU is maintained.

Developing trust in the people Miguel and Wedson is going to be more
important than trust in Google the company for this.

Yours,
Linus Walleij
David Laight April 22, 2021, 2:09 p.m. UTC | #83
From: Linus Walleij
> Sent: 22 April 2021 11:03
...
> I go to patch 12/13 and I see things like this:
> 
> +/// A descriptor of wrapped list elements.
> +pub trait GetLinksWrapped: GetLinks {
> +    /// Specifies which wrapper (e.g., `Box` and `Arc`) wraps the list entries.
> +    type Wrapped: Wrapper<Self::EntryType>;
> +}
> +
> +impl<T: ?Sized> GetLinksWrapped for Box<T>
> +where
> +    Box<T>: GetLinks,
> +{
> +    type Wrapped = Box<<Box<T> as GetLinks>::EntryType>;
> +}
> +
> +impl<T: GetLinks + ?Sized> GetLinks for Box<T> {
> +    type EntryType = T::EntryType;
> +    fn get_links(data: &Self::EntryType) -> &Links<Self::EntryType> {
> +        <T as GetLinks>::get_links(data)
> +    }
> +}
> 
> My God. Lose the horrible CamelCase to begin with. I hope the
> language spec does not mandate that because our kernel C style
> does not use it.

That:

1) Looks as though it could be generated by token pasting in a #define.
2) Seems to be full of what look like casts.

I really wouldn't want to bump into multiple copies of it.

The other issue is that (all most) all uses of a symbol
can be found by running:
   grep -r --include '*.[chsS]' '\<symbol\>' .
often used as:
   vi `grep -l -r '\<symbol\>' .`

But it looks like the rust wrappers are going to break that.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Wedson Almeida Filho April 22, 2021, 3:24 p.m. UTC | #84
First of all, thanks for the thoughtful feedback.

On Thu, Apr 22, 2021 at 12:03:13PM +0200, Linus Walleij wrote:
> I am sure you are aware of this document:
> Documentation/process/stable-api-nonsense.rst
> 
> We really like to change the internal APIs of the kernel, and it sounds to
> me like Rust really likes a rust-side-vs-C-side approach to APIs, requiring
> these wrappers to be written and maintained all over the place, and that
> is going to affect the mobility of the kernel-internal APIs and make them
> less mobile.

The Rust-side-vs-C-side is because we want to provide an environment where we
can write kernel code (e.g., a driver) and if we stick to safe code (i.e.,
we don't use the Rust keyword "unsafe"), then we can be confident that our
code is free of memory safety issues (assuming, of course, that the abstractions
are sound).

Calling C functions directly would not allow us to do this.

As for mobility, I have the impression that this could potentially increase
mobility in that for Rust maintainers would need to change one place (the
wrapper) as opposed to a number of drivers using an API (if it's mostly an
argument/change kind of thing). Of course, if it's something like removing an
API then we'd have to change everywhere.

I'd like to reassure you that it is not our intention to create a stable api,
restrict mobility, or anything of that sort. Though I do acknowledge Rust may
complicate things (more on this below).

> If it means I need to write and review less patches related to NULL
> dereference and use-after-free etc etc, then it may very well be worth
> it.

Indeed that's part of our goal. A class of vulnerabilities is removed by
construction; others are harder to create accidentally. Reviewers would also
know that unsafe blocks need extra attention.
 
> But as subsystem maintainer I'd like a clear picture of this wrapper
> overhead, what does it usually entail? A typical kernel API has
> vtable and a few variables, not much more than that.
> 
> I go to patch 12/13 and I see things like this:
> 
> +/// A descriptor of wrapped list elements.
> +pub trait GetLinksWrapped: GetLinks {
> +    /// Specifies which wrapper (e.g., `Box` and `Arc`) wraps the list entries.
> +    type Wrapped: Wrapper<Self::EntryType>;
> +}
> +
> +impl<T: ?Sized> GetLinksWrapped for Box<T>
> +where
> +    Box<T>: GetLinks,
> +{
> +    type Wrapped = Box<<Box<T> as GetLinks>::EntryType>;
> +}
> +
> +impl<T: GetLinks + ?Sized> GetLinks for Box<T> {
> +    type EntryType = T::EntryType;
> +    fn get_links(data: &Self::EntryType) -> &Links<Self::EntryType> {
> +        <T as GetLinks>::get_links(data)
> +    }
> +}

We want the runtime overhead to be zero. During development, as you rightly
point out, there is the overhead of creating and maintaining these abstractions
for use in Rust. The code above is not a good example of a wrapper because it's
not wrapping kernel C functionality.

A better example is Pages, which wraps a pointer to struct page:

    pub struct Pages<const ORDER: u32> {
        pages: *mut bindings::page,
    }

If you call Pages::new(), alloc_pages() is called and returns a
KernelResult<Pages>. If the allocation fails you get an error back, otherwise
you get the pages: there is no possibility of forgetting to check the return
value and accidentally dereferencing a NULL pointer.

We have ORDER as a compile-time argument to the type, so we know at compile-time
how many pages we have at no additional runtime cost. So, for example, when we
have to free the pages, the destructor knows what the right argument is when
calling free_pages.

The fact that you have a destructor also ensures that you don't accidentally
forget to free the pages, so no leaks. (We don't have it implemented because
we haven't needed it yet, but we can have get_page/put_page with proper
ownership, i.e., after the equivalent of put_page, you can no longer touch the
page, enforced at compile time).

We provide an `insert_page` associated function that maps the given page to a
vma by calling vm_insert_page. (Only works if ORDER is zero.)

We provide a `kmap` associated function that maps one of the pages and returns a
mapping, which itself has a wrapper type that ensures that kunmap is called when
it goes out of scope.

Anyway, what I'm trying to show here is that the wrappers are quite thin and are
intended to enforce safety (where possible) and correct usage. Does it make
sense? I'm glad to go into more details if desired.

> All subsystem maintainers are expected to understand and maintain
> wrappers like these, right? That means all subsystem maintainers need
> to be elevated to understand the above without effort if you wake them
> up in their sleep at 4 in the morning.

I suppose they'd need to understand the wrappers that I talked about above, not
the ones you copied (those are wrapping something else and maintainers of other
subsystems are not expected to write this sort of code).

There are other possible approaches too:
1. Explicitly exclude Rust support from certain subsystems, say, no Rust USB
drivers (just a random example).
2. Maintainers may choose to not care about Rust, breaking it on api changes.

Naturally I'd prefer Rust support to be a first-class citizen but I mention the
above for completeness.

> Get me right, we are of course good at doing really complicated stuff,
> that's what engineers do. But we are not Iron Man. We need a clear
> way into understanding and maintaining wrappers and we need support
> with it when we don't understand it, so the kernel would need a Rust
> wrapper maintainer that we can trust to stay around for the long term,
> i.e. until their retirement, while actively teaching others for decades.
> For an example see how RCU is maintained.

Agreed. The only part that I'm not sure about is whether we need to put all the
burden on a single person for the rest of their career. In the beginning, of
course, but over time I would expect (hope?) experts would emerge and some of
the load would be distributed.
 
Cheers,
-Wedson
Miguel Ojeda April 22, 2021, 9:28 p.m. UTC | #85
Hi Linus,

Thanks for all those very good questions (and thanks for the positive
tone!). I will try to complement Wedson's answer in a couple places.

On Thu, Apr 22, 2021 at 12:03 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> But as subsystem maintainer I'd like a clear picture of this wrapper
> overhead, what does it usually entail? A typical kernel API has
> vtable and a few variables, not much more than that.

If you mean runtime-overhead, i.e. performance, it should be very
small or even zero. It should be possible to perform LTO across
languages too.

If you mean source code overhead, or cognitive overhead, then it is
quite a bit, yes. Please see below.

> It becomes obvious that as subsystem maintainer for the Linux kernel
> a casual drive-by experience with Rust is not going to suffice by far.
>
> All subsystem maintainers are expected to understand and maintain
> wrappers like these, right? That means all subsystem maintainers need
> to be elevated to understand the above without effort if you wake them
> up in their sleep at 4 in the morning.

I would say so, at least if longer-term a substantial amount of new
drivers are written in Rust. That is why I mentioned this as the very
first thing in the RFC. Rust does require some learning to use, even
by C experts.

Having said that, someone that is already a kernel developer and/or a
C expert is in a very good position to learn how Rust approaches
things and the main "new" concepts it introduces.

In the end, Rust is addressing some of the familiar problems that we
face when programming in C and C++.

> Get me right, we are of course good at doing really complicated stuff,
> that's what engineers do. But we are not Iron Man. We need a clear
> way into understanding and maintaining wrappers and we need support
> with it when we don't understand it, so the kernel would need a Rust
> wrapper maintainer that we can trust to stay around for the long term,
> i.e. until their retirement, while actively teaching others for decades.
> For an example see how RCU is maintained.

I hear you! I do not think it will take decades for kernel developers
to get up to speed, but I agree that having some help/backup is a very
good idea in the beginning.

Our hope is that, if Rust advantages prove themselves, then it will
the subsystem maintainers the ones that will want to create and
maintain the wrappers so that drivers in their tree are easier to
maintain and less prone to mistakes ;-)

Cheers,
Miguel
Linus Walleij April 26, 2021, 12:18 a.m. UTC | #86
Hi Wedson,

I try to provide a good answer so I did sit down and look a bit more
at rust and looked over your Binder example to at least reach the
level of "a little knowledge of something is dangerous".

For the record I kind of like the language.

On Thu, Apr 22, 2021 at 5:24 PM Wedson Almeida Filho
<wedsonaf@google.com> wrote:

> > We really like to change the internal APIs of the kernel, and it sounds to
> > me like Rust really likes a rust-side-vs-C-side approach to APIs, requiring
> > these wrappers to be written and maintained all over the place, and that
> > is going to affect the mobility of the kernel-internal APIs and make them
> > less mobile.
>
> The Rust-side-vs-C-side is because we want to provide an environment where we
> can write kernel code (e.g., a driver) and if we stick to safe code (i.e.,
> we don't use the Rust keyword "unsafe"), then we can be confident that our
> code is free of memory safety issues (assuming, of course, that the abstractions
> are sound).
>
> Calling C functions directly would not allow us to do this.

I get it. I think.

> > If it means I need to write and review less patches related to NULL
> > dereference and use-after-free etc etc, then it may very well be worth
> > it.
>
> Indeed that's part of our goal. A class of vulnerabilities is removed by
> construction; others are harder to create accidentally. Reviewers would also
> know that unsafe blocks need extra attention.

Fair enough. What we need to know is what these unsafe blocks
are going to be.

> We want the runtime overhead to be zero. During development, as you rightly
> point out, there is the overhead of creating and maintaining these abstractions
> for use in Rust. The code above is not a good example of a wrapper because it's
> not wrapping kernel C functionality.

For device drivers you will certainly have to wrap assembly as well.
Or C calls that only contain assembly to be precise.

A typical example is the way device drivers talk to actual hardware:
readl()/writel(), readw()/writew(), readb()/writeb() for memory-mapped
IO or inb()/outb() for port-mapped I/O.

So there is for example this (drivers/gpio/gpio-pl061.c):

        writeb(pl061->csave_regs.gpio_is, pl061->base + GPIOIS);
        writeb(pl061->csave_regs.gpio_ibe, pl061->base + GPIOIBE);
        writeb(pl061->csave_regs.gpio_iev, pl061->base + GPIOIEV);
        writeb(pl061->csave_regs.gpio_ie, pl061->base + GPIOIE);

We write a number of u32 into u32 sized registers, this
pl061->base is a void __iomem * so a pretty unsafe thing to
begin with and then we add an offset to get to the register
we want.

writel() on ARM for example turns into (arch/arm/include/asm/io.h):

static inline void __raw_writel(u32 val, volatile void __iomem *addr)
{
        asm volatile("str %1, %0"
                     : : "Qo" (*(volatile u32 __force *)addr), "r" (val));
}

This is usually sprinkled all over a device driver, called in loops etc.
Some of these will contain things like buffer drains and memory
barriers. Elaborately researched for years so they will need to
be there.

I have no clue how this thing would be expressed in Rust.
Even less how it would call the right code in the end.
That makes me feel unsafe and puzzled so this is a specific
area where "the Rust way" needs to be made very tangible
and easy to understand.

How would I write these 4 registers in Rust? From the actual
statements down to the CPU instructions, top to bottom,
that is what a driver writer wants to know.

If the result of the exercise is that a typical device driver
will contain more unsafe code than not, then device drivers
are not a good starting point for Rust in the Linux kernel.
In that case I would recommend that Rust start at a point
where there is a lot of abstract code that is prone to the
kind of problems that Rust is trying to solve. My intuition
would be such things as network protocols. But I may be
wrong.

I worry that it may become evident that introducing Rust
in device drivers is *only* suggested because the number
of affected platforms can be controlled (lacking some
compiler arch targets?) rather than that being a place
that needs memory safety. And then I think it is just a
playground for Rust experiments and need to be proposed
as such. But the idea was a real deployment I suppose.

> A better example is Pages, which wraps a pointer to struct page:
>
>     pub struct Pages<const ORDER: u32> {
>         pages: *mut bindings::page,
>     }
>
> If you call Pages::new(), alloc_pages() is called and returns a
> KernelResult<Pages>. If the allocation fails you get an error back, otherwise
> you get the pages: there is no possibility of forgetting to check the return
> value and accidentally dereferencing a NULL pointer.

This is really neat. I think it is a good example where Rust
really provides the right tool for the job.

And it is very far away from any device driver. Though some
drivers need pages.

(...)

> Anyway, what I'm trying to show here is that the wrappers are quite thin and are
> intended to enforce safety (where possible) and correct usage. Does it make
> sense? I'm glad to go into more details if desired.

It reminds me of Haskell monads for some reason.

This is true for any constrained language. I suppose we could write
kernel modules in Haskell as well, or Prolog, given the right wrappers,
and that would also attain the same thing: you get the desired
restrictions in the target language by way of this adapter.

I don't have a problem with that.

The syntax and semantic meaning of things with lots of
impl <T: ?Sized> Wrapper<T> for ... is just really intimidating
but I suppose one can learn it. No big deal.

What I need to know as device driver infrastructure maintainer is:

1. If the language is expressive enough to do what device driver
   authors need to do in an efficient and readable manner which
   is as good or better than what we have today.

2. Worry about double implementations of core library functions.

3. Kickback in practical problem solving.

This will be illustrated below.

Here is a device driver example that I wrote and merged
just the other week (drivers/iio/magnetometer/yamaha-yas530.c)
it's a nasty example, so I provide it to make a point.

static void yas53x_extract_calibration(u8 *data, struct yas5xx_calibration *c)
{
        u64 val = get_unaligned_be64(data);

        /*
         * Bitfield layout for the axis calibration data, for factor
         * a2 = 2 etc, k = k, c = clock divider
         *
         * n   7 6 5 4 3 2 1 0
         * 0 [ 2 2 2 2 2 2 3 3 ] bits 63 .. 56
         * 1 [ 3 3 4 4 4 4 4 4 ] bits 55 .. 48
         * 2 [ 5 5 5 5 5 5 6 6 ] bits 47 .. 40
         * 3 [ 6 6 6 6 7 7 7 7 ] bits 39 .. 32
         * 4 [ 7 7 7 8 8 8 8 8 ] bits 31 .. 24
         * 5 [ 8 9 9 9 9 9 9 9 ] bits 23 .. 16
         * 6 [ 9 k k k k k c c ] bits 15 .. 8
         * 7 [ c x x x x x x x ] bits  7 .. 0
         */
        c->a2 = FIELD_GET(GENMASK_ULL(63, 58), val) - 32;
        c->a3 = FIELD_GET(GENMASK_ULL(57, 54), val) - 8;
        c->a4 = FIELD_GET(GENMASK_ULL(53, 48), val) - 32;
        c->a5 = FIELD_GET(GENMASK_ULL(47, 42), val) + 38;
        c->a6 = FIELD_GET(GENMASK_ULL(41, 36), val) - 32;
        c->a7 = FIELD_GET(GENMASK_ULL(35, 29), val) - 64;
        c->a8 = FIELD_GET(GENMASK_ULL(28, 23), val) - 32;
        c->a9 = FIELD_GET(GENMASK_ULL(22, 15), val);
        c->k = FIELD_GET(GENMASK_ULL(14, 10), val) + 10;
        c->dck = FIELD_GET(GENMASK_ULL(9, 7), val);
}

This extracts calibration for the sensor from an opaque
chunk of bytes. The calibration is stuffed into sequences of
bits to save space at different offsets and lengths. So we turn
the whole shebang passed in the u8 *data into a 64bit
integer and start picking out the pieces we want.

We know a priori that u8 *data will be more than or equal
to 64 bits of data. (Which is another problem but do not
focus on that, let us look at this function.)

I have no idea how to perform this in
Rust despite reading quite a lot of examples. We have
created a lot of these helpers like FIELD_GET() and
that make this kind of operations simple.

1. Expressiveness of language.

If you look in include/linux/bitfield.h you can see how
this is elaborately implemented to be "a bit" typesafe
and if you follow the stuff around you will find that in
some cases it will resolve into per-CPU assembly
bitwise operations for efficiency. It's neat, it has this
nice handicrafty feeling to it, we control the machine
all the way down.

But that took a few years to get here, and wherever
we want to write a device driver in
Rust this kind of stuff is (I suspect) something that is
going to have to be reinvented, in Rust.

So this is where Rust maintainers will be needed. I will
say something like "I need <linux/bitfield.h>
in Rust" which I guess will eventually become a
"use linux::bitfield" or something like that. Please
fill in the blanks. In the beginning pushing tasks like
that back on the driver writers will just encourage them
to go and write the driver in C. So the maintainers need
to pick it up.

2. Duplication of core libraries.

I worry about that this could quite soon result in two
implementations of bitfield: one in C and one in Rust.
Because the language will have its preferred idiomatic
way of dealing with this, on the level above the
per-arch assembly optimized bitwise instructions
that need to be wrapped nicely for performance.
Which means wrappers all the way down. (Oh well.)

But double maintenance. Multiply with the number
of such kernel abstractions we have. So it better not
happen too much or pay off really well.

I would be worried if we need to say to a submitted device
driver written in Rust: "I know better ways to code this in
C so rewrite it in C". That's not gonna be popular, and
I would worry about angry Rust developers being required
to reinvent the world just because of some pesky review
comments about what we can do in C.

3. Kickback in practical problem solving.

Believe it or not device driver authors are not mainly
interested in memory safety, overruns, dangling pointers,
memory leaks etc. Maybe in a perfect world they
would/should. But they are interested in getting hardware
to work and want a toolbox that gives the shortest path
from A to B. Telling them (or subsystem maintainers) all
about how these things are solved elegantly by Rust is
not a selling point.

So if Rust makes it easier or at least equal to express
the logic with less lines of readable code, that is a
selling point. (Less lines of code that is unintelligible
and hard to read is not a good sell.) I am not referring
to matter of taste here: let's assume intermediary
experience with the language and some talent. There
will always be outliers and there is always a threshold
with any language.

One way of being better would be through me not having to
merge patches of NULL checks and misc things found by
static and dynamic inspection tools such as smatch,
coccinelle, KASan, ... etc etc. I think that is where Rust
would provide real kickback. Make
these never happen. Create less of this. But incidentally
that is not very common in my subsystems, only one patch
in 100 or 50 is about this kind of stuff.

So I do see some upsides here, maybe not super much.

> There are other possible approaches too:
> 1. Explicitly exclude Rust support from certain subsystems, say, no Rust USB
> drivers (just a random example).

With the wrappers being written and submitted to the subsystem
maintainer their buy-in will be the only way in, as our little
ecosystem works that way.

> 2. Maintainers may choose to not care about Rust, breaking it on api changes.

It's not like people don't get annoyed at us for things breaking
already. And I already care about Rust since I am writing this reply.

The maintainers are not going to have a Rust sidekick for their
subsystem, they simply have to understand it and manage it
if it is a supported kernel implementation language.

> Agreed. The only part that I'm not sure about is whether we need to put all the
> burden on a single person for the rest of their career. In the beginning, of
> course, but over time I would expect (hope?) experts would emerge and some of
> the load would be distributed.

People will pick up on it if it delivers expected improvements.

We (we device driver maintainers) already
had to learn YAML (see Documentation/devicetree/writing-schema.rst)
and that is certainly much more cognitively demanding than Rust. But it
made things so much better so it was worth it. In short: it delivers
expected improvements (formal validation of device trees, rooting
out nasty, buggy and incoherent device trees).

Have you tried to just sift through the kernel git log and see what
parts of the kernel is experiencing the kind of problems that
Rust can solve? (I haven't.) But if a certain area stand out,
that is likely where you should start. But maybe it is just
everywhere.

Yours,
Linus Walleij
Linus Walleij April 26, 2021, 12:31 a.m. UTC | #87
On Thu, Apr 22, 2021 at 11:29 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:

> > But as subsystem maintainer I'd like a clear picture of this wrapper
> > overhead, what does it usually entail? A typical kernel API has
> > vtable and a few variables, not much more than that.
>
> If you mean runtime-overhead, i.e. performance, it should be very
> small or even zero. It should be possible to perform LTO across
> languages too.
>
> If you mean source code overhead, or cognitive overhead, then it is
> quite a bit, yes. Please see below.

Yeah that is what I mean :)

> I hear you! I do not think it will take decades for kernel developers
> to get up to speed, but I agree that having some help/backup is a very
> good idea in the beginning.
>
> Our hope is that, if Rust advantages prove themselves, then it will
> the subsystem maintainers the ones that will want to create and
> maintain the wrappers so that drivers in their tree are easier to
> maintain and less prone to mistakes ;-)

I am not really convinced that (leaf) drivers is where Rust will
help most.

As I mentioned in my mail to Wedson that I think things like network
protocols that deal with abstract entities will have more "pure code"
(not deal with machine registers, just RAM memory).
File systems would be another example.

I think the Rust proponents should be open to the fact that their
work will eventually depend on themselves or someone else
fixing a working compiler for the maintained architectures in
the Linux kernel one way or the other, so they will be able to
work with Rust project anywhere in the kernel.

For example m68k is not going away. Avoiding this question
of compiler support, just waiting and hoping that these old
architectures will disappear is the wrong idea. The right idea
is to recognize that LLVM and/or GCC Rust needs to
support all these architectures so they can all use Rust.
Someone needs to put in the effort.

After all fixing that compiler support is an insignificant amount
of work compared to what Rust in the core kernel will be.

Yours,
Linus Walleij
Miguel Ojeda April 26, 2021, 2:26 p.m. UTC | #88
Hi Linus,

On Mon, Apr 26, 2021 at 2:18 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> I try to provide a good answer so I did sit down and look a bit more
> at rust and looked over your Binder example to at least reach the
> level of "a little knowledge of something is dangerous".

Thanks *a lot* for having spent some time to get to know the language a bit!

> For the record I kind of like the language.

That is great to hear :)

> A typical example is the way device drivers talk to actual hardware:
> readl()/writel(), readw()/writew(), readb()/writeb() for memory-mapped
> IO or inb()/outb() for port-mapped I/O.
>
> So there is for example this (drivers/gpio/gpio-pl061.c):
>
>         writeb(pl061->csave_regs.gpio_is, pl061->base + GPIOIS);
>         writeb(pl061->csave_regs.gpio_ibe, pl061->base + GPIOIBE);
>         writeb(pl061->csave_regs.gpio_iev, pl061->base + GPIOIEV);
>         writeb(pl061->csave_regs.gpio_ie, pl061->base + GPIOIE);
>
> We write a number of u32 into u32 sized registers, this
> pl061->base is a void __iomem * so a pretty unsafe thing to
> begin with and then we add an offset to get to the register
> we want.
>
> [...]
>
> How would I write these 4 registers in Rust? From the actual
> statements down to the CPU instructions, top to bottom,
> that is what a driver writer wants to know.

A function that writes to unconstrained addresses is indeed unsafe.
However, if one constraints them, then the functions might be able to
be made safe.

For instance, we could have a macro where you describe your hardware
registers and then code is generated that only allows to write to
those addresses. Not only that, but also make it properly typed, do
any needed masking/bit twiddling/unit conversion, etc.

This would be very similar to other code generation tools out there
used to simplify talking to hardware and maintain HDL mappings.

So instead of:

    writeb(x, pl061->base + GPIOIS);

you could say something like:

    pl061.write_gpio_is(x)

and the generated code should be the same (in fact, the Rust code
could forward the actual call to C to avoid rewriting any assembly --
but that can be done too if needed, e.g. if cross-language LTO does
not manage to inline as much as we want).

> If the result of the exercise is that a typical device driver
> will contain more unsafe code than not, then device drivers
> are not a good starting point for Rust in the Linux kernel.
> In that case I would recommend that Rust start at a point
> where there is a lot of abstract code that is prone to the
> kind of problems that Rust is trying to solve. My intuition
> would be such things as network protocols. But I may be
> wrong.

We may have some constructs that cannot be reasonably made safe, but
that is fine! As long as one needs to spell those out as `unsafe`, the
safe/unsafe split would be working as intended.

It is likely that some code will be written in "C style" nevertheless,
specially in the beginning. But, as explained above, we need to have a
mindset of writing safe abstractions wherever possible; and not just
try to mimic the kernel C side in everything.

It is also true that Rust brings some features that can be very useful
for non-HW-IO/"pure" code (state machines, ser/des, net protocols,
etc.) -- if someone wants to use Rust there, that is great, of course.

> I worry that it may become evident that introducing Rust
> in device drivers is *only* suggested because the number
> of affected platforms can be controlled (lacking some
> compiler arch targets?) rather than that being a place
> that needs memory safety. And then I think it is just a
> playground for Rust experiments and need to be proposed
> as such. But the idea was a real deployment I suppose.

We are proposing "leaf" modules not just because of the platforms
issue, but also because they introduce a smaller risk overall, i.e.
Rust support could be more easily dropped if the kernel community ends
up thinking it is not worth it.

If some platforms start seeing benefits from using Rust, it is our
hope that compiler vendors and companies behind arches will start
putting more resources on supporting Rust for their platforms too.

> It reminds me of Haskell monads for some reason.

Indeed! Result is pretty much Either.

> This is true for any constrained language. I suppose we could write
> kernel modules in Haskell as well, or Prolog, given the right wrappers,
> and that would also attain the same thing: you get the desired
> restrictions in the target language by way of this adapter.

You can indeed see Rust as a language that has brought some of the
"good ideas" to the systems programming domains.

However, while other languages can do all the fancy type things Rust
can do, the key is that it also introduces the necessary bits to
achieve manual (but safe) memory management for a lot of patterns;
while at the same time reading pretty much like C and C++ and without
removing some "down to the metal" features needed, such as raw
pointers, inline assembly, etc.

> The syntax and semantic meaning of things with lots of
> impl <T: ?Sized> Wrapper<T> for ... is just really intimidating
> but I suppose one can learn it. No big deal.

That syntax does take some time to get used to, indeed (like any other
generics or parameterized system).

Since one cannot introduce UB by mistake, it is "safe" to "play with
the language", which makes it way easier than e.g. some C++ features.
Plus the compiler is quite helpful.

> I have no idea how to perform this in
> Rust despite reading quite a lot of examples. We have
> created a lot of these helpers like FIELD_GET() and
> that make this kind of operations simple.

Bit twiddling and working with raw data can be done, e.g. take a look
into `u64::from_be_bytes()` or `core::mem::transmute()`. Things like
`offset_of`, `container_of`, intrinsics, inline assembly, etc. are
also possible.

In general, everything low-level you can do in C or C++, you can do in
Rust (and get the same kind of codegen).

When needed to simplify things, macros can be introduced too (we have
a few of those already, e.g. to declare a kernel module, to declare
file ops, etc.).

> 1. Expressiveness of language.
>
> If you look in include/linux/bitfield.h you can see how
> this is elaborately implemented to be "a bit" typesafe
> and if you follow the stuff around you will find that in
> some cases it will resolve into per-CPU assembly
> bitwise operations for efficiency. It's neat, it has this
> nice handicrafty feeling to it, we control the machine
> all the way down.

All that is fine in Rust (see above).

> But that took a few years to get here, and wherever
> we want to write a device driver in
> Rust this kind of stuff is (I suspect) something that is
> going to have to be reinvented, in Rust.

If you mean it in the sense that we need to have "similar" code in
Rust, yes, of course. But we can also forward things to the C side, so
some things do not need to be rewritten. The standard library also
provides quite a few utilities (more than C's), which also helps.

If you mean it in the sense that "Rust might be too high-level", not
really (as explained above etc.). Rust was designed with this usage in
mind; and is being used in embedded projects already.

> So this is where Rust maintainers will be needed. I will
> say something like "I need <linux/bitfield.h>
> in Rust" which I guess will eventually become a
> "use linux::bitfield" or something like that. Please
> fill in the blanks. In the beginning pushing tasks like
> that back on the driver writers will just encourage them
> to go and write the driver in C. So the maintainers need
> to pick it up.

We will try to help here as much as possible :)

This should also get fleshed out more when we have a couple drivers
that talk to hardware directly.

> 2. Duplication of core libraries.
>
> I worry about that this could quite soon result in two
> implementations of bitfield: one in C and one in Rust.
> Because the language will have its preferred idiomatic
> way of dealing with this, on the level above the
> per-arch assembly optimized bitwise instructions
> that need to be wrapped nicely for performance.
> Which means wrappers all the way down. (Oh well.)
>
> But double maintenance. Multiply with the number
> of such kernel abstractions we have. So it better not
> happen too much or pay off really well.

The Rust abstractions should reuse the C wherever possible. So it is
not a very big concern in that sense. But, yes, we need to have those
wrappers.

We expect that some modules will be easier to write than others,
specially at the beginning. So some subsystem may start to see some
drivers if the abstractions are already there or are easy enough to
make; while others may take longer.

> 3. Kickback in practical problem solving.
>
> Believe it or not device driver authors are not mainly
> interested in memory safety, overruns, dangling pointers,
> memory leaks etc. Maybe in a perfect world they
> would/should. But they are interested in getting hardware
> to work and want a toolbox that gives the shortest path
> from A to B. Telling them (or subsystem maintainers) all
> about how these things are solved elegantly by Rust is
> not a selling point.

Some of those (overruns, leaks, etc.) can turn into functional bugs
too (e.g. crashes), so even if some companies only care about "making
it work", they are still a good thing to eliminate (from their
perspective).

Even outside the memory-safety topic, Rust provides extra features to
make things reliable easier (like the strict typing and the error
handling guarantees with `Result` etc. we discussed above), so
companies should be up for it -- assuming the infrastructure is there
already.

But, of course, in the beginning, it will be harder for everyone
involved because we are not accustomed to either the language, the
utility functions ("headers" like `bitfield.h`), the way of writing
drivers in Rust, etc.

Cheers,
Miguel
Wedson Almeida Filho April 26, 2021, 2:40 p.m. UTC | #89
Linus, again thanks for taking the time to look into this. I think it's great
for us to get into this level of detail.

On Mon, Apr 26, 2021 at 02:18:33AM +0200, Linus Walleij wrote:
> For device drivers you will certainly have to wrap assembly as well.
> Or C calls that only contain assembly to be precise.

Sure, I don't think this would be a problem.

> A typical example is the way device drivers talk to actual hardware:
> readl()/writel(), readw()/writew(), readb()/writeb() for memory-mapped
> IO or inb()/outb() for port-mapped I/O.
> 
> So there is for example this (drivers/gpio/gpio-pl061.c):
> 
>         writeb(pl061->csave_regs.gpio_is, pl061->base + GPIOIS);
>         writeb(pl061->csave_regs.gpio_ibe, pl061->base + GPIOIBE);
>         writeb(pl061->csave_regs.gpio_iev, pl061->base + GPIOIEV);
>         writeb(pl061->csave_regs.gpio_ie, pl061->base + GPIOIE);
> 
> We write a number of u32 into u32 sized registers, this
> pl061->base is a void __iomem * so a pretty unsafe thing to
> begin with and then we add an offset to get to the register
> we want.
> 
> writel() on ARM for example turns into (arch/arm/include/asm/io.h):
> 
> static inline void __raw_writel(u32 val, volatile void __iomem *addr)
> {
>         asm volatile("str %1, %0"
>                      : : "Qo" (*(volatile u32 __force *)addr), "r" (val));
> }
> 
> This is usually sprinkled all over a device driver, called in loops etc.
> Some of these will contain things like buffer drains and memory
> barriers. Elaborately researched for years so they will need to
> be there.
> 
> I have no clue how this thing would be expressed in Rust.
> Even less how it would call the right code in the end.
> That makes me feel unsafe and puzzled so this is a specific
> area where "the Rust way" needs to be made very tangible
> and easy to understand.
> 
> How would I write these 4 registers in Rust? From the actual
> statements down to the CPU instructions, top to bottom,
> that is what a driver writer wants to know.

Here's an example of how this could be implemented. Again, we're happy to
iterate on this (just like any other piece of software, independently of
language), but I think this will give you an idea. We'd begin with an
abstraction for a mapped io region:

pub struct IoMemBlock<const SIZE: usize> {
    ptr: *mut u8
}

Note here that we encode the size of the block at compile time. We'll get our
safety guarantees from it.

For this abstraction, we provide the following implementation of the write
function:

impl<const SIZE: usize> IoMemBlock<SIZE> {
    pub fn write<T>(&self, value: T, offset: usize) {
        if let Some(end) = offset.checked_add(size_of::<T>()) {
            if end <= SIZE {
                // SAFETY: We just checked above that offset was within bounds.
                let ptr = unsafe { self.ptr.add(offset) } as *mut T;
                // SAFETY: We just checked that the offset+size was within bounds.
                unsafe { ptr.write_volatile(value) };
                return;
            }
        }
        // SAFETY: Unimplemented function to cause compilation error.
        unsafe { bad_write() };
    }
}

Now suppose we have some struct like:

pub struct MyDevice {
    base: IoMemBlock<100>,
    reg1: u32,
    reg2: u64,
}

Then a function similar to your example would be this:

pub fn do_something(pl061: &MyDevice) {
    pl061.base.write(pl061.reg1, GPIOIS);
    pl061.base.write(pl061.reg2, GPIOIBE);
    pl061.base.write(20u8, 99);
}

I have this example here: https://rust.godbolt.org/z/chE3vjacE

The x86 compiled output of the code above is as follows:

        mov     eax, dword ptr [rdi + 16]
        mov     rcx, qword ptr [rdi]
        mov     dword ptr [rcx + 16], eax
        mov     rax, qword ptr [rdi + 8]
        mov     qword ptr [rcx + 32], rax
        mov     byte ptr [rcx + 99], 20
        ret

Some observations:
1. do_something is completely safe: all accesses to memory are checked.
2. The only unsafe part that could involve the driver for this would be the
creation of IoMemBlock: my expectation is that this would be implemented by the
bus driver or some library that maps the appropriate region and caps the size.
That is, we can also build a safe abstraction for this.
3. All checks are optimised away because they uses compile-time constants. The
code presented above is as efficient as C.
4. All code is Rust code and therefore type-checked during compilation, there is
no need for macros.
5. Note that the code supports all sizes, and selects which one to use based on
the type of the first argument (the example above has 8, 32, 64 bit examples).
6. If the developer writing a driver accidentally uses an offset beyond the
limit, they will get a compilation error (bad_write is left unimplemented).
Perhaps we could find a better way to indicate this, but a compilation error is
definitely better than corrupting state (potentially silently) at runtime.
7. We could potentially design a way to limit which offsets are available for a
given IoMemBlock, I just haven't thought through it yet, but it would also
reduce the number of mistakes a developer could make.


> If the result of the exercise is that a typical device driver
> will contain more unsafe code than not, then device drivers
> are not a good starting point for Rust in the Linux kernel.
> In that case I would recommend that Rust start at a point
> where there is a lot of abstract code that is prone to the
> kind of problems that Rust is trying to solve. My intuition
> would be such things as network protocols. But I may be
> wrong.

Agreed. But based on the example above, I don't expect a lot (if any) of unsafe
code in drivers due accessing io memory.

> This is really neat. I think it is a good example where Rust
> really provides the right tool for the job.
> 
> And it is very far away from any device driver. Though some
> drivers need pages.

Sure, I didn't mean to imply that this is useful in drivers, I just meant it as
an example.

> This is true for any constrained language. I suppose we could write
> kernel modules in Haskell as well, or Prolog, given the right wrappers,
> and that would also attain the same thing: you get the desired
> restrictions in the target language by way of this adapter.

Agreed. Rust is different in that it doesn't need a garbage collector, so it can
achieve performance comparable to C, which is something that we can't claim
about Haskell and Prolog atm -- I actually like Haskell better than Rust, but
it's not practical at the moment for kernel development.

> The syntax and semantic meaning of things with lots of
> impl <T: ?Sized> Wrapper<T> for ... is just really intimidating
> but I suppose one can learn it. No big deal.

I agree it's intimidating, but so are macros like ____MAKE_OP in bitfield.h --
the former has the advantage of being type-checked. Writing macros like
____MAKE_OP is a hit-and-miss exercise in my experience. However, I feel that
both cases benefit from being specialised implementations that are somewhat
rare.

> What I need to know as device driver infrastructure maintainer is:
> 
> 1. If the language is expressive enough to do what device driver
>    authors need to do in an efficient and readable manner which
>    is as good or better than what we have today.

What do you think of the example I provided above? I think that generics give
Rust an edge over C in terms of expressiveness, though abusing it may
significantly reduce readability.

> 2. Worry about double implementations of core library functions.

This indeed may be a problem, but I'm happy to have Rust wrappers call
C/assembly functions. With LTO this should not affect performance.

> 3. Kickback in practical problem solving.
> 
> This will be illustrated below.
> 
> Here is a device driver example that I wrote and merged
> just the other week (drivers/iio/magnetometer/yamaha-yas530.c)
> it's a nasty example, so I provide it to make a point.
> 
> static void yas53x_extract_calibration(u8 *data, struct yas5xx_calibration *c)
> {
>         u64 val = get_unaligned_be64(data);
> 
>         /*
>          * Bitfield layout for the axis calibration data, for factor
>          * a2 = 2 etc, k = k, c = clock divider
>          *
>          * n   7 6 5 4 3 2 1 0
>          * 0 [ 2 2 2 2 2 2 3 3 ] bits 63 .. 56
>          * 1 [ 3 3 4 4 4 4 4 4 ] bits 55 .. 48
>          * 2 [ 5 5 5 5 5 5 6 6 ] bits 47 .. 40
>          * 3 [ 6 6 6 6 7 7 7 7 ] bits 39 .. 32
>          * 4 [ 7 7 7 8 8 8 8 8 ] bits 31 .. 24
>          * 5 [ 8 9 9 9 9 9 9 9 ] bits 23 .. 16
>          * 6 [ 9 k k k k k c c ] bits 15 .. 8
>          * 7 [ c x x x x x x x ] bits  7 .. 0
>          */
>         c->a2 = FIELD_GET(GENMASK_ULL(63, 58), val) - 32;
>         c->a3 = FIELD_GET(GENMASK_ULL(57, 54), val) - 8;
>         c->a4 = FIELD_GET(GENMASK_ULL(53, 48), val) - 32;
>         c->a5 = FIELD_GET(GENMASK_ULL(47, 42), val) + 38;
>         c->a6 = FIELD_GET(GENMASK_ULL(41, 36), val) - 32;
>         c->a7 = FIELD_GET(GENMASK_ULL(35, 29), val) - 64;
>         c->a8 = FIELD_GET(GENMASK_ULL(28, 23), val) - 32;
>         c->a9 = FIELD_GET(GENMASK_ULL(22, 15), val);
>         c->k = FIELD_GET(GENMASK_ULL(14, 10), val) + 10;
>         c->dck = FIELD_GET(GENMASK_ULL(9, 7), val);
> }
> 
> This extracts calibration for the sensor from an opaque
> chunk of bytes. The calibration is stuffed into sequences of
> bits to save space at different offsets and lengths. So we turn
> the whole shebang passed in the u8 *data into a 64bit
> integer and start picking out the pieces we want.
> 
> We know a priori that u8 *data will be more than or equal
> to 64 bits of data. (Which is another problem but do not
> focus on that, let us look at this function.)
> 
> I have no idea how to perform this in
> Rust despite reading quite a lot of examples. We have
> created a lot of these helpers like FIELD_GET() and
> that make this kind of operations simple.

Would you mind sharing more about which aspect of this you feel is challenging?

I see now that Miguel has already responded to this thread so I'll stop here.
Happy to follow up on anything.

Thanks,
-Wedson
Miguel Ojeda April 26, 2021, 4:03 p.m. UTC | #90
On Mon, Apr 26, 2021 at 4:40 PM Wedson Almeida Filho
<wedsonaf@google.com> wrote:
>
> I see now that Miguel has already responded to this thread so I'll stop here.
> Happy to follow up on anything.

No, no, the message was directed to you, and you gave very nice examples! :)

I think having both replies is great, we gave different perspectives.

Cheers,
Miguel
Miguel Ojeda April 26, 2021, 6:01 p.m. UTC | #91
On Mon, Apr 26, 2021 at 2:18 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> static void yas53x_extract_calibration(u8 *data, struct yas5xx_calibration *c)
> {
>         u64 val = get_unaligned_be64(data);
>
>         c->a2 = FIELD_GET(GENMASK_ULL(63, 58), val) - 32;
>         c->a3 = FIELD_GET(GENMASK_ULL(57, 54), val) - 8;
>         c->a4 = FIELD_GET(GENMASK_ULL(53, 48), val) - 32;
>         c->a5 = FIELD_GET(GENMASK_ULL(47, 42), val) + 38;
>         c->a6 = FIELD_GET(GENMASK_ULL(41, 36), val) - 32;
>         c->a7 = FIELD_GET(GENMASK_ULL(35, 29), val) - 64;
>         c->a8 = FIELD_GET(GENMASK_ULL(28, 23), val) - 32;
>         c->a9 = FIELD_GET(GENMASK_ULL(22, 15), val);
>         c->k = FIELD_GET(GENMASK_ULL(14, 10), val) + 10;
>         c->dck = FIELD_GET(GENMASK_ULL(9, 7), val);
> }

By the way, to give a more concrete example, this function could look like this:

    fn yas53x_extract_calibration(data: [u8; 8], c: &mut yas5xx_calibration)
    {
        let val = u64::from_be_bytes(data);

        c.a2 = FIELD_GET(GENMASK_ULL(63, 58), val) - 32;
        c.a3 = FIELD_GET(GENMASK_ULL(57, 54), val) - 8;
        c.a4 = FIELD_GET(GENMASK_ULL(53, 48), val) - 32;
        c.a5 = FIELD_GET(GENMASK_ULL(47, 42), val) + 38;
        c.a6 = FIELD_GET(GENMASK_ULL(41, 36), val) - 32;
        c.a7 = FIELD_GET(GENMASK_ULL(35, 29), val) - 64;
        c.a8 = FIELD_GET(GENMASK_ULL(28, 23), val) - 32;
        c.a9 = FIELD_GET(GENMASK_ULL(22, 15), val);
        c.k = FIELD_GET(GENMASK_ULL(14, 10), val) + 10;
        c.dck = FIELD_GET(GENMASK_ULL(9, 7), val) as u8;
    }

assuming `FIELD_GET()` returns `i32`. In particular, `GENMASK_ULL` and
`FIELD_GET` can be written as normal functions, no need for macros
(and can be `const fn` too -- i.e. can be evaluated at compile-time if
needed).

As you see, it looks remarkably similar, and there is no `unsafe`
because we pass the array of bytes instead of a raw pointer.

The caller needs to get the array from somewhere, of course -- if you
only have a raw pointer to start with, then the caller will need an
`unsafe` line to dereference it, as usual.

Cheers,
Miguel
Miguel Ojeda April 26, 2021, 6:18 p.m. UTC | #92
On Mon, Apr 26, 2021 at 2:31 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> I think the Rust proponents should be open to the fact that their
> work will eventually depend on themselves or someone else
> fixing a working compiler for the maintained architectures in
> the Linux kernel one way or the other, so they will be able to
> work with Rust project anywhere in the kernel.
>
> For example m68k is not going away. Avoiding this question
> of compiler support, just waiting and hoping that these old
> architectures will disappear is the wrong idea. The right idea
> is to recognize that LLVM and/or GCC Rust needs to
> support all these architectures so they can all use Rust.
> Someone needs to put in the effort.

The RFC does not avoid the question -- please note it explicitly
mentions the architecture/platform support issue and the current
dependency on LLVM, as well as the possible ways to solve it.

We would love to not have that issue, of course, because that would
enable Rust to be used in other parts of the kernel where it is likely
to be quite useful too.

But even if we did not have the issue today, it seems like starting
with drivers and other "leaf" modules is a better approach. There are
several reasons:

  - If for reason reason we wanted to remove Rust from the kernel,
then it would be easier to do so if only "leaf" bits had been written.

  - We cannot compile the Rust support without nightly features yet,
so it does not seem wise to make it a hard requirement right away.

  - Kernel developers need time to learn a bit of Rust, thus writing
subsystems or core pieces of the kernel in Rust would mean less people
can understand them.

Given that drivers are a big part of the new code introduced every
release, that they are "leaf" modules and that in some cases they are
only intended to be used with a given architecture, they seem like a
good starting point.

Cheers,
Miguel
Linus Walleij April 27, 2021, 10:54 a.m. UTC | #93
Hi Wedson,

thanks for your replies, I have a bigger confidence in Rust for
drivers after your detailed answers.

On Mon, Apr 26, 2021 at 4:40 PM Wedson Almeida Filho
<wedsonaf@google.com> wrote:

> Note here that we encode the size of the block at compile time. We'll get our
> safety guarantees from it.
>
> For this abstraction, we provide the following implementation of the write
> function:
>
> impl<const SIZE: usize> IoMemBlock<SIZE> {
>     pub fn write<T>(&self, value: T, offset: usize) {
>         if let Some(end) = offset.checked_add(size_of::<T>()) {
>             if end <= SIZE {
>                 // SAFETY: We just checked above that offset was within bounds.
>                 let ptr = unsafe { self.ptr.add(offset) } as *mut T;
>                 // SAFETY: We just checked that the offset+size was within bounds.
>                 unsafe { ptr.write_volatile(value) };
>                 return;
>             }
>         }
>         // SAFETY: Unimplemented function to cause compilation error.
>         unsafe { bad_write() };
>     }
> }

I really like the look of this. I don't fully understand it, but what
is needed for driver developers to adopt rust is something like a
detailed walk-through of examples like this that explains the
syntax 100% all the way down.

We do not need to understand the basic concepts of the
language as much because these are evident, the devil is
in details like this.

> Now suppose we have some struct like:
>
> pub struct MyDevice {
>     base: IoMemBlock<100>,
>     reg1: u32,
>     reg2: u64,
> }
>
> Then a function similar to your example would be this:
>
> pub fn do_something(pl061: &MyDevice) {
>     pl061.base.write(pl061.reg1, GPIOIS);
>     pl061.base.write(pl061.reg2, GPIOIBE);
>     pl061.base.write(20u8, 99);
> }
>
> I have this example here: https://rust.godbolt.org/z/chE3vjacE
>
> The x86 compiled output of the code above is as follows:
>
>         mov     eax, dword ptr [rdi + 16]
>         mov     rcx, qword ptr [rdi]
>         mov     dword ptr [rcx + 16], eax
>         mov     rax, qword ptr [rdi + 8]
>         mov     qword ptr [rcx + 32], rax
>         mov     byte ptr [rcx + 99], 20
>         ret

This looks good, but cannot be done like this. The assembly versions
of writel() etc have to be used because the compiler simply will not
emit the right type of assembly for IO access, unless the compiler
(LLVM GCC) gains knowledge of what an IO address is, and so far
they have not.

I mostly work on ARM so I have little understanding of x86
assembly other than superficial.

Port-mapped IO on ARM for ISA/PCI would be a stressful
example, I do not think Rust or any other sane language
(except Turbo Pascal) has taken the effort to create language
abstractions explicitly for port-mapped IO.

See this for ARM:

#define outb(v,p)       ({ __iowmb(); __raw_writeb(v,__io(p)); })

So to write a byte to a port we first need to issue a IO write memory
barrier, followed by the actual write to the IO memory where the
port resides. __iowmb() turns into the assembly instruction
wmb on CPUs that support it and a noop on those that do not,
at compile time.

One *could* think about putting awareness about crazy stuff like
that into the language but ... I think you may want to avoid it
and just wrap the assembly. So a bit of low-level control of the
behavior there.

> 2. The only unsafe part that could involve the driver for this would be the
> creation of IoMemBlock: my expectation is that this would be implemented by the
> bus driver or some library that maps the appropriate region and caps the size.
> That is, we can also build a safe abstraction for this.

I suppose this is part of the problem in a way: a language tends to be
imperialistic: the developers will start thinking "it would all be so much
easier if I just rewrote also this thing in Rust".

And that is where you will need compiler support for all targets.

> 7. We could potentially design a way to limit which offsets are available for a
> given IoMemBlock, I just haven't thought through it yet, but it would also
> reduce the number of mistakes a developer could make.

The kernel has an abstraction for memory and register accesses,
which is the regmap, for example MMIO regmap for simple
memory-mapped IO:
drivers/base/regmap/regmap-mmio.c

In a way this is memory safety implemented in C.

Sadly it is not very well documented. But regmap is parameterized
to restrict accesses to certain register areas, using explicit
code in C, so you can provide an algorithm for which addresses
are accessible for write for example, like every fourth address
on a sunday.

A typical usecase is clock drivers which have very fractured
and complex memory maps with random readable/writeable
bits all over the place.

If Rust wants to do this I would strongly recommend it to
try to look like regmap MMIO.
See for example drivers/clk/sprd/common.c:

static const struct regmap_config sprdclk_regmap_config = {
        .reg_bits       = 32,
        .reg_stride     = 4,
        .val_bits       = 32,
        .max_register   = 0xffff,
        .fast_io        = true,
};
(...)
regmap = devm_regmap_init_mmio(&pdev->dev, base,
                                               &sprdclk_regmap_config);

It is also possible to provide a callback function to determine
if addresses are readable/writeable.

This is really a devil-in-the-details place where Rust needs
to watch out to not reimplement regmap in a substandard
way from what is already available.

Also in many cases developers do not use regmap MMIO
because it is just too much trouble. They tend to use it
not because "safety is nice" but because a certain register
region is very fractured and it is easy to do mistakes and
write into a read-only register by mistake. So they want
this, optionally, when the situation demands it.

> > What I need to know as device driver infrastructure maintainer is:
> >
> > 1. If the language is expressive enough to do what device driver
> >    authors need to do in an efficient and readable manner which
> >    is as good or better than what we have today.
>
> What do you think of the example I provided above? I think that generics give
> Rust an edge over C in terms of expressiveness, though abusing it may
> significantly reduce readability.

It looks nice but it is sadly unrealistic because we need to wrap
the real assembly accessors in practice (write memory barriers
and such) and another problem is that it shows that Rust has an
ambition to do a parallel implementation of regmap.

> > 2. Worry about double implementations of core library functions.
>
> This indeed may be a problem, but I'm happy to have Rust wrappers call
> C/assembly functions. With LTO this should not affect performance.

Yeah see above about regmap too.

> > The syntax and semantic meaning of things with lots of
> > impl <T: ?Sized> Wrapper<T> for ... is just really intimidating
> > but I suppose one can learn it. No big deal.
>
> I agree it's intimidating, but so are macros like ____MAKE_OP in bitfield.h --
> the former has the advantage of being type-checked. Writing macros like
> ____MAKE_OP is a hit-and-miss exercise in my experience. However, I feel that
> both cases benefit from being specialised implementations that are somewhat
> rare.
(...)
> > I have no idea how to perform this in
> > Rust despite reading quite a lot of examples. We have
> > created a lot of these helpers like FIELD_GET() and
> > that make this kind of operations simple.
>
> Would you mind sharing more about which aspect of this you feel is challenging?

Good point.

This explanation is going to take some space.

I am not able to express it in Rust at all and that is what
is challenging about it, the examples provided for Rust
are all about nice behaved computer programs like
cutesey fibonnacci series and such things and not really
complex stuff.

Your binder example is however very good, the problem
is that it is not a textbook example so the intricacies of
it are not explained, top down. (I'm not blaming you for
this, I just say we need that kind of text to get to know
Rust in the details.)

As device driver maintainers we especially need to
understand IO access and so I guess that is what
we are discussing above, so we are making progress
here.

What we need is a good resource to learn it, that
skips the trivial aspects of the language and goes immediately
for the interesting details.

It's not like I didn't try.
I consulted the Rust book on the website of coure.

The hard thing to understand in Rust is traits. I don't understand
traits. I have the level of "a little knowledge is dangerous" and
I clearly understand this: all kernel developers must have
a thorough and intuitive understanding of the inner transcendental
meaning of the concept of a TRAIT, how it was devised, how the
authors of the language conceptualized it, what effect it is supposed
to have on generated assembly.

The language book per se is a bit too terse.
For example if I read
https://doc.rust-lang.org/book/appendix-02-operators.html

T: ?Sized : Allow generic type parameter to be a dynamically sized type

This is just self-referential. The description is written in a
strongly context-dependent language to make a pun ...
I think every word in that sentence except "allow"and "to be a"
is dependent on other Rust concepts and thus completely
unreadable without context.

Instead it is described in a later chapter:
https://doc.rust-lang.org/book/ch19-04-advanced-types.html

This is more to the point.

"Rust has a particular trait called the Sized trait to
determine whether or not a type’s size is known at compile time."
(...) "A trait bound on ?Sized is the opposite of a trait bound on
Sized: we would read this as “T may or may not be Sized.” This
syntax is only available for Sized, not any other traits."

But Jesus Christ. This makes me understand less not
more.

So I need to understand what traits are. So back to
https://doc.rust-lang.org/book/ch10-02-traits.html

This chapter is just *really* hard to understand. I
can blame myself for being stupid, but since it is
more convenient to blame the author I'm just going
to complain that this chapter is not very good for
low-level programmers. I'm probably wrong, this is
obviously a personal development exercise.

OK I will give it several second tries. It just feels
very intimidating.

To me, the Rust book is nowhere near "The C
Programming Language" in quality (meaning readability
and ability to transfer complex detailed knowledge) and
that is a serious problem.

Sadly, it is hard to pin down and define what makes it
so hard, but I would take a guess and say that
"The C Programming Language" was written by low
level programmers implementing an operating system
and the Rust book was not. I.e. the authors concept
of the intended audience.

So this is where we need good inroads to understand the
language.

The quality and versatility of the K&R book about The
C Programming Language has been pointed out by
Kernighan in "UNIX: A History and a Memoir"
and I think the Rust community needs to learn something
from this (page78, praising himself and Ritchie):

"We made many alternating passes over the main text (...)
It describes the language with what Bill Plauger
once called 'spine-tingling precision'. The reference
manual is like C itself: precise, elegant, and compact"

I think a main obstacle for getting Rust accepted by kernel
developers is not the language itself, but the lack of textbook
with the same qualities as The C Programming Language.

This is a serious flaw, not with the language itself but with
the surrounding materials.

Kernighan writes about *forcing* Ritchie to write the book
about C ("I twisted his arm harder and eventually he agreed"),
after implementing it, and this made it reflect the
language from the intent of the author and OS usecase
very well.

The Rust book is written "by Steve Klabnik and Carol Nichols,
with contributions from the Rust Community" and I do not mean
to criticize them, because I think they had very clear ideas
of what kind of people were going to read it. And I bet they did
not intend it for OS developers.

What I find very disturbing is that the authors of the Rust
language did NOT write it. I think this may be the source
of a serious flaw. We need this information straight from
the horse's mouth.

I would strongly advice the Rust community to twist the
arms of the original Rust authors to go and review and
edit the Rust book. Possibly rewrite parts of it. This is what
the world needs to get a more adaptable Rust.

I understand this is a thick requirement, but hey, you are
competing with C.

Yours,
Linus Walleij
Linus Walleij April 27, 2021, 11:13 a.m. UTC | #94
On Mon, Apr 26, 2021 at 8:18 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> On Mon, Apr 26, 2021 at 2:31 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > I think the Rust proponents should be open to the fact that their
> > work will eventually depend on themselves or someone else
> > fixing a working compiler for the maintained architectures in
> > the Linux kernel one way or the other, so they will be able to
> > work with Rust project anywhere in the kernel.
> >
> > For example m68k is not going away. Avoiding this question
> > of compiler support, just waiting and hoping that these old
> > architectures will disappear is the wrong idea. The right idea
> > is to recognize that LLVM and/or GCC Rust needs to
> > support all these architectures so they can all use Rust.
> > Someone needs to put in the effort.
>
> The RFC does not avoid the question -- please note it explicitly
> mentions the architecture/platform support issue and the current
> dependency on LLVM, as well as the possible ways to solve it.

OK true. Sorry for being sloppy.

Actually my reply to Wedson brought up a new issue, which is the
quality of learning resources and the lack of an equivalent to
The C Programming Language book.

> But even if we did not have the issue today, it seems like starting
> with drivers and other "leaf" modules is a better approach. There are
> several reasons:
>
>   - If for reason reason we wanted to remove Rust from the kernel,
> then it would be easier to do so if only "leaf" bits had been written.
>
>   - We cannot compile the Rust support without nightly features yet,
> so it does not seem wise to make it a hard requirement right away.
>
>   - Kernel developers need time to learn a bit of Rust, thus writing
> subsystems or core pieces of the kernel in Rust would mean less people
> can understand them.
>
> Given that drivers are a big part of the new code introduced every
> release, that they are "leaf" modules and that in some cases they are
> only intended to be used with a given architecture, they seem like a
> good starting point.

I'm not sure I agree with this.

I think a good starting point would be to either fix Rust support in
GCC or implement some more important ISAs in LLVM,
whichever is easiest. I don't mind having just *one* compiler but
I mind having *a* compiler for every arch we support.

The situation for LLVM is very well described in the Wikipedia
entry for LLVM: "but most of this hardware is mostly obsolete,
and LLVM developers decided the support and maintenance
costs were no longer justified" - this is what I would call
deprecationism (deletionism). I think this is a detrimental force
for both compilers and kernels. It encourages developers of
compilers and kernels to do the wrong thing: instead of
rewriting their compiler and kernel infrastructure such that
maintenance of older ISAs and architectures becomes a bliss
they do what mathematicians do "let's assume a simpler
version of the problem". And this results in a less versatile
infrastructure and less adaptable code in the end. Which will
affect how agile and adaptive the software is. And when
something new comes along it hits you in the head.

Portability to old systems and ISAs is a virtue in itself
because of the effect it has on code quality, not necessarily
for the support itself.

Deprecationism is more the side effect of a certain business
strategy to toss new technology out every quarter without
having to care about aftermarket or postmarket too much.
This irritates people to the extent that there is now even
a project called "PostmarketOS" (Linux based). It is not
sustainable to use an emotional argument, but that is really
not my point, I care about code quality and diversity of
ISAs and target systems improves code quality in my book.

I might be an extremist, but I do need to state this point.

Yours,
Linus Walleij
Robin Randhawa April 27, 2021, 11:13 a.m. UTC | #95
Hi Linus.

Thanks for your detailed inputs. I will defer to Wedson to address your
points but I had one suggestion.

On 27.04.2021 12:54, Linus Walleij wrote:

[...]

>To me, the Rust book is nowhere near "The C
>Programming Language" in quality (meaning readability
>and ability to transfer complex detailed knowledge) and
>that is a serious problem.

Compared to the Rust Book - which aims to provide a relatively gentle
and comprehensive introduction to the language, I think the Rust
reference might be more suitable in order to understand the language
support for features like Traits:

https://doc.rust-lang.org/stable/reference/introduction.html

A lot of folks, myself included, convolve the Book with the Reference to
get a stronger handle on concepts.

This is subjective of course but I felt it worth sharing.

Robin
Kyle Strand April 28, 2021, 2:51 a.m. UTC | #96
Hi Linus,

I wanted to shed some light on one specific point of your criticism of
The Rust Programming Language:

>  What I find very disturbing is that the authors of the Rust language did NOT write it.

Rust, unlike C when the K&R book was written, has already had a pretty
large number of people contribute to its development. Steve Klabnik
and Carol Nichols are 6th and 11th, respectively, on the list of
contributors with the most commits in the Rust compiler & standard
library (from the official "thanks" page, excluding the top committer
"bors", which is part of Rust's CI automation:
https://thanks.rust-lang.org/rust/all-time/).

I hope this clears up some confusion.

> What we need is a good resource to learn it, that skips the trivial aspects of the language and goes immediately for the interesting details.

I realize that what is "trivial" and what is "interesting" is
subjective, but if you find the explanation of traits in the Book
difficult to understand, I would recommend revisiting the earlier
sections in order to be certain you understand the foundations for the
explanation of traits.

For what it's worth, since you would specifically like a lower-level
perspective, in addition to looking at the Reference (as previously
suggested), I recommend trying O'Reilly's Programming Rust by Jim
Blandy and Jason Orendorff.

Kyle Strand


On Tue, Apr 27, 2021 at 5:14 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Mon, Apr 26, 2021 at 8:18 PM Miguel Ojeda
> <miguel.ojeda.sandonis@gmail.com> wrote:
> > On Mon, Apr 26, 2021 at 2:31 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > >
> > > I think the Rust proponents should be open to the fact that their
> > > work will eventually depend on themselves or someone else
> > > fixing a working compiler for the maintained architectures in
> > > the Linux kernel one way or the other, so they will be able to
> > > work with Rust project anywhere in the kernel.
> > >
> > > For example m68k is not going away. Avoiding this question
> > > of compiler support, just waiting and hoping that these old
> > > architectures will disappear is the wrong idea. The right idea
> > > is to recognize that LLVM and/or GCC Rust needs to
> > > support all these architectures so they can all use Rust.
> > > Someone needs to put in the effort.
> >
> > The RFC does not avoid the question -- please note it explicitly
> > mentions the architecture/platform support issue and the current
> > dependency on LLVM, as well as the possible ways to solve it.
>
> OK true. Sorry for being sloppy.
>
> Actually my reply to Wedson brought up a new issue, which is the
> quality of learning resources and the lack of an equivalent to
> The C Programming Language book.
>
> > But even if we did not have the issue today, it seems like starting
> > with drivers and other "leaf" modules is a better approach. There are
> > several reasons:
> >
> >   - If for reason reason we wanted to remove Rust from the kernel,
> > then it would be easier to do so if only "leaf" bits had been written.
> >
> >   - We cannot compile the Rust support without nightly features yet,
> > so it does not seem wise to make it a hard requirement right away.
> >
> >   - Kernel developers need time to learn a bit of Rust, thus writing
> > subsystems or core pieces of the kernel in Rust would mean less people
> > can understand them.
> >
> > Given that drivers are a big part of the new code introduced every
> > release, that they are "leaf" modules and that in some cases they are
> > only intended to be used with a given architecture, they seem like a
> > good starting point.
>
> I'm not sure I agree with this.
>
> I think a good starting point would be to either fix Rust support in
> GCC or implement some more important ISAs in LLVM,
> whichever is easiest. I don't mind having just *one* compiler but
> I mind having *a* compiler for every arch we support.
>
> The situation for LLVM is very well described in the Wikipedia
> entry for LLVM: "but most of this hardware is mostly obsolete,
> and LLVM developers decided the support and maintenance
> costs were no longer justified" - this is what I would call
> deprecationism (deletionism). I think this is a detrimental force
> for both compilers and kernels. It encourages developers of
> compilers and kernels to do the wrong thing: instead of
> rewriting their compiler and kernel infrastructure such that
> maintenance of older ISAs and architectures becomes a bliss
> they do what mathematicians do "let's assume a simpler
> version of the problem". And this results in a less versatile
> infrastructure and less adaptable code in the end. Which will
> affect how agile and adaptive the software is. And when
> something new comes along it hits you in the head.
>
> Portability to old systems and ISAs is a virtue in itself
> because of the effect it has on code quality, not necessarily
> for the support itself.
>
> Deprecationism is more the side effect of a certain business
> strategy to toss new technology out every quarter without
> having to care about aftermarket or postmarket too much.
> This irritates people to the extent that there is now even
> a project called "PostmarketOS" (Linux based). It is not
> sustainable to use an emotional argument, but that is really
> not my point, I care about code quality and diversity of
> ISAs and target systems improves code quality in my book.
>
> I might be an extremist, but I do need to state this point.
>
> Yours,
> Linus Walleij
Miguel Ojeda April 28, 2021, 3:10 a.m. UTC | #97
On Tue, Apr 27, 2021 at 1:13 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> Actually my reply to Wedson brought up a new issue, which is the
> quality of learning resources and the lack of an equivalent to
> The C Programming Language book.

I recall having a similar feeling when initially jumping into
individual chapters of The Rust Programming Language book. I think it
is intended to be read from cover to cover instead.

There are other resources, see [1]. For instance, there is The
Embedded Rust Book [2]. Some of those are a WIP, but perhaps others
can recommend better finished/published books.

In any case, Rust has more features than C, some of them quite unique,
and they are routinely used, so it does take some time to learn.

[1] https://www.rust-lang.org/learn
[2] https://docs.rust-embedded.org/book/

> I think a good starting point would be to either fix Rust support in
> GCC or implement some more important ISAs in LLVM,
> whichever is easiest. I don't mind having just *one* compiler but
> I mind having *a* compiler for every arch we support.
>
> [...]
>
> Portability to old systems and ISAs is a virtue in itself
> because of the effect it has on code quality, not necessarily
> for the support itself.

I agree that there are benefits of keeping compiler technology
flexible, but one cannot force or expect any project (including the
Linux kernel) to maintain all code forever.

In the end, we need to balance that adaptability against the benefits
of adding Rust. In particular because nowadays LLVM is able to cover
the majority of devices that want to run the very latest Linux
kernels. Thus those benefits apply to most users. If LLVM only
supported, say, x86_64, I would agree that it would not be enough.

By contrast, compiler flexibility only matters indirectly to users,
and at some point there are diminishing returns to keeping all
architectures around.

In any case, adding Rust (in particular for "leaf" modules) does not
imply that we will lose those architectures any time soon. That would
take at least several years, and would require quite a few things to
happen at the same time:

  - That Rust got so widely used in the kernel (because the benefits
turned out to be important) that maintainers went as far as wanting to
drop C drivers from mainline for Rust equivalents.

  - That GCC did not get any way to compile Rust (no Rust frontend for
GCC, no GCC backend for `rustc`, etc.) and, moreover, that the plans
for that had been dropped.

  - That LLVM did not add support for the missing architectures.

The first point is unlikely any time soon. The second point is
unlikely, too, given there is funding for that now (and I assume those
projects will receive more support if Rust lands in the kernel). The
third point is likely, though.

Cheers,
Miguel
Mariusz Ceier April 28, 2021, 6:34 p.m. UTC | #98
Hello,
  First of all IANAL, so I might be wrong regarding the issue below.

On 14/04/2021, ojeda@kernel.org <ojeda@kernel.org> wrote:
>
> ## Why not?
>
> Rust also has disadvantages compared to C in the context of
> the Linux kernel:
>
>
>   - Single implementation based on LLVM. There are third-party
>     efforts underway to fix this, such as a GCC frontend,
>     a `rustc` backend based on Cranelift and `mrustc`,
>     a compiler intended to reduce the bootstrapping chain.
>     Any help for those projects would be very welcome!
>
>   - Not standardized. While it is not clear whether standardization
>     would be beneficial for the kernel, several points minimize
>     this issue in any case: the Rust stability promise, the extensive
>     documentation, the WIP reference, the detailed RFCs...
>

After reading the interview referenced by https://lwn.net/Articles/854740/
I think there might be issue with licensing - few quotes from the interview:

> And on the other hand, I've seen a lot of BSD (or MIT or similar) licensed open source projects that just fragment when they become big enough to be commercially important, and the involved companies inevitably decide to turn their own parts proprietary.

> So I think the GPLv2 is pretty much the perfect balance of "everybody works under the same rules", and still requires that people give back to the community ("tit-for-tat")

> So forking isn't a problem, as long as you can then merge back the good parts. And that's where the GPLv2 comes in. The right to fork and do your own thing is important, but the other side of the coin is equally important - the right to then always join back together when a fork was shown to be successful.

Rust compiler license doesn't require for people to give back to the
community - corporation can create their own version of rust compiler
adding some proprietary extensions, develop drivers with it and even
if the drivers code will be GPL'd they won't be buildable by anyone
but that corporation. The rust compiler license doesn't require
sharing changes when you modify it. The similar problem has flex and
openssl required to build the kernel, but so far no one thought about
abusing them afaik.

That "single implementation based on LLVM" uses a mix of MIT, Apache,
BSD-compatible and other licenses. It doesn't use strong copyleft
license in contrast to almost every tool required to build the kernel,
except for flex (BSD, no (L)GPL alternative afaik) and openssl (Apache
license, gnutls could be used instead).

I suggest to wait until featureful GPL implementation of rust language
is made (assuming GNU Rust is on the way) before merging any rust code
in the kernel and when that implementation is done make a requirement
that all rust code must be buildable by at least GPL implementation.

Maybe it would also be worthwhile to make the requirement that the
kernel must be buildable with free software (not just open source
software) explicit ?

Best Regards,
Mariusz Ceier
Nick Desaulniers April 28, 2021, 8:25 p.m. UTC | #99
On Wed, Apr 28, 2021 at 11:34 AM Mariusz Ceier <mceier+kernel@gmail.com> wrote:
>
> Maybe it would also be worthwhile to make the requirement that the
> kernel must be buildable with free software (not just open source
> software) explicit ?

The kernel is already buildable by LLVM (and clang); in fact Android,
CrOS, and Google's production servers already do so.
https://clangbuiltlinux.github.io/
David Laight April 28, 2021, 9:21 p.m. UTC | #100
From: Mariusz Ceier
> Sent: 28 April 2021 19:34
....
> 
> I suggest to wait until featureful GPL implementation of rust language
> is made (assuming GNU Rust is on the way) before merging any rust code
> in the kernel and when that implementation is done make a requirement
> that all rust code must be buildable by at least GPL implementation.
> 
> Maybe it would also be worthwhile to make the requirement that the
> kernel must be buildable with free software (not just open source
> software) explicit ?

Or put the version of the compiler that works in the source tree
with the kernel and then build it as part of the full build.

It is enough of a PITA having to find libelf-devel in order to
build objtool, never mind having to find the correct version
of something else.

gcc tends to be available and the version doesn't matter too much.
But ever that gives problems.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Wedson Almeida Filho April 29, 2021, 1:52 a.m. UTC | #101
On Tue, Apr 27, 2021 at 12:54:00PM +0200, Linus Walleij wrote:
> I really like the look of this. I don't fully understand it, but what
> is needed for driver developers to adopt rust is something like a
> detailed walk-through of examples like this that explains the
> syntax 100% all the way down.

Do you have a suggestion for a good place to host such walk-throughs? Also, do
you have other examples in mind that might be useful?

Have you had a chance to read the example I posted in Google's security blog?
It's not particularly complex stuff but touches on some relevant concepts:
https://security.googleblog.com/2021/04/rust-in-linux-kernel.html

> This looks good, but cannot be done like this. The assembly versions
> of writel() etc have to be used because the compiler simply will not
> emit the right type of assembly for IO access, unless the compiler
> (LLVM GCC) gains knowledge of what an IO address is, and so far
> they have not.

That code does not preclude the use of C/assembly wrappers. One way to do it
would be to define a trait that allows types to specify their read/write
functions. For example:

pub trait IoMemType {
    unsafe fn write(ptr: *mut Self, value: Self);
    unsafe fn read(ptr: *const Self) -> Self;
}

Then we restrict T in my original example to only allow types that implement
IoMemType. And we implement it for u8/16/32/64 as wrappers to the C/assembly
implementations.

> One *could* think about putting awareness about crazy stuff like
> that into the language but ... I think you may want to avoid it
> and just wrap the assembly. So a bit of low-level control of the
> behavior there.

Yes, I'm happy to have C/assembly be the source of truth, called from Rust
through wrappers.

> > 2. The only unsafe part that could involve the driver for this would be the
> > creation of IoMemBlock: my expectation is that this would be implemented by the
> > bus driver or some library that maps the appropriate region and caps the size.
> > That is, we can also build a safe abstraction for this.
> 
> I suppose this is part of the problem in a way: a language tends to be
> imperialistic: the developers will start thinking "it would all be so much
> easier if I just rewrote also this thing in Rust".

I'm not sure I agree with this. I actually just want to hook things up to the
existing C code and expose a Rust interface that allows developers to benefit
from the guarantees it offers. Unnecessarily rewriting things would slow me
down, so my incentive is to avoid rewrites.

> > 7. We could potentially design a way to limit which offsets are available for a
> > given IoMemBlock, I just haven't thought through it yet, but it would also
> > reduce the number of mistakes a developer could make.
> 
> The kernel has an abstraction for memory and register accesses,
> which is the regmap, for example MMIO regmap for simple
> memory-mapped IO:
> drivers/base/regmap/regmap-mmio.c
> 
> In a way this is memory safety implemented in C.

I wasn't aware of this. I like it. Thanks for sharing.

> If Rust wants to do this I would strongly recommend it to
> try to look like regmap MMIO.
> See for example drivers/clk/sprd/common.c:
> 
> static const struct regmap_config sprdclk_regmap_config = {
>         .reg_bits       = 32,
>         .reg_stride     = 4,
>         .val_bits       = 32,
>         .max_register   = 0xffff,
>         .fast_io        = true,
> };
> (...)
> regmap = devm_regmap_init_mmio(&pdev->dev, base,
>                                                &sprdclk_regmap_config);
> 
> It is also possible to provide a callback function to determine
> if addresses are readable/writeable.
> 
> This is really a devil-in-the-details place where Rust needs
> to watch out to not reimplement regmap in a substandard
> way from what is already available.

At the moment we're only providing wrappers for things we need, so it is mostly
restricted to what I needed for Binder.

If someone wants to write a driver that would benefit from this, we will look
into it and possibly wrap the C implementation.

> Also in many cases developers do not use regmap MMIO
> because it is just too much trouble. They tend to use it
> not because "safety is nice" but because a certain register
> region is very fractured and it is easy to do mistakes and
> write into a read-only register by mistake. So they want
> this, optionally, when the situation demands it.

In Rust, we want all accesses to be safe (within reason), so we probably want to
offer something like IoMemBlock for cases when regmap-mmio is too much hassle.

> It looks nice but it is sadly unrealistic because we need to wrap
> the real assembly accessors in practice (write memory barriers
> and such) and another problem is that it shows that Rust has an
> ambition to do a parallel implementation of regmap.

There is no such ambition. The code in my previous email was written on the spot
as a demonstration per your request.

> > Would you mind sharing more about which aspect of this you feel is challenging?
> 
> Good point.
> 
> This explanation is going to take some space.

Thanks, I appreciate this.

> I am not able to express it in Rust at all and that is what
> is challenging about it, the examples provided for Rust
> are all about nice behaved computer programs like
> cutesey fibonnacci series and such things and not really
> complex stuff.

I'm sure you're able to express functions and arguments, for example. So going
into the details of the code would have been helpful to me.

> Your binder example is however very good, the problem
> is that it is not a textbook example so the intricacies of
> it are not explained, top down. (I'm not blaming you for
> this, I just say we need that kind of text to get to know
> Rust in the details.)

Do you think a write up about some of what's in there would be helpful? I was
planning to publish some information about the code, including performance
numbers and comparisons of past vulnerabilities once I completed the work.
Probably not to the level of detail that you're seeking but I may look into
having more details about the code if there is demand for it.

> What we need is a good resource to learn it, that
> skips the trivial aspects of the language and goes immediately
> for the interesting details.
> 
> It's not like I didn't try.
> I consulted the Rust book on the website of coure.

Did you run into 'Rust for Embedded C Programmers' by any chance
(https://docs.opentitan.org/doc/ug/rust_for_c/)? It's not all up to date but I
found it useful.

> The hard thing to understand in Rust is traits. I don't understand
> traits. I have the level of "a little knowledge is dangerous" and
> I clearly understand this: all kernel developers must have
> a thorough and intuitive understanding of the inner transcendental
> meaning of the concept of a TRAIT, how it was devised, how the
> authors of the language conceptualized it, what effect it is supposed
> to have on generated assembly.

Perhaps we need a 'Rust for Linux Kernel Programmers' in a similar vain to the
page I linked above. 

> The language book per se is a bit too terse.
> For example if I read
> https://doc.rust-lang.org/book/appendix-02-operators.html
> 
> T: ?Sized : Allow generic type parameter to be a dynamically sized type
> 
> This is just self-referential. The description is written in a
> strongly context-dependent language to make a pun ...
> I think every word in that sentence except "allow"and "to be a"
> is dependent on other Rust concepts and thus completely
> unreadable without context.
> 
> Instead it is described in a later chapter:
> https://doc.rust-lang.org/book/ch19-04-advanced-types.html
> 
> This is more to the point.
> 
> "Rust has a particular trait called the Sized trait to
> determine whether or not a type’s size is known at compile time."
> (...) "A trait bound on ?Sized is the opposite of a trait bound on
> Sized: we would read this as “T may or may not be Sized.” This
> syntax is only available for Sized, not any other traits."
> 
> But Jesus Christ. This makes me understand less not
> more.

I had similar frustrations when I started on the language, which wasn't that
long ago. One thing that I found useful was to read through some of the RFCs
related to the topic I was interested in: it was time-consuming but helped me
understand not only what was going on but the rationale as well.
 
> What I find very disturbing is that the authors of the Rust
> language did NOT write it. I think this may be the source
> of a serious flaw. We need this information straight from
> the horse's mouth.

Perhaps you're right... I don't share this feeling though.

> I would strongly advice the Rust community to twist the
> arms of the original Rust authors to go and review and
> edit the Rust book. Possibly rewrite parts of it. This is what
> the world needs to get a more adaptable Rust.
> 
> I understand this is a thick requirement, but hey, you are
> competing with C.

I don't think of this as a competition. I'm not arguing for C to be replaced,
only for Rust to be an option for those inclined to use it.

Thanks again,
-Wedson
Kajetan Puchalski April 29, 2021, 11:14 a.m. UTC | #102
David Laight <David.Laight@ACULAB.COM> writes:

> From: Mariusz Ceier
>> Sent: 28 April 2021 19:34
> ....
>>
>> I suggest to wait until featureful GPL implementation of rust 
>> language
>> is made (assuming GNU Rust is on the way) before merging any 
>> rust code
>> in the kernel and when that implementation is done make a 
>> requirement
>> that all rust code must be buildable by at least GPL 
>> implementation.
>>
>> Maybe it would also be worthwhile to make the requirement that 
>> the
>> kernel must be buildable with free software (not just open 
>> source
>> software) explicit ?
>
> Or put the version of the compiler that works in the source tree
> with the kernel and then build it as part of the full build.

Building compilers takes several hours, I'm pretty sure usually 
much more
than the kernel itself. Building the compiler as part of the full 
build
would be a gigantic pain for everyone involved. Rustc is even 
worse than
most compilers on that front due to the complexity of its runtime 
checks.

--
Kind regards,
Kajetan
Kajetan Puchalski April 29, 2021, 11:25 a.m. UTC | #103
Mariusz Ceier <mceier+kernel@gmail.com> writes:

> Rust compiler license doesn't require for people to give back to 
> the
> community - corporation can create their own version of rust 
> compiler
> adding some proprietary extensions, develop drivers with it and 
> even
> if the drivers code will be GPL'd they won't be buildable by 
> anyone
> but that corporation. The rust compiler license doesn't require
> sharing changes when you modify it. The similar problem has flex 
> and
> openssl required to build the kernel, but so far no one thought 
> about
> abusing them afaik.

Could you explain exactly what the issue you see there is?
Surely if someone develops a proprietary compiler and then writes 
kernel
drivers that use that compiler, nobody else will be able to build 
them.
Because of that, none of the maintainers will be able to run or 
test
the code and it'll never actually get merged into the kernel.
Surely they'd effectively be sabotaging themselves.

--
Kind regards,
Kajetan
Mariusz Ceier April 29, 2021, 2:06 p.m. UTC | #104
On 29/04/2021, Kajetan Puchalski <mrkajetanp@gmail.com> wrote:
>
> Mariusz Ceier <mceier+kernel@gmail.com> writes:
>
>> Rust compiler license doesn't require for people to give back to
>> the
>> community - corporation can create their own version of rust
>> compiler
>> adding some proprietary extensions, develop drivers with it and
>> even
>> if the drivers code will be GPL'd they won't be buildable by
>> anyone
>> but that corporation. The rust compiler license doesn't require
>> sharing changes when you modify it. The similar problem has flex
>> and
>> openssl required to build the kernel, but so far no one thought
>> about
>> abusing them afaik.
>
> Could you explain exactly what the issue you see there is?
> Surely if someone develops a proprietary compiler and then writes
> kernel
> drivers that use that compiler, nobody else will be able to build
> them.
> Because of that, none of the maintainers will be able to run or
> test
> the code and it'll never actually get merged into the kernel.
> Surely they'd effectively be sabotaging themselves.
>

Let's assume the hipothetical corporation wants to add some
proprietary stuff to the kernel and avoid sharing the code (sharing
the code is GPL requirement) - maybe they're producing proprietary
hardware e.g. risc-v processor with proprietary ISA extension. So
"none of the maintainers will be able to run or test the code and
it'll never actually get merged into the kernel." is exactly what it
wants.

To do this they could modify any non-GPL tool required to build the
kernel e.g. flex, rust or openssl so that for files with .proprietary
extension they would execute some code (like "patch this file") taken
from database of shell codes based just on .proprietary file name (so
that the contents of .proprietary files will be freely modifiable -
citing GPL: "The source code for a work means the preferred form of
the work for making modifications to it.").

These .proprietary files can be GPL'd since they don't contain any
useful information for outsiders - all of it could be in the shell
codes. The source code of the modified tool wouldn't have to be
shared, since their license doesn't require it.

I think such modified kernel source code would still be
GPL-compatible, but not benefit the kernel community. If the tool was
GPL-licensed, corporation would have to share it's source code - and I
assume also the database of shell codes, due to:

> You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, *to be licensed as a whole* at no charge to all third parties under the terms of this License.


The issue here is, non-GPL tools enable development and distribution
of GPL-compatible yet proprietary versions of the kernel, unless I'm
mistaken.

> --
> Kind regards,
> Kajetan
>
Sven Van Asbroeck April 29, 2021, 2:13 p.m. UTC | #105
On Thu, Apr 29, 2021 at 10:06 AM Mariusz Ceier <mceier+kernel@gmail.com> wrote:
>
> Let's assume the hipothetical corporation wants to add some
> proprietary stuff to the kernel and avoid sharing the code

Wouldn't Greg KH be itching to remove such patches from the kernel? If
they made it in, in the first place.

GPL is a necessary, but not sufficient condition for code to be merged
into the kernel. AFAIK the kernel community has the absolute
discretion to refuse any GPLed code for any reason.

IANAL.
Willy Tarreau April 29, 2021, 2:26 p.m. UTC | #106
On Thu, Apr 29, 2021 at 10:13:23AM -0400, Sven Van Asbroeck wrote:
> On Thu, Apr 29, 2021 at 10:06 AM Mariusz Ceier <mceier+kernel@gmail.com> wrote:
> >
> > Let's assume the hipothetical corporation wants to add some
> > proprietary stuff to the kernel and avoid sharing the code
> 
> Wouldn't Greg KH be itching to remove such patches from the kernel? If
> they made it in, in the first place.

That's not what he was saying, he's saying the code could be distributed
(i.e. on the company's github repo for example) to comply with GPL though
they wouldn't care about getting it merged (like plenty of crappy vendors
today).

But the point is irrelevant since this can already be done using, say,
clang which is already capable of building the kernel and where such
extensions could already be added.

I.e. that's just a non-argument, let's move along.

Willy
Al Viro April 29, 2021, 3:06 p.m. UTC | #107
On Thu, Apr 29, 2021 at 02:06:12PM +0000, Mariusz Ceier wrote:

> > You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, *to be licensed as a whole* at no charge to all third parties under the terms of this License.
> 
> 
> The issue here is, non-GPL tools enable development and distribution
> of GPL-compatible yet proprietary versions of the kernel, unless I'm
> mistaken.

And?  For your argument to work, we'd need to have the kernel somehow
locked into the use of tools that would have no non-GPL equivalents
*and* would be (somehow) protected from getting such equivalents.
How could that be done, anyway?  Undocumented and rapidly changing
features of the tools?  We would get screwed by those changes ourselves.
Copyrights on interfaces?  Software patents?  Some other foulness?

I honestly wonder about the mental contortions needed to describe
something of that sort as "free", but fortunately we are nowhere
near such situation anyway.

I don't like Rust as a language and I'm sceptical about its usefulness
in the kernel, but let's not bring "gcc is better 'cuz GPL" crusades
into that - they are irrelevant anyway, since we demonstrably *not*
locked into gcc on all architectures your hypothetical company would
care about, Rust or no Rust.
Peter Enderborg April 29, 2021, 3:38 p.m. UTC | #108
On 4/20/21 8:16 AM, Willy Tarreau wrote:
> On Tue, Apr 20, 2021 at 07:56:18AM +0200, Greg Kroah-Hartman wrote:
>> I would LOVE it if some "executives" would see the above presentations,
>> because then they would maybe actually fund developers to fix bugs and
>> maintain the kernel code, instead of only allowing them to add new
>> features.
>>
>> Seriously, that's the real problem, that Dmitry's work has exposed, the
>> lack of people allowed to do this type of bugfixing and maintenance on
>> company time, for something that the company relies on, is a huge issue.
>> "executives" feel that they are willing to fund the initial work and
>> then "throw it over the wall to the community" once it is merged, and
>> then they can forget about it as "the community" will maintain it for
>> them for free.  And that's a lie, as Dmitry's work shows.
> That's sadly the eternal situation, and I'm suspecting that software
> development and maintenance is not identified as a requirement for a
> large number of hardware vendors, especially on the consumer side where
> margins are lower. A contractor is paid to develop a driver, *sometimes*
> to try to mainline it (and the later they engage with the community, the
> longer it takes in round trips), and once the code finally gets merged,
> all the initial budget is depleted and no more software work will be
> done.
>
> Worse, we could imagine kicking unmaintained drivers faster off the
> tree, but that would actually help these unscrupulous vendors by
> forcing their customers to switch to the new model :-/  And most of
> them wouldn't care either if their contributions were refused based
> on their track record of not maintaining their code, since they often
> see this as a convenience to please their customers and not something
> they need (after all, relying on a bogus and vulnerable BSP has never
> prevented from selling a device, quite the opposite).
>
> In short, there is a parallel universe where running highly bogus and
> vulnerable out-of-tree code seems like the norm and where there is no
> sort of care for what is mainlined as it's possibly just made to look
> "cool".


In the parallel universe where I spent most time everyone
now need to learn how to make their things to work
out-of-tree. And there is not much of business case trying
to fix and improve core parts of linux. The turn around have
increased a lot and there is no edge doing it.


> We also need to recognize that it's expectable that some vendors are
> not willing to engage on supporting a driver for a decade if they
> expect their device to last 5 years only, and maybe we should make
> some rules clear about mainlining drivers and what to expect for
> users (in which case the end of support would be clear and nobody
> would be surprised if the driver is removed at the end of its
> maintenance, barring a switch to a community maintainer).

Things have changed. Once upon a time the community was
happy if it could get hardware specs.


> Just my two cents,
> Willy
Mariusz Ceier April 29, 2021, 4:09 p.m. UTC | #109
On 29/04/2021, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Thu, Apr 29, 2021 at 02:06:12PM +0000, Mariusz Ceier wrote:
>
>> > You must cause any work that you distribute or publish, that in whole or
>> > in part contains or is derived from the Program or any part thereof, *to
>> > be licensed as a whole* at no charge to all third parties under the
>> > terms of this License.
>>
>>
>> The issue here is, non-GPL tools enable development and distribution
>> of GPL-compatible yet proprietary versions of the kernel, unless I'm
>> mistaken.
>
> And?  For your argument to work, we'd need to have the kernel somehow
> locked into the use of tools that would have no non-GPL equivalents
> *and* would be (somehow) protected from getting such equivalents.
> How could that be done, anyway?  Undocumented and rapidly changing
> features of the tools?  We would get screwed by those changes ourselves.
> Copyrights on interfaces?  Software patents?  Some other foulness?
>
> I honestly wonder about the mental contortions needed to describe
> something of that sort as "free", but fortunately we are nowhere
> near such situation anyway.
>

Equivalents are not a problem - they can exist as long as the
distributed source would be buildable with GPL tools. I was thinking
that adding a requirement that the distributed kernel source should be
buildable by GPL tools would be enough to protect it from proprietary
extensions. But maybe you're right that this is unrealistic.

> I don't like Rust as a language and I'm sceptical about its usefulness
> in the kernel, but let's not bring "gcc is better 'cuz GPL" crusades
> into that - they are irrelevant anyway, since we demonstrably *not*
> locked into gcc on all architectures your hypothetical company would
> care about, Rust or no Rust.
>

I don't mind the language. I'm more concerned about featureful rust
compiler suddenly being developed behind closed doors.
Thomas Schoebel-Theuer April 30, 2021, 6:39 a.m. UTC | #110
On 29/04/2021 13:25, Kajetan Puchalski wrote:
>
> Mariusz Ceier <mceier+kernel@gmail.com> writes:
>
>> Rust compiler license doesn't require for people to give back to the
>> community - corporation can create their own version of rust compiler
>> adding some proprietary extensions, develop drivers with it and even
>> if the drivers code will be GPL'd they won't be buildable by anyone
>> but that corporation.
>
> Could you explain exactly what the issue you see there is?


Kajetan and others, this is an interesting discussion for me. Let us 
compare the kernel-specific scope with general OpenSource community and 
industry scope.

Industry (where I am working) often requires a "second source" to avoid 
the so-called "vendor lock-in", which is the key point of this part of 
the discussion.

As soon as Copyleft is involved, the requirement of "second source" is 
_permanently_ met: anyone may fork it at any time, creating another 
source, (theoretically) avoiding a dead end eternally. Lock-in is 
prevented at license level.

IMO this is a _requirement_ for Linux, otherwise its "business model" 
wouldn't work in the long term (decades as is always necessary for basic 
infrastructure / system software).

If the requirement "second source" (by either way) is not met by Rust at 
the moment, this needs to be fixed first.

Other limitations like "development resources" might lead to similar 
effects than lock-in. I am seeing the latter nearly every workday. 
Software becomes "unmanageable" due to factors like technical debts / 
resource restrictions / etc. Typical main reasons are almost always at a 
_social_ / _human_ level, while purely technical reasons are playing 
only a secondary role.

This is the link to what Greg said earlier in this discussion: 
development resources and their _dedication_ (e.g. maintenance vs 
creation of "new" things) is the absolute key.

Would Rust improve this problem area _provably_ by at least 30% ?

I am insisting on a _quantifiable_ 30% improvement because this is the 
"magic theshold" in industry after which the motto "never change a 
running system" can be overcome from an investment perspective, and also 
from a risk perspective.

After this, another dimension is kicking in: maturity.

You always need to invest a high effort for achieving "sufficient 
maturity". According to the Pareto principle, maintenance is typically 
around 70% to 90% of total cost for key infrastructure.

In my working area where end-to-end SLAs of >99.98% have to met, the 
Pareto ratio may be even higher.

Pareto's law, as well as Zipf's law, are more or less observational 
"natural laws" holding for almost _any_ complex / dynamic system. Even 
if you try to improve such universal laws, e.g. by investing a lot of 
effort / resources / money into maintenance reduction techniques, you 
typically end up at a similar _total_ effort for maintenance (including 
the extra effort for reduction of "ordinary" maintenance) than before.

Otherwise, you would have found a way for bypassing natural laws like 
the observed Pareto law. Even billions of years of biological evolution 
on this earth weren't able to change this universal law in statistical 
average (in global scale). Otherwise we couldn't observe it anymore.

Even if you could improve the Pareto ratio, my experience is that upper 
management will kick in and raise the SLA level so  that Pareto holds 
again ;)

So I'm sceptical that new technologies like Rust will change fundamental 
laws, e.g. with respect to relative maintenance efforts.

However, what _could_ be theoretically possible: _productivity_ gains, 
improving both development of "new" things as well as "maintenance" 
efforts, in total by more than 30% (but not the Pareto ratio between them).

So the question is: can Rust _provably_ lead to *quantifiable* total 
productivity gains of at least 30% ?

If this would be the case, any business case needs further alternatives. 
So it needs to be compared at least with alternative B: what would be 
the effort and the productivity gain when introducing similar technology 
non-disruptively into the current development ecosystem?

Even if this A-B comparison would lead to a conclusion that 30% cannot 
be met by a new and partly disruptive technology like Rust, the 
discussion can be fruitful. There is always a chance to introduce some 
parts of a new technology into a well-proven and mature "old" technology 
non-disruptively.

Cheers,

Thomas
David Laight April 30, 2021, 8:30 a.m. UTC | #111
From: Thomas Schoebel-Theuer
> Sent: 30 April 2021 07:40
...
> Industry (where I am working) often requires a "second source" to avoid
> the so-called "vendor lock-in", which is the key point of this part of
> the discussion.

There is also the related problem that you need to be able to come
back in 5 years time and re-build the original image.
You can then make minor changes, rebuild, and have a reasonable
confidence that there are no side effects.

This means that web-based and auto-updated tools cannot be used.
Even a VM image might suddenly fall foul of changes to hypervisors.
So you need to keep (at least) 2 system than contain all the build
tools just in case you need to do a maintenance build of an old release.

But even then we can no longer build drivers for some windows
systems because we can't sign them with the old keys.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Linus Walleij May 4, 2021, 9:21 p.m. UTC | #112
On Wed, Apr 28, 2021 at 5:10 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> On Tue, Apr 27, 2021 at 1:13 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > Actually my reply to Wedson brought up a new issue, which is the
> > quality of learning resources and the lack of an equivalent to
> > The C Programming Language book.
>
> I recall having a similar feeling when initially jumping into
> individual chapters of The Rust Programming Language book. I think it
> is intended to be read from cover to cover instead.
>
> There are other resources, see [1]. For instance, there is The
> Embedded Rust Book [2]. Some of those are a WIP, but perhaps others
> can recommend better finished/published books.
>
> In any case, Rust has more features than C, some of them quite unique,
> and they are routinely used, so it does take some time to learn.

No joke, I do try. I thought it would be easier since I have written
a fair share of Haskell in my years but Rust is this hybrid of
imperative and functional that just make things a bit hard
to pin down: is this part functional? is it imperative?
is it object oriented? etc.

But the reference that Robin pointed to is better to read. It
is not as talkative and more to the point. So now I am working
with that.

> > I think a good starting point would be to either fix Rust support in
> > GCC or implement some more important ISAs in LLVM,
> > whichever is easiest. I don't mind having just *one* compiler but
> > I mind having *a* compiler for every arch we support.
> >
> > [...]
> >
> > Portability to old systems and ISAs is a virtue in itself
> > because of the effect it has on code quality, not necessarily
> > for the support itself.
>
> I agree that there are benefits of keeping compiler technology
> flexible, but one cannot force or expect any project (including the
> Linux kernel) to maintain all code forever.
>
> In the end, we need to balance that adaptability against the benefits
> of adding Rust. In particular because nowadays LLVM is able to cover
> the majority of devices that want to run the very latest Linux
> kernels. Thus those benefits apply to most users. If LLVM only
> supported, say, x86_64, I would agree that it would not be enough.
>
> By contrast, compiler flexibility only matters indirectly to users,
> and at some point there are diminishing returns to keeping all
> architectures around.

My values in this regard are 180 degrees opposed to yours.

My attitude to the problem, was I to fix it, would be
"let's go fix a frontend for Rust to GCC, how hard can it be"
rather than trying to avoid that work with this kind of
reasoning trying to one way or other prove that it is not
worth the effort.

A GCC front-end will allow you to run Rust on all Linux
target architectures which is a big win. LLVM can be
expanded with backends for all archs as well but that
seems like a much bigger job to me.

Another argument can be made that for Rust to be
perceived as mature, two independent implementations
should exist anyway.

The IETF Standards Process (RFC 2026, updated by
RFC 6410) requires at least two independent and
interoperable implementations for advancing a protocol
specification to Internet Standard.

Why should the kernel programming languages have any
lower standards than that?

The C programming language has earned its place thanks
to perseverance of implementing and reimplementing
compilers for it again and again.

Fixing proper compilers may take a few years, like
5 or 10. But who cares? We are in it for the long run
anyway. When I designed the character device interface
for GPIO I said I expect it to be maintained for at least
100 years. This is my honest perspective of things.

Torvalds has this saying (from Edison) that kernel
engineering is 1% inspiration and 99% perspiration.
Well let's live by that and fix those compilers.

When Linux was developed in 1992 C had existed since
1973 so it was 19 years old.

Now Linux is 28 years old and C is 47 years old.

This discussion needs perspective. And we really
cannot have development with a finger constantly
pushing the fastforward button.

> In any case, adding Rust (in particular for "leaf" modules) does not
> imply that we will lose those architectures any time soon.

But I am not convinced that writing device drivers is the right
thing to use Rust for in the kernel.

There are some stuff in device driver frameworks, such as USB
hierarchies or (battery) charging state machines, that can be
really good to rewrite in Rust. But these rewrites would affect
anything with a USB port for example, including Nios II and
Motorola 68k systems.  So then the compiler support for all
archs is needed first.

> That would
> take at least several years, and would require quite a few things to
> happen at the same time:
>
>   - That Rust got so widely used in the kernel (because the benefits
> turned out to be important) that maintainers went as far as wanting to
> drop C drivers from mainline for Rust equivalents.
>
>   - That GCC did not get any way to compile Rust (no Rust frontend for
> GCC, no GCC backend for `rustc`, etc.) and, moreover, that the plans
> for that had been dropped.
>
>   - That LLVM did not add support for the missing architectures.
>
> The first point is unlikely any time soon. The second point is
> unlikely, too, given there is funding for that now (and I assume those
> projects will receive more support if Rust lands in the kernel). The
> third point is likely, though.

What about patience?

I am thrilled to hear that GCC is growing Rust support and has funding
to fix a proper front-end. This is what Rust needs in general and
what the kernel needs in particular.

I think right now the right thing for Rust is to work out-of-tree until
there is Rust support for all archs, while encouraging kernel
developers to learn the language.

Yours,
Linus Walleij
Miguel Ojeda May 4, 2021, 11:30 p.m. UTC | #113
On Tue, May 4, 2021 at 11:21 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> Another argument can be made that for Rust to be
> perceived as mature, two independent implementations
> should exist anyway.

Many people agree, and in fact it may not be that far away. On related
news, the GCC frontend for Rust is now in Compiler Explorer, e.g.
https://godbolt.org/z/Wjbe5dTTb

I just requested `mrustc` (the Rust transpiler written in C++ for
bootstrapping purposes) to the Compiler Explorer folks to have it
there too.

> Fixing proper compilers may take a few years, like
> 5 or 10. But who cares? We are in it for the long run

I don't think it will take 5 years to see a new frontend (in
particular if only for valid code).

But even if it does, I don't see why we would need to wait for that to
start setting up Rust for the kernel if the decision is made to do so.

In fact, getting into the kernel can be an incentive for a new
frontend to say "we are now able to compile the kernel".

There are also other advantages to start the work now, such as working
out the currently-nightly features we need in the Rust language and
the standard library, getting them stabilized, submitting upstream
fixes (I had to implement a couple small ones), etc.

That way, when the time comes that we announce a minimum Rust stable
version, all that is ready for other frontends too.

> But I am not convinced that writing device drivers is the right
> thing to use Rust for in the kernel.

That is fair, hopefully the picture will be clearer when we get the
first drivers that talk to real hardware.

> There are some stuff in device driver frameworks, such as USB
> hierarchies or (battery) charging state machines, that can be
> really good to rewrite in Rust. But these rewrites would affect
> anything with a USB port for example, including Nios II and
> Motorola 68k systems.  So then the compiler support for all
> archs is needed first.

I would avoid a rewrite, but similarly to one of the previous points,
I don't see why work cannot already start if a maintainer is keen on
using Rust (and able to maintain both to some degree).

> I think right now the right thing for Rust is to work out-of-tree until
> there is Rust support for all archs, while encouraging kernel
> developers to learn the language.

That would be an option, yes, but if the decision ends up being made
and we are encouraging kernel developers to learn the language, what
do we achieve by keeping things out-of-tree?

In fact, by getting in-tree people, organizations & companies would be
encouraged to give more support sooner rather than later to the LLVM
backends they care about and/or to the GCC frontend for Rust. So, in a
way, it can be a win for those projects too.

Cheers,
Miguel
Linus Walleij May 5, 2021, 11:34 a.m. UTC | #114
On Wed, May 5, 2021 at 1:30 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> On Tue, May 4, 2021 at 11:21 PM Linus Walleij <linus.walleij@linaro.org> wrote:

> > I think right now the right thing for Rust is to work out-of-tree until
> > there is Rust support for all archs, while encouraging kernel
> > developers to learn the language.
>
> That would be an option, yes, but if the decision ends up being made
> and we are encouraging kernel developers to learn the language, what
> do we achieve by keeping things out-of-tree?
>
> In fact, by getting in-tree people, organizations & companies would be
> encouraged to give more support sooner rather than later to the LLVM
> backends they care about and/or to the GCC frontend for Rust. So, in a
> way, it can be a win for those projects too.

In a way it is a fair point because for example Unix and C evolved
together and were intermingled at the onset. And they kind of
needed each other to evolve.

Right now it seems like those organizations and companies
would be some academic institutions who like rust (because they
study languages and compilers) and Google. But that is a
pretty nice start, and one upside I would see in it is that
the academic people stop writing so many papers and get their
hands dirty and work on practical problems in the kernel. So
if that can be achieved I would be happy.

Yours,
Linus Walleij
Enrico Weigelt, metux IT consult May 5, 2021, 1:58 p.m. UTC | #115
On 30.04.21 08:39, Thomas Schoebel-Theuer wrote:

Hi,

> IMO this is a _requirement_ for Linux, otherwise its "business model" 
> wouldn't work in the long term (decades as is always necessary for basic 
> infrastructure / system software).

ACK. And speaking for embedded world, 20+ product lifetime is pretty
common. During that lifetime you'd need to be able to pick out old
sources, so some changes and rebuild your code and having your system
still running seamlessly after the update. IOW: long-term
reproducability is absolutely vital. Linux does much better here than
many competitors (that eg. need proprietary build tools that don't
even run later machine generations)

> If the requirement "second source" (by either way) is not met by Rust at 
> the moment, this needs to be fixed first.

Yes, and also adding long-term reproducability as another vital requirement.

Rust seems to be a fast moving target. Even building a Rust compiler can
be a pretty complex task (if you're not a full time rust developer).

Gcc, in constrast, itself can be built on older compilers (even non-
gcc). How to do that w/ rustc ? According to my observations some while
ago, it needs a fairly recent rustc to compile recent rustc, so when
coming with an old version, one has to do a longer chain of rustc
builds first. Doesn't look exactly appealing for enterprise grade and
long term support.

> Other limitations like "development resources" might lead to similar 
> effects than lock-in. I am seeing the latter nearly every workday. 
> Software becomes "unmanageable" due to factors like technical debts / 
> resource restrictions / etc. Typical main reasons are almost always at a 
> _social_ / _human_ level, while purely technical reasons are playing 
> only a secondary role.

Correct, the amount of people who understand rust is pretty low, those
who also understand enough of linux kernel development, probably just
a hand full world wide. For any practical business use case this
practically means: unsupported.

I don't like the idea of Linux being catapulted back from enterprise
grade to academic toy.


--mtx
Miguel Ojeda May 5, 2021, 2:17 p.m. UTC | #116
On Wed, May 5, 2021 at 1:34 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> Right now it seems like those organizations and companies
> would be some academic institutions who like rust (because they
> study languages and compilers) and Google. But that is a

Note that there are quite a few major private players already
involved, not just Google! e.g.

  - The Rust Foundation has AWS, Facebook, Google, Huawei, Microsoft
and Mozilla: https://foundation.rust-lang.org/

  - AWS and Facebook using Rust for a few years now:
https://engineering.fb.com/2021/04/29/developer-tools/rust/ and
https://aws.amazon.com/blogs/opensource/how-our-aws-rust-team-will-contribute-to-rusts-future-successes/

  - Microsoft providing official Win32 bindings/docs for Rust:
https://github.com/microsoft/windows-rs and
https://docs.microsoft.com/en-us/windows/dev-environment/rust/overview

Of course, any major company uses most common languages at some point
or another, but their commitment looks significant now (and public).

Cheers,
Miguel
Miguel Ojeda May 5, 2021, 2:41 p.m. UTC | #117
On Wed, May 5, 2021 at 3:59 PM Enrico Weigelt, metux IT consult
<lkml@metux.net> wrote:
>
> ACK. And speaking for embedded world, 20+ product lifetime is pretty
> common. During that lifetime you'd need to be able to pick out old
> sources, so some changes and rebuild your code and having your system
> still running seamlessly after the update. IOW: long-term
> reproducability is absolutely vital. Linux does much better here than
> many competitors (that eg. need proprietary build tools that don't
> even run later machine generations)

You should be able to rebuild old releases with newer compilers.

Like the major C and C++ compilers keep support for old code and old
standards, the main Rust compiler keeps support for old code and old
"editions" too.

> Yes, and also adding long-term reproducability as another vital requirement.

See my sibling replies to Linus W. on the efforts underway around this.

> Rust seems to be a fast moving target. Even building a Rust compiler can
> be a pretty complex task (if you're not a full time rust developer).

It only takes a handful of commands. If you know how to build GCC or
LLVM, building Rust is about the same complexity.

> Gcc, in constrast, itself can be built on older compilers (even non-
> gcc). How to do that w/ rustc ? According to my observations some while
> ago, it needs a fairly recent rustc to compile recent rustc, so when
> coming with an old version, one has to do a longer chain of rustc
> builds first. Doesn't look exactly appealing for enterprise grade and
> long term support.

Why would enterprise users care about bootstrapping? Companies
typically want to use supported software, so they would use the
pre-built compiler their distribution offers support for.

For companies that want more features, they can use newer versions via
the pre-built official binaries from the Rust project itself, which
are routinely used by many projects around the world. Some companies
are even using particular (i.e. frozen) Rust nightly compilers they
picked.

> Correct, the amount of people who understand rust is pretty low, those
> who also understand enough of linux kernel development, probably just
> a hand full world wide. For any practical business use case this
> practically means: unsupported.

This assumes Rust-enabled kernels will be provided by distributions to
businesses from day 1 as soon as supports gets merged.

Instead, what will need to happen first is that we evolve the support
enough to compile the kernel with a Rust stable compiler, some
important drivers get written *and* distributions start shipping those
drivers in their business-oriented releases.

That will take some time, and interested companies (e.g. for drivers)
and their kernel developers will learn how to use Rust in the
meantime.

Cheers,
Miguel
Enrico Weigelt, metux IT consult May 5, 2021, 3:13 p.m. UTC | #118
On 05.05.21 16:17, Miguel Ojeda wrote:

> Note that there are quite a few major private players already
> involved, not just Google! e.g.
> 
>    - The Rust Foundation has AWS, Facebook, Google, Huawei, Microsoft
> and Mozilla: https://foundation.rust-lang.org/
> 
>    - AWS and Facebook using Rust for a few years now:
> https://engineering.fb.com/2021/04/29/developer-tools/rust/ and
> https://aws.amazon.com/blogs/opensource/how-our-aws-rust-team-will-contribute-to-rusts-future-successes/
> 
>    - Microsoft providing official Win32 bindings/docs for Rust:
> https://github.com/microsoft/windows-rs and
> https://docs.microsoft.com/en-us/windows/dev-environment/rust/overview

Exactly a list of corporations, I'd never want to rely on.


--mtx
Linus Walleij May 6, 2021, 12:47 p.m. UTC | #119
On Wed, May 5, 2021 at 4:17 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> On Wed, May 5, 2021 at 1:34 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > Right now it seems like those organizations and companies
> > would be some academic institutions who like rust (because they
> > study languages and compilers) and Google. But that is a
>
> Note that there are quite a few major private players already
> involved, not just Google! e.g.

I was referring to entities interested in using Rust for the
Linux kernel. Not just "using rust". And that interest is coming
from Google and a few academic institutions AFAICT.

Yours,
Linus Walleij
Miguel Ojeda May 7, 2021, 6:23 p.m. UTC | #120
On Thu, May 6, 2021 at 2:47 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> I was referring to entities interested in using Rust for the
> Linux kernel. Not just "using rust". And that interest is coming
> from Google and a few academic institutions AFAICT.

There is interest from a handful of major companies in using Rust for
the Linux kernel. For instance, we had a few present in our latest
informal call.

I am also aware of at least two having or looking to
have someone working on testing the waters of what we have done so
far for their use cases.

Cheers,
Miguel
Olliver Schinagl June 20, 2022, 3:11 p.m. UTC | #121
Hey Miguel and others,

Having followed this for a while, I finally decided it best to at least 
share some thoughts in the hopes to make life better for us with some 
readability/accessibility issues, such as dyslexia for example.

I apologize for being late to the party and for potentially using the 
wrong thread, but I recall somewhere in v5 that it was best to respond 
to the RFC for general comments.

On 14-04-2021 20:45, ojeda@kernel.org wrote:
> From: Miguel Ojeda <ojeda@kernel.org>
> 
> Some of you have noticed the past few weeks and months that
> a serious attempt to bring a second language to the kernel was
> being forged. We are finally here, with an RFC that adds support
> for Rust to the Linux kernel.
> 
> This cover letter is fairly long, since there are quite a few topics
> to describe, but I hope it answers as many questions as possible
> before the discussion starts.
> 

<snip>

> 
> Moreover, as explained above, we are taking the chance to enforce
> some documentation guidelines. We are also enforcing automatic code
> formatting, a set of Clippy lints, etc. We decided to go with Rust's
> idiomatic style, i.e. keeping `rustfmt` defaults. For instance, this
> means 4 spaces are used for indentation, rather than a tab. We are
> happy to change that if needed -- we think what is important is
> keeping the formatting automated

Enforcing this is great, but how will you enforce this 'everywhere'? 
Right now, you can easily 'bypass' any CI put in place, and while 'for 
now' this is only about the Rust infra, where this can be strongly 
enforced, once we see actual drivers pop-up; these won't go through the 
Rust CI before merging CI forever? A maintainer can 'just merge' 
something still, right?

Anyway, what I wanted to criticize, is the so called "keeping with 
`rustfmt` defaults". It has been known, that, well Rust's defaults are 
pretty biased and opinionated. For the Rust project, that's fair of 
course, their code, their rules.

However, there's two arguments against that. For one, using the Rust 
'style', now means there's 2 different code styles in the Kernel. 
Cognitively alone, that can be quite frustrating and annoying. Having to 
go back and forth between two styles can be mentally challenging which 
only causes mistakes and frustration. So why change something that 
already exists? Also, see my first point. Having to constantly 
remember/switch to 'in this file/function the curly brace is on a 
different line'. Lets try to stay consistent, the rules may not be 
perfect (80 columns ;), but so far consistency is tried. OCD and Autism 
etc doesn't help with this ;)

Secondly, and this is really far more important, the Rust default style 
is not very inclusive, as it makes readability harder. This has been 
brought up by many others in plenty of places, including the `rustfmt` 
issue tracker under bug #4067 [0]. While the discussion eventually only 
led to the 'fmt-rfcs' [1], where it was basically said 'you could be on 
to something, but this ship has sailed 3 years ago (when nobody was 
looking caring), and while we hear you, we're not going to change our 
defaults anymore.

But I also agree and share these commenters pain. When the tab character 
is used for indenting (and not alignment mind you), then visually 
impaired (who can still be amazing coders) can more easily read code by 
adjusting the width what works best to them.

With even git renaming `master` to `main` to be more inclusive, can we 
also be more inclusive to us that have a hard time distinguishing narrow 
indentations?

Thanks, and sorry for rubbing any ones nerves, but to "some of us" this 
actually matters a great deal.

Olliver

P.S. would we expect inline C/Rust code mixed? What then?


<snip>

[0]: https://github.com/rust-lang/rustfmt/issues/4067#issuecomment-685961408
[1]: 
https://github.com/rust-dev-tools/fmt-rfcs/issues/1#issuecomment-911804826
Miguel Ojeda June 27, 2022, 5:44 p.m. UTC | #122
Hi Olliver,

On Mon, Jun 20, 2022 at 5:11 PM Olliver Schinagl <oliver@schinagl.nl> wrote:
>
> I apologize for being late to the party and for potentially using the
> wrong thread, but I recall somewhere in v5 that it was best to respond
> to the RFC for general comments.

No need to apologize! Feel free to use the latest threads or a new
thread in e.g. the rust-for-linux ML.

> On 14-04-2021 20:45, ojeda@kernel.org wrote:
> > From: Miguel Ojeda <ojeda@kernel.org>
> >
> > Moreover, as explained above, we are taking the chance to enforce
> > some documentation guidelines. We are also enforcing automatic code
> > formatting, a set of Clippy lints, etc. We decided to go with Rust's
> > idiomatic style, i.e. keeping `rustfmt` defaults. For instance, this
> > means 4 spaces are used for indentation, rather than a tab. We are
> > happy to change that if needed -- we think what is important is
> > keeping the formatting automated
>
> Enforcing this is great, but how will you enforce this 'everywhere'?
> Right now, you can easily 'bypass' any CI put in place, and while 'for
> now' this is only about the Rust infra, where this can be strongly
> enforced, once we see actual drivers pop-up; these won't go through the
> Rust CI before merging CI forever? A maintainer can 'just merge'
> something still, right?

Indeed, but there are workarounds, for instance, we could have a bot
checking -next.

Or we could put it in an opt-in compilation mode (i.e. not for users)
where extra things are checked (like `W=`) that maintainers use so
that e.g. `allmodconfig` builds are kept clean.

> Anyway, what I wanted to criticize, is the so called "keeping with
> `rustfmt` defaults". It has been known, that, well Rust's defaults are
> pretty biased and opinionated. For the Rust project, that's fair of
> course, their code, their rules.
>
> However, there's two arguments against that. For one, using the Rust
> 'style', now means there's 2 different code styles in the Kernel.
> Cognitively alone, that can be quite frustrating and annoying. Having to
> go back and forth between two styles can be mentally challenging which
> only causes mistakes and frustration. So why change something that
> already exists? Also, see my first point. Having to constantly
> remember/switch to 'in this file/function the curly brace is on a
> different line'. Lets try to stay consistent, the rules may not be
> perfect (80 columns ;), but so far consistency is tried. OCD and Autism
> etc doesn't help with this ;)

Note that the point of using `rustfmt` is that one does not need to
care about the details -- one can e.g. run the tool on file save. So
no need to remember how to do it when writing Rust.

Now, it is true that the Rust syntax resembles C in many cases, so
things like the curly braces for function definitions are similar
enough that we could do the same thing in both sides.

However, most Rust code uses `rustfmt` and typically also follow most
of its defaults, including the standard library, books, etc.; which
helps when reading and reusing other code. This is different from C
and C++, where as you know there is no single style (at least as
prevalent as `rustfmt`), thus one needs to become accustomed to each
project's C style (or ideally use `clang-format` to avoid having to
learn it). So while this is not relevant for C, in the case of Rust,
there is value in using the `rustfmt` style.

As for consistency, one could argue that by using `rustfmt` we are
being consistent with the rest of the Rust code out there. This may be
important for those that have expressed interest on sharing some code
between kernel and userspace; as well as if we end up vendoring some
external crates (similar to what we do with `alloc` now).

> Secondly, and this is really far more important, the Rust default style
> is not very inclusive, as it makes readability harder. This has been
> brought up by many others in plenty of places, including the `rustfmt`
> issue tracker under bug #4067 [0]. While the discussion eventually only
> led to the 'fmt-rfcs' [1], where it was basically said 'you could be on
> to something, but this ship has sailed 3 years ago (when nobody was
> looking caring), and while we hear you, we're not going to change our
> defaults anymore.
>
> But I also agree and share these commenters pain. When the tab character
> is used for indenting (and not alignment mind you), then visually
> impaired (who can still be amazing coders) can more easily read code by
> adjusting the width what works best to them.
>
> With even git renaming `master` to `main` to be more inclusive, can we
> also be more inclusive to us that have a hard time distinguishing narrow
> indentations?

As noted in the RFC, we are happy to tweak the style to whatever
kernel developers prefer. We think the particular style is not that
important. Absent other reasons, the defaults seem OK, so we chose
that for simplicity and consistency with as most existing Rust code as
possible.

As for accessibility, I am no expert, so that may be a good point,
especially if editors cannot solve this on their end (so that everyone
could program in all languages/projects regardless of style).

> Thanks, and sorry for rubbing any ones nerves, but to "some of us" this
> actually matters a great deal.

No nerves were damaged :) Thanks for all the input!

> P.S. would we expect inline C/Rust code mixed? What then?

Everything is possible, e.g. we could have Rust proc macros that parse
C and things like that. But if we ended up with such a thing, the
solution would be to format each accordingly to its style (indentation
could be an exception, I guess).

Cheers,
Miguel
Olliver Schinagl July 18, 2022, 6:56 a.m. UTC | #123
Hey Miguel,

Sorry for the late reply ;)

On 27-06-2022 19:44, Miguel Ojeda wrote:
> Hi Olliver,
> 
> On Mon, Jun 20, 2022 at 5:11 PM Olliver Schinagl <oliver@schinagl.nl> wrote:
>>
>> I apologize for being late to the party and for potentially using the
>> wrong thread, but I recall somewhere in v5 that it was best to respond
>> to the RFC for general comments.
> 
> No need to apologize! Feel free to use the latest threads or a new
> thread in e.g. the rust-for-linux ML.
> 
>> On 14-04-2021 20:45, ojeda@kernel.org wrote:
>>> From: Miguel Ojeda <ojeda@kernel.org>
>>>
>>> Moreover, as explained above, we are taking the chance to enforce
>>> some documentation guidelines. We are also enforcing automatic code
>>> formatting, a set of Clippy lints, etc. We decided to go with Rust's
>>> idiomatic style, i.e. keeping `rustfmt` defaults. For instance, this
>>> means 4 spaces are used for indentation, rather than a tab. We are
>>> happy to change that if needed -- we think what is important is
>>> keeping the formatting automated
>>
>> Enforcing this is great, but how will you enforce this 'everywhere'?
>> Right now, you can easily 'bypass' any CI put in place, and while 'for
>> now' this is only about the Rust infra, where this can be strongly
>> enforced, once we see actual drivers pop-up; these won't go through the
>> Rust CI before merging CI forever? A maintainer can 'just merge'
>> something still, right?
> 
> Indeed, but there are workarounds, for instance, we could have a bot
> checking -next.
Absolutly, but with the many luitenants, many tree's, and not a single 
CI source, this would still be tricky in the end; but certainly possible.

> 
> Or we could put it in an opt-in compilation mode (i.e. not for users)
> where extra things are checked (like `W=`) that maintainers use so
> that e.g. `allmodconfig` builds are kept clean.
> 
>> Anyway, what I wanted to criticize, is the so called "keeping with
>> `rustfmt` defaults". It has been known, that, well Rust's defaults are
>> pretty biased and opinionated. For the Rust project, that's fair of
>> course, their code, their rules.
>>
>> However, there's two arguments against that. For one, using the Rust
>> 'style', now means there's 2 different code styles in the Kernel.
>> Cognitively alone, that can be quite frustrating and annoying. Having to
>> go back and forth between two styles can be mentally challenging which
>> only causes mistakes and frustration. So why change something that
>> already exists? Also, see my first point. Having to constantly
>> remember/switch to 'in this file/function the curly brace is on a
>> different line'. Lets try to stay consistent, the rules may not be
>> perfect (80 columns ;), but so far consistency is tried. OCD and Autism
>> etc doesn't help with this ;)
> 
> Note that the point of using `rustfmt` is that one does not need to
> care about the details -- one can e.g. run the tool on file save. So
> no need to remember how to do it when writing Rust.
And that's great of course, I was mearly speaking of the configuration 
of rustfmt. I think as a tool it's pretty great!

> 
> Now, it is true that the Rust syntax resembles C in many cases, so
> things like the curly braces for function definitions are similar
> enough that we could do the same thing in both sides.
> 
> However, most Rust code uses `rustfmt` and typically also follow most
> of its defaults, including the standard library, books, etc.; which
> helps when reading and reusing other code. This is different from C
> and C++, where as you know there is no single style (at least as
> prevalent as `rustfmt`), thus one needs to become accustomed to each
> project's C style (or ideally use `clang-format` to avoid having to
> learn it). So while this is not relevant for C, in the case of Rust,
> there is value in using the `rustfmt` style.
I think this is a pretty poor argument for following Rust's opinionated 
view of the world. E.g. it's generally bad to copy/paste code to begin 
with. How many 'bugs' that we know of are copy/paste bugs?

Secondly, and more importantly so; you argue 'who cares about people 
with disablements, atleast its equally hard to read everywhere' which is 
a very poor argument :p

Finally, it must of course be mentioned, that rust is really trying to 
do an XKCD here, https://xkcd.com/927/ though I'm sure we'll get it 
right this time around ;)

> 
> As for consistency, one could argue that by using `rustfmt` we are
> being consistent with the rest of the Rust code out there.
But you are not, only those that follow rust's biased view. Everybody 
else that has a different opinion (like die-hard C programmers) that 
care enough (I'm sure there's plenty) would setup their rustfmt config 
file to resemble their C code; and thus the entire premisis is broken. 
Though; yes, in a perfect world it could have worked like this, but xkcd 
again :)

> This may be
> important for those that have expressed interest on sharing some code
> between kernel and userspace; as well as if we end up vendoring some
> external crates (similar to what we do with `alloc` now).
This though is a fair argument I understand, it would be weird in having 
2 styles in user-space and kernel-space code; though I see this 
happening today as well; where developers follow kernel style for kernel 
code (obviously) but use their preferred 2 or 3 space style on their 
userland code. Trying to 'force' this, usually however never gets the 
intended result ...

> 
>> Secondly, and this is really far more important, the Rust default style
>> is not very inclusive, as it makes readability harder. This has been
>> brought up by many others in plenty of places, including the `rustfmt`
>> issue tracker under bug #4067 [0]. While the discussion eventually only
>> led to the 'fmt-rfcs' [1], where it was basically said 'you could be on
>> to something, but this ship has sailed 3 years ago (when nobody was
>> looking caring), and while we hear you, we're not going to change our
>> defaults anymore.
>>
>> But I also agree and share these commenters pain. When the tab character
>> is used for indenting (and not alignment mind you), then visually
>> impaired (who can still be amazing coders) can more easily read code by
>> adjusting the width what works best to them.
>>
>> With even git renaming `master` to `main` to be more inclusive, can we
>> also be more inclusive to us that have a hard time distinguishing narrow
>> indentations?
> 
> As noted in the RFC, we are happy to tweak the style to whatever
> kernel developers prefer. We think the particular style is not that
> important. Absent other reasons, the defaults seem OK, so we chose
> that for simplicity and consistency with as most existing Rust code as
> possible.
> 
> As for accessibility, I am no expert, so that may be a good point,
> especially if editors cannot solve this on their end (so that everyone
> could program in all languages/projects regardless of style).
Yeah, this is a common reasoning. People without disabilities often 
oversee cases to those with. E.g. Traffic lights being red and green is 
horrible for colorblind people; luckily enough we have 'order' to help 
distinguish there for example. While I'm not colorblind myself, I often 
have to remind UX designers, with their fancy LED based UI's, to think 
of others as well, which always strikes them as odd first, then of 
course they only start to realize this.

I'm with you that style is the least important for the functionality, no 
argument there. Long-term though; this will matter of course, to those 
like me, have hard times here.

> 
>> Thanks, and sorry for rubbing any ones nerves, but to "some of us" this
>> actually matters a great deal.
> 
> No nerves were damaged :) Thanks for all the input!
> 
>> P.S. would we expect inline C/Rust code mixed? What then?
> 
> Everything is possible, e.g. we could have Rust proc macros that parse
> C and things like that. But if we ended up with such a thing, the
> solution would be to format each accordingly to its style (indentation
> could be an exception, I guess).
The first exception to the rule starts here already :p

Thanks for your thoughts,

Olliver
> 
> Cheers,
> Miguel
Miguel Ojeda July 20, 2022, 7:23 p.m. UTC | #124
On Mon, Jul 18, 2022 at 8:56 AM Olliver Schinagl
<oliver+list@schinagl.nl> wrote:
>
> Absolutly, but with the many luitenants, many tree's, and not a single
> CI source, this would still be tricky in the end; but certainly possible.

A bot in -next (possibly an existing one) is a single thing to care
about and the amount of maintainers/trees doesn't have an effect on
it, thus I don't think it would be tricky.

> I think this is a pretty poor argument for following Rust's opinionated
> view of the world. E.g. it's generally bad to copy/paste code to begin
> with. How many 'bugs' that we know of are copy/paste bugs?

We will have to disagree. Consistency and simplicity are fine
arguments in my book, not "pretty poor" ones.

I don't see the relevance of the copy/paste code discussion here. But
assuming the analogy makes sense, I don't agree that reusing code is
"generally bad" either.

Anyway, given you mention "bugs", I think you are implying that the
defaults are somehow "incorrect" (not accessible?). In that case, to
improve things for all Rust developers out there, I would suggest
opening an issue in https://github.com/rust-dev-tools/fmt-rfcs.

> Secondly, and more importantly so; you argue 'who cares about people
> with disablements, atleast its equally hard to read everywhere' which is
> a very poor argument :p

No, and I want to be __very__ clear about this: at no point I have
argued "who cares about people with disabilities" or anything like it.
It is insulting that you even suggest it.

Likewise, you are the one claiming it is "hard to read", not me.

And then after constructing those straw men, you call them "a very
poor argument"...

> Finally, it must of course be mentioned, that rust is really trying to
> do an XKCD here, https://xkcd.com/927/ though I'm sure we'll get it
> right this time around ;)

How does that even apply here? There is no "standard" for formatting
across languages, if that is what you are saying.

Actually, what is happening here is that there is an "official" tool,
called rustfmt, that most Rust code out there uses.

By not using it, it is you the one that may be creating a XKCD
situation, if anything.

And to be clear, we don't necessarily follow "Rust's biased view". For
instance, there is also an "official" build tool, called Cargo, that
most Rust code out there uses; yet we are not using it for the kernel.

We are actually doing things how we think are best for the kernel. Not
because "Rust" (whatever or whoever that is) is "trying to do an
XKCD". Not because we are "following Rust's opinionated view of the
world" or "Rust's biased view".

> But you are not, only those that follow rust's biased view. Everybody
> else that has a different opinion (like die-hard C programmers) that
> care enough (I'm sure there's plenty) would setup their rustfmt config
> file to resemble their C code; and thus the entire premisis is broken.
> Though; yes, in a perfect world it could have worked like this, but xkcd
> again :)

No. I said we are being consistent with the majority of the Rust code
out there, not with "everybody".

If, instead, we try to be consistent with the kernel C style, then you
are likely not being consistent with the majority of the Rust code out
there. And you would have to decide exactly how to map the C style to
Rust constructs and which particular kernel style.

Again: I personally don't mind what the particular style is. As a
project, what we value the most is having a single style across it and
not having to think about formatting. Nevertheless, I think there is
also value in being consistent with the majority of the Rust code out
there.

> This though is a fair argument I understand, it would be weird in having
> 2 styles in user-space and kernel-space code; though I see this
> happening today as well; where developers follow kernel style for kernel
> code (obviously) but use their preferred 2 or 3 space style on their
> userland code. Trying to 'force' this, usually however never gets the
> intended result ...

If we follow the usual Rust style in the kernel, I would say it is
more likely that both styles match.

Cheers,
Miguel
Nicolas Pitre July 20, 2022, 8:21 p.m. UTC | #125
On Wed, 20 Jul 2022, Miguel Ojeda wrote:

> On Mon, Jul 18, 2022 at 8:56 AM Olliver Schinagl
> <oliver+list@schinagl.nl> wrote:
> 
> > Secondly, and more importantly so; you argue 'who cares about people
> > with disablements, atleast its equally hard to read everywhere' which is
> > a very poor argument :p
> 
> No, and I want to be __very__ clear about this: at no point I have
> argued "who cares about people with disabilities" or anything like it.
> It is insulting that you even suggest it.

What "people with disablements" have to do with this anyway?
I don't get it.


Nicolas
Olliver Schinagl July 27, 2022, 7:47 a.m. UTC | #126
Hey Nicolas,

On 20-07-2022 22:21, Nicolas Pitre wrote:
> On Wed, 20 Jul 2022, Miguel Ojeda wrote:
> 
>> On Mon, Jul 18, 2022 at 8:56 AM Olliver Schinagl
>> <oliver+list@schinagl.nl> wrote:
>>
>>> Secondly, and more importantly so; you argue 'who cares about people
>>> with disablements, atleast its equally hard to read everywhere' which is
>>> a very poor argument :p
>>
>> No, and I want to be __very__ clear about this: at no point I have
>> argued "who cares about people with disabilities" or anything like it.
>> It is insulting that you even suggest it.
> 
> What "people with disablements" have to do with this anyway?
> I don't get it.
If you are talking to me; simple, reading disabilities (dyslexia being 
the most common one) are real :) and code-style heavily impacts those.

For example, I have a really really hard time reading 2 space indent, 
especially in larger code bases. Also, CamelCase is very very hard for 
me to read also, as is statements without spaces `if((x<y&0xf)||z>a)` 
for example.

So codestyle affects those with reading disabilities, but not well known 
to the people without these, which is why I was raising awareness.

Hope this helps to explain things?

> 
> 
> Nicolas
Olliver Schinagl July 27, 2022, 8:05 a.m. UTC | #127
Hey Miguel,

Just to make it very clear from the beginning, while this is a sensitive 
subject, and I am very passionate about it; In no way, I am being 
offensive, insulting or rude. This would be obvious in a f2f discussion 
of course, but alas. I do try to format my text as friendly as I can, 
but some things get lost in translation ;)

On 20-07-2022 21:23, Miguel Ojeda wrote:
> On Mon, Jul 18, 2022 at 8:56 AM Olliver Schinagl
> <oliver+list@schinagl.nl> wrote:
>>
>> Absolutly, but with the many luitenants, many tree's, and not a single
>> CI source, this would still be tricky in the end; but certainly possible.
> 
> A bot in -next (possibly an existing one) is a single thing to care
> about and the amount of maintainers/trees doesn't have an effect on
> it, thus I don't think it would be tricky.
> 
>> I think this is a pretty poor argument for following Rust's opinionated
>> view of the world. E.g. it's generally bad to copy/paste code to begin
>> with. How many 'bugs' that we know of are copy/paste bugs?
> 
> We will have to disagree. Consistency and simplicity are fine
> arguments in my book, not "pretty poor" ones.
Consitency is absolutly important! Zero argument there. My argument is, 
the consistency should be within the kernel tree, not 'but the rest of 
the world is using style X/Y/Z, lets be consistent with that. In an 
utopia, maybe, but the real world doesn't work that way, sadly. So in an 
attempt to standardize (rustfmt) they just "invented" a new standard. 
Which btw is common, we see this happening every so often, right?

> 
> I don't see the relevance of the copy/paste code discussion here. But
> assuming the analogy makes sense, I don't agree that reusing code is
> "generally bad" either.
Copy/pasting is known to cause bugs. There's actually research from NASA 
on that. Code-reuse (libraries/functions) are not bad. But (worst kind 
of example) copy paste from stack-overflow, or copy/pasting stuff 
without actually looking at the content and forgetting to rename 
something, causes bugs. Why is this relevant? The whole 'lets be 
consistent with the rust codebase of the wrold' argument. E.g. if 
everybody uses the same style (which is idealistic and great) then 
copy/pasting becomes consistent. Where I say, try to be careful when 
copy/pasting code.

> 
> Anyway, given you mention "bugs", I think you are implying that the
> defaults are somehow "incorrect" (not accessible?). In that case, to
> improve things for all Rust developers out there, I would suggest
> opening an issue in https://github.com/rust-dev-tools/fmt-rfcs.
There have been, I've linked them in my first post; the devs basically 
say 'you are right, we are sorry; but this discussion is over' which 
reads as 'we love our style, we think its great, we're not changing it 
for people with reading-disabilitles, figure it out' (yes, I paraphrase 
it much harsher then what they state, but like with people in 
wheelchairs that run into a small staircase (1 or 2 treads), and then 
are being told, sorry, it is what it is, we can't fix this, it's been 
like this for years, it is what it is)

> 
>> Secondly, and more importantly so; you argue 'who cares about people
>> with disablements, atleast its equally hard to read everywhere' which is
>> a very poor argument :p
> 
> No, and I want to be __very__ clear about this: at no point I have
> argued "who cares about people with disabilities" or anything like it.
> It is insulting that you even suggest it.
I appose for feeling insulted, that was surely not my intend! What I am 
stating however is, that by stating 'but rustfmt is great, their 
standard is consistent and simple and amazing', the message implies 
(nofi!) that reading disabilities do not matter, because this new 
standard is so great.

> 
> Likewise, you are the one claiming it is "hard to read", not me.
Yes, as I do suffer from a reading disability, so I know how hard this 
is :) but fear not, I'm not alone, just vocal.

> 
> And then after constructing those straw men, you call them "a very
> poor argument"...
Obviously I do not intend to cunstruct straw-men, as to me, this is all 
very real and painful :)

> 
>> Finally, it must of course be mentioned, that rust is really trying to
>> do an XKCD here, https://xkcd.com/927/ though I'm sure we'll get it
>> right this time around ;)
> 
> How does that even apply here? There is no "standard" for formatting
> across languages, if that is what you are saying.
I'm not, I'm saying 'every language has its own standard(s), lets make 
one that is better then the others'. So instead of rust for example 
following the linux kernel standard (or the Go coding standard, or 
X/Y/Z), they came up with their own. Not bad, but as mentioned earlier, 
requires careful thinking. But it is of course their choice!

> 
> Actually, what is happening here is that there is an "official" tool,
> called rustfmt, that most Rust code out there uses.
> 
> By not using it, it is you the one that may be creating a XKCD
> situation, if anything.
No, do use it! rustfmt is pretty amazing. And rustfmt knows there's not 
a single answer on coding style, so the rustfmt tool, is super 
configurable trying to match any code-base needs, without forcing 
anybodies style. It is actually like you say, below, the defauflts that 
come from the rust group itself.

> 
> And to be clear, we don't necessarily follow "Rust's biased view". For
> instance, there is also an "official" build tool, called Cargo, that
> most Rust code out there uses; yet we are not using it for the kernel.
I'd prefer if we keep it to style and readability (rustfmt) :p as cargo 
is more a technical direction, and not relevant; but point noted.

> 
> We are actually doing things how we think are best for the kernel. Not
> because "Rust" (whatever or whoever that is) is "trying to do an
> XKCD". Not because we are "following Rust's opinionated view of the
> world" or "Rust's biased view".
But if that is the case, why not try to follow the kernels existing 
code-style as close as possible with the rust-fmt configuration? I know 
code-style has been discussed a few times over the decades; but not many 
changes have been done, surely, if there's some codestyle changes that 
are best for the kernel, they would have been 'advised'? '4 space 
indents are better then 8-size tabs, on new code, try to use them' for 
example :p

> 
>> But you are not, only those that follow rust's biased view. Everybody
>> else that has a different opinion (like die-hard C programmers) that
>> care enough (I'm sure there's plenty) would setup their rustfmt config
>> file to resemble their C code; and thus the entire premisis is broken.
>> Though; yes, in a perfect world it could have worked like this, but xkcd
>> again :)
> 
> No. I said we are being consistent with the majority of the Rust code
> out there, not with "everybody".
But why? Why should we not be consistent with the kernels' code-base 
(while yes, that is not rust, but C, but we can follow the same style?)

> 
> If, instead, we try to be consistent with the kernel C style, then you
> are likely not being consistent with the majority of the Rust code out
> there. And you would have to decide exactly how to map the C style to
> Rust constructs and which particular kernel style.
But is this a bad thing? Being consistent within the kernel repo? Who 
cares what the rest of the ruts code does? I know it matters for 
user-space; but I know that my user-space rust code (be it linux, or 
micro) actually follows the kernel style, not the rust style :p becaues 
of my disability, the rust format is not easy to read/parse in my head 
due to small inconsistencies.

> 
> Again: I personally don't mind what the particular style is. As a
> project, what we value the most is having a single style across it and
> not having to think about formatting. Nevertheless, I think there is
> also value in being consistent with the majority of the Rust code out
> there.
So I fully agree with the first part; but not with the last part :p As 
the rust code style, is poor on readability for people with 
reading-disabilities.


> 
>> This though is a fair argument I understand, it would be weird in having
>> 2 styles in user-space and kernel-space code; though I see this
>> happening today as well; where developers follow kernel style for kernel
>> code (obviously) but use their preferred 2 or 3 space style on their
>> userland code. Trying to 'force' this, usually however never gets the
>> intended result ...
> 
> If we follow the usual Rust style in the kernel, I would say it is
> more likely that both styles match.
That is of course the downside, that if user-space is writing their own 
code in their own style, it will either be rust-style, or something 
completly different, we have no control of this anyway; but having it 
'match' is 'nice' from a consistency pov;

Sadly, I've seen so much vendor code (yeah, I know) which doesn't even 
have consistency in their own files ...

> Cheers,
> Miguel
Nicolas Pitre July 27, 2022, 1:32 p.m. UTC | #128
On Wed, 27 Jul 2022, Olliver Schinagl wrote:

> Hey Nicolas,
> 
> On 20-07-2022 22:21, Nicolas Pitre wrote:
> > On Wed, 20 Jul 2022, Miguel Ojeda wrote:
> > 
> >> On Mon, Jul 18, 2022 at 8:56 AM Olliver Schinagl
> >> <oliver+list@schinagl.nl> wrote:
> >>
> >>> Secondly, and more importantly so; you argue 'who cares about people
> >>> with disablements, atleast its equally hard to read everywhere' which is
> >>> a very poor argument :p
> >>
> >> No, and I want to be __very__ clear about this: at no point I have
> >> argued "who cares about people with disabilities" or anything like it.
> >> It is insulting that you even suggest it.
> > 
> > What "people with disablements" have to do with this anyway?
> > I don't get it.
> If you are talking to me; simple, reading disabilities (dyslexia being the
> most common one) are real :) and code-style heavily impacts those.
> 
> For example, I have a really really hard time reading 2 space indent,
> especially in larger code bases. Also, CamelCase is very very hard for me to
> read also, as is statements without spaces `if((x<y&0xf)||z>a)` for example.

OK, that's a good point.

I asked because such arguments are often brought up in the context of 
sight impairment by people who are not affected, and in this regard I'm 
well positioned to call those arguments unfounded.

> So codestyle affects those with reading disabilities, but not well known to
> the people without these, which is why I was raising awareness.


Nicolas
Gary Guo July 28, 2022, 10:21 a.m. UTC | #129
Hi Olliver,

On Wed, 27 Jul 2022 10:05:31 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Consitency is absolutly important! Zero argument there. My argument
> is, the consistency should be within the kernel tree, not 'but the
> rest of the world is using style X/Y/Z, lets be consistent with that.
> In an utopia, maybe, but the real world doesn't work that way, sadly.
> So in an attempt to standardize (rustfmt) they just "invented" a new
> standard. Which btw is common, we see this happening every so often,
> right?

Difference languages have different characteristics and I don't think
it's necessarily good (and often isn't) to blindly apply coding style
of one language onto another. So I don't see rustfmt as "inventing yet
another standard" really, because there aren't many conflicting coding
style standards in Rust world; almost everyone just settled on using
rustfmt, mostly using the default, maybe with a few small
project-specific configuration tweaks.

A small example for C and Rust differences:

Rust requires braces around branches of if expression, and C doesn't.
So in kernel coding style you often have:

	if (condition) do_something();

Or

	if (condition)
		do_something();

But in Rust it will be:

	if condition {
	    do_something();
	}

That's just an example of one control flow constructions. There are
differences between Rust match and C switch, etc. Rust's official
coding style takes properties of Rust into consideration, so in many
regards it's a more suitable coding style for Rust code in kernel, then
applying kernel's C coding standard directly on kernel's Rust code.

Your earlier email in the thread also mentions about indentation, and I
have a few things to point out as well.

First, Rust code typically requires more levels of indentation than C
code. For example, many functions might be methods and they are inside
an impl block, which creates one extra level of indentation.
Statements inside match arms' block are two levels more indented than
the match statement itself, as opposed to C's switch (as kernel coding
style doesn't indent the case labels). As a result, 8 spaces for 1 level
can be a bit excessive for Rust code, and thus the 4 space indentation
used in rustfmt default.

Secondly, I don't think the argument about tabs being customisable
holds; in kernel coding style tabs are strictly 8 characters. For line
continuation it's not uncommon to use a series of tabs followed by a
few spaces, e.g.

	int function_name(int first_argument,
	< tab  >< tab  >..int second_argument)

changing tab into 4 spaces will break the layout. (and I'll not go into
well-known reasons about non-4-space-tab messing up code in terminal
etc).

> Copy/pasting is known to cause bugs. There's actually research from
> NASA on that. Code-reuse (libraries/functions) are not bad. But
> (worst kind of example) copy paste from stack-overflow, or
> copy/pasting stuff without actually looking at the content and
> forgetting to rename something, causes bugs. Why is this relevant?
> The whole 'lets be consistent with the rust codebase of the wrold'
> argument. E.g. if everybody uses the same style (which is idealistic
> and great) then copy/pasting becomes consistent. Where I say, try to
> be careful when copy/pasting code.

When we vendor in code as a whole (e.g. like we currently do for
alloc crate), it is proper code reuse. With different coding style the
vendored code either diverges from upstream (which makes upstreaming
much more difficult) or diverge from rest of kernel's Rust code base.

> But if that is the case, why not try to follow the kernels existing 
> code-style as close as possible with the rust-fmt configuration? I
> know code-style has been discussed a few times over the decades; but
> not many changes have been done, surely, if there's some codestyle
> changes that are best for the kernel, they would have been 'advised'?
> '4 space indents are better then 8-size tabs, on new code, try to use
> them' for example :p

You do realize that you are creating a new coding style by doing this,
right? It feels like creating problems rather than solving problems.

My personal feeling is that it's easier for me to adapt to different
coding style when switching between languages, but it's rather awkward
for me when trying to use different coding styles with the same
language. I find myself no problem switching between 2 spaces when
coding JavaScript to 4 spaces when coding Rust to 8 spaces(tab) when
coding C, but it's rather painful to switch between C projects with
different coding styles. I certainly don't want to switch between Rust
projects with vastly different coding styles.

> But why? Why should we not be consistent with the kernels' code-base 
> (while yes, that is not rust, but C, but we can follow the same
> style?)

Difference languages have different characteristics, and one size
doesn't fit them all :)

> Sadly, I've seen so much vendor code (yeah, I know) which doesn't
> even have consistency in their own files ...

That's very true. So when all other Rust code currently follow
(roughly) the same coding style and this situation doesn't currently
exist, let's not make it worse...

Best,
Gary
Greg KH July 28, 2022, 12:09 p.m. UTC | #130
On Thu, Jul 28, 2022 at 11:21:14AM +0100, Gary Guo wrote:
> Rust requires braces around branches of if expression, and C doesn't.
> So in kernel coding style you often have:
> 
> 	if (condition) do_something();

That is not a valid kernel coding style, and our tools should catch this
and prevent it from being added to the kernel tree.

thanks,

greg k-h
Gary Guo July 28, 2022, 12:28 p.m. UTC | #131
On Thu, 28 Jul 2022 14:09:57 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Thu, Jul 28, 2022 at 11:21:14AM +0100, Gary Guo wrote:
> > Rust requires braces around branches of if expression, and C
> > doesn't. So in kernel coding style you often have:
> > 
> > 	if (condition) do_something();  
> 
> That is not a valid kernel coding style, and our tools should catch
> this and prevent it from being added to the kernel tree.

Thanks, I stand corrected. I do see those patterns occasionally so
presumably that gives me wrong impression.

	rg '\bif \(.*;' $(find -name *.c) | wc -l

gives me 3362 results though. Most of them are from fs and driver
directory so presumably they are vendored or legacy code.

Best,
Gary
Olliver Schinagl July 28, 2022, 8:43 p.m. UTC | #132
Hey Gary,

On 28-07-2022 12:21, Gary Guo wrote:
> Hi Olliver,
> 
> On Wed, 27 Jul 2022 10:05:31 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
> 
>> Consitency is absolutly important! Zero argument there. My argument
>> is, the consistency should be within the kernel tree, not 'but the
>> rest of the world is using style X/Y/Z, lets be consistent with that.
>> In an utopia, maybe, but the real world doesn't work that way, sadly.
>> So in an attempt to standardize (rustfmt) they just "invented" a new
>> standard. Which btw is common, we see this happening every so often,
>> right?
> 
> Difference languages have different characteristics and I don't think
> it's necessarily good (and often isn't) to blindly apply coding style
> of one language onto another. So I don't see rustfmt as "inventing yet
> another standard" really, because there aren't many conflicting coding
> style standards in Rust world; almost everyone just settled on using
> rustfmt, mostly using the default, maybe with a few small
> project-specific configuration tweaks.
I was mostly arguing about a) lets look at this and b) having said 
configuration tweaks, rather then blindly (pun not really intended) 
going with rust's defaults :)

> 
> A small example for C and Rust differences:
> 
> Rust requires braces around branches of if expression, and C doesn't.
> So in kernel coding style you often have:
> 
> 	if (condition) do_something();
> 
> Or
> 
> 	if (condition)
> 		do_something();
> 
> But in Rust it will be:
> 
> 	if condition {
> 	    do_something();
> 	}
So kernel style kind of says 'no braces around single statements'; but 
if your rust compiler doesn't allow this; well then there's nothing to 
do. You could even argue to update the kernel C style on this to make it 
consistent again. BUT, this inconsistency makes it cognative 'hard'. If 
this if a C or a rust function? for example during a review. During 
authoring, when writing both C and rust code (due to nececity, not 
constant context switching) you cognitivly constantly have to go 
back/foward. While I'm sure there's people here that can do this all day 
without problem, some of of find this harder then needs to be. Hence the 
request to _try_ to keep consistency within the kernel tree.

> 
> That's just an example of one control flow constructions. There are
> differences between Rust match and C switch, etc. Rust's official
> coding style takes properties of Rust into consideration, so in many
> regards it's a more suitable coding style for Rust code in kernel, then
> applying kernel's C coding standard directly on kernel's Rust code.
> 
> Your earlier email in the thread also mentions about indentation, and I
> have a few things to point out as well.
> 
> First, Rust code typically requires more levels of indentation than C
> code. For example, many functions might be methods and they are inside
> an impl block, which creates one extra level of indentation.
> Statements inside match arms' block are two levels more indented than
> the match statement itself, as opposed to C's switch (as kernel coding
> style doesn't indent the case labels). As a result, 8 spaces for 1 level
> can be a bit excessive for Rust code, and thus the 4 space indentation
> used in rustfmt default.
> 
> Secondly, I don't think the argument about tabs being customisable
> holds; in kernel coding style tabs are strictly 8 characters. For line
Sure, this rule implies that for alignment, tabs should be set to 8 so 
things align nicely. However, nobody forces people to set their editor 
to 8 character width. Not doing so, doesn't break anything. At worst, 
you may commit something that is poorly aligned (but we _should_ be 
using tabs to indent, spaces to align anyway :p, tab == indent has meaning).

With non-tab indentation, this is no longer really possible, or at 
least, editors haven't solved that problem yet, as it tends to still 
break (due to the mixing of indentation and alignment using a single 
character). Maybe once we have AI and ML in our editors though :p

> continuation it's not uncommon to use a series of tabs followed by a
> few spaces, e.g.
> 
> 	int function_name(int first_argument,
> 	< tab  >< tab  >..int second_argument)
> 
> changing tab into 4 spaces will break the layout. (and I'll not go into
> well-known reasons about non-4-space-tab messing up code in terminal
> etc).
> 
>> Copy/pasting is known to cause bugs. There's actually research from
>> NASA on that. Code-reuse (libraries/functions) are not bad. But
>> (worst kind of example) copy paste from stack-overflow, or
>> copy/pasting stuff without actually looking at the content and
>> forgetting to rename something, causes bugs. Why is this relevant?
>> The whole 'lets be consistent with the rust codebase of the wrold'
>> argument. E.g. if everybody uses the same style (which is idealistic
>> and great) then copy/pasting becomes consistent. Where I say, try to
>> be careful when copy/pasting code.
> 
> When we vendor in code as a whole (e.g. like we currently do for
> alloc crate), it is proper code reuse. With different coding style the
> vendored code either diverges from upstream (which makes upstreaming
> much more difficult) or diverge from rest of kernel's Rust code base.
Very fair point of course. Though really, we should fix the upstream 
rust preferred format, but there it was already stated, that 'too bad, 
sorry' which from a developer point of view is fine, your project, your 
choice. From a disabilities point of view, sucks of course.

> 
>> But if that is the case, why not try to follow the kernels existing
>> code-style as close as possible with the rust-fmt configuration? I
>> know code-style has been discussed a few times over the decades; but
>> not many changes have been done, surely, if there's some codestyle
>> changes that are best for the kernel, they would have been 'advised'?
>> '4 space indents are better then 8-size tabs, on new code, try to use
>> them' for example :p
> 
> You do realize that you are creating a new coding style by doing this,
> right? It feels like creating problems rather than solving problems.
> 
> My personal feeling is that it's easier for me to adapt to different
> coding style when switching between languages, but it's rather awkward
> for me when trying to use different coding styles with the same
> language. I find myself no problem switching between 2 spaces when
> coding JavaScript to 4 spaces when coding Rust to 8 spaces(tab) when
> coding C, but it's rather painful to switch between C projects with
> different coding styles. I certainly don't want to switch between Rust
> projects with vastly different coding styles.
And I'm happy for you that you can easily take in 2 and 4 spaces. For 
me, it is extremly hard to read. So it's not a 'personal preference' 
thing. But I suggest to read the earlier posted links, where others at 
length explain it as well, how it is like to feel excluded becaues its 
just hard to read.

> 
>> But why? Why should we not be consistent with the kernels' code-base
>> (while yes, that is not rust, but C, but we can follow the same
>> style?)
> 
> Difference languages have different characteristics, and one size
> doesn't fit them all :)
I'm not even arguing this at all :)

I think the biggest issues i'm speaking of really are the braces and the 
spaces really, where the braces can be argued for/against, it's 
cognitive harder, but can be dealth with (and we can expect 
inconsitencies; but the sapces vs tabs thing, personal configuration vs 
forced with is the point I was trying to raise.

As said before, 'every building is different, some offer wheelchair 
ramps, others do' kind of point, not like 'this building is red, and 
that one is blue, and not every color fits all :p

> 
>> Sadly, I've seen so much vendor code (yeah, I know) which doesn't
>> even have consistency in their own files ...
> 
> That's very true. So when all other Rust code currently follow
> (roughly) the same coding style and this situation doesn't currently
> exist, let's not make it worse...
> 
> Best,
> Gary
Olliver Schinagl July 28, 2022, 8:45 p.m. UTC | #133
Hey Greg,

On 28-07-2022 14:09, Greg Kroah-Hartman wrote:
> On Thu, Jul 28, 2022 at 11:21:14AM +0100, Gary Guo wrote:
>> Rust requires braces around branches of if expression, and C doesn't.
>> So in kernel coding style you often have:
>>
>> 	if (condition) do_something();
> 
> That is not a valid kernel coding style, and our tools should catch this
> and prevent it from being added to the kernel tree.
Are you sure? I'm not sure if this isn't true today, but I've certainly 
seen old code where this definitely was done. Was all of this cleaned up 
in the last 2+ years?

Olliver

> 
> thanks,
> 
> greg k-h
Greg KH July 29, 2022, 8:04 a.m. UTC | #134
On Thu, Jul 28, 2022 at 10:45:08PM +0200, Olliver Schinagl wrote:
> Hey Greg,
> 
> On 28-07-2022 14:09, Greg Kroah-Hartman wrote:
> > On Thu, Jul 28, 2022 at 11:21:14AM +0100, Gary Guo wrote:
> > > Rust requires braces around branches of if expression, and C doesn't.
> > > So in kernel coding style you often have:
> > > 
> > > 	if (condition) do_something();
> > 
> > That is not a valid kernel coding style, and our tools should catch this
> > and prevent it from being added to the kernel tree.
> Are you sure? I'm not sure if this isn't true today, but I've certainly seen
> old code where this definitely was done. Was all of this cleaned up in the
> last 2+ years?

Given that I wrote about this back in 2002, and it was true then:
	https://www.linuxjournal.com/article/5780
and:
	https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.2.8887&rep=rep1&type=pdf

that is not anything new at all.

Yes, old code still survives that might not be correct, and some
subsystems might have added code over time without the proper style, but
our tools check that the above is not correct, you can check it
yourself:

$ cat foo.c
// SPDX-License-Identifier: GPL-2.0
int foo(int baz)
{
	if (baz == 1) do_something();
}

$ ./scripts/checkpatch.pl --file --terse foo.c
foo.c:4: ERROR: trailing statements should be on next line
total: 1 errors, 0 warnings, 6 lines checked

thanks,

greg k-h
Olliver Schinagl Oct. 15, 2022, 2:16 p.m. UTC | #135
As this thread kind of went silent and as the 'big merge' for this 
feature is getting closer, here a final plee, inspired by this slashdot 
post [0].

The post in itself speaks of a new team forming on working on the Rust 
styleguide, which in itself is still evolving. This makes sense, rust is 
new, it's not very commonly in use and as with all good things, they evolve.

One comment in that slashdot post [1] I want to bring forward and quote 
a piece of:
"i created a new repository, and thought i was being hip and modern, so
i started to evangelize spaces for the 'consistency across environments'

i get approached by not one, but TWO coworkers who unfortunately are 
highly visually impaired and each has a different visual impairment

at that moment, i instantaneously conceded — there's just no 
counter-argument that even comes close to outweighing the accessibility 
needs of valued coworkers"

Visual impairness is a thing, it does not make someone smarter or 
dumber. Helping those with visual impairments should be concidered, and 
not shunted off by saying 'but the rust guys came up with the perfect 
style, so we should use it'.

Find attached, a diff to the .rustfmt.toml, that should keep things more 
consistent with the current kernel style.

I'll leave it now to Linus and Greg to concsider this, and will keep my 
peace (though I hope they actually read it :p).


Olliver


[0]: 
https://developers.slashdot.org/story/22/10/07/2351222/rust-programming-language-announces-new-team-to-evolve-official-coding-style
[1]: https://developers.slashdot.org/comments.pl?sid=22182701&cid=62949323

On 28-07-2022 22:43, Olliver Schinagl wrote:
> Hey Gary,
> 
> On 28-07-2022 12:21, Gary Guo wrote:
>> Hi Olliver,
>>
>> On Wed, 27 Jul 2022 10:05:31 +0200
>> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>>
>>> Consitency is absolutly important! Zero argument there. My argument
>>> is, the consistency should be within the kernel tree, not 'but the
>>> rest of the world is using style X/Y/Z, lets be consistent with that.
>>> In an utopia, maybe, but the real world doesn't work that way, sadly.
>>> So in an attempt to standardize (rustfmt) they just "invented" a new
>>> standard. Which btw is common, we see this happening every so often,
>>> right?
>>
>> Difference languages have different characteristics and I don't think
>> it's necessarily good (and often isn't) to blindly apply coding style
>> of one language onto another. So I don't see rustfmt as "inventing yet
>> another standard" really, because there aren't many conflicting coding
>> style standards in Rust world; almost everyone just settled on using
>> rustfmt, mostly using the default, maybe with a few small
>> project-specific configuration tweaks.
> I was mostly arguing about a) lets look at this and b) having said
> configuration tweaks, rather then blindly (pun not really intended)
> going with rust's defaults :)
> 
>>
>> A small example for C and Rust differences:
>>
>> Rust requires braces around branches of if expression, and C doesn't.
>> So in kernel coding style you often have:
>>
>> 	if (condition) do_something();
>>
>> Or
>>
>> 	if (condition)
>> 		do_something();
>>
>> But in Rust it will be:
>>
>> 	if condition {
>> 	    do_something();
>> 	}
> So kernel style kind of says 'no braces around single statements'; but
> if your rust compiler doesn't allow this; well then there's nothing to
> do. You could even argue to update the kernel C style on this to make it
> consistent again. BUT, this inconsistency makes it cognative 'hard'. If
> this if a C or a rust function? for example during a review. During
> authoring, when writing both C and rust code (due to nececity, not
> constant context switching) you cognitivly constantly have to go
> back/foward. While I'm sure there's people here that can do this all day
> without problem, some of of find this harder then needs to be. Hence the
> request to _try_ to keep consistency within the kernel tree.
> 
>>
>> That's just an example of one control flow constructions. There are
>> differences between Rust match and C switch, etc. Rust's official
>> coding style takes properties of Rust into consideration, so in many
>> regards it's a more suitable coding style for Rust code in kernel, then
>> applying kernel's C coding standard directly on kernel's Rust code.
>>
>> Your earlier email in the thread also mentions about indentation, and I
>> have a few things to point out as well.
>>
>> First, Rust code typically requires more levels of indentation than C
>> code. For example, many functions might be methods and they are inside
>> an impl block, which creates one extra level of indentation.
>> Statements inside match arms' block are two levels more indented than
>> the match statement itself, as opposed to C's switch (as kernel coding
>> style doesn't indent the case labels). As a result, 8 spaces for 1 level
>> can be a bit excessive for Rust code, and thus the 4 space indentation
>> used in rustfmt default.
>>
>> Secondly, I don't think the argument about tabs being customisable
>> holds; in kernel coding style tabs are strictly 8 characters. For line
> Sure, this rule implies that for alignment, tabs should be set to 8 so
> things align nicely. However, nobody forces people to set their editor
> to 8 character width. Not doing so, doesn't break anything. At worst,
> you may commit something that is poorly aligned (but we _should_ be
> using tabs to indent, spaces to align anyway :p, tab == indent has meaning).
> 
> With non-tab indentation, this is no longer really possible, or at
> least, editors haven't solved that problem yet, as it tends to still
> break (due to the mixing of indentation and alignment using a single
> character). Maybe once we have AI and ML in our editors though :p
> 
>> continuation it's not uncommon to use a series of tabs followed by a
>> few spaces, e.g.
>>
>> 	int function_name(int first_argument,
>> 	< tab  >< tab  >..int second_argument)
>>
>> changing tab into 4 spaces will break the layout. (and I'll not go into
>> well-known reasons about non-4-space-tab messing up code in terminal
>> etc).
>>
>>> Copy/pasting is known to cause bugs. There's actually research from
>>> NASA on that. Code-reuse (libraries/functions) are not bad. But
>>> (worst kind of example) copy paste from stack-overflow, or
>>> copy/pasting stuff without actually looking at the content and
>>> forgetting to rename something, causes bugs. Why is this relevant?
>>> The whole 'lets be consistent with the rust codebase of the wrold'
>>> argument. E.g. if everybody uses the same style (which is idealistic
>>> and great) then copy/pasting becomes consistent. Where I say, try to
>>> be careful when copy/pasting code.
>>
>> When we vendor in code as a whole (e.g. like we currently do for
>> alloc crate), it is proper code reuse. With different coding style the
>> vendored code either diverges from upstream (which makes upstreaming
>> much more difficult) or diverge from rest of kernel's Rust code base.
> Very fair point of course. Though really, we should fix the upstream
> rust preferred format, but there it was already stated, that 'too bad,
> sorry' which from a developer point of view is fine, your project, your
> choice. From a disabilities point of view, sucks of course.
> 
>>
>>> But if that is the case, why not try to follow the kernels existing
>>> code-style as close as possible with the rust-fmt configuration? I
>>> know code-style has been discussed a few times over the decades; but
>>> not many changes have been done, surely, if there's some codestyle
>>> changes that are best for the kernel, they would have been 'advised'?
>>> '4 space indents are better then 8-size tabs, on new code, try to use
>>> them' for example :p
>>
>> You do realize that you are creating a new coding style by doing this,
>> right? It feels like creating problems rather than solving problems.
>>
>> My personal feeling is that it's easier for me to adapt to different
>> coding style when switching between languages, but it's rather awkward
>> for me when trying to use different coding styles with the same
>> language. I find myself no problem switching between 2 spaces when
>> coding JavaScript to 4 spaces when coding Rust to 8 spaces(tab) when
>> coding C, but it's rather painful to switch between C projects with
>> different coding styles. I certainly don't want to switch between Rust
>> projects with vastly different coding styles.
> And I'm happy for you that you can easily take in 2 and 4 spaces. For
> me, it is extremly hard to read. So it's not a 'personal preference'
> thing. But I suggest to read the earlier posted links, where others at
> length explain it as well, how it is like to feel excluded becaues its
> just hard to read.
> 
>>
>>> But why? Why should we not be consistent with the kernels' code-base
>>> (while yes, that is not rust, but C, but we can follow the same
>>> style?)
>>
>> Difference languages have different characteristics, and one size
>> doesn't fit them all :)
> I'm not even arguing this at all :)
> 
> I think the biggest issues i'm speaking of really are the braces and the
> spaces really, where the braces can be argued for/against, it's
> cognitive harder, but can be dealth with (and we can expect
> inconsitencies; but the sapces vs tabs thing, personal configuration vs
> forced with is the point I was trying to raise.
> 
> As said before, 'every building is different, some offer wheelchair
> ramps, others do' kind of point, not like 'this building is red, and
> that one is blue, and not every color fits all :p
> 
>>
>>> Sadly, I've seen so much vendor code (yeah, I know) which doesn't
>>> even have consistency in their own files ...
>>
>> That's very true. So when all other Rust code currently follow
>> (roughly) the same coding style and this situation doesn't currently
>> exist, let's not make it worse...
>>
>> Best,
>> Gary
>
Bagas Sanjaya Oct. 16, 2022, 1:44 a.m. UTC | #136
On 10/15/22 21:16, Olliver Schinagl wrote:
> As this thread kind of went silent and as the 'big merge' for this feature is getting closer, here a final plee, inspired by this slashdot post [0].
> 
> The post in itself speaks of a new team forming on working on the Rust styleguide, which in itself is still evolving. This makes sense, rust is new, it's not very commonly in use and as with all good things, they evolve.
> 
> One comment in that slashdot post [1] I want to bring forward and quote a piece of:
> "i created a new repository, and thought i was being hip and modern, so
> i started to evangelize spaces for the 'consistency across environments'
> 
> i get approached by not one, but TWO coworkers who unfortunately are highly visually impaired and each has a different visual impairment
> 
> at that moment, i instantaneously conceded — there's just no counter-argument that even comes close to outweighing the accessibility needs of valued coworkers"
> 
> Visual impairness is a thing, it does not make someone smarter or dumber. Helping those with visual impairments should be concidered, and not shunted off by saying 'but the rust guys came up with the perfect style, so we should use it'.
> 
> Find attached, a diff to the .rustfmt.toml, that should keep things more consistent with the current kernel style.
> 
> I'll leave it now to Linus and Greg to concsider this, and will keep my peace (though I hope they actually read it :p).
> 

I have to say two advices:

First, don't top-post. I don't know what context you're replying to
(in fact I have to cut the reply context below your message).

Second, please post the patch inline, not attached. git format-patch +
git send-email should suffice.

Thanks.
Bagas Sanjaya Oct. 16, 2022, 1:50 a.m. UTC | #137
On 10/16/22 08:44, Bagas Sanjaya wrote:
> 
> I have to say two advices:
> 
> First, don't top-post. I don't know what context you're replying to
> (in fact I have to cut the reply context below your message).
> 
> Second, please post the patch inline, not attached. git format-patch +
> git send-email should suffice.
> 

Oh, I forget to mention this. Is it RFC? If so, you need to specify
--subject-prefix="RFC PATCH" to git format-patch.

Thanks.
Josh Triplett Oct. 16, 2022, 1:23 p.m. UTC | #138
On Sat, Oct 15, 2022 at 04:16:14PM +0200, Olliver Schinagl wrote:
> +indent_style = "Visual"

Without commenting on the rest of this: visual indent style produces a
*lot* of diff noise, and I'd strongly recommend against it. Because it
lines things up, a change to one line can change many adjacent lines,
and make it hard to see what actually changed.