Message ID | 20240313110515.70088-1-nmi@metaspace.dk (mailing list archive) |
---|---|
Headers | show |
Series | Rust block device driver API and null block driver | expand |
On 3/13/24 04:05, Andreas Hindborg wrote: > This is the second version of the Rust block device driver API and the Rust null > block driver. The context and motivation can be seen in cover letter of the RFC > v1 [1]. If more context is required, a talk about this effort was recorded at > LPC [2]. I hope to be able to discuss this series at LSF this year [3]. Memory safety may land in C++ in the near future (see also https://herbsutter.com/2024/03/). If memory-safe C++ or memory-safe C would be adopted in the kernel, it would allow writing memory-safe drivers without having to add complicated bindings between existing C code and new Rust code. Please do not take this as personal criticism - I appreciate the effort that has been spent on coming up with great Rust bindings for the Linux kernel block layer. Thanks, Bart.
On Wed, Mar 13, 2024 at 11:02:23AM -0700, Bart Van Assche wrote: > On 3/13/24 04:05, Andreas Hindborg wrote: > > This is the second version of the Rust block device driver API and the Rust null > > block driver. The context and motivation can be seen in cover letter of the RFC > > v1 [1]. If more context is required, a talk about this effort was recorded at > > LPC [2]. I hope to be able to discuss this series at LSF this year [3]. > > Memory safety may land in C++ in the near future (see also > https://herbsutter.com/2024/03/). If memory-safe C++ or memory-safe C > would be adopted in the kernel, it would allow writing memory-safe > drivers without having to add complicated bindings between existing C I honestly doubt it, memory-safe is not free, basically you will still want unsafe part for the performance reason (or interacting with hardware), and provide a safe API for driver development. I don't think that part will be gone with a memory-safe C++. So the complication still exists. But I'm happy to be proved wrong ;-) Regards, Boqun > code and new Rust code. Please do not take this as personal criticism - > I appreciate the effort that has been spent on coming up with great > Rust bindings for the Linux kernel block layer. > > Thanks, > > Bart. >
Boqun Feng <boqun.feng@gmail.com> writes: > On Wed, Mar 13, 2024 at 11:02:23AM -0700, Bart Van Assche wrote: >> On 3/13/24 04:05, Andreas Hindborg wrote: >> > This is the second version of the Rust block device driver API and the Rust null >> > block driver. The context and motivation can be seen in cover letter of the RFC >> > v1 [1]. If more context is required, a talk about this effort was recorded at >> > LPC [2]. I hope to be able to discuss this series at LSF this year [3]. >> >> Memory safety may land in C++ in the near future (see also >> https://herbsutter.com/2024/03/). If memory-safe C++ or memory-safe C >> would be adopted in the kernel, it would allow writing memory-safe >> drivers without having to add complicated bindings between existing C > > I honestly doubt it, memory-safe is not free, basically you will still > want unsafe part for the performance reason (or interacting with > hardware), and provide a safe API for driver development. I don't think > that part will be gone with a memory-safe C++. So the complication still > exists. But I'm happy to be proved wrong ;-) I think it is great that people are starting to realize that bringing memory safety to other systems languages is a good idea. But from one person blogging about it to things being ready for production is a long journey. Language designers have to design ways to express the new semantics, standards committees has to agree, compiler engineers have to build and test their compilers. Probably we need a few cycles of this to get things right. At any rate, it is going to take a while. Second, as Boqun is saying, interfacing the unsafe C part of the kernel with memory safe C++ is going to require the same complicated abstraction logic as the safe Rust APIs are bringing. The complexity in bringing Rust to the kernel is not coming from interfacing a foreign language. It stems from the inherent difficulty in designing memory safe wrappers around unsafe C APIs. Best regards, Andreas
On 3/13/24 12:03, Andreas Hindborg wrote: > I think it is great that people are starting to realize that bringing > memory safety to other systems languages is a good idea. Totally agree :-) > But from one person blogging about it to things being ready for > production is a long journey. In case you wouldn't be aware of this, Herb Sutter is not a random blogger - he is the chair of the ISO C++ standards committee since 2002. Thanks, Bart.
On Wed, Mar 13, 2024 at 11:02:23AM -0700, Bart Van Assche wrote: > On 3/13/24 04:05, Andreas Hindborg wrote: > > This is the second version of the Rust block device driver API and the Rust null > > block driver. The context and motivation can be seen in cover letter of the RFC > > v1 [1]. If more context is required, a talk about this effort was recorded at > > LPC [2]. I hope to be able to discuss this series at LSF this year [3]. > > Memory safety may land in C++ in the near future (see also > https://herbsutter.com/2024/03/). If memory-safe C++ or memory-safe C > would be adopted in the kernel, it would allow writing memory-safe > drivers without having to add complicated bindings between existing C > code and new Rust code. Please do not take this as personal criticism - > I appreciate the effort that has been spent on coming up with great > Rust bindings for the Linux kernel block layer. You know, this reminds me of when I was at Intel working on NVMe. We had a product all ready to ship and a manager suggested that we delay shipping it until the SCSI-over-PCI standard was finished and ratified so that the same piece of silicon could support both. Fortunately, said manager was overruled, but it's a great example of how this kind of FUD can actually work.
On Wed, 2024-03-13 at 11:02 -0700, Bart Van Assche wrote: > On 3/13/24 04:05, Andreas Hindborg wrote: > > This is the second version of the Rust block device driver API and > > the Rust null > > block driver. The context and motivation can be seen in cover > > letter of the RFC > > v1 [1]. If more context is required, a talk about this effort was > > recorded at > > LPC [2]. I hope to be able to discuss this series at LSF this year > > [3]. > > Memory safety may land in C++ in the near future (see also > https://herbsutter.com/2024/03/). If memory-safe C++ or memory-safe C > would be adopted in the kernel, it would allow writing memory-safe > drivers without having to add complicated bindings between existing C > code and new Rust code. Correct, but it would also most likely allow to just arbitrarily ignore the "modern, memory safe C" (or C++) and write it the old way. Those discussions are, below the surface, frequently in truth about the question: Can you (and do you want to) _force_ people? The Kernel's C already has more memory safety than standardized C: There's devres, and since last year there's the __cleanup attribute. – but the thing is, you can just ignore it and do it the old way. Once you give people freedom (which is necessary frequently), you'll get people who ignore "the right way to do it". You certainly get them once thousands of people are participating in your project. Actually, Rust in userspace has a similar problem: Your coprogrammers can call unwrap(), just ignoring those neat Result types and blow up your thread. So you have to review and reject that. One of the stronger arguments behind the push for Rust is that the language by design forces you to obey, because otherwise the compiler will just reject building. P. > Please do not take this as personal criticism - > I appreciate the effort that has been spent on coming up with great > Rust bindings for the Linux kernel block layer. > > Thanks, > > Bart. > >
On 3/14/24 05:14, Philipp Stanner wrote: > On Wed, 2024-03-13 at 11:02 -0700, Bart Van Assche wrote: >> On 3/13/24 04:05, Andreas Hindborg wrote: >>> This is the second version of the Rust block device driver API and >>> the Rust null >>> block driver. The context and motivation can be seen in cover >>> letter of the RFC >>> v1 [1]. If more context is required, a talk about this effort was >>> recorded at >>> LPC [2]. I hope to be able to discuss this series at LSF this year >>> [3]. >> >> Memory safety may land in C++ in the near future (see also >> https://herbsutter.com/2024/03/). If memory-safe C++ or memory-safe C >> would be adopted in the kernel, it would allow writing memory-safe >> drivers without having to add complicated bindings between existing C >> code and new Rust code. > > Correct, but it would also most likely allow to just arbitrarily ignore > the "modern, memory safe C" (or C++) and write it the old way. Linux kernel developers have the choice today between memory-unsafe C and memory-safe Rust so they already have the freedom to ignore memory safety by choosing for C. In my opinion a compiler option should be introduced once memory-safe C or C++ is available that can be set per source file and that makes the build fail for source files that unintentionally do not follow the memory-safety rules. > The Kernel's C already has more memory safety than standardized C: > There's devres, and since last year there's the __cleanup attribute. > – but the thing is, you can just ignore it and do it the old way. devres is controversial - see also Laurent Pinchart, "Why is devm_kzalloc() harmful and what can we do about it", LPC, 2022 (https://lpc.events/event/16/contributions/1227/). > One of the stronger arguments behind the push for Rust is that the > language by design forces you to obey, because otherwise the compiler > will just reject building. Rust has a very significant disadvantage that memory-safe C/C++ won't have: supporting Rust means adding Rust bindings for all C functions called from Rust code. This forces everyone who wants to change an interface to also change the Rust bindings and hence will make it harder to maintain the Linux kernel in its entirety. Thanks, Bart.
Just a passer-by here, but I noticed the link to Laurent's talk.. On Thu, Mar 14, 2024 at 10:03:28AM -0700, Bart Van Assche wrote: > On 3/14/24 05:14, Philipp Stanner wrote: > > > The Kernel's C already has more memory safety than standardized C: > > There's devres, and since last year there's the __cleanup attribute. > > – but the thing is, you can just ignore it and do it the old way. > > devres is controversial - see also Laurent Pinchart, "Why is > devm_kzalloc() harmful and what can we do about it", LPC, 2022 > (https://lpc.events/event/16/contributions/1227/). I don't think that's a great thing to cite, that talk prompted a series of others with (AFAIK*) the most recent being from Bart at LPC this year: https://lpc.events/event/17/contributions/16f The TL;DR is that it's not actually problem caused by devres. * I think Wolfram also talked about it at an automotive conference, but that seemed like a bit of a pitch for funding from the safety conscious
Hi Bart, Bart Van Assche <bvanassche@acm.org> writes: > On 3/14/24 05:14, Philipp Stanner wrote: >> On Wed, 2024-03-13 at 11:02 -0700, Bart Van Assche wrote: [...] >> One of the stronger arguments behind the push for Rust is that the >> language by design forces you to obey, because otherwise the compiler >> will just reject building. > > Rust has a very significant disadvantage that memory-safe C/C++ won't > have: supporting Rust means adding Rust bindings for all C functions > called from Rust code. This forces everyone who wants to change an > interface to also change the Rust bindings and hence will make it > harder to maintain the Linux kernel in its entirety. I think you might be missing a key point here. We actually generate Rust bindings to the existing C kernel automatically. No hand editing required, except for some corner cases we currently have with static methods and certain macros. If we just wanted to call the C APIa directly, there would be no engineering required. The main reason to deploy Rust would also go away, we might as well stay in C. The actual engineering effort goes into building memory safe versions of the C APIs. This requirement will not magically go away, no matter what memory safe language (or language extensions) your use to interface the existing unsafe C APIs. Best regards, Andreas
On Wed, Mar 13, 2024 at 12:05:07PM +0100, Andreas Hindborg wrote:
> - Adopted the folio abstraction where applicable
I don't think this was the correct move. The memory you're managing
with folios is only used for storing the data being stored in the blkdev.
It's not mapped to userspace, it doesn't make use of the flags (locked,
uptodate, writeback, etc), it doesn't need an LRU list, a mapping,
an index, a refcount or memcg.
I think you should use pages instead of folios to keep a handle on
this memory. Apologies for the confusion; we're in the middle of a
years-long transition from the overly complex and overused struct page
to splitting it into different data types for different purposes.
More detail on this here, if you're interested:
https://kernelnewbies.org/MatthewWilcox/Memdescs
Matthew Wilcox <willy@infradead.org> writes: > On Wed, Mar 13, 2024 at 12:05:07PM +0100, Andreas Hindborg wrote: >> - Adopted the folio abstraction where applicable > > I don't think this was the correct move. The memory you're managing > with folios is only used for storing the data being stored in the blkdev. > It's not mapped to userspace, it doesn't make use of the flags (locked, > uptodate, writeback, etc), it doesn't need an LRU list, a mapping, > an index, a refcount or memcg. > > I think you should use pages instead of folios to keep a handle on > this memory. Apologies for the confusion; we're in the middle of a > years-long transition from the overly complex and overused struct page > to splitting it into different data types for different purposes. Ok, thanks. I'll swap it back. I was under the impression that using folios also give the advantage that handles are always head pages. No need to worry about head/tail pages. If the driver moves to use higher order pages for larger block sizes, would it then make sense with folios, or are page still preferred? > More detail on this here, if you're interested: > https://kernelnewbies.org/MatthewWilcox/Memdescs Thanks, that is useful. Best regards, Andreas
On Sun, Mar 17, 2024 at 08:09:53AM +0100, Andreas Hindborg wrote: > I was under the impression that using folios also give the advantage > that handles are always head pages. No need to worry about head/tail > pages. If the driver moves to use higher order pages for larger block > sizes, would it then make sense with folios, or are page still > preferred? This is a good question. If you do enhance this driver to support larger block sizes, I would recommend indexing the xarray by sector number instead of page number. You would allocate a compound page, but at no point would you need to ask the page what order it is (you already know), you don't need to change the page flags, etc, etc. The page cache is in a bit of a different spot. It has to support multiple folio sizes for the same file; it has to support folios being split, it has to support lookup from users who don't know what the folio size is, etc, etc. I don't think there's ever a point at which you'd need to call compound_head() because you'd always look up the head page. You'd still want an iterator of some kind to copy to a compound page (because you can only map one at a time). And that might look a lot like the folio copying code. Probably the right way to handle this is for the folio copying code to call the page copying code since a folio is always composed of pages.
From: Andreas Hindborg <a.hindborg@samsung.com> Hi All! This is the second version of the Rust block device driver API and the Rust null block driver. The context and motivation can be seen in cover letter of the RFC v1 [1]. If more context is required, a talk about this effort was recorded at LPC [2]. I hope to be able to discuss this series at LSF this year [3]. The series is still in RFC state, as many dependencies have not yet been merged. Changes from v1: - Improved request lifetime tracking - Changed rust null block driver name to `rnull` - Added timer completion support to `rnull` - Added softirq completion support to `rnull` - Moved to `xarray` for memory backing - Adopted the folio abstraction where applicable - Added `SAFETY` comments to all unsafe blocks - Improved documentation and added examples - Fixed up `vtable` default method behavior - Dropped `unsafe` from vtable builder - Fixed a safety issue in `RawWriter` - Dropped the `TagSetRef` abstraction - Renamed `Bio::iter` -> `Bio::raw_iter` - Updated `core::fmt::Display for Bio` for readability - Simplified `BioIterator::next` - Documented and refactored `bvec_iter` functions - Moved cache alignment out of `Lock` - Added MAINTAINER entry Thanks for all the input on v1! Performance =========== Rather than include an abundance of performance numbers in this letter, I would refer to [4] for a comparison of performance to the C null block driver. In general across all of the benchmark configurations, the mean of the difference is -2.5%, with individual configurations showing [-10%;30%] difference in IO/s. Request lifetime modeling ========================= While implementing timer completion, it became clear that the request lifetime modeling applied in the v1 RFC was not adequate. With that modeling, combined with the timer callback feature, a developer would be able to write safe Rust code that violates the Rust invariants for references, thus leading to undefined behavior. A similar situation would happen in the completion path when developers write code to convert from a completion id to a `&Request`. To make these situations safe, and thus free from undefined behavior, the `Request` type now applies reference counting. The logic piggybacks on the `atomic_t ref` field of the C `struct request`. In order to make this work, the Rust code has to be able to free requests when the refcount reaches zero. Therefore, the series exposes (EXPORT_SYMBOL_GPL) the internal blk-mq symbol `__blk_mq_free_request`. I am curious what the community thinks of this approach and whether it is OK to expose this symbol so that Rust can call it. One consequence of these changes is that requests that are processed by a Rust block device driver, are freed at a different place in the completion path, than requests processed by C drivers. Requests processed by C drivers are typically freed and recycled inline during the call to `blk_mq_complete_request`. The requests processed by a Rust driver will be recycled when the `RequestRef` is dropped, which is usually at some time after the request has been completed. There does not seem to be any statistically significant effect on performance for the rust null block implementation due to this change. Dependencies ============ This series is based on the branch `rnull-deps-v6.8` of [5]. This tree is based on the upstream `v6.8` tag and has been prepared with dependencies for this series: - [17] rust: add `CacheAligned` for easy cache line alignment of values - [16] rust: xarray: add mutable access through `Guard` - [15] rust: hrtimer: introduce hrtimer support - [14] rust: add `Opaque::try_ffi_init` - [13] rust: make `ARef` repr transparent - [12] rust: folio: add `with_slice_into_page()` - [11] rust: page: add `from_raw()` - [10] rust: page: add `with_slice_into_page()` - [ 9] IM: implement `ForeignOwnable` for `Pin` - [ 8] IM: add `module_params` macro - [ 7] rust: xarray: fix example - [ 6] LIST: rust: xarray: Add an abstraction for XArray - [ 5] LIST: rust: types: add FOREIGN_ALIGN to ForeignOwnable - [ 4] LIST: rust: lock: implement `IrqSaveBackend` for `SpinLock` - [ 3] LIST: rust: lock: add support for `Lock::lock_irqsave` - [ 2] LIST: rust: add improved version of `ForeignOwnable::borrow_mut` - [ 1] LIST: rust: folio: introduce basic support for folios - [ 0] LIST: rust: add abstraction for `struct page` Dependencies 0-6 are patches that have appeared on the list but are not yet merged. Dependencies 8-9 are imports from the `rust` branch [6,7] that have not yet been submitted. Dependencies 10-17 I will submit independently in the near future. If the upstream maintainers agree, then when the dependencies are merged, I will eventually start submitting PRs to Jens. I fully commit to maintain this code as indicated in the MAINTAINERS entry. Best regards, Andreas Hindborg [1] https://lore.kernel.org/rust-for-linux/20230503090708.2524310-1-nmi@metaspace.dk [2] https://lpc.events/event/17/contributions/1425 [3] https://lore.kernel.org/rust-for-linux/87v87cgsp8.fsf@metaspace.dk [4] https://rust-for-linux.com/null-block-driver [5] git https://github.com/metaspace/linux.git rnull-deps-v6.8 [6] git https://github.com/rust-for-Linux/linux.git rust [7] https://rust-for-linux.com/branches#rust Andreas Hindborg (5): rust: block: introduce `kernel::block::mq` module rust: block: introduce `kernel::block::bio` module rust: block: allow `hrtimer::Timer` in `RequestData` rust: block: add rnull, Rust null_blk implementation MAINTAINERS: add entry for Rust block device driver API MAINTAINERS | 14 ++ block/blk-mq.c | 3 +- drivers/block/Kconfig | 4 + drivers/block/Makefile | 3 + drivers/block/rnull.rs | 323 +++++++++++++++++++++++++++ include/linux/blk-mq.h | 1 + rust/bindings/bindings_helper.h | 2 + rust/helpers.c | 46 ++++ rust/kernel/block.rs | 6 + rust/kernel/block/bio.rs | 112 ++++++++++ rust/kernel/block/bio/vec.rs | 279 +++++++++++++++++++++++ rust/kernel/block/mq.rs | 131 +++++++++++ rust/kernel/block/mq/gen_disk.rs | 174 +++++++++++++++ rust/kernel/block/mq/operations.rs | 346 +++++++++++++++++++++++++++++ rust/kernel/block/mq/raw_writer.rs | 60 +++++ rust/kernel/block/mq/request.rs | 269 ++++++++++++++++++++++ rust/kernel/block/mq/tag_set.rs | 117 ++++++++++ rust/kernel/error.rs | 5 + rust/kernel/lib.rs | 1 + scripts/Makefile.build | 2 +- 20 files changed, 1896 insertions(+), 2 deletions(-) create mode 100644 drivers/block/rnull.rs create mode 100644 rust/kernel/block.rs create mode 100644 rust/kernel/block/bio.rs create mode 100644 rust/kernel/block/bio/vec.rs create mode 100644 rust/kernel/block/mq.rs create mode 100644 rust/kernel/block/mq/gen_disk.rs create mode 100644 rust/kernel/block/mq/operations.rs create mode 100644 rust/kernel/block/mq/raw_writer.rs create mode 100644 rust/kernel/block/mq/request.rs create mode 100644 rust/kernel/block/mq/tag_set.rs base-commit: e8f897f4afef0031fe618a8e94127a0934896aba prerequisite-patch-id: 299c2cc48c45c8149e7cb18c6146a0308e5d0a44 prerequisite-patch-id: 5153ee1c410dbdf22a6fd40667712943b4b89a97 prerequisite-patch-id: 4d025bab9bc9741aedecc5327ad18f88f9105271 prerequisite-patch-id: a5e932c86fa6c68234764aa3d7f314e5b534b1d9 prerequisite-patch-id: aef3042976c4c678b7aa96154fc280f9061ebaf7 prerequisite-patch-id: 8bf108ad0af2a3ec89acb8d99ee1b49ca2f51c69 prerequisite-patch-id: a803b221c3232db3258406a4075558e85acefd09 prerequisite-patch-id: 5e9cbcd0dc56a83353f0e4a3b5d4e8d5b51f3160 prerequisite-patch-id: 28bae4a7fe83b36afed9892515a6dde1ea51e98a prerequisite-patch-id: 5b5ea2a21b37afb05fdf655396a6f74d83bb99c4 prerequisite-patch-id: dc53b6ce21f74726d5d13988398c2954da07bcb6 prerequisite-patch-id: b86d2b14f1770c1d4756ca10b93efaada643e560 prerequisite-patch-id: 6a155859eb9a18afcd22a5bda3350d45d92e2fc7 prerequisite-patch-id: c8ca075008f50d3cf1781c1ea1130a8ee735e7d2 prerequisite-patch-id: b000cd190fe80dea3f4dd9172ecf8787f23b72be prerequisite-patch-id: b72f1fc3bd44b60911d9d91ecc5a45812a75eba3 prerequisite-patch-id: 167c7770e124b9afa44bead742f90a57ac73f2d7 prerequisite-patch-id: cc24a3135b4f258f4b1ea83ac91c2be4ffe31772