Message ID | 20231018122518.128049-1-wedsonaf@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Rust abstractions for VFS | expand |
On 23/10/18 09:24AM, Wedson Almeida Filho wrote: > From: Wedson Almeida Filho <walmeida@microsoft.com> > > This series introduces Rust abstractions that allow page-cache-backed read-only > file systems to be written in Rust. > > There are two file systems that are built on top of these abstractions: tarfs > and puzzlefs. The former has zero unsafe blocks and is included as a patch in > this series; the latter is described elsewhere [1]. We limit the functionality > to the bare minimum needed to implement them. > > Rust file system modules can be declared with the `module_fs` macro and are > required to implement the following functions (which are part of the > `FileSystem` trait): > > impl FileSystem for MyFS { > fn super_params(sb: &NewSuperBlock<Self>) -> Result<SuperParams<Self::Data>>; > fn init_root(sb: &SuperBlock<Self>) -> Result<ARef<INode<Self>>>; > fn read_dir(inode: &INode<Self>, emitter: &mut DirEmitter) -> Result; > fn lookup(parent: &INode<Self>, name: &[u8]) -> Result<ARef<INode<Self>>>; > fn read_folio(inode: &INode<Self>, folio: LockedFolio<'_>) -> Result; > } > > They can optionally implement the following: > > fn read_xattr(inode: &INode<Self>, name: &CStr, outbuf: &mut [u8]) -> Result<usize>; > fn statfs(sb: &SuperBlock<Self>) -> Result<Stat>; > > They may also choose the type of the data they can attach to superblocks and/or > inodes. > > There a couple of issues that are likely to lead to unsoundness that have to do > with the unregistration of file systems. I will send separate emails about > them. > > A git tree is available here: > git://github.com/wedsonaf/linux.git vfs > > Web: > https://github.com/wedsonaf/linux/commits/vfs I've checked out your branch and but it doesn't compile: ``` $ make LLVM=1 -j4 DESCEND objtool CALL scripts/checksyscalls.sh make[4]: 'install_headers' is up to date. RUSTC L rust/kernel.o error[E0425]: cannot find function `folio_alloc` in crate `bindings` --> rust/kernel/folio.rs:43:54 | 43 | let f = ptr::NonNull::new(unsafe { bindings::folio_alloc(bindings::GFP_KERNEL, order) }) | ^^^^^^^^^^^ help: a function with a similar name exists: `__folio_alloc` | ::: /home/amiculas/work/linux/rust/bindings/bindings_generated.rs:17311:5 | 17311 | / pub fn __folio_alloc( 17312 | | gfp: gfp_t, 17313 | | order: core::ffi::c_uint, 17314 | | preferred_nid: core::ffi::c_int, 17315 | | nodemask: *mut nodemask_t, 17316 | | ) -> *mut folio; | |___________________- similarly named function `__folio_alloc` defined here error: aborting due to previous error For more information about this error, try `rustc --explain E0425`. make[2]: *** [rust/Makefile:460: rust/kernel.o] Error 1 make[1]: *** [/home/amiculas/work/linux/Makefile:1208: prepare] Error 2 make: *** [Makefile:234: __sub-make] Error 2 ``` I'm missing `CONFIG_NUMA`, which seems to guard `folio_alloc` (include/linux/gfp.h): ``` #ifdef CONFIG_NUMA struct page *alloc_pages(gfp_t gfp, unsigned int order); struct folio *folio_alloc(gfp_t gfp, unsigned order); struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma, unsigned long addr, bool hugepage); #else ``` > > [1]: The PuzzleFS container filesystem: https://lwn.net/Articles/945320/ > > Wedson Almeida Filho (19): > rust: fs: add registration/unregistration of file systems > rust: fs: introduce the `module_fs` macro > samples: rust: add initial ro file system sample > rust: fs: introduce `FileSystem::super_params` > rust: fs: introduce `INode<T>` > rust: fs: introduce `FileSystem::init_root` > rust: fs: introduce `FileSystem::read_dir` > rust: fs: introduce `FileSystem::lookup` > rust: folio: introduce basic support for folios > rust: fs: introduce `FileSystem::read_folio` > rust: fs: introduce `FileSystem::read_xattr` > rust: fs: introduce `FileSystem::statfs` > rust: fs: introduce more inode types > rust: fs: add per-superblock data > rust: fs: add basic support for fs buffer heads > rust: fs: allow file systems backed by a block device > rust: fs: allow per-inode data > rust: fs: export file type from mode constants > tarfs: introduce tar fs > > fs/Kconfig | 1 + > fs/Makefile | 1 + > fs/tarfs/Kconfig | 16 + > fs/tarfs/Makefile | 8 + > fs/tarfs/defs.rs | 80 ++ > fs/tarfs/tar.rs | 322 +++++++ > rust/bindings/bindings_helper.h | 13 + > rust/bindings/lib.rs | 6 + > rust/helpers.c | 142 ++++ > rust/kernel/error.rs | 6 +- > rust/kernel/folio.rs | 214 +++++ > rust/kernel/fs.rs | 1290 +++++++++++++++++++++++++++++ > rust/kernel/fs/buffer.rs | 60 ++ > rust/kernel/lib.rs | 2 + > rust/kernel/mem_cache.rs | 2 - > samples/rust/Kconfig | 10 + > samples/rust/Makefile | 1 + > samples/rust/rust_rofs.rs | 154 ++++ > scripts/generate_rust_analyzer.py | 2 +- > 19 files changed, 2324 insertions(+), 6 deletions(-) > create mode 100644 fs/tarfs/Kconfig > create mode 100644 fs/tarfs/Makefile > create mode 100644 fs/tarfs/defs.rs > create mode 100644 fs/tarfs/tar.rs > create mode 100644 rust/kernel/folio.rs > create mode 100644 rust/kernel/fs.rs > create mode 100644 rust/kernel/fs/buffer.rs > create mode 100644 samples/rust/rust_rofs.rs > > > base-commit: b0bc357ef7a98904600826dea3de79c0c67eb0a7 > -- > 2.34.1 >
On Wed, 18 Oct 2023 at 10:40, Ariel Miculas (amiculas) <amiculas@cisco.com> wrote: > I'm missing `CONFIG_NUMA`, which seems to guard `folio_alloc` > (include/linux/gfp.h): > ``` > #ifdef CONFIG_NUMA > struct page *alloc_pages(gfp_t gfp, unsigned int order); > struct folio *folio_alloc(gfp_t gfp, unsigned order); > struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma, > unsigned long addr, bool hugepage); > #else > ``` Hey Ariel, thanks for finding this. When CONFIG_NUMA is not defined, `folio_alloc` is a static inline function defined in the header file, so bindgen doesn't generate a binding for it. I'll fix this by adding a helper in rust/helpers.c.
On Wed, Oct 18, 2023 at 09:24:59AM -0300, Wedson Almeida Filho wrote: > Rust file system modules can be declared with the `module_fs` macro and are > required to implement the following functions (which are part of the > `FileSystem` trait): > > impl FileSystem for MyFS { > fn super_params(sb: &NewSuperBlock<Self>) -> Result<SuperParams<Self::Data>>; > fn init_root(sb: &SuperBlock<Self>) -> Result<ARef<INode<Self>>>; > fn read_dir(inode: &INode<Self>, emitter: &mut DirEmitter) -> Result; > fn lookup(parent: &INode<Self>, name: &[u8]) -> Result<ARef<INode<Self>>>; > fn read_folio(inode: &INode<Self>, folio: LockedFolio<'_>) -> Result; > } Does it make sense to describe filesystem methods like this? As I understand (eg) how inodes are laid out, we typically malloc a foofs_inode { x; y; z; struct inode vfs_inode; }; and then the first line of many functions that take an inode is: struct ext2_inode_info *ei = EXT2_I(dir); That feels like unnecessary boilerplate, and might lead to questions like "What if I'm passed an inode that isn't an ext2 inode". Do we want to always pass in the foofs_inode instead of the inode? Also, I see you're passing an inode to read_dir. Why did you decide to do that? There's information in the struct file that's either necessary or useful to have in the filesystem. Maybe not in toy filesystems, but eg network filesystems need user credentials to do readdir, which are stored in struct file. Block filesystems store readahead data in struct file.
On Sun, 29 Oct 2023 at 17:32, Matthew Wilcox <willy@infradead.org> wrote: > > impl FileSystem for MyFS { > > fn super_params(sb: &NewSuperBlock<Self>) -> Result<SuperParams<Self::Data>>; > > fn init_root(sb: &SuperBlock<Self>) -> Result<ARef<INode<Self>>>; > > fn read_dir(inode: &INode<Self>, emitter: &mut DirEmitter) -> Result; > > fn lookup(parent: &INode<Self>, name: &[u8]) -> Result<ARef<INode<Self>>>; > > fn read_folio(inode: &INode<Self>, folio: LockedFolio<'_>) -> Result; > > } > > Does it make sense to describe filesystem methods like this? As I > understand (eg) how inodes are laid out, we typically malloc a > > foofs_inode { > x; y; z; > struct inode vfs_inode; > }; > > and then the first line of many functions that take an inode is: > > struct ext2_inode_info *ei = EXT2_I(dir); > > That feels like unnecessary boilerplate, and might lead to questions like > "What if I'm passed an inode that isn't an ext2 inode". Do we want to > always pass in the foofs_inode instead of the inode? We're well aligned here. :) Note that the type is `&INode<Self>` -- `Self` is an alias for the type implementing this filesystem. For example, in tarfs, the type is really `&INode<TarFs>`, so it is what you're asking for: the TarFs filesystem only sees TarFs inodes and superblocks (through the FileSystem trait, maybe they have to deal with other inodes for other reasons). In fact, when you have inode of type `INode<TarFs>`, and you have a call like: let d = inode.data(); What you get back has the type declared in `TarFs::INodeData`. Similarly, if you do: let d = inode.super_block().data(); What you get back has the type declared in `TarFs::Data`. So all `container_of` calls are hidden away, and we store super-block data in `s_fs_info` and inode data by having a new struct that contains the data the fs wants plus a struct inode (this is done with generics, it's called `INodeWithData`). This is required for type safety: you always get the right type. If someone changes the type in one place but forgets to change it in another place, they'll get a compilation error. > Also, I see you're passing an inode to read_dir. Why did you decide to > do that? There's information in the struct file that's either necessary > or useful to have in the filesystem. Maybe not in toy filesystems, but eg > network filesystems need user credentials to do readdir, which are stored > in struct file. Block filesystems store readahead data in struct file. Because the two file systems we have don't use anything from `struct file` beyond the inode. Passing a `file` to `read_dir` would require us to introduce an unnecessary abstraction that no one uses, which we've been told not to do. There is no technical reason that makes it impractical though. We can add it when the need arises.
On Tue, Oct 31, 2023 at 05:14:08PM -0300, Wedson Almeida Filho wrote: > > Also, I see you're passing an inode to read_dir. Why did you decide to > > do that? There's information in the struct file that's either necessary > > or useful to have in the filesystem. Maybe not in toy filesystems, but eg > > network filesystems need user credentials to do readdir, which are stored > > in struct file. Block filesystems store readahead data in struct file. > > Because the two file systems we have don't use anything from `struct > file` beyond the inode. > > Passing a `file` to `read_dir` would require us to introduce an > unnecessary abstraction that no one uses, which we've been told not to > do. > > There is no technical reason that makes it impractical though. We can > add it when the need arises. Then we shouldn't merge any of this, or even send it out for review again until there is at least one non-toy filesystems implemented. Either stick to the object orientation we've already defined (ie separate aops, iops, fops, ... with substantially similar arguments) or propose changes to the ones we have in C. Dealing only with toy filesystems is leading you to bad architecture.
On Wed, 3 Jan 2024 at 15:02, Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Oct 31, 2023 at 05:14:08PM -0300, Wedson Almeida Filho wrote: > > > Also, I see you're passing an inode to read_dir. Why did you decide to > > > do that? There's information in the struct file that's either necessary > > > or useful to have in the filesystem. Maybe not in toy filesystems, but eg > > > network filesystems need user credentials to do readdir, which are stored > > > in struct file. Block filesystems store readahead data in struct file. > > > > Because the two file systems we have don't use anything from `struct > > file` beyond the inode. > > > > Passing a `file` to `read_dir` would require us to introduce an > > unnecessary abstraction that no one uses, which we've been told not to > > do. > > > > There is no technical reason that makes it impractical though. We can > > add it when the need arises. > > Then we shouldn't merge any of this, or even send it out for review > again until there is at least one non-toy filesystems implemented. What makes you characterize these filesystems as toys? The fact that they only use the file's inode in iterate_shared? > Either stick to the object orientation we've already defined (ie > separate aops, iops, fops, ... with substantially similar arguments) > or propose changes to the ones we have in C. Dealing only with toy > filesystems is leading you to bad architecture. I'm trying to understand the argument here. Are saying that Rust cannot have different APIs with the same performance characteristics as C's, unless we also fix the C apis? That isn't even a requirement when introducing new C apis, why would it be a requirement for Rust apis? Cheers, -Wedson
On Wed, Jan 03, 2024 at 06:02:40PM +0000, Matthew Wilcox wrote: > On Tue, Oct 31, 2023 at 05:14:08PM -0300, Wedson Almeida Filho wrote: > > > Also, I see you're passing an inode to read_dir. Why did you decide to > > > do that? There's information in the struct file that's either necessary > > > or useful to have in the filesystem. Maybe not in toy filesystems, but eg > > > network filesystems need user credentials to do readdir, which are stored > > > in struct file. Block filesystems store readahead data in struct file. > > > > Because the two file systems we have don't use anything from `struct > > file` beyond the inode. > > > > Passing a `file` to `read_dir` would require us to introduce an > > unnecessary abstraction that no one uses, which we've been told not to > > do. > > > > There is no technical reason that makes it impractical though. We can > > add it when the need arises. > > Then we shouldn't merge any of this, or even send it out for review > again until there is at least one non-toy filesystems implemented. > Either stick to the object orientation we've already defined (ie > separate aops, iops, fops, ... with substantially similar arguments) > or propose changes to the ones we have in C. Dealing only with toy > filesystems is leading you to bad architecture. Not sure I agree - this is a "waterfall vs. incremental" question, and personally I would go with doing things incrementally here. We don't need to copy the C interface as is; we can use this as an opportunity to incrementally design a new API that will obviously take lessons from the C API (since it's wrapping it), but it doesn't have to do things the same and it doesn't have to do everything all at once. Anyways, like you alluded to the C side is a bit of a mess w.r.t. what's in a_ops vs. i_ops, and cleaning that up on the C side is a giant hassle because then you have to fix _everything_ that implements or consumes those interfaces at the same time. So instead, it would seem easier to me to do the cleaner version on the Rust side, and then once we know what that looks like, maybe we update the C version to match - or maybe we light it all on fire and continue with rewriting everything in Rust... *shrug*
On Wed, Jan 03, 2024 at 04:04:26PM -0300, Wedson Almeida Filho wrote: > > Either stick to the object orientation we've already defined (ie > > separate aops, iops, fops, ... with substantially similar arguments) > > or propose changes to the ones we have in C. Dealing only with toy > > filesystems is leading you to bad architecture. > > I'm trying to understand the argument here. Are saying that Rust > cannot have different APIs with the same performance characteristics > as C's, unless we also fix the C apis? Different expressive power, not performance characteristics. It's *NOT* about C vs Rust; we have an existing system of objects and properties of such. Independent from the language being used to work with them. If we have to keep a separate system for your language, feel free to fork the kernel and do whatever you want with it. Just don't expect anybody else to play with your toy. In case it's not entirely obvious - your arguments about not needing something or other for the instances you have tried to work with so far do not hold water. At all. The only acceptable way to use Rust in that space is to treat the existing set of objects and operations as externally given; we *can* change those, with good enough reasons, but "the instances in Rust-using filesystems don't need this and that" doesn't cut it. Changes do happen in that area. Often enough. And the cost of figuring out whether they break things shouldn't be doubled because Rust folks want a universe of their own - the benefits of Rust are not worth that kind of bother.
On Wed, Jan 03, 2024 at 07:53:58PM +0000, Al Viro wrote: > On Wed, Jan 03, 2024 at 04:04:26PM -0300, Wedson Almeida Filho wrote: > > > > Either stick to the object orientation we've already defined (ie > > > separate aops, iops, fops, ... with substantially similar arguments) > > > or propose changes to the ones we have in C. Dealing only with toy > > > filesystems is leading you to bad architecture. > > > > I'm trying to understand the argument here. Are saying that Rust > > cannot have different APIs with the same performance characteristics > > as C's, unless we also fix the C apis? > > Different expressive power, not performance characteristics. > > It's *NOT* about C vs Rust; we have an existing system of objects and > properties of such. Independent from the language being used to work > with them. > > If we have to keep a separate system for your language, feel free to fork > the kernel and do whatever you want with it. Just don't expect anybody > else to play with your toy. The rust people have been getting conflicting advice, and your response is to tell them to fork the kernel and go away? > In case it's not entirely obvious - your arguments about not needing > something or other for the instances you have tried to work with so far > do not hold water. At all. > > The only acceptable way to use Rust in that space is to treat the existing > set of objects and operations as externally given; we *can* change those, > with good enough reasons, but "the instances in Rust-using filesystems > don't need this and that" doesn't cut it. The question was just about whether to add something now that isn't used on the Rust side yet, or wait until later when it is. I think this has gone a bit afield, and gotten a bit dramatic.
On Wed, Jan 03, 2024 at 02:14:34PM -0500, Kent Overstreet wrote: > We don't need to copy the C interface as is; we can use this as an > opportunity to incrementally design a new API that will obviously take > lessons from the C API (since it's wrapping it), but it doesn't have to > do things the same and it doesn't have to do everything all at once. > > Anyways, like you alluded to the C side is a bit of a mess w.r.t. what's > in a_ops vs. i_ops, and cleaning that up on the C side is a giant hassle > because then you have to fix _everything_ that implements or consumes > those interfaces at the same time. > > So instead, it would seem easier to me to do the cleaner version on the > Rust side, and then once we know what that looks like, maybe we update > the C version to match - or maybe we light it all on fire and continue > with rewriting everything in Rust... *shrug* No. This "cleaner version on the Rust side" is nothing of that sort; this "readdir doesn't need any state that might be different for different file instances beyond the current position, because none of our examples have needed that so far" is a good example of the garbage we really do not need to deal with.
On Wed, Jan 03, 2024 at 04:04:26PM -0300, Wedson Almeida Filho wrote: > On Wed, 3 Jan 2024 at 15:02, Matthew Wilcox <willy@infradead.org> wrote: > > > > On Tue, Oct 31, 2023 at 05:14:08PM -0300, Wedson Almeida Filho wrote: > > > > Also, I see you're passing an inode to read_dir. Why did you decide to > > > > do that? There's information in the struct file that's either necessary > > > > or useful to have in the filesystem. Maybe not in toy filesystems, but eg > > > > network filesystems need user credentials to do readdir, which are stored > > > > in struct file. Block filesystems store readahead data in struct file. > > > > > > Because the two file systems we have don't use anything from `struct > > > file` beyond the inode. > > > > > > Passing a `file` to `read_dir` would require us to introduce an > > > unnecessary abstraction that no one uses, which we've been told not to > > > do. > > > > > > There is no technical reason that makes it impractical though. We can > > > add it when the need arises. > > > > Then we shouldn't merge any of this, or even send it out for review > > again until there is at least one non-toy filesystems implemented. > > What makes you characterize these filesystems as toys? The fact that > they only use the file's inode in iterate_shared? They're not real filesystems. You can't put, eg, root or your home directory on one of these filesystems. > > Either stick to the object orientation we've already defined (ie > > separate aops, iops, fops, ... with substantially similar arguments) > > or propose changes to the ones we have in C. Dealing only with toy > > filesystems is leading you to bad architecture. > > I'm trying to understand the argument here. Are saying that Rust > cannot have different APIs with the same performance characteristics > as C's, unless we also fix the C apis? > > That isn't even a requirement when introducing new C apis, why would > it be a requirement for Rust apis? I'm saying that we have the current object orientation (eg each inode is an object with inode methods) for a reason. Don't change it without understanding what that reason is. And moving, eg iterate_shared() from file_operations to struct file_system_type (effectively what you've done) is something we obviously wouldn't want to do.
Kent Overstreet <kent.overstreet@linux.dev> wrote: > So instead, it would seem easier to me to do the cleaner version on the > Rust side, and then once we know what that looks like, maybe we update > the C version to match - or maybe we light it all on fire and continue > with rewriting everything in Rust... *shrug* Please, no. Please keep Rust separate and out of the core of the kernel and subsystems. David
On Fri Jan 5, 2024 at 2:04 AM EET, David Howells wrote: > Kent Overstreet <kent.overstreet@linux.dev> wrote: > > > So instead, it would seem easier to me to do the cleaner version on the > > Rust side, and then once we know what that looks like, maybe we update > > the C version to match - or maybe we light it all on fire and continue > > with rewriting everything in Rust... *shrug* > > Please, no. Please keep Rust separate and out of the core of the kernel and > subsystems. > > David Yeah, if we ignore that code is field-tested in some cases literally decades, is infrastructure critical in global scale and similar QA metrics, any major Rust update to the core code would be pretty hard to manage for stable kernels... Using Rust in core would probably require decisions at least that are not in the scope of a single patch set. BR, Jarkko
On Wed, 3 Jan 2024 at 22:49, Matthew Wilcox <willy@infradead.org> wrote: > > On Wed, Jan 03, 2024 at 04:04:26PM -0300, Wedson Almeida Filho wrote: > > On Wed, 3 Jan 2024 at 15:02, Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Tue, Oct 31, 2023 at 05:14:08PM -0300, Wedson Almeida Filho wrote: > > > > > Also, I see you're passing an inode to read_dir. Why did you decide to > > > > > do that? There's information in the struct file that's either necessary > > > > > or useful to have in the filesystem. Maybe not in toy filesystems, but eg > > > > > network filesystems need user credentials to do readdir, which are stored > > > > > in struct file. Block filesystems store readahead data in struct file. > > > > > > > > Because the two file systems we have don't use anything from `struct > > > > file` beyond the inode. > > > > > > > > Passing a `file` to `read_dir` would require us to introduce an > > > > unnecessary abstraction that no one uses, which we've been told not to > > > > do. > > > > > > > > There is no technical reason that makes it impractical though. We can > > > > add it when the need arises. > > > > > > Then we shouldn't merge any of this, or even send it out for review > > > again until there is at least one non-toy filesystems implemented. > > > > What makes you characterize these filesystems as toys? The fact that > > they only use the file's inode in iterate_shared? > > They're not real filesystems. You can't put, eg, root or your home > directory on one of these filesystems. tarfs is a real file system, we use it to mount read-only container layers on top of dm-verity for integrity. The root of a container is made of potentially several of these layers, overlaid with overlayfs. We use this in confidential kata containers where we need to enforce authenticity and integrity of data: with tarfs, the original tar file is exposed to confidential VMs, so we can use existing signatures to verify that an attacker hasn't modified the data before the container starts, and dm-verity ensures that we catch any attempts by the host to change data after the container is running. > > > Either stick to the object orientation we've already defined (ie > > > separate aops, iops, fops, ... with substantially similar arguments) > > > or propose changes to the ones we have in C. Dealing only with toy > > > filesystems is leading you to bad architecture. > > > > I'm trying to understand the argument here. Are saying that Rust > > cannot have different APIs with the same performance characteristics > > as C's, unless we also fix the C apis? > > > > That isn't even a requirement when introducing new C apis, why would > > it be a requirement for Rust apis? > > I'm saying that we have the current object orientation (eg each inode > is an object with inode methods) for a reason. Don't change it without > understanding what that reason is. And moving, eg iterate_shared() from > file_operations to struct file_system_type (effectively what you've done) > is something we obviously wouldn't want to do. I don't think I'm changing anything. AFAICT, I'm adding a way to write file systems in Rust. It uses the C API faithfully -- if you find ways in which it doesn't, I'd be happy to fix them. To show its usefulness, I'm providing a real file system that uses it, is simpler than the C version, and contains no unsafe code. So barring bugs in the abstractions, it contains no memory safety issues. Why do you feel I need to mimic the unsafe (in the sense that the compiler doesn't help you prevent safety issues) way C does it _now_? Cheers, -Wedson
On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Wed, Jan 03, 2024 at 02:14:34PM -0500, Kent Overstreet wrote: > > > We don't need to copy the C interface as is; we can use this as an > > opportunity to incrementally design a new API that will obviously take > > lessons from the C API (since it's wrapping it), but it doesn't have to > > do things the same and it doesn't have to do everything all at once. > > > > Anyways, like you alluded to the C side is a bit of a mess w.r.t. what's > > in a_ops vs. i_ops, and cleaning that up on the C side is a giant hassle > > because then you have to fix _everything_ that implements or consumes > > those interfaces at the same time. > > > > So instead, it would seem easier to me to do the cleaner version on the > > Rust side, and then once we know what that looks like, maybe we update > > the C version to match - or maybe we light it all on fire and continue > > with rewriting everything in Rust... *shrug* > > No. This "cleaner version on the Rust side" is nothing of that sort; > this "readdir doesn't need any state that might be different for different > file instances beyond the current position, because none of our examples > have needed that so far" is a good example of the garbage we really do > not need to deal with. What you're calling garbage is what Greg KH asked us to do, namely, not introduce anything for which there are no users. See a couple of quotes below. https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ The best feedback is "who will use these new interfaces?" Without that, it's really hard to review a patchset as it's difficult to see how the bindings will be used, right? https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ And I'd recommend that we not take any more bindings without real users, as there seems to be just a collection of these and it's hard to actually review them to see how they are used... Cheers, -Wedson
On Tue, Jan 09, 2024 at 04:13:15PM -0300, Wedson Almeida Filho wrote: > On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > > No. This "cleaner version on the Rust side" is nothing of that sort; > > this "readdir doesn't need any state that might be different for different > > file instances beyond the current position, because none of our examples > > have needed that so far" is a good example of the garbage we really do > > not need to deal with. > > What you're calling garbage is what Greg KH asked us to do, namely, > not introduce anything for which there are no users. See a couple of > quotes below. > > https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ > The best feedback is "who will use these new interfaces?" Without that, > it's really hard to review a patchset as it's difficult to see how the > bindings will be used, right? > > https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ > And I'd recommend that we not take any more bindings without real users, > as there seems to be just a collection of these and it's hard to > actually review them to see how they are used... You've misunderstood Greg. He's saying (effectively) "No fs bindings without a filesystem to use them". And Al, myself and others are saying "Your filesystem interfaces are wrong because they're not usable for real filesystems". And you're saying "But I'm not allowed to change them". And that's not true. Change them to be laid out how a real filesystem would need them to be. Or argue that your current interfaces are the right ones (they aren't).
On Tue, Jan 09, 2024 at 03:25:11PM -0300, Wedson Almeida Filho wrote: > On Wed, 3 Jan 2024 at 22:49, Matthew Wilcox <willy@infradead.org> wrote: > > > What makes you characterize these filesystems as toys? The fact that > > > they only use the file's inode in iterate_shared? > > > > They're not real filesystems. You can't put, eg, root or your home > > directory on one of these filesystems. > > tarfs is a real file system, we use it to mount read-only container > layers on top of dm-verity for integrity. You're using it in production? Oh dear. > > > I'm trying to understand the argument here. Are saying that Rust > > > cannot have different APIs with the same performance characteristics > > > as C's, unless we also fix the C apis? > > > > > > That isn't even a requirement when introducing new C apis, why would > > > it be a requirement for Rust apis? > > > > I'm saying that we have the current object orientation (eg each inode > > is an object with inode methods) for a reason. Don't change it without > > understanding what that reason is. And moving, eg iterate_shared() from > > file_operations to struct file_system_type (effectively what you've done) > > is something we obviously wouldn't want to do. > > I don't think I'm changing anything. AFAICT, I'm adding a way to write > file systems in Rust. It uses the C API faithfully -- if you find ways > in which it doesn't, I'd be happy to fix them. You are changing the _object model_. The C API has separate objects for inodes, files, filesystems, superblocks, dentries, etc, etc. You've just smashed all of it together into a FileSystem which implements all of the inode, file, address_space, etc, etc ops. And this is the wrong approach.
On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: > On Tue, Jan 09, 2024 at 04:13:15PM -0300, Wedson Almeida Filho wrote: > > On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > No. This "cleaner version on the Rust side" is nothing of that sort; > > > this "readdir doesn't need any state that might be different for different > > > file instances beyond the current position, because none of our examples > > > have needed that so far" is a good example of the garbage we really do > > > not need to deal with. > > > > What you're calling garbage is what Greg KH asked us to do, namely, > > not introduce anything for which there are no users. See a couple of > > quotes below. > > > > https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ > > The best feedback is "who will use these new interfaces?" Without that, > > it's really hard to review a patchset as it's difficult to see how the > > bindings will be used, right? > > > > https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ > > And I'd recommend that we not take any more bindings without real users, > > as there seems to be just a collection of these and it's hard to > > actually review them to see how they are used... > > You've misunderstood Greg. He's saying (effectively) "No fs bindings > without a filesystem to use them". And Al, myself and others are saying > "Your filesystem interfaces are wrong because they're not usable for real > filesystems". And you're saying "But I'm not allowed to change them". > And that's not true. Change them to be laid out how a real filesystem > would need them to be. Note, I agree, change them to work our a "real" filesystem would need them and then, automatically, all of the "fake" filesystems like currently underway (i.e. tarfs) will work just fine too, right? That way we can drop the .c code for binderfs at the same time, also a nice win. thanks, greg k-h
On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: > On Tue, Jan 09, 2024 at 04:13:15PM -0300, Wedson Almeida Filho wrote: > > On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > No. This "cleaner version on the Rust side" is nothing of that sort; > > > this "readdir doesn't need any state that might be different for different > > > file instances beyond the current position, because none of our examples > > > have needed that so far" is a good example of the garbage we really do > > > not need to deal with. > > > > What you're calling garbage is what Greg KH asked us to do, namely, > > not introduce anything for which there are no users. See a couple of > > quotes below. > > > > https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ > > The best feedback is "who will use these new interfaces?" Without that, > > it's really hard to review a patchset as it's difficult to see how the > > bindings will be used, right? > > > > https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ > > And I'd recommend that we not take any more bindings without real users, > > as there seems to be just a collection of these and it's hard to > > actually review them to see how they are used... > > You've misunderstood Greg. He's saying (effectively) "No fs bindings > without a filesystem to use them". And Al, myself and others are saying > "Your filesystem interfaces are wrong because they're not usable for real > filesystems". And that's why I've been saying that the first Rust filesystem that should be implemented is an ext2 clone. That's our "reference filesystem" for people who want to learn how filesystems should be implemented in Linux - it's relatively simple but fully featured and uses much of the generic abstractions and infrastructure. At minimum, we need a filesystem implementation that is fully read-write, supports truncate and rename, and has a fully functional userspace and test infrastructure so that we can actually verify that the Rust code does what it says on the label. ext2 ticks all of these boxes.... -Dave.
On Tue, 9 Jan 2024 at 16:32, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: > > You've misunderstood Greg. He's saying (effectively) "No fs bindings > > without a filesystem to use them". And Al, myself and others are saying > > "Your filesystem interfaces are wrong because they're not usable for real > > filesystems". And you're saying "But I'm not allowed to change them". > > And that's not true. Change them to be laid out how a real filesystem > > would need them to be. Ok, then I'll update the code to have 3 additional traits: FileOperations INodeOperations AddressSpaceOperations When one initialises an inode, one gets to pick all three. And FileOperations::read_dir will take a File<T> as its first argument (instead of an INode<T>). Does this sound reasonable? > Note, I agree, change them to work our a "real" filesystem would need > them and then, automatically, all of the "fake" filesystems like > currently underway (i.e. tarfs) will work just fine too, right? That > way we can drop the .c code for binderfs at the same time, also a nice > win. Are you volunteering to rewrite binderfs once rust bindings are available? :) Cheers, -Wedson
On Wed, Jan 10, 2024 at 04:49:02AM -0300, Wedson Almeida Filho wrote: > > Note, I agree, change them to work our a "real" filesystem would need > > them and then, automatically, all of the "fake" filesystems like > > currently underway (i.e. tarfs) will work just fine too, right? That > > way we can drop the .c code for binderfs at the same time, also a nice > > win. > > Are you volunteering to rewrite binderfs once rust bindings are available? :) Sure, would be glad to do so, after the binder conversion to rust is merged :) thanks, greg k-h
On Wed, Jan 10, 2024 at 04:49:02AM -0300, Wedson Almeida Filho wrote: > On Tue, 9 Jan 2024 at 16:32, Greg Kroah-Hartman > <gregkh@linuxfoundation.org> wrote: > > On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: > > > You've misunderstood Greg. He's saying (effectively) "No fs bindings > > > without a filesystem to use them". And Al, myself and others are saying > > > "Your filesystem interfaces are wrong because they're not usable for real > > > filesystems". And you're saying "But I'm not allowed to change them". > > > And that's not true. Change them to be laid out how a real filesystem > > > would need them to be. > > Ok, then I'll update the code to have 3 additional traits: > > FileOperations > INodeOperations > AddressSpaceOperations > > When one initialises an inode, one gets to pick all three. That makes sense, yes. > And FileOperations::read_dir will take a File<T> as its first argument > (instead of an INode<T>). > > Does this sound reasonable? yep!
On Wed, Jan 10, 2024 at 09:19:37AM +1100, Dave Chinner wrote: > On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: > > On Tue, Jan 09, 2024 at 04:13:15PM -0300, Wedson Almeida Filho wrote: > > > On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > No. This "cleaner version on the Rust side" is nothing of that sort; > > > > this "readdir doesn't need any state that might be different for different > > > > file instances beyond the current position, because none of our examples > > > > have needed that so far" is a good example of the garbage we really do > > > > not need to deal with. > > > > > > What you're calling garbage is what Greg KH asked us to do, namely, > > > not introduce anything for which there are no users. See a couple of > > > quotes below. > > > > > > https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ > > > The best feedback is "who will use these new interfaces?" Without that, > > > it's really hard to review a patchset as it's difficult to see how the > > > bindings will be used, right? > > > > > > https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ > > > And I'd recommend that we not take any more bindings without real users, > > > as there seems to be just a collection of these and it's hard to > > > actually review them to see how they are used... > > > > You've misunderstood Greg. He's saying (effectively) "No fs bindings > > without a filesystem to use them". And Al, myself and others are saying > > "Your filesystem interfaces are wrong because they're not usable for real > > filesystems". > > And that's why I've been saying that the first Rust filesystem that > should be implemented is an ext2 clone. That's our "reference > filesystem" for people who want to learn how filesystems should be > implemented in Linux - it's relatively simple but fully featured and > uses much of the generic abstractions and infrastructure. > > At minimum, we need a filesystem implementation that is fully > read-write, supports truncate and rename, and has a fully functional > userspace and test infrastructure so that we can actually verify > that the Rust code does what it says on the label. ext2 ticks all of > these boxes.... I think someone was working on that? But I'd prefer that not to be a condition of merging the VFS interfaces; we've got multiple new Rust filesystems being implemented and I'm also planning on merging Rust bcachefs code next merge window.
On Wed, 10 Jan 2024 14:19:41 -0500 Kent Overstreet <kent.overstreet@linux.dev> wrote: > On Wed, Jan 10, 2024 at 09:19:37AM +1100, Dave Chinner wrote: >> On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: >> > On Tue, Jan 09, 2024 at 04:13:15PM -0300, Wedson Almeida Filho wrote: >> > > On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: >> > > > No. This "cleaner version on the Rust side" is nothing of that sort; >> > > > this "readdir doesn't need any state that might be different for different >> > > > file instances beyond the current position, because none of our examples >> > > > have needed that so far" is a good example of the garbage we really do >> > > > not need to deal with. >> > > >> > > What you're calling garbage is what Greg KH asked us to do, namely, >> > > not introduce anything for which there are no users. See a couple of >> > > quotes below. >> > > >> > > https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ >> > > The best feedback is "who will use these new interfaces?" Without that, >> > > it's really hard to review a patchset as it's difficult to see how the >> > > bindings will be used, right? >> > > >> > > https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ >> > > And I'd recommend that we not take any more bindings without real users, >> > > as there seems to be just a collection of these and it's hard to >> > > actually review them to see how they are used... >> > >> > You've misunderstood Greg. He's saying (effectively) "No fs bindings >> > without a filesystem to use them". And Al, myself and others are saying >> > "Your filesystem interfaces are wrong because they're not usable for real >> > filesystems". >> >> And that's why I've been saying that the first Rust filesystem that >> should be implemented is an ext2 clone. That's our "reference >> filesystem" for people who want to learn how filesystems should be >> implemented in Linux - it's relatively simple but fully featured and >> uses much of the generic abstractions and infrastructure. >> >> At minimum, we need a filesystem implementation that is fully >> read-write, supports truncate and rename, and has a fully functional >> userspace and test infrastructure so that we can actually verify >> that the Rust code does what it says on the label. ext2 ticks all of >> these boxes.... > > I think someone was working on that? But I'd prefer that not to be a > condition of merging the VFS interfaces; we've got multiple new Rust > filesystems being implemented and I'm also planning on merging Rust > bcachefs code next merge window. It's very far from a fully functional clone of ext2 but the following can do simple read-write to/from files and directories: https://github.com/fujita/linux/tree/ext2-rust/fs/ext2rust For now, all of the code is unsafe Rust, using C structures directly but I could update the code to see how well Rust VFS abstractions for real file systems work.
On Wed, Jan 24, 2024 at 10:08:35PM +0900, FUJITA Tomonori wrote: > On Wed, 10 Jan 2024 14:19:41 -0500 > Kent Overstreet <kent.overstreet@linux.dev> wrote: > > > On Wed, Jan 10, 2024 at 09:19:37AM +1100, Dave Chinner wrote: > >> On Tue, Jan 09, 2024 at 07:25:38PM +0000, Matthew Wilcox wrote: > >> > On Tue, Jan 09, 2024 at 04:13:15PM -0300, Wedson Almeida Filho wrote: > >> > > On Wed, 3 Jan 2024 at 17:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > >> > > > No. This "cleaner version on the Rust side" is nothing of that sort; > >> > > > this "readdir doesn't need any state that might be different for different > >> > > > file instances beyond the current position, because none of our examples > >> > > > have needed that so far" is a good example of the garbage we really do > >> > > > not need to deal with. > >> > > > >> > > What you're calling garbage is what Greg KH asked us to do, namely, > >> > > not introduce anything for which there are no users. See a couple of > >> > > quotes below. > >> > > > >> > > https://lore.kernel.org/rust-for-linux/2023081411-apache-tubeless-7bb3@gregkh/ > >> > > The best feedback is "who will use these new interfaces?" Without that, > >> > > it's really hard to review a patchset as it's difficult to see how the > >> > > bindings will be used, right? > >> > > > >> > > https://lore.kernel.org/rust-for-linux/2023071049-gigabyte-timing-0673@gregkh/ > >> > > And I'd recommend that we not take any more bindings without real users, > >> > > as there seems to be just a collection of these and it's hard to > >> > > actually review them to see how they are used... > >> > > >> > You've misunderstood Greg. He's saying (effectively) "No fs bindings > >> > without a filesystem to use them". And Al, myself and others are saying > >> > "Your filesystem interfaces are wrong because they're not usable for real > >> > filesystems". > >> > >> And that's why I've been saying that the first Rust filesystem that > >> should be implemented is an ext2 clone. That's our "reference > >> filesystem" for people who want to learn how filesystems should be > >> implemented in Linux - it's relatively simple but fully featured and > >> uses much of the generic abstractions and infrastructure. > >> > >> At minimum, we need a filesystem implementation that is fully > >> read-write, supports truncate and rename, and has a fully functional > >> userspace and test infrastructure so that we can actually verify > >> that the Rust code does what it says on the label. ext2 ticks all of > >> these boxes.... > > > > I think someone was working on that? But I'd prefer that not to be a > > condition of merging the VFS interfaces; we've got multiple new Rust > > filesystems being implemented and I'm also planning on merging Rust > > bcachefs code next merge window. > > It's very far from a fully functional clone of ext2 but the following > can do simple read-write to/from files and directories: > > https://github.com/fujita/linux/tree/ext2-rust/fs/ext2rust > > For now, all of the code is unsafe Rust, using C structures directly > but I could update the code to see how well Rust VFS abstractions for > real file systems work. I think that would be well received. I think the biggest hurdle for a lot of people is going to be figuring out the patterns for expressing old idioms in safe rust - a version of ext2 in safe Rust would be the perfect gentle introduction for filesystem people. And if it achieved feature parity with fs/ext2, there'd be a strong argument for it eventually replacing fs/ext2 so that we can more safely mount untrusted filesystem images.
From: Wedson Almeida Filho <walmeida@microsoft.com> This series introduces Rust abstractions that allow page-cache-backed read-only file systems to be written in Rust. There are two file systems that are built on top of these abstractions: tarfs and puzzlefs. The former has zero unsafe blocks and is included as a patch in this series; the latter is described elsewhere [1]. We limit the functionality to the bare minimum needed to implement them. Rust file system modules can be declared with the `module_fs` macro and are required to implement the following functions (which are part of the `FileSystem` trait): impl FileSystem for MyFS { fn super_params(sb: &NewSuperBlock<Self>) -> Result<SuperParams<Self::Data>>; fn init_root(sb: &SuperBlock<Self>) -> Result<ARef<INode<Self>>>; fn read_dir(inode: &INode<Self>, emitter: &mut DirEmitter) -> Result; fn lookup(parent: &INode<Self>, name: &[u8]) -> Result<ARef<INode<Self>>>; fn read_folio(inode: &INode<Self>, folio: LockedFolio<'_>) -> Result; } They can optionally implement the following: fn read_xattr(inode: &INode<Self>, name: &CStr, outbuf: &mut [u8]) -> Result<usize>; fn statfs(sb: &SuperBlock<Self>) -> Result<Stat>; They may also choose the type of the data they can attach to superblocks and/or inodes. There a couple of issues that are likely to lead to unsoundness that have to do with the unregistration of file systems. I will send separate emails about them. A git tree is available here: git://github.com/wedsonaf/linux.git vfs Web: https://github.com/wedsonaf/linux/commits/vfs [1]: The PuzzleFS container filesystem: https://lwn.net/Articles/945320/ Wedson Almeida Filho (19): rust: fs: add registration/unregistration of file systems rust: fs: introduce the `module_fs` macro samples: rust: add initial ro file system sample rust: fs: introduce `FileSystem::super_params` rust: fs: introduce `INode<T>` rust: fs: introduce `FileSystem::init_root` rust: fs: introduce `FileSystem::read_dir` rust: fs: introduce `FileSystem::lookup` rust: folio: introduce basic support for folios rust: fs: introduce `FileSystem::read_folio` rust: fs: introduce `FileSystem::read_xattr` rust: fs: introduce `FileSystem::statfs` rust: fs: introduce more inode types rust: fs: add per-superblock data rust: fs: add basic support for fs buffer heads rust: fs: allow file systems backed by a block device rust: fs: allow per-inode data rust: fs: export file type from mode constants tarfs: introduce tar fs fs/Kconfig | 1 + fs/Makefile | 1 + fs/tarfs/Kconfig | 16 + fs/tarfs/Makefile | 8 + fs/tarfs/defs.rs | 80 ++ fs/tarfs/tar.rs | 322 +++++++ rust/bindings/bindings_helper.h | 13 + rust/bindings/lib.rs | 6 + rust/helpers.c | 142 ++++ rust/kernel/error.rs | 6 +- rust/kernel/folio.rs | 214 +++++ rust/kernel/fs.rs | 1290 +++++++++++++++++++++++++++++ rust/kernel/fs/buffer.rs | 60 ++ rust/kernel/lib.rs | 2 + rust/kernel/mem_cache.rs | 2 - samples/rust/Kconfig | 10 + samples/rust/Makefile | 1 + samples/rust/rust_rofs.rs | 154 ++++ scripts/generate_rust_analyzer.py | 2 +- 19 files changed, 2324 insertions(+), 6 deletions(-) create mode 100644 fs/tarfs/Kconfig create mode 100644 fs/tarfs/Makefile create mode 100644 fs/tarfs/defs.rs create mode 100644 fs/tarfs/tar.rs create mode 100644 rust/kernel/folio.rs create mode 100644 rust/kernel/fs.rs create mode 100644 rust/kernel/fs/buffer.rs create mode 100644 samples/rust/rust_rofs.rs base-commit: b0bc357ef7a98904600826dea3de79c0c67eb0a7