diff mbox series

[4/7] rust: file: add `FileDescriptorReservation`

Message ID 20231129-alice-file-v1-4-f81afe8c7261@google.com (mailing list archive)
State New, archived
Headers show
Series File abstractions needed by Rust Binder | expand

Commit Message

Alice Ryhl Nov. 29, 2023, 1:11 p.m. UTC
From: Wedson Almeida Filho <wedsonaf@gmail.com>

Allow for the creation of a file descriptor in two steps: first, we
reserve a slot for it, then we commit or drop the reservation. The first
step may fail (e.g., the current process ran out of available slots),
but commit and drop never fail (and are mutually exclusive).

This is needed by Rust Binder when fds are sent from one process to
another. It has to be a two-step process to properly handle the case
where multiple fds are sent: The operation must fail or succeed
atomically, which we achieve by first reserving the fds we need, and
only installing the files once we have reserved enough fds to send the
files.

Fd reservations assume that the value of `current` does not change
between the call to get_unused_fd_flags and the call to fd_install (or
put_unused_fd). By not implementing the Send trait, this abstraction
ensures that the `FileDescriptorReservation` cannot be moved into a
different process.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/kernel/file.rs | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

Comments

Christian Brauner Nov. 29, 2023, 4:14 p.m. UTC | #1
On Wed, Nov 29, 2023 at 01:11:56PM +0000, Alice Ryhl wrote:
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> Allow for the creation of a file descriptor in two steps: first, we
> reserve a slot for it, then we commit or drop the reservation. The first
> step may fail (e.g., the current process ran out of available slots),
> but commit and drop never fail (and are mutually exclusive).
> 
> This is needed by Rust Binder when fds are sent from one process to
> another. It has to be a two-step process to properly handle the case
> where multiple fds are sent: The operation must fail or succeed
> atomically, which we achieve by first reserving the fds we need, and
> only installing the files once we have reserved enough fds to send the
> files.
> 
> Fd reservations assume that the value of `current` does not change
> between the call to get_unused_fd_flags and the call to fd_install (or
> put_unused_fd). By not implementing the Send trait, this abstraction
> ensures that the `FileDescriptorReservation` cannot be moved into a
> different process.
> 
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/file.rs | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 63 insertions(+), 1 deletion(-)
> 
> diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
> index f1f71c3d97e2..2186a6ea3f2f 100644
> --- a/rust/kernel/file.rs
> +++ b/rust/kernel/file.rs
> @@ -11,7 +11,7 @@
>      error::{code::*, Error, Result},
>      types::{ARef, AlwaysRefCounted, Opaque},
>  };
> -use core::ptr;
> +use core::{marker::PhantomData, ptr};
>  
>  /// Flags associated with a [`File`].
>  pub mod flags {
> @@ -180,6 +180,68 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
>      }
>  }
>  
> +/// A file descriptor reservation.
> +///
> +/// This allows the creation of a file descriptor in two steps: first, we reserve a slot for it,
> +/// then we commit or drop the reservation. The first step may fail (e.g., the current process ran
> +/// out of available slots), but commit and drop never fail (and are mutually exclusive).
> +///
> +/// Dropping the reservation happens in the destructor of this type.
> +///
> +/// # Invariants
> +///
> +/// The fd stored in this struct must correspond to a reserved file descriptor of the current task.
> +pub struct FileDescriptorReservation {

Can we follow the traditional file terminology, i.e.,
get_unused_fd_flags() and fd_install()? At least at the beginning this
might be quite helpful instead of having to mentally map new() and
commit() onto the C functions.

> +    fd: u32,
> +    /// Prevent values of this type from being moved to a different task.
> +    ///
> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> +    /// owns the fd in question.
> +    _not_send_sync: PhantomData<*mut ()>,

I don't fully understand this. Can you explain in a little more detail
what you mean by this and how this works?

> +}
> +
> +impl FileDescriptorReservation {
> +    /// Creates a new file descriptor reservation.
> +    pub fn new(flags: u32) -> Result<Self> {
> +        // SAFETY: FFI call, there are no safety requirements on `flags`.
> +        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
> +        if fd < 0 {
> +            return Err(Error::from_errno(fd));
> +        }
> +        Ok(Self {
> +            fd: fd as _,

This is a cast to a u32?

> +            _not_send_sync: PhantomData,

Can you please draft a quick example how that return value would be
expected to be used by a caller? It's really not clear 

> +        })
> +    }
> +
> +    /// Returns the file descriptor number that was reserved.
> +    pub fn reserved_fd(&self) -> u32 {
> +        self.fd
> +    }
> +
> +    /// Commits the reservation.
> +    ///
> +    /// The previously reserved file descriptor is bound to `file`. This method consumes the
> +    /// [`FileDescriptorReservation`], so it will not be usable after this call.
> +    pub fn commit(self, file: ARef<File>) {
> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> +        // guaranteed to have an owned ref count by its type invariants.
> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };

Why file.0.get()? Where did that come from?

> +
> +        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
> +        // the destructors.
> +        core::mem::forget(self);
> +        core::mem::forget(file);
> +    }
> +}
> +
> +impl Drop for FileDescriptorReservation {
> +    fn drop(&mut self) {
> +        // SAFETY: `self.fd` was returned by a previous call to `get_unused_fd_flags`.
> +        unsafe { bindings::put_unused_fd(self.fd) };
> +    }
> +}
> +
>  /// Represents the `EBADF` error code.
>  ///
>  /// Used for methods that can only fail with `EBADF`.
> 
> -- 
> 2.43.0.rc1.413.gea7ed67945-goog
>
Alice Ryhl Nov. 29, 2023, 4:55 p.m. UTC | #2
Christian Brauner <brauner@kernel.org> writes:
> Can we follow the traditional file terminology, i.e.,
> get_unused_fd_flags() and fd_install()? At least at the beginning this
> might be quite helpful instead of having to mentally map new() and
> commit() onto the C functions.

Sure, I'll do that in the next version.

>> +    /// Prevent values of this type from being moved to a different task.
>> +    ///
>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>> +    /// owns the fd in question.
>> +    _not_send_sync: PhantomData<*mut ()>,
> 
> I don't fully understand this. Can you explain in a little more detail
> what you mean by this and how this works?

Yeah, so, this has to do with the Rust trait `Send` that controls
whether it's okay for a value to get moved from one thread to another.
In this case, we don't want it to be `Send` so that it can't be moved to
another thread, since current might be different there.

The `Send` trait is automatically applied to structs whenever *all*
fields of the struct are `Send`. So to ensure that a struct is not
`Send`, you add a field that is not `Send`.

The `PhantomData` type used here is a special zero-sized type.
Basically, it says "pretend this struct has a field of type `*mut ()`,
but don't actually add the field". So for the purposes of `Send`, it has
a non-Send field, but since its wrapped in `PhantomData`, the field is
not there at runtime.

>> +        Ok(Self {
>> +            fd: fd as _,
> 
> This is a cast to a u32?

Yes.

> Can you please draft a quick example how that return value would be
> expected to be used by a caller? It's really not clear

The most basic usage would look like this:

	// First, reserve the fd.
	let reservation = FileDescriptorReservation::new(O_CLOEXEC)?;

	// Then, somehow get a file to put in it.
	let file = get_file_using_fallible_operation()?;

	// Finally, commit it to the fd.
	reservation.commit(file);

In Rust Binder, reservations are used here:
https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L199-L210
https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L512-L541

>> +    pub fn commit(self, file: ARef<File>) {
>> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
>> +        // guaranteed to have an owned ref count by its type invariants.
>> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> 
> Why file.0.get()? Where did that come from?

This gets a raw pointer to the C type.

The `.0` part is a field access. `ARef` struct is a tuple struct, so its
fields are unnamed. However, the fields can still be accessed by index.

Alice
Alice Ryhl Nov. 29, 2023, 5:14 p.m. UTC | #3
On Wed, Nov 29, 2023 at 5:55 PM Alice Ryhl <aliceryhl@google.com> wrote:

> >> +    pub fn commit(self, file: ARef<File>) {
> >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> >> +        // guaranteed to have an owned ref count by its type invariants.
> >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> >
> > Why file.0.get()? Where did that come from?
>
> This gets a raw pointer to the C type.
>
> The `.0` part is a field access. `ARef` struct is a tuple struct, so its
> fields are unnamed. However, the fields can still be accessed by index.

Oh, sorry, this is wrong. Let me try again:

This gets a raw pointer to the C type. The `.0` part accesses the
field of type `Opaque<bindings::file>` in the Rust wrapper. Recall
that File is defined like this:

pub struct File(Opaque<bindings::file>);

The above syntax defines a tuple struct, which means that the fields
are unnamed. The `.0` syntax accesses the first field of a tuple
struct [1].

The `.get()` method is from the `Opaque` struct, which returns a raw
pointer to the C type being wrapped.

Alice

[1]: https://doc.rust-lang.org/std/keyword.struct.html#:~:text=Tuple%20structs%20are%20similar%20to,with%20regular%20tuples%2C%20namely%20foo.
Christian Brauner Nov. 30, 2023, 9:09 a.m. UTC | #4
On Wed, Nov 29, 2023 at 04:55:51PM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> > Can we follow the traditional file terminology, i.e.,
> > get_unused_fd_flags() and fd_install()? At least at the beginning this
> > might be quite helpful instead of having to mentally map new() and
> > commit() onto the C functions.
> 
> Sure, I'll do that in the next version.
> 
> >> +    /// Prevent values of this type from being moved to a different task.
> >> +    ///
> >> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> >> +    /// owns the fd in question.
> >> +    _not_send_sync: PhantomData<*mut ()>,
> > 
> > I don't fully understand this. Can you explain in a little more detail
> > what you mean by this and how this works?
> 
> Yeah, so, this has to do with the Rust trait `Send` that controls
> whether it's okay for a value to get moved from one thread to another.
> In this case, we don't want it to be `Send` so that it can't be moved to
> another thread, since current might be different there.
> 
> The `Send` trait is automatically applied to structs whenever *all*
> fields of the struct are `Send`. So to ensure that a struct is not
> `Send`, you add a field that is not `Send`.
> 
> The `PhantomData` type used here is a special zero-sized type.
> Basically, it says "pretend this struct has a field of type `*mut ()`,
> but don't actually add the field". So for the purposes of `Send`, it has
> a non-Send field, but since its wrapped in `PhantomData`, the field is
> not there at runtime.

This probably a stupid suggestion, question. But while PhantomData gives
the right hint of what is happening I wouldn't mind if that was very
explicitly called NoSendTrait or just add the explanatory comment. Yes,
that's a lot of verbiage but you'd help us a lot.

> 
> >> +        Ok(Self {
> >> +            fd: fd as _,
> > 
> > This is a cast to a u32?
> 
> Yes.
> 
> > Can you please draft a quick example how that return value would be
> > expected to be used by a caller? It's really not clear
> 
> The most basic usage would look like this:
> 
> 	// First, reserve the fd.
> 	let reservation = FileDescriptorReservation::new(O_CLOEXEC)?;
> 
> 	// Then, somehow get a file to put in it.
> 	let file = get_file_using_fallible_operation()?;
> 
> 	// Finally, commit it to the fd.
> 	reservation.commit(file);

Ok, the reason I asked was that I was confused about the PhantomData and
how that would figure into using the return value as I hadn't seen that
Ok(Self { }) syntax before. Thanks.
  
> 
> In Rust Binder, reservations are used here:
> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L199-L210
> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L512-L541
> 
> >> +    pub fn commit(self, file: ARef<File>) {
> >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> >> +        // guaranteed to have an owned ref count by its type invariants.
> >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> > 
> > Why file.0.get()? Where did that come from?
> 
> This gets a raw pointer to the C type.
> 
> The `.0` part is a field access. `ARef` struct is a tuple struct, so its

Ah, there we go. It's a bit ugly tbh.

> fields are unnamed. However, the fields can still be accessed by index.
Christian Brauner Nov. 30, 2023, 9:12 a.m. UTC | #5
On Wed, Nov 29, 2023 at 06:14:24PM +0100, Alice Ryhl wrote:
> On Wed, Nov 29, 2023 at 5:55 PM Alice Ryhl <aliceryhl@google.com> wrote:
> 
> > >> +    pub fn commit(self, file: ARef<File>) {
> > >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> > >> +        // guaranteed to have an owned ref count by its type invariants.
> > >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> > >
> > > Why file.0.get()? Where did that come from?
> >
> > This gets a raw pointer to the C type.
> >
> > The `.0` part is a field access. `ARef` struct is a tuple struct, so its
> > fields are unnamed. However, the fields can still be accessed by index.
> 
> Oh, sorry, this is wrong. Let me try again:
> 
> This gets a raw pointer to the C type. The `.0` part accesses the
> field of type `Opaque<bindings::file>` in the Rust wrapper. Recall
> that File is defined like this:
> 
> pub struct File(Opaque<bindings::file>);
> 
> The above syntax defines a tuple struct, which means that the fields
> are unnamed. The `.0` syntax accesses the first field of a tuple
> struct [1].
> 
> The `.get()` method is from the `Opaque` struct, which returns a raw
> pointer to the C type being wrapped.

It'd be nice if this could be written in a more obvious/elegant way. And
if not a comment would help. I know there'll be more text then code but
until this is second nature to read I personally won't mind... Because
searching for this specific syntax isn't really possible.
Alice Ryhl Nov. 30, 2023, 9:17 a.m. UTC | #6
Christian Brauner <brauner@kernel.org> writes:
>>>> +    /// Prevent values of this type from being moved to a different task.
>>>> +    ///
>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>>>> +    /// owns the fd in question.
>>>> +    _not_send_sync: PhantomData<*mut ()>,
>>> 
>>> I don't fully understand this. Can you explain in a little more detail
>>> what you mean by this and how this works?
>> 
>> Yeah, so, this has to do with the Rust trait `Send` that controls
>> whether it's okay for a value to get moved from one thread to another.
>> In this case, we don't want it to be `Send` so that it can't be moved to
>> another thread, since current might be different there.
>> 
>> The `Send` trait is automatically applied to structs whenever *all*
>> fields of the struct are `Send`. So to ensure that a struct is not
>> `Send`, you add a field that is not `Send`.
>> 
>> The `PhantomData` type used here is a special zero-sized type.
>> Basically, it says "pretend this struct has a field of type `*mut ()`,
>> but don't actually add the field". So for the purposes of `Send`, it has
>> a non-Send field, but since its wrapped in `PhantomData`, the field is
>> not there at runtime.
> 
> This probably a stupid suggestion, question. But while PhantomData gives
> the right hint of what is happening I wouldn't mind if that was very
> explicitly called NoSendTrait or just add the explanatory comment. Yes,
> that's a lot of verbiage but you'd help us a lot.

I suppose we could add a typedef:

type NoSendTrait = PhantomData<*mut ()>;

and use that as the field type. The way I did it here is the "standard"
way of doing it, and if you look at code outside the kernel, you will
also find them using `PhantomData` like this. However, I don't mind
adding the typedef if you think it is helpful.

Alice
Alice Ryhl Nov. 30, 2023, 9:23 a.m. UTC | #7
Christian Brauner <brauner@kernel.org> writes:
> On Wed, Nov 29, 2023 at 06:14:24PM +0100, Alice Ryhl wrote:
> > On Wed, Nov 29, 2023 at 5:55 PM Alice Ryhl <aliceryhl@google.com> wrote:
> > 
> > > >> +    pub fn commit(self, file: ARef<File>) {
> > > >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> > > >> +        // guaranteed to have an owned ref count by its type invariants.
> > > >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> > > >
> > > > Why file.0.get()? Where did that come from?
> > >
> > > This gets a raw pointer to the C type.
> > >
> > > The `.0` part is a field access. `ARef` struct is a tuple struct, so its
> > > fields are unnamed. However, the fields can still be accessed by index.
> > 
> > Oh, sorry, this is wrong. Let me try again:
> > 
> > This gets a raw pointer to the C type. The `.0` part accesses the
> > field of type `Opaque<bindings::file>` in the Rust wrapper. Recall
> > that File is defined like this:
> > 
> > pub struct File(Opaque<bindings::file>);
> > 
> > The above syntax defines a tuple struct, which means that the fields
> > are unnamed. The `.0` syntax accesses the first field of a tuple
> > struct [1].
> > 
> > The `.get()` method is from the `Opaque` struct, which returns a raw
> > pointer to the C type being wrapped.
> 
> It'd be nice if this could be written in a more obvious/elegant way. And
> if not a comment would help. I know there'll be more text then code but
> until this is second nature to read I personally won't mind... Because
> searching for this specific syntax isn't really possible.

Adding a comment to every instance of this is probably not realisitic.
This kind of code will be very common in abstraction code. However,
there are two other options that I think are reasonable:

1. I can change the definition of `File` so that the field has a name:

struct File {
    inner: Opaque<bindings::file>,
}

Then, it would say `file.inner.get()`.

2. Alternatively, I can add a method to file:

impl File {
    #[inline]
    pub fn as_ptr(&self) -> *mut bindings::file {
        self.0.get()
    }
}

And then write `file.as_ptr()` whenever I want a pointer.

Alice
Christian Brauner Nov. 30, 2023, 10:51 a.m. UTC | #8
On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> >>>> +    /// Prevent values of this type from being moved to a different task.
> >>>> +    ///
> >>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> >>>> +    /// owns the fd in question.
> >>>> +    _not_send_sync: PhantomData<*mut ()>,
> >>> 
> >>> I don't fully understand this. Can you explain in a little more detail
> >>> what you mean by this and how this works?
> >> 
> >> Yeah, so, this has to do with the Rust trait `Send` that controls
> >> whether it's okay for a value to get moved from one thread to another.
> >> In this case, we don't want it to be `Send` so that it can't be moved to
> >> another thread, since current might be different there.
> >> 
> >> The `Send` trait is automatically applied to structs whenever *all*
> >> fields of the struct are `Send`. So to ensure that a struct is not
> >> `Send`, you add a field that is not `Send`.
> >> 
> >> The `PhantomData` type used here is a special zero-sized type.
> >> Basically, it says "pretend this struct has a field of type `*mut ()`,
> >> but don't actually add the field". So for the purposes of `Send`, it has
> >> a non-Send field, but since its wrapped in `PhantomData`, the field is
> >> not there at runtime.
> > 
> > This probably a stupid suggestion, question. But while PhantomData gives
> > the right hint of what is happening I wouldn't mind if that was very
> > explicitly called NoSendTrait or just add the explanatory comment. Yes,
> > that's a lot of verbiage but you'd help us a lot.
> 
> I suppose we could add a typedef:
> 
> type NoSendTrait = PhantomData<*mut ()>;
> 
> and use that as the field type. The way I did it here is the "standard"
> way of doing it, and if you look at code outside the kernel, you will
> also find them using `PhantomData` like this. However, I don't mind
> adding the typedef if you think it is helpful.

I'm fine with just a comment as well. I just need to be able to read
this a bit faster. I'm basically losing half a day just dealing with
this patchset and that's not realistic if I want to keep up with other
patches that get sent.

And if you resend and someone else review you might have to answer the
same question again.
Alice Ryhl Nov. 30, 2023, 11:54 a.m. UTC | #9
Christian Brauner <brauner@kernel.org> writes:
> On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
>> Christian Brauner <brauner@kernel.org> writes:
>>>>>> +    /// Prevent values of this type from being moved to a different task.
>>>>>> +    ///
>>>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>>>>>> +    /// owns the fd in question.
>>>>>> +    _not_send_sync: PhantomData<*mut ()>,
>>>>> 
>>>>> I don't fully understand this. Can you explain in a little more detail
>>>>> what you mean by this and how this works?
>>>> 
>>>> Yeah, so, this has to do with the Rust trait `Send` that controls
>>>> whether it's okay for a value to get moved from one thread to another.
>>>> In this case, we don't want it to be `Send` so that it can't be moved to
>>>> another thread, since current might be different there.
>>>> 
>>>> The `Send` trait is automatically applied to structs whenever *all*
>>>> fields of the struct are `Send`. So to ensure that a struct is not
>>>> `Send`, you add a field that is not `Send`.
>>>> 
>>>> The `PhantomData` type used here is a special zero-sized type.
>>>> Basically, it says "pretend this struct has a field of type `*mut ()`,
>>>> but don't actually add the field". So for the purposes of `Send`, it has
>>>> a non-Send field, but since its wrapped in `PhantomData`, the field is
>>>> not there at runtime.
>>> 
>>> This probably a stupid suggestion, question. But while PhantomData gives
>>> the right hint of what is happening I wouldn't mind if that was very
>>> explicitly called NoSendTrait or just add the explanatory comment. Yes,
>>> that's a lot of verbiage but you'd help us a lot.
>> 
>> I suppose we could add a typedef:
>> 
>> type NoSendTrait = PhantomData<*mut ()>;
>> 
>> and use that as the field type. The way I did it here is the "standard"
>> way of doing it, and if you look at code outside the kernel, you will
>> also find them using `PhantomData` like this. However, I don't mind
>> adding the typedef if you think it is helpful.
> 
> I'm fine with just a comment as well. I just need to be able to read
> this a bit faster. I'm basically losing half a day just dealing with
> this patchset and that's not realistic if I want to keep up with other
> patches that get sent.
> 
> And if you resend and someone else review you might have to answer the
> same question again.

What do you think about this wording?

/// Prevent values of this type from being moved to a different task.
/// 
/// This field has the type `PhantomData<*mut ()>`, which does not
/// implement the Send trait. By adding a field with this property, we
/// ensure that the `FileDescriptorReservation` struct will not
/// implement the Send trait either. This has the consequence that the
/// compiler will prevent you from moving values of type
/// `FileDescriptorReservation` into a different task, which we want
/// because other tasks might have a different value of `current`. We
/// want to avoid that because `fd_install` assumes that the value of
/// `current` is unchanged since the call to `get_unused_fd_flags`.
/// 
/// The `PhantomData` type has size zero, so the field does not exist at
/// runtime.

Alice
Benno Lossin Nov. 30, 2023, 12:17 p.m. UTC | #10
On 30.11.23 12:54, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
>> On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
>>> Christian Brauner <brauner@kernel.org> writes:
>>>>>>> +    /// Prevent values of this type from being moved to a different task.
>>>>>>> +    ///
>>>>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>>>>>>> +    /// owns the fd in question.
>>>>>>> +    _not_send_sync: PhantomData<*mut ()>,
>>>>>>
>>>>>> I don't fully understand this. Can you explain in a little more detail
>>>>>> what you mean by this and how this works?
>>>>>
>>>>> Yeah, so, this has to do with the Rust trait `Send` that controls
>>>>> whether it's okay for a value to get moved from one thread to another.
>>>>> In this case, we don't want it to be `Send` so that it can't be moved to
>>>>> another thread, since current might be different there.
>>>>>
>>>>> The `Send` trait is automatically applied to structs whenever *all*
>>>>> fields of the struct are `Send`. So to ensure that a struct is not
>>>>> `Send`, you add a field that is not `Send`.
>>>>>
>>>>> The `PhantomData` type used here is a special zero-sized type.
>>>>> Basically, it says "pretend this struct has a field of type `*mut ()`,
>>>>> but don't actually add the field". So for the purposes of `Send`, it has
>>>>> a non-Send field, but since its wrapped in `PhantomData`, the field is
>>>>> not there at runtime.
>>>>
>>>> This probably a stupid suggestion, question. But while PhantomData gives
>>>> the right hint of what is happening I wouldn't mind if that was very
>>>> explicitly called NoSendTrait or just add the explanatory comment. Yes,
>>>> that's a lot of verbiage but you'd help us a lot.
>>>
>>> I suppose we could add a typedef:
>>>
>>> type NoSendTrait = PhantomData<*mut ()>;
>>>
>>> and use that as the field type. The way I did it here is the "standard"
>>> way of doing it, and if you look at code outside the kernel, you will
>>> also find them using `PhantomData` like this. However, I don't mind
>>> adding the typedef if you think it is helpful.
>>
>> I'm fine with just a comment as well. I just need to be able to read
>> this a bit faster. I'm basically losing half a day just dealing with
>> this patchset and that's not realistic if I want to keep up with other
>> patches that get sent.
>>
>> And if you resend and someone else review you might have to answer the
>> same question again.
> 
> What do you think about this wording?
> 
> /// Prevent values of this type from being moved to a different task.
> ///
> /// This field has the type `PhantomData<*mut ()>`, which does not
> /// implement the Send trait. By adding a field with this property, we
> /// ensure that the `FileDescriptorReservation` struct will not
> /// implement the Send trait either. This has the consequence that the
> /// compiler will prevent you from moving values of type
> /// `FileDescriptorReservation` into a different task, which we want
> /// because other tasks might have a different value of `current`. We
> /// want to avoid that because `fd_install` assumes that the value of
> /// `current` is unchanged since the call to `get_unused_fd_flags`.
> ///
> /// The `PhantomData` type has size zero, so the field does not exist at
> /// runtime.
> 
> Alice

I don't think it is a good idea to add this big comment to every
`PhantomData` field. I would much rather have a type alias:

    /// Zero-sized type to mark types not [`Send`].
    ///
    /// Add this type as a field to your struct if your type should not be sent to a different task.
    /// Since [`Send`] is an auto trait, adding a single field that is [`!Send`] will ensure that the
    /// whole type is [`!Send`].
    ///
    /// If a type is [`!Send`] it is impossible to give control over an instance of the type to another
    /// task. This is useful when a type stores task-local information for example file descriptors.
    pub type NotSend = PhantomData<*mut ()>;

If you have suggestions for improving the doc comment or the name,
please go ahead.

This doesn't mean that there should be no comment on the `NotSend`
field of `FileDescriptorReservation`, but I don't want to repeat
the `Send` stuff all over the place (since it comes up a lot):

    /// Ensure that `FileDescriptorReservation` cannot be sent to a different task, since there the
    /// value of `current` is different. We want to avoid that because `fd_install` assumes that the
    /// value of `current` is unchanged since the call to `get_unused_fd_flags`.
    _not_send: NotSend,
Christian Brauner Nov. 30, 2023, 12:33 p.m. UTC | #11
On Thu, Nov 30, 2023 at 12:17:14PM +0000, Benno Lossin wrote:
> On 30.11.23 12:54, Alice Ryhl wrote:
> > Christian Brauner <brauner@kernel.org> writes:
> >> On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
> >>> Christian Brauner <brauner@kernel.org> writes:
> >>>>>>> +    /// Prevent values of this type from being moved to a different task.
> >>>>>>> +    ///
> >>>>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> >>>>>>> +    /// owns the fd in question.
> >>>>>>> +    _not_send_sync: PhantomData<*mut ()>,
> >>>>>>
> >>>>>> I don't fully understand this. Can you explain in a little more detail
> >>>>>> what you mean by this and how this works?
> >>>>>
> >>>>> Yeah, so, this has to do with the Rust trait `Send` that controls
> >>>>> whether it's okay for a value to get moved from one thread to another.
> >>>>> In this case, we don't want it to be `Send` so that it can't be moved to
> >>>>> another thread, since current might be different there.
> >>>>>
> >>>>> The `Send` trait is automatically applied to structs whenever *all*
> >>>>> fields of the struct are `Send`. So to ensure that a struct is not
> >>>>> `Send`, you add a field that is not `Send`.
> >>>>>
> >>>>> The `PhantomData` type used here is a special zero-sized type.
> >>>>> Basically, it says "pretend this struct has a field of type `*mut ()`,
> >>>>> but don't actually add the field". So for the purposes of `Send`, it has
> >>>>> a non-Send field, but since its wrapped in `PhantomData`, the field is
> >>>>> not there at runtime.
> >>>>
> >>>> This probably a stupid suggestion, question. But while PhantomData gives
> >>>> the right hint of what is happening I wouldn't mind if that was very
> >>>> explicitly called NoSendTrait or just add the explanatory comment. Yes,
> >>>> that's a lot of verbiage but you'd help us a lot.
> >>>
> >>> I suppose we could add a typedef:
> >>>
> >>> type NoSendTrait = PhantomData<*mut ()>;
> >>>
> >>> and use that as the field type. The way I did it here is the "standard"
> >>> way of doing it, and if you look at code outside the kernel, you will
> >>> also find them using `PhantomData` like this. However, I don't mind
> >>> adding the typedef if you think it is helpful.
> >>
> >> I'm fine with just a comment as well. I just need to be able to read
> >> this a bit faster. I'm basically losing half a day just dealing with
> >> this patchset and that's not realistic if I want to keep up with other
> >> patches that get sent.
> >>
> >> And if you resend and someone else review you might have to answer the
> >> same question again.
> > 
> > What do you think about this wording?
> > 
> > /// Prevent values of this type from being moved to a different task.
> > ///
> > /// This field has the type `PhantomData<*mut ()>`, which does not
> > /// implement the Send trait. By adding a field with this property, we
> > /// ensure that the `FileDescriptorReservation` struct will not
> > /// implement the Send trait either. This has the consequence that the
> > /// compiler will prevent you from moving values of type
> > /// `FileDescriptorReservation` into a different task, which we want
> > /// because other tasks might have a different value of `current`. We
> > /// want to avoid that because `fd_install` assumes that the value of
> > /// `current` is unchanged since the call to `get_unused_fd_flags`.
> > ///
> > /// The `PhantomData` type has size zero, so the field does not exist at
> > /// runtime.
> > 
> > Alice
> 
> I don't think it is a good idea to add this big comment to every
> `PhantomData` field. I would much rather have a type alias:
> 
>     /// Zero-sized type to mark types not [`Send`].
>     ///
>     /// Add this type as a field to your struct if your type should not be sent to a different task.
>     /// Since [`Send`] is an auto trait, adding a single field that is [`!Send`] will ensure that the
>     /// whole type is [`!Send`].
>     ///
>     /// If a type is [`!Send`] it is impossible to give control over an instance of the type to another
>     /// task. This is useful when a type stores task-local information for example file descriptors.
>     pub type NotSend = PhantomData<*mut ()>;
> 
> If you have suggestions for improving the doc comment or the name,
> please go ahead.
> 
> This doesn't mean that there should be no comment on the `NotSend`
> field of `FileDescriptorReservation`, but I don't want to repeat
> the `Send` stuff all over the place (since it comes up a lot):
> 
>     /// Ensure that `FileDescriptorReservation` cannot be sent to a different task, since there the
>     /// value of `current` is different. We want to avoid that because `fd_install` assumes that the
>     /// value of `current` is unchanged since the call to `get_unused_fd_flags`.
>     _not_send: NotSend,

Seems sane to me. But I would suggest to move away from the "send"
terminology?

* CurrentOnly
* AccessCurrentTask vs AccessForeignTask
* NoForeignTaskAccess
* TaskLocalContext
* TaskCurrentAccess

Or some other variant thereof.
Benno Lossin Nov. 30, 2023, 4:40 p.m. UTC | #12
On 11/29/23 14:11, Alice Ryhl wrote:
> +impl FileDescriptorReservation {
> +    /// Creates a new file descriptor reservation.
> +    pub fn new(flags: u32) -> Result<Self> {
> +        // SAFETY: FFI call, there are no safety requirements on `flags`.
> +        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
> +        if fd < 0 {
> +            return Err(Error::from_errno(fd));
> +        }

I think here we could also use the modified `to_result` function that
returns a `u32` if the value is non-negative.

> +        Ok(Self {
> +            fd: fd as _,
> +            _not_send_sync: PhantomData,
> +        })
> +    }
> +
> +    /// Returns the file descriptor number that was reserved.
> +    pub fn reserved_fd(&self) -> u32 {
> +        self.fd
> +    }
> +
> +    /// Commits the reservation.
> +    ///
> +    /// The previously reserved file descriptor is bound to `file`. This method consumes the
> +    /// [`FileDescriptorReservation`], so it will not be usable after this call.
> +    pub fn commit(self, file: ARef<File>) {
> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> +        // guaranteed to have an owned ref count by its type invariants.
> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> +
> +        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
> +        // the destructors.
> +        core::mem::forget(self);
> +        core::mem::forget(file);

Would be useful to have an `ARef::into_raw` function that would do
the `forget` for us.
Alice Ryhl Dec. 1, 2023, 11:32 a.m. UTC | #13
Benno Lossin <benno.lossin@proton.me> writes:
> On 11/29/23 14:11, Alice Ryhl wrote:
> > +impl FileDescriptorReservation {
> > +    /// Creates a new file descriptor reservation.
> > +    pub fn new(flags: u32) -> Result<Self> {
> > +        // SAFETY: FFI call, there are no safety requirements on `flags`.
> > +        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
> > +        if fd < 0 {
> > +            return Err(Error::from_errno(fd));
> > +        }
> 
> I think here we could also use the modified `to_result` function that
> returns a `u32` if the value is non-negative.

I'll look into that for the next version.

>> +    /// Commits the reservation.
>> +    ///
>> +    /// The previously reserved file descriptor is bound to `file`. This method consumes the
>> +    /// [`FileDescriptorReservation`], so it will not be usable after this call.
>> +    pub fn commit(self, file: ARef<File>) {
>> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
>> +        // guaranteed to have an owned ref count by its type invariants.
>> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
>> +
>> +        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
>> +        // the destructors.
>> +        core::mem::forget(self);
>> +        core::mem::forget(file);
> 
> Would be useful to have an `ARef::into_raw` function that would do
> the `forget` for us.

That makes sense to me, but I don't think it needs to happen in this patchset.

Alice
diff mbox series

Patch

diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
index f1f71c3d97e2..2186a6ea3f2f 100644
--- a/rust/kernel/file.rs
+++ b/rust/kernel/file.rs
@@ -11,7 +11,7 @@ 
     error::{code::*, Error, Result},
     types::{ARef, AlwaysRefCounted, Opaque},
 };
-use core::ptr;
+use core::{marker::PhantomData, ptr};
 
 /// Flags associated with a [`File`].
 pub mod flags {
@@ -180,6 +180,68 @@  unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
     }
 }
 
+/// A file descriptor reservation.
+///
+/// This allows the creation of a file descriptor in two steps: first, we reserve a slot for it,
+/// then we commit or drop the reservation. The first step may fail (e.g., the current process ran
+/// out of available slots), but commit and drop never fail (and are mutually exclusive).
+///
+/// Dropping the reservation happens in the destructor of this type.
+///
+/// # Invariants
+///
+/// The fd stored in this struct must correspond to a reserved file descriptor of the current task.
+pub struct FileDescriptorReservation {
+    fd: u32,
+    /// Prevent values of this type from being moved to a different task.
+    ///
+    /// This is necessary because the C FFI calls assume that `current` is set to the task that
+    /// owns the fd in question.
+    _not_send_sync: PhantomData<*mut ()>,
+}
+
+impl FileDescriptorReservation {
+    /// Creates a new file descriptor reservation.
+    pub fn new(flags: u32) -> Result<Self> {
+        // SAFETY: FFI call, there are no safety requirements on `flags`.
+        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
+        if fd < 0 {
+            return Err(Error::from_errno(fd));
+        }
+        Ok(Self {
+            fd: fd as _,
+            _not_send_sync: PhantomData,
+        })
+    }
+
+    /// Returns the file descriptor number that was reserved.
+    pub fn reserved_fd(&self) -> u32 {
+        self.fd
+    }
+
+    /// Commits the reservation.
+    ///
+    /// The previously reserved file descriptor is bound to `file`. This method consumes the
+    /// [`FileDescriptorReservation`], so it will not be usable after this call.
+    pub fn commit(self, file: ARef<File>) {
+        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
+        // guaranteed to have an owned ref count by its type invariants.
+        unsafe { bindings::fd_install(self.fd, file.0.get()) };
+
+        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
+        // the destructors.
+        core::mem::forget(self);
+        core::mem::forget(file);
+    }
+}
+
+impl Drop for FileDescriptorReservation {
+    fn drop(&mut self) {
+        // SAFETY: `self.fd` was returned by a previous call to `get_unused_fd_flags`.
+        unsafe { bindings::put_unused_fd(self.fd) };
+    }
+}
+
 /// Represents the `EBADF` error code.
 ///
 /// Used for methods that can only fail with `EBADF`.