diff mbox series

[v5,1/4] rust: uaccess: add userspace pointers

Message ID 20240415-alice-mm-v5-1-6f55e4d8ef51@google.com (mailing list archive)
State New
Headers show
Series Memory management patches needed by Rust Binder | expand

Commit Message

Alice Ryhl April 15, 2024, 7:13 a.m. UTC
From: Wedson Almeida Filho <wedsonaf@gmail.com>

A pointer to an area in userspace memory, which can be either read-only
or read-write.

All methods on this struct are safe: attempting to read or write on bad
addresses (either out of the bound of the slice or unmapped addresses)
will return `EFAULT`. Concurrent access, *including data races to/from
userspace memory*, is permitted, because fundamentally another userspace
thread/process could always be modifying memory at the same time (in the
same way that userspace Rust's `std::io` permits data races with the
contents of files on disk). In the presence of a race, the exact byte
values read/written are unspecified but the operation is well-defined.
Kernelspace code should validate its copy of data after completing a
read, and not expect that multiple reads of the same address will return
the same value.

These APIs are designed to make it difficult to accidentally write
TOCTOU bugs. Every time you read from a memory location, the pointer is
advanced by the length so that you cannot use that reader to read the
same memory location twice. Preventing double-fetches avoids TOCTOU
bugs. This is accomplished by taking `self` by value to prevent
obtaining multiple readers on a given `UserSlicePtr`, and the readers
only permitting forward reads. If double-fetching a memory location is
necessary for some reason, then that is done by creating multiple
readers to the same memory location.

Constructing a `UserSlicePtr` performs no checks on the provided
address and length, it can safely be constructed inside a kernel thread
with no current userspace process. Reads and writes wrap the kernel APIs
`copy_from_user` and `copy_to_user`, which check the memory map of the
current process and enforce that the address range is within the user
range (no additional calls to `access_ok` are needed).

This code is based on something that was originally written by Wedson on
the old rust branch. It was modified by Alice by removing the
`IoBufferReader` and `IoBufferWriter` traits, and various other changes.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/helpers.c         |  14 +++
 rust/kernel/lib.rs     |   1 +
 rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 319 insertions(+)

Comments

Benno Lossin April 15, 2024, 9:36 a.m. UTC | #1
On 15.04.24 09:13, Alice Ryhl wrote:
> +impl UserSlice {
> +    /// Constructs a user slice from a raw pointer and a length in bytes.
> +    ///
> +    /// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
> +    /// safely be constructed inside a kernel thread with no current userspace process. Reads and
> +    /// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
> +    /// of the current process and enforce that the address range is within the user range (no
> +    /// additional calls to `access_ok` are needed).
> +    ///
> +    /// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
> +    /// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
> +    /// at most once.
> +    pub fn new(ptr: *mut c_void, length: usize) -> Self {

What would happen if I call this with a kernel pointer and then
read/write to it? For example

     let mut arr = [MaybeUninit::uninit(); 64];
     let ptr: *mut [MaybeUninit<u8>] = &mut arr;
     let ptr = ptr.cast::<c_void>();

     let slice = UserSlice::new(ptr, 64);
     let (mut r, mut w) = slice.reader_writer();

     r.read_raw(&mut arr)?;
     // SAFETY: `arr` was initialized above.
     w.write_slice(unsafe { MaybeUninit::slice_assume_init_ref(&arr) })?;

I think this would violate the exclusivity of `&mut` without any
`unsafe` code. (the `unsafe` block at the end cannot possibly be wrong)

> +        UserSlice { ptr, length }
> +    }

[...]

> +    /// Returns `true` if no data is available in the io buffer.
> +    pub fn is_empty(&self) -> bool {
> +        self.length == 0
> +    }
> +
> +    /// Reads raw data from the user slice into a kernel buffer.
> +    ///
> +    /// After a successful call to this method, all bytes in `out` are initialized.

I think we should put things like this into a `# Guarantees` section.
Alice Ryhl April 15, 2024, 9:44 a.m. UTC | #2
On Mon, Apr 15, 2024 at 11:37 AM Benno Lossin <benno.lossin@proton.me> wrote:
>
> On 15.04.24 09:13, Alice Ryhl wrote:
> > +impl UserSlice {
> > +    /// Constructs a user slice from a raw pointer and a length in bytes.
> > +    ///
> > +    /// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
> > +    /// safely be constructed inside a kernel thread with no current userspace process. Reads and
> > +    /// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
> > +    /// of the current process and enforce that the address range is within the user range (no
> > +    /// additional calls to `access_ok` are needed).
> > +    ///
> > +    /// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
> > +    /// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
> > +    /// at most once.
> > +    pub fn new(ptr: *mut c_void, length: usize) -> Self {
>
> What would happen if I call this with a kernel pointer and then
> read/write to it? For example
>
>      let mut arr = [MaybeUninit::uninit(); 64];
>      let ptr: *mut [MaybeUninit<u8>] = &mut arr;
>      let ptr = ptr.cast::<c_void>();
>
>      let slice = UserSlice::new(ptr, 64);
>      let (mut r, mut w) = slice.reader_writer();
>
>      r.read_raw(&mut arr)?;
>      // SAFETY: `arr` was initialized above.
>      w.write_slice(unsafe { MaybeUninit::slice_assume_init_ref(&arr) })?;
>
> I think this would violate the exclusivity of `&mut` without any
> `unsafe` code. (the `unsafe` block at the end cannot possibly be wrong)

This will fail with an EFAULT error. There is a check on the C side
that verifies that the address is in userspace. (The access_ok call.)

Alice
Benno Lossin April 15, 2024, 9:51 a.m. UTC | #3
On 15.04.24 11:44, Alice Ryhl wrote:
> On Mon, Apr 15, 2024 at 11:37 AM Benno Lossin <benno.lossin@proton.me> wrote:
>>
>> On 15.04.24 09:13, Alice Ryhl wrote:
>>> +impl UserSlice {
>>> +    /// Constructs a user slice from a raw pointer and a length in bytes.
>>> +    ///
>>> +    /// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
>>> +    /// safely be constructed inside a kernel thread with no current userspace process. Reads and
>>> +    /// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
>>> +    /// of the current process and enforce that the address range is within the user range (no
>>> +    /// additional calls to `access_ok` are needed).
>>> +    ///
>>> +    /// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
>>> +    /// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
>>> +    /// at most once.
>>> +    pub fn new(ptr: *mut c_void, length: usize) -> Self {
>>
>> What would happen if I call this with a kernel pointer and then
>> read/write to it? For example
>>
>>       let mut arr = [MaybeUninit::uninit(); 64];
>>       let ptr: *mut [MaybeUninit<u8>] = &mut arr;
>>       let ptr = ptr.cast::<c_void>();
>>
>>       let slice = UserSlice::new(ptr, 64);
>>       let (mut r, mut w) = slice.reader_writer();
>>
>>       r.read_raw(&mut arr)?;
>>       // SAFETY: `arr` was initialized above.
>>       w.write_slice(unsafe { MaybeUninit::slice_assume_init_ref(&arr) })?;
>>
>> I think this would violate the exclusivity of `&mut` without any
>> `unsafe` code. (the `unsafe` block at the end cannot possibly be wrong)
> 
> This will fail with an EFAULT error. There is a check on the C side
> that verifies that the address is in userspace. (The access_ok call.)

I see, that makes a lot of sense.

Regardless of whether you fix the nit about the guarantees section:

Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Boqun Feng April 15, 2024, 9:53 p.m. UTC | #4
On Mon, Apr 15, 2024 at 07:13:53AM +0000, Alice Ryhl wrote:
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> A pointer to an area in userspace memory, which can be either read-only
> or read-write.
> 
> All methods on this struct are safe: attempting to read or write on bad
> addresses (either out of the bound of the slice or unmapped addresses)
> will return `EFAULT`. Concurrent access, *including data races to/from
> userspace memory*, is permitted, because fundamentally another userspace
> thread/process could always be modifying memory at the same time (in the
> same way that userspace Rust's `std::io` permits data races with the
> contents of files on disk). In the presence of a race, the exact byte
> values read/written are unspecified but the operation is well-defined.
> Kernelspace code should validate its copy of data after completing a
> read, and not expect that multiple reads of the same address will return
> the same value.
> 
> These APIs are designed to make it difficult to accidentally write
> TOCTOU bugs. Every time you read from a memory location, the pointer is
> advanced by the length so that you cannot use that reader to read the
> same memory location twice. Preventing double-fetches avoids TOCTOU
> bugs. This is accomplished by taking `self` by value to prevent
> obtaining multiple readers on a given `UserSlicePtr`, and the readers
> only permitting forward reads. If double-fetching a memory location is
> necessary for some reason, then that is done by creating multiple
> readers to the same memory location.
> 
> Constructing a `UserSlicePtr` performs no checks on the provided
> address and length, it can safely be constructed inside a kernel thread
> with no current userspace process. Reads and writes wrap the kernel APIs
> `copy_from_user` and `copy_to_user`, which check the memory map of the
> current process and enforce that the address range is within the user
> range (no additional calls to `access_ok` are needed).
> 
> This code is based on something that was originally written by Wedson on
> the old rust branch. It was modified by Alice by removing the
> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
> 
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Thanks!

Reviewed-by: Boqun Feng <boqun.feng@gmail.com>

Two small nits below..

> ---
>  rust/helpers.c         |  14 +++
>  rust/kernel/lib.rs     |   1 +
>  rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 319 insertions(+)
> 
[...]
> +    /// Reads raw data from the user slice into a kernel buffer.
> +    ///
> +    /// Fails with `EFAULT` if the read happens on a bad address.

... we probably want to mention that `out` may get modified even in
failure cases.

> +    pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
> +        // SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
> +        // `out`.
> +        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
> +        self.read_raw(out)
> +    }
> +
[...]
> +
> +impl UserSliceWriter {
[...]
> +
> +    /// Writes raw data to this user pointer from a kernel buffer.
> +    ///
> +    /// Fails with `EFAULT` if the write happens on a bad address.

Same here, probably mention that: the userspace memory may be modified
even in failure cases.

Anyway, they are not correctness critical, so we can do these in later
patches.

Regards,
Boqun

> +    pub fn write_slice(&mut self, data: &[u8]) -> Result {
> +        let len = data.len();
> +        let data_ptr = data.as_ptr().cast::<c_void>();
> +        if len > self.length {
> +            return Err(EFAULT);
> +        }
> +        let Ok(len_ulong) = c_ulong::try_from(len) else {
> +            return Err(EFAULT);
> +        };
> +        // SAFETY: `data_ptr` points into an immutable slice of length `len_ulong`, so we may read
> +        // that many bytes from it.
> +        let res = unsafe { bindings::copy_to_user(self.ptr, data_ptr, len_ulong) };
> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Userspace pointers are not directly dereferencable by the kernel, so
> +        // we cannot use `add`, which has C-style rules for defined behavior.
> +        self.ptr = self.ptr.wrapping_byte_add(len);
> +        self.length -= len;
> +        Ok(())
> +    }
> +}
> 
> -- 
> 2.44.0.683.g7961c838ac-goog
> 
>
Trevor Gross April 16, 2024, 5:05 a.m. UTC | #5
On Mon, Apr 15, 2024 at 3:14 AM Alice Ryhl <aliceryhl@google.com> wrote:
>
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
>
> A pointer to an area in userspace memory, which can be either read-only
> or read-write.
>
> All methods on this struct are safe: attempting to read or write on bad
> addresses (either out of the bound of the slice or unmapped addresses)
> will return `EFAULT`. Concurrent access, *including data races to/from
> userspace memory*, is permitted, because fundamentally another userspace
> thread/process could always be modifying memory at the same time (in the
> same way that userspace Rust's `std::io` permits data races with the
> contents of files on disk). In the presence of a race, the exact byte
> values read/written are unspecified but the operation is well-defined.
> Kernelspace code should validate its copy of data after completing a
> read, and not expect that multiple reads of the same address will return
> the same value.
>
> These APIs are designed to make it difficult to accidentally write
> TOCTOU bugs. Every time you read from a memory location, the pointer is
> advanced by the length so that you cannot use that reader to read the
> same memory location twice. Preventing double-fetches avoids TOCTOU
> bugs. This is accomplished by taking `self` by value to prevent
> obtaining multiple readers on a given `UserSlicePtr`, and the readers
> only permitting forward reads. If double-fetching a memory location is
> necessary for some reason, then that is done by creating multiple
> readers to the same memory location.
>
> Constructing a `UserSlicePtr` performs no checks on the provided
> address and length, it can safely be constructed inside a kernel thread
> with no current userspace process. Reads and writes wrap the kernel APIs
> `copy_from_user` and `copy_to_user`, which check the memory map of the
> current process and enforce that the address range is within the user
> range (no additional calls to `access_ok` are needed).
>
> This code is based on something that was originally written by Wedson on
> the old rust branch. It was modified by Alice by removing the
> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Reviewed-by: Trevor Gross <tmgross@umich.edu>

I left some suggestions for documentation improvements and one
question, but mostly LGTM.

> ---
>  rust/helpers.c         |  14 +++
>  rust/kernel/lib.rs     |   1 +
>  rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 319 insertions(+)
>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index be68d5e567b1..37f84223b83f 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -49,6 +49,7 @@
>  pub mod task;
>  pub mod time;
>  pub mod types;
> +pub mod uaccess;
>  pub mod workqueue;
>
>  #[doc(hidden)]
> diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
> new file mode 100644
> index 000000000000..c97029cdeba1
> --- /dev/null
> +++ b/rust/kernel/uaccess.rs
> @@ -0,0 +1,304 @@
> [...]
> +impl UserSlice {
> +    /// Constructs a user slice from a raw pointer and a length in bytes.
> +    ///
> +    /// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
> +    /// safely be constructed inside a kernel thread with no current userspace process. Reads and
> +    /// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
> +    /// of the current process and enforce that the address range is within the user range (no
> +    /// additional calls to `access_ok` are needed).

I would just add a note that pointer should be a valid userspace
pointer, but that gets checked at read/write time

> +    /// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
> +    /// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
> +    /// at most once.
> +    pub fn new(ptr: *mut c_void, length: usize) -> Self {
> +        UserSlice { ptr, length }
> +    }

> +impl UserSliceReader {
> [...]
> +    /// Reads raw data from the user slice into a kernel buffer.
> +    ///
> +    /// After a successful call to this method, all bytes in `out` are initialized.

If this is guaranteed, could it return `Result<&mut [u8]>`? So the
caller doesn't need to unsafely `assume_init` anything.

> +    /// Fails with `EFAULT` if the read happens on a bad address.

This should also mention that the slice cannot be bigger than the
reader's length.

> +    pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
> +        let len = out.len();
> +        let out_ptr = out.as_mut_ptr().cast::<c_void>();
> +        if len > self.length {
> +            return Err(EFAULT);
> +        }
> +        let Ok(len_ulong) = c_ulong::try_from(len) else {
> +            return Err(EFAULT);
> +        };
> +        // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
> +        // that many bytes to it.
> +        let res = unsafe { bindings::copy_from_user(out_ptr, self.ptr, len_ulong) };
> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Userspace pointers are not directly dereferencable by the kernel, so we cannot use `add`,
> +        // which has C-style rules for defined behavior.
> +        self.ptr = self.ptr.wrapping_byte_add(len);
> +        self.length -= len;
> +        Ok(())
> +    }
> +
> +    /// Reads raw data from the user slice into a kernel buffer.
> +    ///
> +    /// Fails with `EFAULT` if the read happens on a bad address.
> +    pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
> +        // SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
> +        // `out`.
> +        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
> +        self.read_raw(out)
> +    }

If this is just a safe version of read_raw, could you crosslink the docs?

> +impl UserSliceWriter {
> +
> +    /// Writes raw data to this user pointer from a kernel buffer.
> +    ///
> +    /// Fails with `EFAULT` if the write happens on a bad address.
> +    pub fn write_slice(&mut self, data: &[u8]) -> Result {
>   [...]
> +    }

Could use a note about length like `read_raw`.
Alice Ryhl April 16, 2024, 9:53 a.m. UTC | #6
Boqun Feng <boqun.feng@gmail.com> writes:
> On Mon, Apr 15, 2024 at 07:13:53AM +0000, Alice Ryhl wrote:
>> From: Wedson Almeida Filho <wedsonaf@gmail.com>
>> 
>> A pointer to an area in userspace memory, which can be either read-only
>> or read-write.
>> 
>> All methods on this struct are safe: attempting to read or write on bad
>> addresses (either out of the bound of the slice or unmapped addresses)
>> will return `EFAULT`. Concurrent access, *including data races to/from
>> userspace memory*, is permitted, because fundamentally another userspace
>> thread/process could always be modifying memory at the same time (in the
>> same way that userspace Rust's `std::io` permits data races with the
>> contents of files on disk). In the presence of a race, the exact byte
>> values read/written are unspecified but the operation is well-defined.
>> Kernelspace code should validate its copy of data after completing a
>> read, and not expect that multiple reads of the same address will return
>> the same value.
>> 
>> These APIs are designed to make it difficult to accidentally write
>> TOCTOU bugs. Every time you read from a memory location, the pointer is
>> advanced by the length so that you cannot use that reader to read the
>> same memory location twice. Preventing double-fetches avoids TOCTOU
>> bugs. This is accomplished by taking `self` by value to prevent
>> obtaining multiple readers on a given `UserSlicePtr`, and the readers
>> only permitting forward reads. If double-fetching a memory location is
>> necessary for some reason, then that is done by creating multiple
>> readers to the same memory location.
>> 
>> Constructing a `UserSlicePtr` performs no checks on the provided
>> address and length, it can safely be constructed inside a kernel thread
>> with no current userspace process. Reads and writes wrap the kernel APIs
>> `copy_from_user` and `copy_to_user`, which check the memory map of the
>> current process and enforce that the address range is within the user
>> range (no additional calls to `access_ok` are needed).
>> 
>> This code is based on something that was originally written by Wedson on
>> the old rust branch. It was modified by Alice by removing the
>> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
>> 
>> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
>> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
>> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> 
> Thanks!
> 
> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>

Thanks for taking a look!

>> ---
>>  rust/helpers.c         |  14 +++
>>  rust/kernel/lib.rs     |   1 +
>>  rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 319 insertions(+)
>> 
> [...]
>> +    /// Reads raw data from the user slice into a kernel buffer.
>> +    ///
>> +    /// Fails with `EFAULT` if the read happens on a bad address.
> 
> ... we probably want to mention that `out` may get modified even in
> failure cases.

Will do.

>> +    pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
>> +        // SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
>> +        // `out`.
>> +        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
>> +        self.read_raw(out)
>> +    }
>> +
> [...]
>> +
>> +impl UserSliceWriter {
> [...]
>> +
>> +    /// Writes raw data to this user pointer from a kernel buffer.
>> +    ///
>> +    /// Fails with `EFAULT` if the write happens on a bad address.
> 
> Same here, probably mention that: the userspace memory may be modified
> even in failure cases.

Will do.

> Anyway, they are not correctness critical, so we can do these in later
> patches.

It looks like I'll have to send another version anyway due to the
conflict with [1], so I can take care of it.

Alice

[1]: https://lore.kernel.org/rust-for-linux/20240328013603.206764-1-wedsonaf@gmail.com/
Alice Ryhl April 16, 2024, 9:53 a.m. UTC | #7
Trevor Gross <tmgross@umich.edu> writes:
> On Mon, Apr 15, 2024 at 3:14 AM Alice Ryhl <aliceryhl@google.com> wrote:
>>
>> From: Wedson Almeida Filho <wedsonaf@gmail.com>
>>
>> A pointer to an area in userspace memory, which can be either read-only
>> or read-write.
>>
>> All methods on this struct are safe: attempting to read or write on bad
>> addresses (either out of the bound of the slice or unmapped addresses)
>> will return `EFAULT`. Concurrent access, *including data races to/from
>> userspace memory*, is permitted, because fundamentally another userspace
>> thread/process could always be modifying memory at the same time (in the
>> same way that userspace Rust's `std::io` permits data races with the
>> contents of files on disk). In the presence of a race, the exact byte
>> values read/written are unspecified but the operation is well-defined.
>> Kernelspace code should validate its copy of data after completing a
>> read, and not expect that multiple reads of the same address will return
>> the same value.
>>
>> These APIs are designed to make it difficult to accidentally write
>> TOCTOU bugs. Every time you read from a memory location, the pointer is
>> advanced by the length so that you cannot use that reader to read the
>> same memory location twice. Preventing double-fetches avoids TOCTOU
>> bugs. This is accomplished by taking `self` by value to prevent
>> obtaining multiple readers on a given `UserSlicePtr`, and the readers
>> only permitting forward reads. If double-fetching a memory location is
>> necessary for some reason, then that is done by creating multiple
>> readers to the same memory location.
>>
>> Constructing a `UserSlicePtr` performs no checks on the provided
>> address and length, it can safely be constructed inside a kernel thread
>> with no current userspace process. Reads and writes wrap the kernel APIs
>> `copy_from_user` and `copy_to_user`, which check the memory map of the
>> current process and enforce that the address range is within the user
>> range (no additional calls to `access_ok` are needed).
>>
>> This code is based on something that was originally written by Wedson on
>> the old rust branch. It was modified by Alice by removing the
>> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
>>
>> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
>> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
>> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> 
> Reviewed-by: Trevor Gross <tmgross@umich.edu>
> 
> I left some suggestions for documentation improvements and one
> question, but mostly LGTM.

Thanks for taking a look!

>> +impl UserSlice {
>> +    /// Constructs a user slice from a raw pointer and a length in bytes.
>> +    ///
>> +    /// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
>> +    /// safely be constructed inside a kernel thread with no current userspace process. Reads and
>> +    /// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
>> +    /// of the current process and enforce that the address range is within the user range (no
>> +    /// additional calls to `access_ok` are needed).
> 
> I would just add a note that pointer should be a valid userspace
> pointer, but that gets checked at read/write time

Will do.

>> +    /// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
>> +    /// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
>> +    /// at most once.
>> +    pub fn new(ptr: *mut c_void, length: usize) -> Self {
>> +        UserSlice { ptr, length }
>> +    }
> 
>> +impl UserSliceReader {
>> [...]
>> +    /// Reads raw data from the user slice into a kernel buffer.
>> +    ///
>> +    /// After a successful call to this method, all bytes in `out` are initialized.
> 
> If this is guaranteed, could it return `Result<&mut [u8]>`? So the
> caller doesn't need to unsafely `assume_init` anything.

It could, but I don't think it's that useful. All existing callers will
want to record it somewhere with something like `Vec::set_len`, which
this doesn't help with. There are ways to do something like that, but it
complicates the API further which I am not interested in.

>> +    /// Fails with `EFAULT` if the read happens on a bad address.
> 
> This should also mention that the slice cannot be bigger than the
> reader's length.

I can add a note.
 
>> +    pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
>> +        let len = out.len();
>> +        let out_ptr = out.as_mut_ptr().cast::<c_void>();
>> +        if len > self.length {
>> +            return Err(EFAULT);
>> +        }
>> +        let Ok(len_ulong) = c_ulong::try_from(len) else {
>> +            return Err(EFAULT);
>> +        };
>> +        // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
>> +        // that many bytes to it.
>> +        let res = unsafe { bindings::copy_from_user(out_ptr, self.ptr, len_ulong) };
>> +        if res != 0 {
>> +            return Err(EFAULT);
>> +        }
>> +        // Userspace pointers are not directly dereferencable by the kernel, so we cannot use `add`,
>> +        // which has C-style rules for defined behavior.
>> +        self.ptr = self.ptr.wrapping_byte_add(len);
>> +        self.length -= len;
>> +        Ok(())
>> +    }
>> +
>> +    /// Reads raw data from the user slice into a kernel buffer.
>> +    ///
>> +    /// Fails with `EFAULT` if the read happens on a bad address.
>> +    pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
>> +        // SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
>> +        // `out`.
>> +        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
>> +        self.read_raw(out)
>> +    }
> 
> If this is just a safe version of read_raw, could you crosslink the docs?

Okay.

>> +impl UserSliceWriter {
>> +
>> +    /// Writes raw data to this user pointer from a kernel buffer.
>> +    ///
>> +    /// Fails with `EFAULT` if the write happens on a bad address.
>> +    pub fn write_slice(&mut self, data: &[u8]) -> Result {
>>   [...]
>> +    }
> 
> Could use a note about length like `read_raw`.

Okay.

Alice
Gary Guo April 17, 2024, 2:28 p.m. UTC | #8
On Mon, 15 Apr 2024 07:13:53 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> A pointer to an area in userspace memory, which can be either read-only
> or read-write.
> 
> All methods on this struct are safe: attempting to read or write on bad
> addresses (either out of the bound of the slice or unmapped addresses)
> will return `EFAULT`. Concurrent access, *including data races to/from
> userspace memory*, is permitted, because fundamentally another userspace
> thread/process could always be modifying memory at the same time (in the
> same way that userspace Rust's `std::io` permits data races with the
> contents of files on disk). In the presence of a race, the exact byte
> values read/written are unspecified but the operation is well-defined.
> Kernelspace code should validate its copy of data after completing a
> read, and not expect that multiple reads of the same address will return
> the same value.
> 
> These APIs are designed to make it difficult to accidentally write
> TOCTOU bugs. Every time you read from a memory location, the pointer is
> advanced by the length so that you cannot use that reader to read the
> same memory location twice. Preventing double-fetches avoids TOCTOU
> bugs. This is accomplished by taking `self` by value to prevent
> obtaining multiple readers on a given `UserSlicePtr`, and the readers
> only permitting forward reads. If double-fetching a memory location is
> necessary for some reason, then that is done by creating multiple
> readers to the same memory location.
> 
> Constructing a `UserSlicePtr` performs no checks on the provided
> address and length, it can safely be constructed inside a kernel thread
> with no current userspace process. Reads and writes wrap the kernel APIs
> `copy_from_user` and `copy_to_user`, which check the memory map of the
> current process and enforce that the address range is within the user
> range (no additional calls to `access_ok` are needed).
> 
> This code is based on something that was originally written by Wedson on
> the old rust branch. It was modified by Alice by removing the
> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
> 
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/helpers.c         |  14 +++
>  rust/kernel/lib.rs     |   1 +
>  rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 319 insertions(+)
> 
> diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs

> +/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
> +/// [`clone_reader`]: UserSliceReader::clone_reader
> +pub struct UserSlice {
> +    ptr: *mut c_void,
> +    length: usize,
> +}

How useful is the `c_void` in the struct and new signature? They tend
to not be very useful in Rust. Given that provenance doesn't matter
for userspace pointers, could this be `usize` simply?

I think `*mut u8` or `*mut ()` makes more sense than `*mut c_void` for
Rust code even if we don't want to use `usize`.

---

Some thinking aloud and brainstorming bits about the API.

I wonder if it make sense to have `User<[u8]>` instead of `UserSlice`?
The `User` type can be defined like this:

```rust
struct User<T: ?Sized> {
   ptr: *mut T,
}
```

and this allows arbitrary T as long as it's POD. So we could have
`User<[u8]>`, `User<u32>`, `User<PodStruct>`. I imagine the
`User<[u8]>` would be the general usage and the latter ones can be
especially helpful if you are trying to implement ioctl and need to
copy fixed size data structs from userspace.

Best,
Gary
Alice Ryhl April 17, 2024, 2:40 p.m. UTC | #9
On Wed, Apr 17, 2024 at 4:28 PM Gary Guo <gary@garyguo.net> wrote:
>
> On Mon, 15 Apr 2024 07:13:53 +0000
> Alice Ryhl <aliceryhl@google.com> wrote:
>
> > From: Wedson Almeida Filho <wedsonaf@gmail.com>
> >
> > A pointer to an area in userspace memory, which can be either read-only
> > or read-write.
> >
> > All methods on this struct are safe: attempting to read or write on bad
> > addresses (either out of the bound of the slice or unmapped addresses)
> > will return `EFAULT`. Concurrent access, *including data races to/from
> > userspace memory*, is permitted, because fundamentally another userspace
> > thread/process could always be modifying memory at the same time (in the
> > same way that userspace Rust's `std::io` permits data races with the
> > contents of files on disk). In the presence of a race, the exact byte
> > values read/written are unspecified but the operation is well-defined.
> > Kernelspace code should validate its copy of data after completing a
> > read, and not expect that multiple reads of the same address will return
> > the same value.
> >
> > These APIs are designed to make it difficult to accidentally write
> > TOCTOU bugs. Every time you read from a memory location, the pointer is
> > advanced by the length so that you cannot use that reader to read the
> > same memory location twice. Preventing double-fetches avoids TOCTOU
> > bugs. This is accomplished by taking `self` by value to prevent
> > obtaining multiple readers on a given `UserSlicePtr`, and the readers
> > only permitting forward reads. If double-fetching a memory location is
> > necessary for some reason, then that is done by creating multiple
> > readers to the same memory location.
> >
> > Constructing a `UserSlicePtr` performs no checks on the provided
> > address and length, it can safely be constructed inside a kernel thread
> > with no current userspace process. Reads and writes wrap the kernel APIs
> > `copy_from_user` and `copy_to_user`, which check the memory map of the
> > current process and enforce that the address range is within the user
> > range (no additional calls to `access_ok` are needed).
> >
> > This code is based on something that was originally written by Wedson on
> > the old rust branch. It was modified by Alice by removing the
> > `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
> >
> > Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> > Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> >  rust/helpers.c         |  14 +++
> >  rust/kernel/lib.rs     |   1 +
> >  rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 319 insertions(+)
> >
> > diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
>
> > +/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
> > +/// [`clone_reader`]: UserSliceReader::clone_reader
> > +pub struct UserSlice {
> > +    ptr: *mut c_void,
> > +    length: usize,
> > +}
>
> How useful is the `c_void` in the struct and new signature? They tend
> to not be very useful in Rust. Given that provenance doesn't matter
> for userspace pointers, could this be `usize` simply?
>
> I think `*mut u8` or `*mut ()` makes more sense than `*mut c_void` for
> Rust code even if we don't want to use `usize`.

I don't have a strong opinion here. I suppose a usize could make
sense. But I also think c_void is fine, and I lean towards not
changing it. :)

> Some thinking aloud and brainstorming bits about the API.
>
> I wonder if it make sense to have `User<[u8]>` instead of `UserSlice`?
> The `User` type can be defined like this:
>
> ```rust
> struct User<T: ?Sized> {
>    ptr: *mut T,
> }
> ```
>
> and this allows arbitrary T as long as it's POD. So we could have
> `User<[u8]>`, `User<u32>`, `User<PodStruct>`. I imagine the
> `User<[u8]>` would be the general usage and the latter ones can be
> especially helpful if you are trying to implement ioctl and need to
> copy fixed size data structs from userspace.

Hmm, we have to be careful here. Generally, when you get a user slice
via an ioctl, you should make sure to use the length you get from
userspace. In binder, it looks like this:

let user_slice = UserSlice::new(arg, _IOC_SIZE(cmd));

so whichever API we use, we must make sure to get the length as an
argument in bytes. What should we do if the length is not a multiple
of size_of(T)?

Another issue is that there's no stable way to get the length from a
`*mut [T]` without creating a reference, which is not okay for a user
slice.

Alice
Benno Lossin April 17, 2024, 3:27 p.m. UTC | #10
On 17.04.24 16:40, Alice Ryhl wrote:
> On Wed, Apr 17, 2024 at 4:28 PM Gary Guo <gary@garyguo.net> wrote:
>>
>> On Mon, 15 Apr 2024 07:13:53 +0000
>> Alice Ryhl <aliceryhl@google.com> wrote:
>>
>>> From: Wedson Almeida Filho <wedsonaf@gmail.com>
>>>
>>> A pointer to an area in userspace memory, which can be either read-only
>>> or read-write.
>>>
>>> All methods on this struct are safe: attempting to read or write on bad
>>> addresses (either out of the bound of the slice or unmapped addresses)
>>> will return `EFAULT`. Concurrent access, *including data races to/from
>>> userspace memory*, is permitted, because fundamentally another userspace
>>> thread/process could always be modifying memory at the same time (in the
>>> same way that userspace Rust's `std::io` permits data races with the
>>> contents of files on disk). In the presence of a race, the exact byte
>>> values read/written are unspecified but the operation is well-defined.
>>> Kernelspace code should validate its copy of data after completing a
>>> read, and not expect that multiple reads of the same address will return
>>> the same value.
>>>
>>> These APIs are designed to make it difficult to accidentally write
>>> TOCTOU bugs. Every time you read from a memory location, the pointer is
>>> advanced by the length so that you cannot use that reader to read the
>>> same memory location twice. Preventing double-fetches avoids TOCTOU
>>> bugs. This is accomplished by taking `self` by value to prevent
>>> obtaining multiple readers on a given `UserSlicePtr`, and the readers
>>> only permitting forward reads. If double-fetching a memory location is
>>> necessary for some reason, then that is done by creating multiple
>>> readers to the same memory location.
>>>
>>> Constructing a `UserSlicePtr` performs no checks on the provided
>>> address and length, it can safely be constructed inside a kernel thread
>>> with no current userspace process. Reads and writes wrap the kernel APIs
>>> `copy_from_user` and `copy_to_user`, which check the memory map of the
>>> current process and enforce that the address range is within the user
>>> range (no additional calls to `access_ok` are needed).
>>>
>>> This code is based on something that was originally written by Wedson on
>>> the old rust branch. It was modified by Alice by removing the
>>> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
>>>
>>> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
>>> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
>>> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
>>> ---
>>>   rust/helpers.c         |  14 +++
>>>   rust/kernel/lib.rs     |   1 +
>>>   rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>   3 files changed, 319 insertions(+)
>>>
>>> diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
>>
>>> +/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
>>> +/// [`clone_reader`]: UserSliceReader::clone_reader
>>> +pub struct UserSlice {
>>> +    ptr: *mut c_void,
>>> +    length: usize,
>>> +}
>>
>> How useful is the `c_void` in the struct and new signature? They tend
>> to not be very useful in Rust. Given that provenance doesn't matter
>> for userspace pointers, could this be `usize` simply?
>>
>> I think `*mut u8` or `*mut ()` makes more sense than `*mut c_void` for
>> Rust code even if we don't want to use `usize`.
> 
> I don't have a strong opinion here. I suppose a usize could make
> sense. But I also think c_void is fine, and I lean towards not
> changing it. :)
> 
>> Some thinking aloud and brainstorming bits about the API.
>>
>> I wonder if it make sense to have `User<[u8]>` instead of `UserSlice`?
>> The `User` type can be defined like this:
>>
>> ```rust
>> struct User<T: ?Sized> {
>>     ptr: *mut T,
>> }
>> ```
>>
>> and this allows arbitrary T as long as it's POD. So we could have
>> `User<[u8]>`, `User<u32>`, `User<PodStruct>`. I imagine the
>> `User<[u8]>` would be the general usage and the latter ones can be
>> especially helpful if you are trying to implement ioctl and need to
>> copy fixed size data structs from userspace.
> 
> Hmm, we have to be careful here. Generally, when you get a user slice
> via an ioctl, you should make sure to use the length you get from
> userspace. In binder, it looks like this:
> 
> let user_slice = UserSlice::new(arg, _IOC_SIZE(cmd));
> 
> so whichever API we use, we must make sure to get the length as an
> argument in bytes. What should we do if the length is not a multiple
> of size_of(T)?

We could print a warning and then just floor to the next multiple of
`size_of::<T>()`. I agree that is not perfect, but if one uses the
current API, one also needs to do the length check eventually.

> Another issue is that there's no stable way to get the length from a
> `*mut [T]` without creating a reference, which is not okay for a user
> slice.

Seems like `<* const [T]>::len` (feature `slice_ptr_len`) [1] was just
stabilized 5 days ago [1].

[1]: https://doc.rust-lang.org/std/primitive.pointer.html#method.len-1
[2]: https://github.com/rust-lang/rust/pull/123868
Alice Ryhl April 17, 2024, 3:35 p.m. UTC | #11
On Wed, Apr 17, 2024 at 5:27 PM Benno Lossin <benno.lossin@proton.me> wrote:
>
> On 17.04.24 16:40, Alice Ryhl wrote:
> > On Wed, Apr 17, 2024 at 4:28 PM Gary Guo <gary@garyguo.net> wrote:
> >>
> >> On Mon, 15 Apr 2024 07:13:53 +0000
> >> Alice Ryhl <aliceryhl@google.com> wrote:
> >>
> >>> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> >>>
> >>> A pointer to an area in userspace memory, which can be either read-only
> >>> or read-write.
> >>>
> >>> All methods on this struct are safe: attempting to read or write on bad
> >>> addresses (either out of the bound of the slice or unmapped addresses)
> >>> will return `EFAULT`. Concurrent access, *including data races to/from
> >>> userspace memory*, is permitted, because fundamentally another userspace
> >>> thread/process could always be modifying memory at the same time (in the
> >>> same way that userspace Rust's `std::io` permits data races with the
> >>> contents of files on disk). In the presence of a race, the exact byte
> >>> values read/written are unspecified but the operation is well-defined.
> >>> Kernelspace code should validate its copy of data after completing a
> >>> read, and not expect that multiple reads of the same address will return
> >>> the same value.
> >>>
> >>> These APIs are designed to make it difficult to accidentally write
> >>> TOCTOU bugs. Every time you read from a memory location, the pointer is
> >>> advanced by the length so that you cannot use that reader to read the
> >>> same memory location twice. Preventing double-fetches avoids TOCTOU
> >>> bugs. This is accomplished by taking `self` by value to prevent
> >>> obtaining multiple readers on a given `UserSlicePtr`, and the readers
> >>> only permitting forward reads. If double-fetching a memory location is
> >>> necessary for some reason, then that is done by creating multiple
> >>> readers to the same memory location.
> >>>
> >>> Constructing a `UserSlicePtr` performs no checks on the provided
> >>> address and length, it can safely be constructed inside a kernel thread
> >>> with no current userspace process. Reads and writes wrap the kernel APIs
> >>> `copy_from_user` and `copy_to_user`, which check the memory map of the
> >>> current process and enforce that the address range is within the user
> >>> range (no additional calls to `access_ok` are needed).
> >>>
> >>> This code is based on something that was originally written by Wedson on
> >>> the old rust branch. It was modified by Alice by removing the
> >>> `IoBufferReader` and `IoBufferWriter` traits, and various other changes.
> >>>
> >>> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> >>> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> >>> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> >>> ---
> >>>   rust/helpers.c         |  14 +++
> >>>   rust/kernel/lib.rs     |   1 +
> >>>   rust/kernel/uaccess.rs | 304 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>>   3 files changed, 319 insertions(+)
> >>>
> >>> diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
> >>
> >>> +/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
> >>> +/// [`clone_reader`]: UserSliceReader::clone_reader
> >>> +pub struct UserSlice {
> >>> +    ptr: *mut c_void,
> >>> +    length: usize,
> >>> +}
> >>
> >> How useful is the `c_void` in the struct and new signature? They tend
> >> to not be very useful in Rust. Given that provenance doesn't matter
> >> for userspace pointers, could this be `usize` simply?
> >>
> >> I think `*mut u8` or `*mut ()` makes more sense than `*mut c_void` for
> >> Rust code even if we don't want to use `usize`.
> >
> > I don't have a strong opinion here. I suppose a usize could make
> > sense. But I also think c_void is fine, and I lean towards not
> > changing it. :)
> >
> >> Some thinking aloud and brainstorming bits about the API.
> >>
> >> I wonder if it make sense to have `User<[u8]>` instead of `UserSlice`?
> >> The `User` type can be defined like this:
> >>
> >> ```rust
> >> struct User<T: ?Sized> {
> >>     ptr: *mut T,
> >> }
> >> ```
> >>
> >> and this allows arbitrary T as long as it's POD. So we could have
> >> `User<[u8]>`, `User<u32>`, `User<PodStruct>`. I imagine the
> >> `User<[u8]>` would be the general usage and the latter ones can be
> >> especially helpful if you are trying to implement ioctl and need to
> >> copy fixed size data structs from userspace.
> >
> > Hmm, we have to be careful here. Generally, when you get a user slice
> > via an ioctl, you should make sure to use the length you get from
> > userspace. In binder, it looks like this:
> >
> > let user_slice = UserSlice::new(arg, _IOC_SIZE(cmd));
> >
> > so whichever API we use, we must make sure to get the length as an
> > argument in bytes. What should we do if the length is not a multiple
> > of size_of(T)?
>
> We could print a warning and then just floor to the next multiple of
> `size_of::<T>()`. I agree that is not perfect, but if one uses the
> current API, one also needs to do the length check eventually.

Right now, the length check happens when you call `read::<T>` and get
EFAULT if the size of T is greater than the length of the user slice.
That works pretty well. And there are real-world use-cases for
userspace passing in a length longer than what the kernel expects -
often adding fields to the end of the struct is how the kernel makes
ioctls extensible. So I don't think printing a warning in that case
would be good.

In Binder, I also have use-cases where I alternate between reading
bytes and various different structs. Basically, I read two user slices
in lockstep, where the next value in one userslice determines whether
I should read some amount of bytes or a specific struct from the other
user slice. That's much easier with the current API than this
proposal.

> > Another issue is that there's no stable way to get the length from a
> > `*mut [T]` without creating a reference, which is not okay for a user
> > slice.
>
> Seems like `<* const [T]>::len` (feature `slice_ptr_len`) [1] was just
> stabilized 5 days ago [1].
>
> [1]: https://doc.rust-lang.org/std/primitive.pointer.html#method.len-1
> [2]: https://github.com/rust-lang/rust/pull/123868

Okay.

Alice
David Laight April 21, 2024, 6:08 p.m. UTC | #12
Should you be implementing 'struct iov_iter' ?

Even if it means creating an IO_UBUF for ioctls?
(Although that might take some 'fettling' for read+write for ioctls.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Alice Ryhl April 21, 2024, 6:37 p.m. UTC | #13
On Sun, Apr 21, 2024 at 8:08 PM David Laight <David.Laight@aculab.com> wrote:
>
> Should you be implementing 'struct iov_iter' ?
>
> Even if it means creating an IO_UBUF for ioctls?
> (Although that might take some 'fettling' for read+write for ioctls.)

That seems to be intended for when you have several chunks of memory
in userspace that you want to treat as one contiguous chunk. That's
not something I need in the Android Binder driver.

Alice
David Laight April 21, 2024, 7:48 p.m. UTC | #14
From: Alice Ryhl
> Sent: 21 April 2024 19:38
> 
> On Sun, Apr 21, 2024 at 8:08 PM David Laight <David.Laight@aculab.com> wrote:
> >
> > Should you be implementing 'struct iov_iter' ?
> >
> > Even if it means creating an IO_UBUF for ioctls?
> > (Although that might take some 'fettling' for read+write for ioctls.)
> 
> That seems to be intended for when you have several chunks of memory
> in userspace that you want to treat as one contiguous chunk. That's
> not something I need in the Android Binder driver.

It also transparently supports in-kernel users and some other cases.

I think there is a patch intended for 6.10 that removes the 'read'
and 'write' driver 'ops' and requires that drivers support 'read_iter'
and 'write_iter'.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Alice Ryhl April 22, 2024, 6:31 a.m. UTC | #15
On Sun, Apr 21, 2024 at 9:49 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Alice Ryhl
> > Sent: 21 April 2024 19:38
> >
> > On Sun, Apr 21, 2024 at 8:08 PM David Laight <David.Laight@aculab.com> wrote:
> > >
> > > Should you be implementing 'struct iov_iter' ?
> > >
> > > Even if it means creating an IO_UBUF for ioctls?
> > > (Although that might take some 'fettling' for read+write for ioctls.)
> >
> > That seems to be intended for when you have several chunks of memory
> > in userspace that you want to treat as one contiguous chunk. That's
> > not something I need in the Android Binder driver.
>
> It also transparently supports in-kernel users and some other cases.
>
> I think there is a patch intended for 6.10 that removes the 'read'
> and 'write' driver 'ops' and requires that drivers support 'read_iter'
> and 'write_iter'.

Binder uses an ioctl, not read/write, so even if Binder could use it,
there's no real advantage for Binder to do so. But it does sound like
something we want to support eventually in the kernel. However, I've
spent a long time on polishing this API, and it fits my needs well. I
don't want to start over now that I am almost finished.

So, I think that this could be a good follow-up to this patch that
someone else is welcome to submit.

Alice
diff mbox series

Patch

diff --git a/rust/helpers.c b/rust/helpers.c
index 70e59efd92bc..312b6fcb49d5 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -38,6 +38,20 @@  __noreturn void rust_helper_BUG(void)
 }
 EXPORT_SYMBOL_GPL(rust_helper_BUG);
 
+unsigned long rust_helper_copy_from_user(void *to, const void __user *from,
+					 unsigned long n)
+{
+	return copy_from_user(to, from, n);
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_from_user);
+
+unsigned long rust_helper_copy_to_user(void __user *to, const void *from,
+				       unsigned long n)
+{
+	return copy_to_user(to, from, n);
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_to_user);
+
 void rust_helper_mutex_lock(struct mutex *lock)
 {
 	mutex_lock(lock);
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index be68d5e567b1..37f84223b83f 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -49,6 +49,7 @@ 
 pub mod task;
 pub mod time;
 pub mod types;
+pub mod uaccess;
 pub mod workqueue;
 
 #[doc(hidden)]
diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
new file mode 100644
index 000000000000..c97029cdeba1
--- /dev/null
+++ b/rust/kernel/uaccess.rs
@@ -0,0 +1,304 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+//! Slices to user space memory regions.
+//!
+//! C header: [`include/linux/uaccess.h`](srctree/include/linux/uaccess.h)
+
+use crate::{bindings, error::code::*, error::Result};
+use alloc::vec::Vec;
+use core::ffi::{c_ulong, c_void};
+use core::mem::MaybeUninit;
+
+/// A pointer to an area in userspace memory, which can be either read-only or read-write.
+///
+/// All methods on this struct are safe: attempting to read or write on bad addresses (either out of
+/// the bound of the slice or unmapped addresses) will return `EFAULT`. Concurrent access,
+/// *including data races to/from userspace memory*, is permitted, because fundamentally another
+/// userspace thread/process could always be modifying memory at the same time (in the same way that
+/// userspace Rust's [`std::io`] permits data races with the contents of files on disk). In the
+/// presence of a race, the exact byte values read/written are unspecified but the operation is
+/// well-defined. Kernelspace code should validate its copy of data after completing a read, and not
+/// expect that multiple reads of the same address will return the same value.
+///
+/// These APIs are designed to make it difficult to accidentally write TOCTOU (time-of-check to
+/// time-of-use) bugs. Every time a memory location is read, the reader's position is advanced by
+/// the read length and the next read will start from there. This helps prevent accidentally reading
+/// the same location twice and causing a TOCTOU bug.
+///
+/// Creating a [`UserSliceReader`] and/or [`UserSliceWriter`] consumes the `UserSlice`, helping
+/// ensure that there aren't multiple readers or writers to the same location.
+///
+/// If double-fetching a memory location is necessary for some reason, then that is done by creating
+/// multiple readers to the same memory location, e.g. using [`clone_reader`].
+///
+/// # Examples
+///
+/// Takes a region of userspace memory from the current process, and modify it by adding one to
+/// every byte in the region.
+///
+/// ```no_run
+/// use alloc::vec::Vec;
+/// use core::ffi::c_void;
+/// use kernel::error::Result;
+/// use kernel::uaccess::UserSlice;
+///
+/// fn bytes_add_one(uptr: *mut c_void, len: usize) -> Result<()> {
+///     let (read, mut write) = UserSlice::new(uptr, len).reader_writer();
+///
+///     let mut buf = Vec::new();
+///     read.read_all(&mut buf)?;
+///
+///     for b in &mut buf {
+///         *b = b.wrapping_add(1);
+///     }
+///
+///     write.write_slice(&buf)?;
+///     Ok(())
+/// }
+/// ```
+///
+/// Example illustrating a TOCTOU (time-of-check to time-of-use) bug.
+///
+/// ```no_run
+/// use alloc::vec::Vec;
+/// use core::ffi::c_void;
+/// use kernel::error::{code::EINVAL, Result};
+/// use kernel::uaccess::UserSlice;
+///
+/// /// Returns whether the data in this region is valid.
+/// fn is_valid(uptr: *mut c_void, len: usize) -> Result<bool> {
+///     let read = UserSlice::new(uptr, len).reader();
+///
+///     let mut buf = Vec::new();
+///     read.read_all(&mut buf)?;
+///
+///     todo!()
+/// }
+///
+/// /// Returns the bytes behind this user pointer if they are valid.
+/// fn get_bytes_if_valid(uptr: *mut c_void, len: usize) -> Result<Vec<u8>> {
+///     if !is_valid(uptr, len)? {
+///         return Err(EINVAL);
+///     }
+///
+///     let read = UserSlice::new(uptr, len).reader();
+///
+///     let mut buf = Vec::new();
+///     read.read_all(&mut buf)?;
+///
+///     // THIS IS A BUG! The bytes could have changed since we checked them.
+///     //
+///     // To avoid this kind of bug, don't call `UserSlice::new` multiple
+///     // times with the same address.
+///     Ok(buf)
+/// }
+/// ```
+///
+/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
+/// [`clone_reader`]: UserSliceReader::clone_reader
+pub struct UserSlice {
+    ptr: *mut c_void,
+    length: usize,
+}
+
+impl UserSlice {
+    /// Constructs a user slice from a raw pointer and a length in bytes.
+    ///
+    /// Constructing a [`UserSlice`] performs no checks on the provided address and length, it can
+    /// safely be constructed inside a kernel thread with no current userspace process. Reads and
+    /// writes wrap the kernel APIs `copy_from_user` and `copy_to_user`, which check the memory map
+    /// of the current process and enforce that the address range is within the user range (no
+    /// additional calls to `access_ok` are needed).
+    ///
+    /// Callers must be careful to avoid time-of-check-time-of-use (TOCTOU) issues. The simplest way
+    /// is to create a single instance of [`UserSlice`] per user memory block as it reads each byte
+    /// at most once.
+    pub fn new(ptr: *mut c_void, length: usize) -> Self {
+        UserSlice { ptr, length }
+    }
+
+    /// Reads the entirety of the user slice, appending it to the end of the provided buffer.
+    ///
+    /// Fails with `EFAULT` if the read happens on a bad address.
+    pub fn read_all(self, buf: &mut Vec<u8>) -> Result {
+        self.reader().read_all(buf)
+    }
+
+    /// Constructs a [`UserSliceReader`].
+    pub fn reader(self) -> UserSliceReader {
+        UserSliceReader {
+            ptr: self.ptr,
+            length: self.length,
+        }
+    }
+
+    /// Constructs a [`UserSliceWriter`].
+    pub fn writer(self) -> UserSliceWriter {
+        UserSliceWriter {
+            ptr: self.ptr,
+            length: self.length,
+        }
+    }
+
+    /// Constructs both a [`UserSliceReader`] and a [`UserSliceWriter`].
+    ///
+    /// Usually when this is used, you will first read the data, and then overwrite it afterwards.
+    pub fn reader_writer(self) -> (UserSliceReader, UserSliceWriter) {
+        (
+            UserSliceReader {
+                ptr: self.ptr,
+                length: self.length,
+            },
+            UserSliceWriter {
+                ptr: self.ptr,
+                length: self.length,
+            },
+        )
+    }
+}
+
+/// A reader for [`UserSlice`].
+///
+/// Used to incrementally read from the user slice.
+pub struct UserSliceReader {
+    ptr: *mut c_void,
+    length: usize,
+}
+
+impl UserSliceReader {
+    /// Skip the provided number of bytes.
+    ///
+    /// Returns an error if skipping more than the length of the buffer.
+    pub fn skip(&mut self, num_skip: usize) -> Result {
+        // Update `self.length` first since that's the fallible part of this operation.
+        self.length = self.length.checked_sub(num_skip).ok_or(EFAULT)?;
+        self.ptr = self.ptr.wrapping_byte_add(num_skip);
+        Ok(())
+    }
+
+    /// Create a reader that can access the same range of data.
+    ///
+    /// Reading from the clone does not advance the current reader.
+    ///
+    /// The caller should take care to not introduce TOCTOU issues, as described in the
+    /// documentation for [`UserSlice`].
+    pub fn clone_reader(&self) -> UserSliceReader {
+        UserSliceReader {
+            ptr: self.ptr,
+            length: self.length,
+        }
+    }
+
+    /// Returns the number of bytes left to be read from this reader.
+    ///
+    /// Note that even reading less than this number of bytes may fail.
+    pub fn len(&self) -> usize {
+        self.length
+    }
+
+    /// Returns `true` if no data is available in the io buffer.
+    pub fn is_empty(&self) -> bool {
+        self.length == 0
+    }
+
+    /// Reads raw data from the user slice into a kernel buffer.
+    ///
+    /// After a successful call to this method, all bytes in `out` are initialized.
+    ///
+    /// Fails with `EFAULT` if the read happens on a bad address.
+    pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
+        let len = out.len();
+        let out_ptr = out.as_mut_ptr().cast::<c_void>();
+        if len > self.length {
+            return Err(EFAULT);
+        }
+        let Ok(len_ulong) = c_ulong::try_from(len) else {
+            return Err(EFAULT);
+        };
+        // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
+        // that many bytes to it.
+        let res = unsafe { bindings::copy_from_user(out_ptr, self.ptr, len_ulong) };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Userspace pointers are not directly dereferencable by the kernel, so we cannot use `add`,
+        // which has C-style rules for defined behavior.
+        self.ptr = self.ptr.wrapping_byte_add(len);
+        self.length -= len;
+        Ok(())
+    }
+
+    /// Reads raw data from the user slice into a kernel buffer.
+    ///
+    /// Fails with `EFAULT` if the read happens on a bad address.
+    pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
+        // SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
+        // `out`.
+        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
+        self.read_raw(out)
+    }
+
+    /// Reads the entirety of the user slice, appending it to the end of the provided buffer.
+    ///
+    /// Fails with `EFAULT` if the read happens on a bad address.
+    pub fn read_all(mut self, buf: &mut Vec<u8>) -> Result {
+        let len = self.length;
+        buf.try_reserve(len)?;
+
+        // The call to `try_reserve` was successful, so the spare capacity is at least `len` bytes
+        // long.
+        self.read_raw(&mut buf.spare_capacity_mut()[..len])?;
+
+        // SAFETY: Since the call to `read_raw` was successful, so the next `len` bytes of the
+        // vector have been initialized.
+        unsafe { buf.set_len(buf.len() + len) };
+        Ok(())
+    }
+}
+
+/// A writer for [`UserSlice`].
+///
+/// Used to incrementally write into the user slice.
+pub struct UserSliceWriter {
+    ptr: *mut c_void,
+    length: usize,
+}
+
+impl UserSliceWriter {
+    /// Returns the amount of space remaining in this buffer.
+    ///
+    /// Note that even writing less than this number of bytes may fail.
+    pub fn len(&self) -> usize {
+        self.length
+    }
+
+    /// Returns `true` if no more data can be written to this buffer.
+    pub fn is_empty(&self) -> bool {
+        self.length == 0
+    }
+
+    /// Writes raw data to this user pointer from a kernel buffer.
+    ///
+    /// Fails with `EFAULT` if the write happens on a bad address.
+    pub fn write_slice(&mut self, data: &[u8]) -> Result {
+        let len = data.len();
+        let data_ptr = data.as_ptr().cast::<c_void>();
+        if len > self.length {
+            return Err(EFAULT);
+        }
+        let Ok(len_ulong) = c_ulong::try_from(len) else {
+            return Err(EFAULT);
+        };
+        // SAFETY: `data_ptr` points into an immutable slice of length `len_ulong`, so we may read
+        // that many bytes from it.
+        let res = unsafe { bindings::copy_to_user(self.ptr, data_ptr, len_ulong) };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Userspace pointers are not directly dereferencable by the kernel, so
+        // we cannot use `add`, which has C-style rules for defined behavior.
+        self.ptr = self.ptr.wrapping_byte_add(len);
+        self.length -= len;
+        Ok(())
+    }
+}