diff mbox series

[v3,17/25] rust: alloc: implement `collect` for `IntoIter`

Message ID 20240801000641.1882-18-dakr@kernel.org (mailing list archive)
State New
Headers show
Series Generic `Allocator` support for Rust | expand

Commit Message

Danilo Krummrich Aug. 1, 2024, 12:02 a.m. UTC
Currently, we can't implement `FromIterator`. There are a couple of
issues with this trait in the kernel, namely:

  - Rust's specialization feature is unstable. This prevents us to
    optimze for the special case where `I::IntoIter` equals `Vec`'s
    `IntoIter` type.
  - We also can't use `I::IntoIter`'s type ID either to work around this,
    since `FromIterator` doesn't require this type to be `'static`.
  - `FromIterator::from_iter` does return `Self` instead of
    `Result<Self, AllocError>`, hence we can't properly handle allocation
    failures.
  - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle
    additional allocation flags.

Instead, provide `IntoIter::collect`, such that we can at least convert
`IntoIter` into a `Vec` again.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/alloc/kvec.rs | 78 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

Comments

Alice Ryhl Aug. 1, 2024, 3:10 p.m. UTC | #1
On Thu, Aug 1, 2024 at 2:08 AM Danilo Krummrich <dakr@kernel.org> wrote:
>
> Currently, we can't implement `FromIterator`. There are a couple of
> issues with this trait in the kernel, namely:
>
>   - Rust's specialization feature is unstable. This prevents us to
>     optimze for the special case where `I::IntoIter` equals `Vec`'s
>     `IntoIter` type.
>   - We also can't use `I::IntoIter`'s type ID either to work around this,
>     since `FromIterator` doesn't require this type to be `'static`.
>   - `FromIterator::from_iter` does return `Self` instead of
>     `Result<Self, AllocError>`, hence we can't properly handle allocation
>     failures.
>   - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle
>     additional allocation flags.
>
> Instead, provide `IntoIter::collect`, such that we can at least convert
> `IntoIter` into a `Vec` again.
>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

I'm not convinced a collect implementation specific to IntoIter is necessary?

> +
> +        // SAFETY: `buf` points to the start of the backing buffer and `len` is guaranteed to be
> +        // smaller than `cap`. Depending on `alloc` this operation may shrink the buffer or leaves
> +        // it as it is.
> +        ptr = match unsafe { A::realloc(Some(buf.cast()), layout, flags) } {

Why would you shrink it? You can just keep the capacity.

Alice
Danilo Krummrich Aug. 1, 2024, 3:37 p.m. UTC | #2
On Thu, Aug 01, 2024 at 05:10:22PM +0200, Alice Ryhl wrote:
> On Thu, Aug 1, 2024 at 2:08 AM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > Currently, we can't implement `FromIterator`. There are a couple of
> > issues with this trait in the kernel, namely:
> >
> >   - Rust's specialization feature is unstable. This prevents us to
> >     optimze for the special case where `I::IntoIter` equals `Vec`'s
> >     `IntoIter` type.
> >   - We also can't use `I::IntoIter`'s type ID either to work around this,
> >     since `FromIterator` doesn't require this type to be `'static`.
> >   - `FromIterator::from_iter` does return `Self` instead of
> >     `Result<Self, AllocError>`, hence we can't properly handle allocation
> >     failures.
> >   - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle
> >     additional allocation flags.
> >
> > Instead, provide `IntoIter::collect`, such that we can at least convert
> > `IntoIter` into a `Vec` again.
> >
> > Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> 
> I'm not convinced a collect implementation specific to IntoIter is necessary?

For the reasons above, we can't implement `FromIterator`. At some point we may
want to implement our own kernel `FromIterator` trait, but that's out of scope
for this series.

For now, I just want to provide a way to get a `Vec` from `IntoIter` again,
which without `Vec::collect` would be impossible.

> 
> > +
> > +        // SAFETY: `buf` points to the start of the backing buffer and `len` is guaranteed to be
> > +        // smaller than `cap`. Depending on `alloc` this operation may shrink the buffer or leaves
> > +        // it as it is.
> > +        ptr = match unsafe { A::realloc(Some(buf.cast()), layout, flags) } {
> 
> Why would you shrink it? You can just keep the capacity.

What if the vector was pretty huge and meanwhile as advanced by a lot? I think
we don't want to waste those resources.

Ideally the corresponding `Allocator` implements a proper heuristic for when to
actually shrink. For instance, krealloc() never shrinks, and it's probably not
worth it. For vrealloc() though we clearly want to shrink properly (i.e. unmap
and free spare pages) at some point.

> 
> Alice
>
Alice Ryhl Aug. 2, 2024, 7:08 a.m. UTC | #3
On Thu, Aug 1, 2024 at 5:37 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu, Aug 01, 2024 at 05:10:22PM +0200, Alice Ryhl wrote:
> > On Thu, Aug 1, 2024 at 2:08 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > >
> > > Currently, we can't implement `FromIterator`. There are a couple of
> > > issues with this trait in the kernel, namely:
> > >
> > >   - Rust's specialization feature is unstable. This prevents us to
> > >     optimze for the special case where `I::IntoIter` equals `Vec`'s
> > >     `IntoIter` type.
> > >   - We also can't use `I::IntoIter`'s type ID either to work around this,
> > >     since `FromIterator` doesn't require this type to be `'static`.
> > >   - `FromIterator::from_iter` does return `Self` instead of
> > >     `Result<Self, AllocError>`, hence we can't properly handle allocation
> > >     failures.
> > >   - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle
> > >     additional allocation flags.
> > >
> > > Instead, provide `IntoIter::collect`, such that we can at least convert
> > > `IntoIter` into a `Vec` again.
> > >
> > > Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> >
> > I'm not convinced a collect implementation specific to IntoIter is necessary?
>
> For the reasons above, we can't implement `FromIterator`. At some point we may
> want to implement our own kernel `FromIterator` trait, but that's out of scope
> for this series.
>
> For now, I just want to provide a way to get a `Vec` from `IntoIter` again,
> which without `Vec::collect` would be impossible.

If you have a need for a collect on this particular kind of iterator, then okay.

> > > +
> > > +        // SAFETY: `buf` points to the start of the backing buffer and `len` is guaranteed to be
> > > +        // smaller than `cap`. Depending on `alloc` this operation may shrink the buffer or leaves
> > > +        // it as it is.
> > > +        ptr = match unsafe { A::realloc(Some(buf.cast()), layout, flags) } {
> >
> > Why would you shrink it? You can just keep the capacity.
>
> What if the vector was pretty huge and meanwhile as advanced by a lot? I think
> we don't want to waste those resources.
>
> Ideally the corresponding `Allocator` implements a proper heuristic for when to
> actually shrink. For instance, krealloc() never shrinks, and it's probably not
> worth it. For vrealloc() though we clearly want to shrink properly (i.e. unmap
> and free spare pages) at some point.

The Rust Vec never shrinks unless explicitly requested. But I guess
it's okay either way.

Alice
Danilo Krummrich Aug. 2, 2024, 12:02 p.m. UTC | #4
On Fri, Aug 02, 2024 at 09:08:48AM +0200, Alice Ryhl wrote:
> On Thu, Aug 1, 2024 at 5:37 PM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Thu, Aug 01, 2024 at 05:10:22PM +0200, Alice Ryhl wrote:
> > > On Thu, Aug 1, 2024 at 2:08 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > >
> > > > Currently, we can't implement `FromIterator`. There are a couple of
> > > > issues with this trait in the kernel, namely:
> > > >
> > > >   - Rust's specialization feature is unstable. This prevents us to
> > > >     optimze for the special case where `I::IntoIter` equals `Vec`'s
> > > >     `IntoIter` type.
> > > >   - We also can't use `I::IntoIter`'s type ID either to work around this,
> > > >     since `FromIterator` doesn't require this type to be `'static`.
> > > >   - `FromIterator::from_iter` does return `Self` instead of
> > > >     `Result<Self, AllocError>`, hence we can't properly handle allocation
> > > >     failures.
> > > >   - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle
> > > >     additional allocation flags.
> > > >
> > > > Instead, provide `IntoIter::collect`, such that we can at least convert
> > > > `IntoIter` into a `Vec` again.
> > > >
> > > > Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> > >
> > > I'm not convinced a collect implementation specific to IntoIter is necessary?
> >
> > For the reasons above, we can't implement `FromIterator`. At some point we may
> > want to implement our own kernel `FromIterator` trait, but that's out of scope
> > for this series.
> >
> > For now, I just want to provide a way to get a `Vec` from `IntoIter` again,
> > which without `Vec::collect` would be impossible.
> 
> If you have a need for a collect on this particular kind of iterator, then okay.

Even once we have our own `FromIterator` trait, we probably want to keep this
specific one besides the generic `collect` for optimization. With the generic
one we'd otherwise copy into a new `Vec`.

> 
> > > > +
> > > > +        // SAFETY: `buf` points to the start of the backing buffer and `len` is guaranteed to be
> > > > +        // smaller than `cap`. Depending on `alloc` this operation may shrink the buffer or leaves
> > > > +        // it as it is.
> > > > +        ptr = match unsafe { A::realloc(Some(buf.cast()), layout, flags) } {
> > >
> > > Why would you shrink it? You can just keep the capacity.
> >
> > What if the vector was pretty huge and meanwhile as advanced by a lot? I think
> > we don't want to waste those resources.
> >
> > Ideally the corresponding `Allocator` implements a proper heuristic for when to
> > actually shrink. For instance, krealloc() never shrinks, and it's probably not
> > worth it. For vrealloc() though we clearly want to shrink properly (i.e. unmap
> > and free spare pages) at some point.
> 
> The Rust Vec never shrinks unless explicitly requested. But I guess
> it's okay either way.

It actually does, see [1]. But Rust's `Vec` does it less efficiently. It either
keeps the `Vec` as it is, or creates a whole new one.

[1] https://github.com/rust-lang/rust/blob/master/library/alloc/src/vec/spec_from_iter.rs#L39

> 
> Alice
>
Alice Ryhl Aug. 2, 2024, 12:08 p.m. UTC | #5
On Fri, Aug 2, 2024 at 2:02 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Fri, Aug 02, 2024 at 09:08:48AM +0200, Alice Ryhl wrote:
> > On Thu, Aug 1, 2024 at 5:37 PM Danilo Krummrich <dakr@kernel.org> wrote:
> > >
> > > On Thu, Aug 01, 2024 at 05:10:22PM +0200, Alice Ryhl wrote:
> > > > On Thu, Aug 1, 2024 at 2:08 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > >
> > > > > Currently, we can't implement `FromIterator`. There are a couple of
> > > > > issues with this trait in the kernel, namely:
> > > > >
> > > > >   - Rust's specialization feature is unstable. This prevents us to
> > > > >     optimze for the special case where `I::IntoIter` equals `Vec`'s
> > > > >     `IntoIter` type.
> > > > >   - We also can't use `I::IntoIter`'s type ID either to work around this,
> > > > >     since `FromIterator` doesn't require this type to be `'static`.
> > > > >   - `FromIterator::from_iter` does return `Self` instead of
> > > > >     `Result<Self, AllocError>`, hence we can't properly handle allocation
> > > > >     failures.
> > > > >   - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle
> > > > >     additional allocation flags.
> > > > >
> > > > > Instead, provide `IntoIter::collect`, such that we can at least convert
> > > > > `IntoIter` into a `Vec` again.
> > > > >
> > > > > Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> > > >
> > > > I'm not convinced a collect implementation specific to IntoIter is necessary?
> > >
> > > For the reasons above, we can't implement `FromIterator`. At some point we may
> > > want to implement our own kernel `FromIterator` trait, but that's out of scope
> > > for this series.
> > >
> > > For now, I just want to provide a way to get a `Vec` from `IntoIter` again,
> > > which without `Vec::collect` would be impossible.
> >
> > If you have a need for a collect on this particular kind of iterator, then okay.
>
> Even once we have our own `FromIterator` trait, we probably want to keep this
> specific one besides the generic `collect` for optimization. With the generic
> one we'd otherwise copy into a new `Vec`.
>
> >
> > > > > +
> > > > > +        // SAFETY: `buf` points to the start of the backing buffer and `len` is guaranteed to be
> > > > > +        // smaller than `cap`. Depending on `alloc` this operation may shrink the buffer or leaves
> > > > > +        // it as it is.
> > > > > +        ptr = match unsafe { A::realloc(Some(buf.cast()), layout, flags) } {
> > > >
> > > > Why would you shrink it? You can just keep the capacity.
> > >
> > > What if the vector was pretty huge and meanwhile as advanced by a lot? I think
> > > we don't want to waste those resources.
> > >
> > > Ideally the corresponding `Allocator` implements a proper heuristic for when to
> > > actually shrink. For instance, krealloc() never shrinks, and it's probably not
> > > worth it. For vrealloc() though we clearly want to shrink properly (i.e. unmap
> > > and free spare pages) at some point.
> >
> > The Rust Vec never shrinks unless explicitly requested. But I guess
> > it's okay either way.
>
> It actually does, see [1]. But Rust's `Vec` does it less efficiently. It either
> keeps the `Vec` as it is, or creates a whole new one.
>
> [1] https://github.com/rust-lang/rust/blob/master/library/alloc/src/vec/spec_from_iter.rs#L39

Huh, surprising.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>

Alice
diff mbox series

Patch

diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
index 50e7705e5686..6f151ef5c988 100644
--- a/rust/kernel/alloc/kvec.rs
+++ b/rust/kernel/alloc/kvec.rs
@@ -636,6 +636,84 @@  impl<T, A> IntoIter<T, A>
     fn as_raw_mut_slice(&mut self) -> *mut [T] {
         ptr::slice_from_raw_parts_mut(self.ptr, self.len)
     }
+
+    fn into_raw_parts(self) -> (*mut T, NonNull<T>, usize, usize) {
+        let me = ManuallyDrop::new(self);
+        let ptr = me.ptr;
+        let buf = me.buf;
+        let len = me.len;
+        let cap = me.cap;
+        (ptr, buf, len, cap)
+    }
+
+    /// Same as `Iterator::collect` but specialized for `Vec`'s `IntoIter`.
+    ///
+    /// Currently, we can't implement `FromIterator`. There are a couple of issues with this trait
+    /// in the kernel, namely:
+    ///
+    /// - Rust's specialization feature is unstable. This prevents us to optimze for the special
+    ///   case where `I::IntoIter` equals `Vec`'s `IntoIter` type.
+    /// - We also can't use `I::IntoIter`'s type ID either to work around this, since `FromIterator`
+    ///   doesn't require this type to be `'static`.
+    /// - `FromIterator::from_iter` does return `Self` instead of `Result<Self, AllocError>`, hence
+    ///   we can't properly handle allocation failures.
+    /// - Neither `Iterator::collect` nor `FromIterator::from_iter` can handle additional allocation
+    ///   flags.
+    ///
+    /// Instead, provide `IntoIter::collect`, such that we can at least convert a `IntoIter` into a
+    /// `Vec` again.
+    ///
+    /// Note that `IntoIter::collect` doesn't require `Flags`, since it re-uses the existing backing
+    /// buffer. However, this backing buffer may be shrunk to the actual count of elements.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// let v = kernel::kvec![1, 2, 3]?;
+    /// let mut it = v.into_iter();
+    ///
+    /// assert_eq!(it.next(), Some(1));
+    ///
+    /// let v = it.collect(GFP_KERNEL);
+    /// assert_eq!(v, [2, 3]);
+    ///
+    /// # Ok::<(), Error>(())
+    /// ```
+    pub fn collect(self, flags: Flags) -> Vec<T, A> {
+        let (mut ptr, buf, len, mut cap) = self.into_raw_parts();
+        let has_advanced = ptr != buf.as_ptr();
+
+        if has_advanced {
+            // SAFETY: Copy the contents we have advanced to at the beginning of the buffer.
+            // `ptr` is guaranteed to be between `buf` and `buf.add(cap)` and `ptr.add(len)` is
+            // guaranteed to be smaller than `buf.add(cap)`.
+            unsafe { ptr::copy(ptr, buf.as_ptr(), len) };
+            ptr = buf.as_ptr();
+        }
+
+        // This can never fail, `len` is guaranteed to be smaller than `cap`.
+        let layout = core::alloc::Layout::array::<T>(len).unwrap();
+
+        // SAFETY: `buf` points to the start of the backing buffer and `len` is guaranteed to be
+        // smaller than `cap`. Depending on `alloc` this operation may shrink the buffer or leaves
+        // it as it is.
+        ptr = match unsafe { A::realloc(Some(buf.cast()), layout, flags) } {
+            // If we fail to shrink, which likely can't even happen, continue with the existing
+            // buffer.
+            Err(_) => ptr,
+            Ok(ptr) => {
+                cap = len;
+                ptr.as_ptr().cast()
+            }
+        };
+
+        // SAFETY: If the iterator has been advanced, the advanced elements have been copied to
+        // the beginning of the buffer and `len` has been adjusted accordingly. `ptr` is guaranteed
+        // to point to the start of the backing buffer. `cap` is either the original capacity or,
+        // after shrinking the buffer, equal to `len`. `alloc` is guaranteed to be unchanged since
+        // `into_iter` has been called on the original `Vec`.
+        unsafe { Vec::from_raw_parts(ptr, len, cap) }
+    }
 }
 
 impl<T, A> Iterator for IntoIter<T, A>