diff mbox series

[v8,2/2] rust: xarray: Add an abstraction for XArray

Message ID 20240309235927.168915-4-mcanal@igalia.com (mailing list archive)
State New
Headers show
Series rust: xarray: Add an abstraction for XArray | expand

Commit Message

Maíra Canal March 9, 2024, 11:57 p.m. UTC
From: Asahi Lina <lina@asahilina.net>

The XArray is an abstract data type which behaves like a very large
array of pointers. Add a Rust abstraction for this data type.

The initial implementation uses explicit locking on get operations and
returns a guard which blocks mutation, ensuring that the referenced
object remains alive. To avoid excessive serialization, users are
expected to use an inner type that can be efficiently cloned (such as
Arc<T>), and eagerly clone and drop the guard to unblock other users
after a lookup.

Future variants may support using RCU instead to avoid mutex locking.

This abstraction also introduces a reservation mechanism, which can be
used by alloc-capable XArrays to reserve a free slot without immediately
filling it, and then do so at a later time. If the reservation is
dropped without being filled, the slot is freed again for other users,
which eliminates the need for explicit cleanup code.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Co-developed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com>
---
 rust/bindings/bindings_helper.h |  17 ++
 rust/helpers.c                  |  37 +++
 rust/kernel/lib.rs              |   1 +
 rust/kernel/xarray.rs           | 407 ++++++++++++++++++++++++++++++++
 4 files changed, 462 insertions(+)
 create mode 100644 rust/kernel/xarray.rs

Comments

Alice Ryhl March 18, 2024, 12:10 p.m. UTC | #1
On Sun, Mar 10, 2024 at 1:00 AM Maíra Canal <mcanal@igalia.com> wrote:
>
> From: Asahi Lina <lina@asahilina.net>
>
> The XArray is an abstract data type which behaves like a very large
> array of pointers. Add a Rust abstraction for this data type.
>
> The initial implementation uses explicit locking on get operations and
> returns a guard which blocks mutation, ensuring that the referenced
> object remains alive. To avoid excessive serialization, users are
> expected to use an inner type that can be efficiently cloned (such as
> Arc<T>), and eagerly clone and drop the guard to unblock other users
> after a lookup.
>
> Future variants may support using RCU instead to avoid mutex locking.
>
> This abstraction also introduces a reservation mechanism, which can be
> used by alloc-capable XArrays to reserve a free slot without immediately
> filling it, and then do so at a later time. If the reservation is
> dropped without being filled, the slot is freed again for other users,
> which eliminates the need for explicit cleanup code.
>
> Signed-off-by: Asahi Lina <lina@asahilina.net>
> Co-developed-by: Maíra Canal <mcanal@igalia.com>
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com>

Overall looks good to me.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>

> +        if ret < 0 {
> +            Err(Error::from_errno(ret))
> +        } else {
> +            guard.dismiss();
> +            Ok(id as usize)
> +        }

You could make this easier to read using to_result.

to_result(ret)?;
guard.dismiss();
Ok(id as usize)

Alice
Philipp Stanner March 19, 2024, 9:32 a.m. UTC | #2
On Mon, 2024-03-18 at 13:10 +0100, Alice Ryhl wrote:
> On Sun, Mar 10, 2024 at 1:00 AM Maíra Canal <mcanal@igalia.com>
> wrote:
> > 
> > From: Asahi Lina <lina@asahilina.net>
> > 
> > The XArray is an abstract data type which behaves like a very large
> > array of pointers. Add a Rust abstraction for this data type.
> > 
> > The initial implementation uses explicit locking on get operations
> > and
> > returns a guard which blocks mutation, ensuring that the referenced
> > object remains alive. To avoid excessive serialization, users are
> > expected to use an inner type that can be efficiently cloned (such
> > as
> > Arc<T>), and eagerly clone and drop the guard to unblock other
> > users
> > after a lookup.
> > 
> > Future variants may support using RCU instead to avoid mutex
> > locking.
> > 
> > This abstraction also introduces a reservation mechanism, which can
> > be
> > used by alloc-capable XArrays to reserve a free slot without
> > immediately
> > filling it, and then do so at a later time. If the reservation is
> > dropped without being filled, the slot is freed again for other
> > users,
> > which eliminates the need for explicit cleanup code.
> > 
> > Signed-off-by: Asahi Lina <lina@asahilina.net>
> > Co-developed-by: Maíra Canal <mcanal@igalia.com>
> > Signed-off-by: Maíra Canal <mcanal@igalia.com>
> > Reviewed-by: Andreas Hindborg <a.hindborg@samsung.com>
> 
> Overall looks good to me.
> 
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> 
> > +        if ret < 0 {
> > +            Err(Error::from_errno(ret))
> > +        } else {
> > +            guard.dismiss();
> > +            Ok(id as usize)
> > +        }
> 
> You could make this easier to read using to_result.
> 
> to_result(ret)?;
> guard.dismiss();
> Ok(id as usize)

My 2 cents, I'd go for classic kernel style:


if ret < 0 {
    return Err(...);
}

guard.dismiss();
Ok(id as usize)


P.

> 
> Alice
>
John Hubbard March 22, 2024, 12:22 a.m. UTC | #3
On 3/19/24 2:32 AM, Philipp Stanner wrote:
...
>>
>>> +        if ret < 0 {
>>> +            Err(Error::from_errno(ret))
>>> +        } else {
>>> +            guard.dismiss();
>>> +            Ok(id as usize)
>>> +        }
>>
>> You could make this easier to read using to_result.
>>
>> to_result(ret)?;
>> guard.dismiss();
>> Ok(id as usize)
> 
> My 2 cents, I'd go for classic kernel style:
> 
> 
> if ret < 0 {
>      return Err(...);
> }
> 
> guard.dismiss();
> Ok(id as usize)
> 

Yes.

As a "standard" C-based kernel person who is trying to move into Rust
for Linux, I hereby invoke my Rust-newbie powers to confirm that the
"clasic kernel style" above goes into my head much faster and easier,
yes. :)

I hope I'm not violating any "this is how Rust idioms must be"
conventions. But if not, then all other things being equal, it is of
course a nice touch to make the code more readable to the rest of the
kernel folks.


thanks,
Boqun Feng March 22, 2024, 1:20 a.m. UTC | #4
On Thu, Mar 21, 2024 at 05:22:11PM -0700, John Hubbard wrote:
> On 3/19/24 2:32 AM, Philipp Stanner wrote:
> ...
> > > 
> > > > +        if ret < 0 {
> > > > +            Err(Error::from_errno(ret))
> > > > +        } else {
> > > > +            guard.dismiss();
> > > > +            Ok(id as usize)
> > > > +        }
> > > 
> > > You could make this easier to read using to_result.
> > > 
> > > to_result(ret)?;
> > > guard.dismiss();
> > > Ok(id as usize)
> > 
> > My 2 cents, I'd go for classic kernel style:
> > 
> > 
> > if ret < 0 {
> >      return Err(...);
> > }
> > 
> > guard.dismiss();
> > Ok(id as usize)
> > 
> 
> Yes.
> 
> As a "standard" C-based kernel person who is trying to move into Rust
> for Linux, I hereby invoke my Rust-newbie powers to confirm that the
> "clasic kernel style" above goes into my head much faster and easier,
> yes. :)
> 

Hope I'm still belong to a "standard C-based kernel person" ;-) My
problem on "if ret < 0 { ... }" is: what's the type of "ret", and is it
"negative means failure" or "non-zero means failure"? Now for this
particular code, the assignment of "ret" and the use of "ret" is pretty
close, so it doesn't matter. But in the code where "ret" is used for
multiple function calls and there is code in-between, then I'd prefer we
use `to_result` (i.e. `Result` type and question mark operator).

> I hope I'm not violating any "this is how Rust idioms must be"
> conventions. But if not, then all other things being equal, it is of
> course a nice touch to make the code more readable to the rest of the
> kernel folks.
> 

One more extra point from myself only: if one is using Rust for drivers
or subsystem they are going to take care of it in the future, it's
totally fine for them to pick coding styles that they feel comfortable,
I don't want to make people who do the real work feel frustrated because
"this is how Rust idioms must be", also I don't believe tools should
restrict people. But in the "kernel" crate (i.e. for core kernel part),
I want to make it "Rusty" since it's the point of the experiement ("you
asked for it ;-)).

Hope we can find a balanced point when we learn more and more ;-)

Regards,
Boqun

> 
> thanks,
> -- 
> John Hubbard
> NVIDIA
>
John Hubbard March 22, 2024, 1:34 a.m. UTC | #5
On 3/21/24 6:20 PM, Boqun Feng wrote:
> On Thu, Mar 21, 2024 at 05:22:11PM -0700, John Hubbard wrote:
>> On 3/19/24 2:32 AM, Philipp Stanner wrote:
>> ...
>>>>
>>>>> +        if ret < 0 {
>>>>> +            Err(Error::from_errno(ret))
>>>>> +        } else {
>>>>> +            guard.dismiss();
>>>>> +            Ok(id as usize)
>>>>> +        }
>>>>
>>>> You could make this easier to read using to_result.
>>>>
>>>> to_result(ret)?;
>>>> guard.dismiss();
>>>> Ok(id as usize)
>>>
>>> My 2 cents, I'd go for classic kernel style:
>>>
>>>
>>> if ret < 0 {
>>>       return Err(...);
>>> }
>>>
>>> guard.dismiss();
>>> Ok(id as usize)
>>>
>>
>> Yes.
>>
>> As a "standard" C-based kernel person who is trying to move into Rust
>> for Linux, I hereby invoke my Rust-newbie powers to confirm that the
>> "clasic kernel style" above goes into my head much faster and easier,
>> yes. :)
>>
> 
> Hope I'm still belong to a "standard C-based kernel person" ;-) My
> problem on "if ret < 0 { ... }" is: what's the type of "ret", and is it
> "negative means failure" or "non-zero means failure"? Now for this
> particular code, the assignment of "ret" and the use of "ret" is pretty
> close, so it doesn't matter. But in the code where "ret" is used for
> multiple function calls and there is code in-between, then I'd prefer we
> use `to_result` (i.e. `Result` type and question mark operator).

No argument there. I was more focused on the readability of: do a clean
check for failure, and returning directly in that case. And then
continuing on with non-failure-case code. The to_result() part makes
sense and can fit right into that.

  
>> I hope I'm not violating any "this is how Rust idioms must be"
>> conventions. But if not, then all other things being equal, it is of
>> course a nice touch to make the code more readable to the rest of the
>> kernel folks.
>>
> 
> One more extra point from myself only: if one is using Rust for drivers
> or subsystem they are going to take care of it in the future, it's
> totally fine for them to pick coding styles that they feel comfortable,
> I don't want to make people who do the real work feel frustrated because
> "this is how Rust idioms must be", also I don't believe tools should
> restrict people. But in the "kernel" crate (i.e. for core kernel part),
> I want to make it "Rusty" since it's the point of the experiement ("you
> asked for it ;-)).
>

OK, I understand that reasoning.
  
> Hope we can find a balanced point when we learn more and more ;-)

Yes. And to continue with the metaphor, I'm attempting here to put a
light thumb on the scale. :)


thanks,
Benno Lossin March 22, 2024, 11:12 a.m. UTC | #6
On 3/10/24 00:57, Maíra Canal wrote:

[...]

> +// Convenience impl for [`ForeignOwnable`] types whose [`Borrowed`] form
> +// implements [`Deref`].
> +impl<'a, T: ForeignOwnable> Deref for Guard<'a, T>
> +where
> +    T::Borrowed<'a>: Deref,
> +    for<'b> T::Borrowed<'b>: Into<&'b <T::Borrowed<'a> as Deref>::Target>,

I took a look at this again and it currently is not really helpful.
`ArcBorrow<'b, T>` does not implement `Into<&'b T>` and it never is
going to, since it is impossible due to Rust's orphan rules. We have two
options:
1. Remove this impl.
2. Create a `IntoRef` trait that does the job instead and implement it
    for `ArcBorrow`.

If you go for option 2, I think `IntoRef` should look like this:
     pub trait IntoRef<'a, T: ?Sized> {
         fn into_ref(self) -> &'a T;
     }

But this can also easily be done in a future patch, so feel free to
choose option 1.

> +{
> +    type Target = <T::Borrowed<'a> as Deref>::Target;
> +
> +    fn deref(&self) -> &Self::Target {
> +        self.borrow().into()
> +    }
> +}
> +
> +impl<'a, T: ForeignOwnable> Drop for Guard<'a, T> {
> +    fn drop(&mut self) {
> +        // SAFETY: By the type invariant, we own the [`XArray`] lock, so we must
> +        // unlock it here.

I think the "so we must unlock it here" part is unnecessary.

> +        unsafe { bindings::xa_unlock(self.1.xa.get()) };
> +    }
> +}

[...]

> +/// # Examples
> +///
> +/// ```rust
> +/// use kernel::xarray::{XArray, flags};
> +/// use kernel::sync::Arc;
> +///
> +/// struct Foo {
> +///     a: u32,
> +///     b: u32,
> +/// }
> +///
> +/// let arr = Box::pin_init(XArray::<Arc<Foo>>::new(flags::ALLOC1))?;
> +///
> +/// let item = Arc::try_new(Foo { a : 1, b: 2 })?;
> +/// let index = arr.alloc(item)?;
> +///
> +/// if let Some(guard) = arr.get_locked(index) {
> +///     assert_eq!(guard.borrow().a, 1);
> +///     assert_eq!(guard.borrow().b, 2);
> +/// } else {
> +///     pr_info!("No value found in index {}", index);
> +/// }
> +///
> +/// let item = Arc::try_new(Foo { a : 3, b: 4 })?;
> +/// let index = arr.alloc(item)?;
> +///
> +/// if let Some(removed_data) = arr.remove(index) {
> +///     assert_eq!(removed_data.a, 3);
> +///     assert_eq!(removed_data.b, 4);
> +/// } else {
> +///     pr_info!("No value found in index {}", index);
> +/// }
> +/// # Ok::<(), Error>(())
> +/// ```

I think it would make more sense to move these examples into the
documentation of `XArray<T>` itself.

> +impl<T: ForeignOwnable> XArray<T> {
> +    /// Creates a new [`XArray`] with the given flags.
> +    pub fn new(flags: Flags) -> impl PinInit<Self> {
> +        pin_init!(Self {
> +            // SAFETY: `xa` is valid while the closure is called.
> +            xa <- Opaque::ffi_init(|xa| unsafe {
> +                bindings::xa_init_flags(xa, flags)
> +            }),
> +            _p: PhantomData,
> +        })
> +    }
> +
> +    /// Converts [`usize`] to `unsigned long`.
> +    fn to_index(i: usize) -> c_ulong {
> +        // The type is `unsigned long`, which is always the same as `usize` in
> +        // the kernel.
> +        build_assert!(mem::size_of::<usize>() == mem::size_of::<c_ulong>());
> +        i as c_ulong
> +    }
> +
> +    /// Replaces an entry with a new value, returning the old value (if any).
> +    pub fn replace(&self, index: usize, value: T) -> Result<Option<T>> {
> +        let new = value.into_foreign();
> +
> +        build_assert!(T::FOREIGN_ALIGN >= 4);

This looks good. Which unsafe function call required this again? I can't
find it in the safety comments below.

As I wrote in my mail to the first patch, `ForeignOwnable` needs to
guarantee that `T::FOREIGN_ALIGN` actually is the alignment of the
pointers returned by `into_foreign()`, so it must be an `unsafe` trait.

> +
> +        // SAFETY: `new` just came from [`into_foreign()`], and we dismiss this guard
> +        // if the `xa_store` operation succeeds and takes ownership of the pointer.
> +        let guard = ScopeGuard::new(|| unsafe {
> +            drop(T::from_foreign(new));
> +        });
> +
> +        // SAFETY: `self.xa` is always valid by the type invariant, and we are
> +        // storing a [`T::into_foreign()`] result which upholds the later invariants.

The second part of this comment should be an `INVARIANT` comment
instead. So "INVARIANT: `new` came from [`T::into_foreign()`]."

> +        let old = unsafe {
> +            bindings::xa_store(
> +                self.xa.get(),
> +                Self::to_index(index),
> +                new.cast_mut(),
> +                bindings::GFP_KERNEL,
> +            )
> +        };
> +
> +        // SAFETY: `xa_store` returns the old entry at this index on success or
> +        // a [`XArray`] result, which can be turn into an errno through `xa_err`.
> +        to_result(unsafe { bindings::xa_err(old) })?;
> +        guard.dismiss();
> +
> +        Ok(if old.is_null() {
> +            None
> +        } else {
> +            // SAFETY: The old value must have been stored by either this function or
> +            // `insert_between`, both of which ensure non-NULL entries are valid
> +            // [`ForeignOwnable`] pointers.

This should be replaced by "by the type invariant of `Self`, `old` is
either NULL, or came from `T::into_foreign()`.".

> +            Some(unsafe { T::from_foreign(old) })
> +        })
> +    }

[...]

> +    /// Allocates a new index in the array, optionally storing a new value into
> +    /// it, with configurable bounds for the index range to allocate from.
> +    ///
> +    /// If `value` is [`None`], then the index is reserved from further allocation
> +    /// but remains free for storing a value into it.
> +    fn insert_between(&self, value: Option<T>, min: u32, max: u32) -> Result<usize> {
> +        let new = value.map_or(core::ptr::null(), |a| a.into_foreign());
> +        let mut id: u32 = 0;
> +
> +        let guard = ScopeGuard::new(|| {
> +            if !new.is_null() {
> +                // SAFETY: If `new` is not NULL, it came from the [`ForeignOwnable`]
> +                // we got from the caller.
> +                unsafe { drop(T::from_foreign(new)) };
> +            }
> +        });
> +
> +        // SAFETY: `self.xa` is always valid by the type invariant.
> +        //
> +        // If this succeeds, it takes ownership of the passed `T` (if any).
> +        // If it fails, we must drop the `T` again.
> +        let ret = unsafe {
> +            bindings::xa_alloc(
> +                self.xa.get(),
> +                &mut id,
> +                new.cast_mut(),
> +                bindings::xa_limit { min, max },
> +                bindings::GFP_KERNEL,
> +            )
> +        };

This function call is missing an INVARIANT comment stating that `new` is
either NULL or came from `T::into_foreign()`.

> +
> +        if ret < 0 {
> +            Err(Error::from_errno(ret))
> +        } else {
> +            guard.dismiss();
> +            Ok(id as usize)
> +        }
> +    }

[...]

> +// SAFETY: It is safe to send `XArray<T>` to another thread when the underlying
> +// `T` is `Send` because XArray is thread-safe and all mutation operations are
> +// internally locked. `T` must be `ForeignOwnable` because all pointers stored
> +// in the array are pointers obtained by calling `T::into_foreign` or are NULL
> +// pointers.
> +unsafe impl<T: Send + ForeignOwnable> Send for XArray<T> {}
> +//
> +// SAFETY: It is safe to send `&XArray<T>` to another thread when the underlying
> +// `T` is `Sync` because it effectively means sharing `&T` (which is safe because
> +// `T` is `Sync`). Additionally, `T` is `Send` because XArray is thread-safe and
> +// all mutation operations are internally locked. `T` must be `ForeignOwnable`
> +// because all pointers stored in the array are pointers obtained by calling
> +// `T::into_foreign` or are NULL pointers.
> +unsafe impl<T: Send + Sync + ForeignOwnable> Sync for XArray<T> {}

The part about "`T` must be `ForeignOwnable`..." is not needed here.
Philipp Stanner March 26, 2024, 7:41 a.m. UTC | #7
On Thu, 2024-03-21 at 18:20 -0700, Boqun Feng wrote:
> On Thu, Mar 21, 2024 at 05:22:11PM -0700, John Hubbard wrote:
> > On 3/19/24 2:32 AM, Philipp Stanner wrote:
> > ...
> > > > 
> > > > > +        if ret < 0 {
> > > > > +            Err(Error::from_errno(ret))
> > > > > +        } else {
> > > > > +            guard.dismiss();
> > > > > +            Ok(id as usize)
> > > > > +        }
> > > > 
> > > > You could make this easier to read using to_result.
> > > > 
> > > > to_result(ret)?;
> > > > guard.dismiss();
> > > > Ok(id as usize)
> > > 
> > > My 2 cents, I'd go for classic kernel style:
> > > 
> > > 
> > > if ret < 0 {
> > >      return Err(...);
> > > }
> > > 
> > > guard.dismiss();
> > > Ok(id as usize)
> > > 
> > 
> > Yes.
> > 
> > As a "standard" C-based kernel person who is trying to move into
> > Rust
> > for Linux, I hereby invoke my Rust-newbie powers to confirm that
> > the
> > "clasic kernel style" above goes into my head much faster and
> > easier,
> > yes. :)
> > 
> 
> Hope I'm still belong to a "standard C-based kernel person" ;-) My
> problem on "if ret < 0 { ... }" is: what's the type of "ret", and is
> it
> "negative means failure" or "non-zero means failure"? Now for this
> particular code, the assignment of "ret" and the use of "ret" is
> pretty
> close, so it doesn't matter. But in the code where "ret" is used for
> multiple function calls and there is code in-between, then I'd prefer
> we
> use `to_result` (i.e. `Result` type and question mark operator).

Yeah, the issue we're running into here is that it's a quite different
way of programming since you create bindings on the fly with `let`.

I actually think it would be nice to have variable declarations at the
top of the functions, but that would obviously very frequently collide
with Rust's concept of data being immutable by default.

Regarding our example, my hope would be that `if ret < 0 ` should
always be fine, because, what else could it be? A float?
And such patterns should only appear where you interact with C, since
pure Rust should definitely always have all result status packed in
Result or Option.

> 
> > I hope I'm not violating any "this is how Rust idioms must be"
> > conventions. But if not, then all other things being equal, it is
> > of
> > course a nice touch to make the code more readable to the rest of
> > the
> > kernel folks.
> > 
> 
> One more extra point from myself only: if one is using Rust for
> drivers
> or subsystem they are going to take care of it in the future, it's
> totally fine for them to pick coding styles that they feel
> comfortable,
> I don't want to make people who do the real work feel frustrated
> because
> "this is how Rust idioms must be", also I don't believe tools should
> restrict people.

I'm very happy to hear that we're on sync here :)

Though there will be a lot to be discussed and done. As I see it, so
far the clippy rules for example are global, aren't they?

In any case I don't even think they shouldn't be global, but that they
should behave more like checkpatch.

I think the ideal end result would be one where we have rules and
enforcement similar to how it's done in C:
C has checkpatch, which fires errors for some things, and warnings for
others. And, if it makes sense, you or the maintainer can ignore the
warning.

When the build chain checks global rules we can't really ignore any
rule without some kind of annotation in the code, because otherwise
we'd permanently spam the build log


>  But in the "kernel" crate (i.e. for core kernel part),
> I want to make it "Rusty" since it's the point of the experiement
> ("you
> asked for it ;-)).
> 
> Hope we can find a balanced point when we learn more and more ;-)

Yeah, I get where that comes from. Many new languages attempt to end
endless style discussions etc. by axiomatically defining a canonical
style.

But we're living in this special world called kernel with its own laws
of physics so to speak. To some degree we already have to use a
"dialect / flavor" of Rust (CStr, try_push()?)
Will be interesting to see how it all plays out for us once the users
and use cases increase in number more and more


Many greetings,
P.

> 
> Regards,
> Boqun
> 
> > 
> > thanks,
> > -- 
> > John Hubbard
> > NVIDIA
> > 
>
Miguel Ojeda March 26, 2024, 12:16 p.m. UTC | #8
On Fri, Mar 22, 2024 at 2:20 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> One more extra point from myself only: if one is using Rust for drivers
> or subsystem they are going to take care of it in the future, it's
> totally fine for them to pick coding styles that they feel comfortable,
> I don't want to make people who do the real work feel frustrated because
> "this is how Rust idioms must be", also I don't believe tools should
> restrict people. But in the "kernel" crate (i.e. for core kernel part),
> I want to make it "Rusty" since it's the point of the experiement ("you
> asked for it ;-)).

We should aim to be as consistent as possible for all the kernel, not
just the "core kernel".

Yes, there should be flexibility. In fact, sometimes it is just
impossible, unreasonable, impractical and/or annoying to be
consistent. And, as you say, we should definitely avoid making people
frustrated for inane reasons.

But we should not introduce flexibility for bad reasons either.

And this is the kind of thing that it is very hard to restrict later.
So even if we are a bit "over the top" in some cases now (the current
compiler/Clippy flags could be already arguably so, at times), the
point is to figure out what makes sense to keep long term. In other
words, instead of discussing how to create local coding styles, we
should be discussing what should be changed for everybody (e.g.
because a particular diagnostic may too annoying now, because a style
is found to be better than another like `if ret < 0 ...` vs.
`to_result` and so on).

Cheers,
Miguel
Miguel Ojeda March 26, 2024, 12:23 p.m. UTC | #9
On Tue, Mar 26, 2024 at 8:41 AM Philipp Stanner <pstanner@redhat.com> wrote:
>
> I actually think it would be nice to have variable declarations at the
> top of the functions, but that would obviously very frequently collide
> with Rust's concept of data being immutable by default.

It is true that deferring bindings sometimes allows for more
immutability (this also applies to kernel C to some degree with
variables at the top, by the way). However, it is more about other
things like when new objects can be constructed (when can invariants
be set), when destructors run and in what order, even the type itself
(rebinding).

But it is not just this (where to place variables/bindings) -- there
are other kernel C guidelines that do not make sense for Rust, and
trying to apply them is a bad idea.

Where possible and where it makes sense, we should try to keep it
consistent, of course. For instance, naming concerns have been
discussed many times (avoiding different names for existing things) --
that can be indeed be confusing and introduce mental overhead, and
thus we try to avoid it, even if sometimes it may have been locally
the best solution.

> Regarding our example, my hope would be that `if ret < 0 ` should
> always be fine, because, what else could it be? A float?

I think one of the things Boqun was referring to were the semantics of
the value, which is why one advantage of `to_result` is that, when
dealing with a C function that does not follow the usual pattern, it
stands out more.

> I'm very happy to hear that we're on sync here :)
>
> Though there will be a lot to be discussed and done. As I see it, so
> far the clippy rules for example are global, aren't they?

Yeah, and that is intentional.

Of course, as with anything else, there can be exceptions where they
make sense. But the point is to avoid diverging in how we write the
code in different parts of the kernel unless there is a good reason.

This is essentially the same as the formatting discussion. The flags
for `rustfmt`, Clippy or the compiler diagnostics can and will change
to try to come up with the optimal set as we learn more (and as new
flags appear or if they change behavior in a way that we didn't expect
-- hopefully not -- etc.), but they should be consistent for the
entire kernel, except where there is a good reason not to.

> I think the ideal end result would be one where we have rules and
> enforcement similar to how it's done in C:
> C has checkpatch, which fires errors for some things, and warnings for
> others. And, if it makes sense, you or the maintainer can ignore the
> warning.

I may not be understanding this part, but that is already the case, no?

In any case, I am not sure we should look at `checkpatch.pl` as a
reference here, because it is a tool that we apply against patches,
i.e. something done at a given point in time once against "ephemeral"
entities.

In other words, it is always possible to ignore any and all of its
warnings/errors, at the discretion of the maintainer, and thus it is
not as big of a deal to have false positives for the checks (in fact,
it is good that it does, because it allows `checkpatch.pl` more leeway
to catch more issues).

We could have different sets like with `W=`, though, including perhaps
one for "development" where it is even more relaxed (e.g. no missing
documentation warning).

> When the build chain checks global rules we can't really ignore any
> rule without some kind of annotation in the code, because otherwise
> we'd permanently spam the build log

That is the intention, i.e. that they are enforced as much as
reasonably possible, but one can still ignore the rule if really
needed (and even better, one can do it as locally as possible, e.g.
right at the item that requires it instead of file-wide).

> Yeah, I get where that comes from. Many new languages attempt to end
> endless style discussions etc. by axiomatically defining a canonical
> style.
>
> But we're living in this special world called kernel with its own laws
> of physics so to speak. To some degree we already have to use a
> "dialect / flavor" of Rust (CStr, try_push()?)
> Will be interesting to see how it all plays out for us once the users
> and use cases increase in number more and more

The "dialect" you are speaking about is global for the kernel. That is
completely fine, and as you say, needed in some aspects.

But what we do not want is to end up with different "dialects" within
the kernel, unless there is a very good reason for exceptional cases.

Cheers,
Miguel
Philipp Stanner March 26, 2024, 1:53 p.m. UTC | #10
On Tue, 2024-03-26 at 13:23 +0100, Miguel Ojeda wrote:
> On Tue, Mar 26, 2024 at 8:41 AM Philipp Stanner <pstanner@redhat.com>
> wrote:
> > 
> > I actually think it would be nice to have variable declarations at
> > the
> > top of the functions, but that would obviously very frequently
> > collide
> > with Rust's concept of data being immutable by default.
> 
> It is true that deferring bindings sometimes allows for more
> immutability (this also applies to kernel C to some degree with
> variables at the top, by the way). However, it is more about other
> things like when new objects can be constructed (when can invariants
> be set), when destructors run and in what order, even the type itself
> (rebinding).
> 
> But it is not just this (where to place variables/bindings) -- there
> are other kernel C guidelines that do not make sense for Rust, and
> trying to apply them is a bad idea.
> 
> Where possible and where it makes sense, we should try to keep it
> consistent, of course. For instance, naming concerns have been
> discussed many times (avoiding different names for existing things) -
> -
> that can be indeed be confusing and introduce mental overhead, and
> thus we try to avoid it, even if sometimes it may have been locally
> the best solution.
> 
> > Regarding our example, my hope would be that `if ret < 0 ` should
> > always be fine, because, what else could it be? A float?
> 
> I think one of the things Boqun was referring to were the semantics
> of
> the value, which is why one advantage of `to_result` is that, when
> dealing with a C function that does not follow the usual pattern, it
> stands out more.

You mean if `ret` was a pointer that, consequently, would have to be
checked for equality rather then for less-than 0?

> 
> > I'm very happy to hear that we're on sync here :)
> > 
> > Though there will be a lot to be discussed and done. As I see it,
> > so
> > far the clippy rules for example are global, aren't they?
> 
> Yeah, and that is intentional.
> 
> Of course, as with anything else, there can be exceptions where they
> make sense. But the point is to avoid diverging in how we write the
> code in different parts of the kernel unless there is a good reason.
> 
> This is essentially the same as the formatting discussion. The flags
> for `rustfmt`, Clippy or the compiler diagnostics can and will change
> to try to come up with the optimal set as we learn more (and as new
> flags appear or if they change behavior in a way that we didn't
> expect
> -- hopefully not -- etc.), but they should be consistent for the
> entire kernel, except where there is a good reason not to.
> 
> > I think the ideal end result would be one where we have rules and
> > enforcement similar to how it's done in C:
> > C has checkpatch, which fires errors for some things, and warnings
> > for
> > others. And, if it makes sense, you or the maintainer can ignore
> > the
> > warning.
> 
> I may not be understanding this part, but that is already the case,
> no?

The difference is that checkpatch is an additional tool that is not
tied to the compiler or build chain.

> 
> In any case, I am not sure we should look at `checkpatch.pl` as a
> reference here, because it is a tool that we apply against patches,
> i.e. something done at a given point in time once against "ephemeral"
> entities.
> 
> In other words, it is always possible to ignore any and all of its
> warnings/errors, at the discretion of the maintainer, and thus it is
> not as big of a deal to have false positives for the checks (in fact,
> it is good that it does, because it allows `checkpatch.pl` more
> leeway
> to catch more issues).
> 
> We could have different sets like with `W=`, though, including
> perhaps
> one for "development" where it is even more relaxed (e.g. no missing
> documentation warning).
> 
> > When the build chain checks global rules we can't really ignore any
> > rule without some kind of annotation in the code, because otherwise
> > we'd permanently spam the build log
> 
> That is the intention, i.e. that they are enforced as much as
> reasonably possible, but one can still ignore the rule if really
> needed (and even better, one can do it as locally as possible, e.g.
> right at the item that requires it instead of file-wide).

Yes, and I think that shouldn't be done. The C coding guideline is more
of a recommendation.

Nothing in C complains about you doing a `return;` over a `goto out;`.
The current setting, however, would fire warnings (or isn't it even
classified as a build error?) if you add `return` in Rust at the end of
a code path.

I believe that memory-safety-related mechanisms and rules should be
"enforced as much as reasonably possible", because that's the entire
point of Rust in the kernel.

Everything else should in my opinion be as free as in C.

> 
> > Yeah, I get where that comes from. Many new languages attempt to
> > end
> > endless style discussions etc. by axiomatically defining a
> > canonical
> > style.
> > 
> > But we're living in this special world called kernel with its own
> > laws
> > of physics so to speak. To some degree we already have to use a
> > "dialect / flavor" of Rust (CStr, try_push()?)
> > Will be interesting to see how it all plays out for us once the
> > users
> > and use cases increase in number more and more
> 
> The "dialect" you are speaking about is global for the kernel. That
> is
> completely fine, and as you say, needed in some aspects.
> 
> But what we do not want is to end up with different "dialects" within
> the kernel, unless there is a very good reason for exceptional cases.

Agreed, that should be global.

In fact, almost everything should be global.

The only thing we don't agree on is what should be a per-default
strictly enforced rule, and what should be a recommendation.


P.

> 
> Cheers,
> Miguel
>
Miguel Ojeda March 26, 2024, 5:03 p.m. UTC | #11
On Tue, Mar 26, 2024 at 2:53 PM Philipp Stanner <pstanner@redhat.com> wrote:
>
> You mean if `ret` was a pointer that, consequently, would have to be
> checked for equality rather then for less-than 0?

No, I meant a C function that happens to return an `int` with
different semantics (e.g. encoding for the error) than the usual ones
(Rust does not allow to compare a raw pointer with an integer).

> The difference is that checkpatch is an additional tool that is not
> tied to the compiler or build chain.

Perhaps a better comparison would be Sparse or Coccinelle then? (i.e.
in that we ideally would test the source code, not the patches, all
the time).

In any case, if you mean to have kernel-only lints, Gary created klint
[1] for that and Julia and Tathagata announced Coccinelle for Rust [2]
some months ago.

We hope we can start putting these tools into good use soon (not to
say they are not useful already, e.g. klint already found actual
issues in the past), ideally running them all the time for all code.

Unless you mean replicating all the compiler diagnostics -- that is
not really feasible (bandwidth-wise) and I am not sure what we would
gain in practice. We don't do that either for kernel C.

[1] https://rust-for-linux.com/klint
[2] https://rust-for-linux.com/coccinelle-for-rust

> Yes, and I think that shouldn't be done. The C coding guideline is more
> of a recommendation.

Kernel C is quite consistent (relative to what you could randomly do
in C), even including the big differences from some subsystems. So, in
practice, some of the guidelines are rules.

For Rust, we had the chance to be even more consistent (thanks to
having the tooling available from the start), and we went for that,
because we see that as a good thing that will pay off long term.

Cheers,
Miguel
diff mbox series

Patch

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 65b98831b975..7a6be07038ce 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -17,8 +17,25 @@ 
 #include <linux/slab.h>
 #include <linux/wait.h>
 #include <linux/workqueue.h>
+#include <linux/xarray.h>
 
 /* `bindgen` gets confused at certain things. */
 const size_t RUST_CONST_HELPER_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
 const gfp_t RUST_CONST_HELPER_GFP_KERNEL = GFP_KERNEL;
 const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
+
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_LOCK_IRQ = XA_FLAGS_LOCK_IRQ;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_LOCK_BH = XA_FLAGS_LOCK_BH;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_TRACK_FREE = XA_FLAGS_TRACK_FREE;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_ZERO_BUSY = XA_FLAGS_ZERO_BUSY;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC_WRAPPED = XA_FLAGS_ALLOC_WRAPPED;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_ACCOUNT = XA_FLAGS_ACCOUNT;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC = XA_FLAGS_ALLOC;
+const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC1 = XA_FLAGS_ALLOC1;
+
+const xa_mark_t RUST_CONST_HELPER_XA_MARK_0 = XA_MARK_0;
+const xa_mark_t RUST_CONST_HELPER_XA_MARK_1 = XA_MARK_1;
+const xa_mark_t RUST_CONST_HELPER_XA_MARK_2 = XA_MARK_2;
+const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
+const xa_mark_t RUST_CONST_HELPER_XA_MARK_MAX = XA_MARK_MAX;
+const xa_mark_t RUST_CONST_HELPER_XA_FREE_MARK = XA_FREE_MARK;
diff --git a/rust/helpers.c b/rust/helpers.c
index 70e59efd92bc..72a7f9c596ad 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -31,6 +31,7 @@ 
 #include <linux/spinlock.h>
 #include <linux/wait.h>
 #include <linux/workqueue.h>
+#include <linux/xarray.h>
 
 __noreturn void rust_helper_BUG(void)
 {
@@ -157,6 +158,42 @@  void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
 }
 EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
 
+void rust_helper_xa_init_flags(struct xarray *xa, gfp_t flags)
+{
+	xa_init_flags(xa, flags);
+}
+EXPORT_SYMBOL_GPL(rust_helper_xa_init_flags);
+
+bool rust_helper_xa_empty(struct xarray *xa)
+{
+	return xa_empty(xa);
+}
+EXPORT_SYMBOL_GPL(rust_helper_xa_empty);
+
+int rust_helper_xa_alloc(struct xarray *xa, u32 *id, void *entry, struct xa_limit limit, gfp_t gfp)
+{
+	return xa_alloc(xa, id, entry, limit, gfp);
+}
+EXPORT_SYMBOL_GPL(rust_helper_xa_alloc);
+
+void rust_helper_xa_lock(struct xarray *xa)
+{
+	xa_lock(xa);
+}
+EXPORT_SYMBOL_GPL(rust_helper_xa_lock);
+
+void rust_helper_xa_unlock(struct xarray *xa)
+{
+	xa_unlock(xa);
+}
+EXPORT_SYMBOL_GPL(rust_helper_xa_unlock);
+
+int rust_helper_xa_err(void *entry)
+{
+	return xa_err(entry);
+}
+EXPORT_SYMBOL_GPL(rust_helper_xa_err);
+
 /*
  * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
  * use it in contexts where Rust expects a `usize` like slice (array) indices.
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index be68d5e567b1..ec7e5b5c2d0a 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -50,6 +50,7 @@ 
 pub mod time;
 pub mod types;
 pub mod workqueue;
+pub mod xarray;
 
 #[doc(hidden)]
 pub use bindings;
diff --git a/rust/kernel/xarray.rs b/rust/kernel/xarray.rs
new file mode 100644
index 000000000000..02ca6ae2d160
--- /dev/null
+++ b/rust/kernel/xarray.rs
@@ -0,0 +1,407 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+//! XArray abstraction.
+//!
+//! C header: [`include/linux/xarray.h`](srctree/include/linux/xarray.h)
+
+use crate::{
+    bindings,
+    error::{to_result, Error, Result},
+    prelude::*,
+    types::{ForeignOwnable, Opaque, ScopeGuard},
+};
+use core::{ffi::{c_void, c_ulong}, marker::PhantomData, mem, ops::Deref, ptr::NonNull};
+
+/// Flags passed to [`XArray::new`] to configure the [`XArray`].
+type Flags = bindings::gfp_t;
+
+/// Flag values passed to [`XArray::new`] to configure the [`XArray`].
+pub mod flags {
+    /// Use IRQ-safe locking.
+    pub const LOCK_IRQ: super::Flags = bindings::XA_FLAGS_LOCK_IRQ;
+    /// Use softirq-safe locking.
+    pub const LOCK_BH: super::Flags = bindings::XA_FLAGS_LOCK_BH;
+    /// Track which entries are free (distinct from [`None`]).
+    pub const TRACK_FREE: super::Flags = bindings::XA_FLAGS_TRACK_FREE;
+    /// Initialize array index 0 as busy.
+    pub const ZERO_BUSY: super::Flags = bindings::XA_FLAGS_ZERO_BUSY;
+    /// Use GFP_ACCOUNT for internal memory allocations.
+    pub const ACCOUNT: super::Flags = bindings::XA_FLAGS_ACCOUNT;
+    /// Create an allocating [`XArray`] starting at index 0.
+    pub const ALLOC: super::Flags = bindings::XA_FLAGS_ALLOC;
+    /// Create an allocating [`XArray`] starting at index 1.
+    pub const ALLOC1: super::Flags = bindings::XA_FLAGS_ALLOC1;
+}
+
+/// An array which efficiently maps sparse integer indices to owned objects.
+///
+/// This is similar to a [`Vec<Option<T>>`], but more efficient when there are
+/// holes in the index space, and can be efficiently grown.
+///
+/// This structure is expected to often be used with an inner type that
+/// can be efficiently cloned, such as an [`Arc<T>`].
+///
+/// # Invariants
+///
+/// All pointers stored in the array are pointers obtained by calling
+/// `T::into_foreign` or are NULL pointers. `self.xa` is always an
+/// initialized and valid [`XArray`].
+#[pin_data(PinnedDrop)]
+pub struct XArray<T: ForeignOwnable> {
+    #[pin]
+    xa: Opaque<bindings::xarray>,
+    _p: PhantomData<T>,
+}
+
+/// Wrapper for a value owned by the [`XArray`] which holds the [`XArray`] lock until
+/// dropped.
+///
+/// You can use the [`into_foreign`] method to obtain a pointer to the foreign
+/// representation of the owned value, which is valid for the lifetime of the [`Guard`].
+///
+/// # Invariants
+///
+/// [`Guard`] holds a reference (`self.0`) to the underlying value owned by the
+/// [`XArray`] (`self.1`) with its lock held.
+pub struct Guard<'a, T: ForeignOwnable>(NonNull<c_void>, &'a XArray<T>);
+
+impl<'a, T: ForeignOwnable> Guard<'a, T> {
+    /// Borrow the underlying value wrapped by the [`Guard`].
+    ///
+    /// Returns a [`T::Borrowed`] type for the owned [`ForeignOwnable`] type.
+    pub fn borrow(&self) -> T::Borrowed<'_> {
+        // SAFETY: The value is owned by the [`XArray`], the lifetime it is
+        // borrowed for must not outlive the [`XArray`] itself, nor the [`Guard`]
+        // that holds the lock ensuring the value remains in the [`XArray`].
+        //
+        // By the type invariant of [`Guard`], we can guarantee that [`Guard`]
+        // holds this reference (`self.0`).
+        unsafe { T::borrow(self.0.as_ptr()) }
+    }
+}
+
+// Convenience impl for [`ForeignOwnable`] types whose [`Borrowed`] form
+// implements [`Deref`].
+impl<'a, T: ForeignOwnable> Deref for Guard<'a, T>
+where
+    T::Borrowed<'a>: Deref,
+    for<'b> T::Borrowed<'b>: Into<&'b <T::Borrowed<'a> as Deref>::Target>,
+{
+    type Target = <T::Borrowed<'a> as Deref>::Target;
+
+    fn deref(&self) -> &Self::Target {
+        self.borrow().into()
+    }
+}
+
+impl<'a, T: ForeignOwnable> Drop for Guard<'a, T> {
+    fn drop(&mut self) {
+        // SAFETY: By the type invariant, we own the [`XArray`] lock, so we must
+        // unlock it here.
+        unsafe { bindings::xa_unlock(self.1.xa.get()) };
+    }
+}
+
+/// Represents a reserved slot in an [`XArray`], which does not yet have a value
+/// but has an assigned index and may not be allocated by any other user. If the
+/// [`Reservation`] is dropped without being filled, the entry is marked as
+/// available again.
+///
+/// Users must ensure that reserved slots are not filled by other mechanisms, or
+/// otherwise their contents may be dropped and replaced (which will print a warning).
+pub struct Reservation<'a, T: ForeignOwnable>(&'a XArray<T>, usize);
+
+impl<'a, T: ForeignOwnable> Reservation<'a, T> {
+    /// Stores a value into the reserved slot.
+    pub fn store(self, value: T) -> Result<usize> {
+        if self.0.replace(self.1, value)?.is_some() {
+            crate::pr_err!("XArray: Reservation stored but the entry already had data!\n");
+            // Consider it a success anyway, not much we can do.
+        }
+        let index = self.1;
+        // The reservation is now fulfilled, so do not run our destructor.
+        core::mem::forget(self);
+        Ok(index)
+    }
+
+    /// Returns the index of this reservation.
+    pub fn index(&self) -> usize {
+        self.1
+    }
+}
+
+impl<'a, T: ForeignOwnable> Drop for Reservation<'a, T> {
+    fn drop(&mut self) {
+        if self.0.remove(self.1).is_some() {
+            crate::pr_err!("XArray: Reservation dropped but the entry was not empty!\n");
+        }
+    }
+}
+
+/// # Examples
+///
+/// ```rust
+/// use kernel::xarray::{XArray, flags};
+/// use kernel::sync::Arc;
+///
+/// struct Foo {
+///     a: u32,
+///     b: u32,
+/// }
+///
+/// let arr = Box::pin_init(XArray::<Arc<Foo>>::new(flags::ALLOC1))?;
+///
+/// let item = Arc::try_new(Foo { a : 1, b: 2 })?;
+/// let index = arr.alloc(item)?;
+///
+/// if let Some(guard) = arr.get_locked(index) {
+///     assert_eq!(guard.borrow().a, 1);
+///     assert_eq!(guard.borrow().b, 2);
+/// } else {
+///     pr_info!("No value found in index {}", index);
+/// }
+///
+/// let item = Arc::try_new(Foo { a : 3, b: 4 })?;
+/// let index = arr.alloc(item)?;
+///
+/// if let Some(removed_data) = arr.remove(index) {
+///     assert_eq!(removed_data.a, 3);
+///     assert_eq!(removed_data.b, 4);
+/// } else {
+///     pr_info!("No value found in index {}", index);
+/// }
+/// # Ok::<(), Error>(())
+/// ```
+impl<T: ForeignOwnable> XArray<T> {
+    /// Creates a new [`XArray`] with the given flags.
+    pub fn new(flags: Flags) -> impl PinInit<Self> {
+        pin_init!(Self {
+            // SAFETY: `xa` is valid while the closure is called.
+            xa <- Opaque::ffi_init(|xa| unsafe {
+                bindings::xa_init_flags(xa, flags)
+            }),
+            _p: PhantomData,
+        })
+    }
+
+    /// Converts [`usize`] to `unsigned long`.
+    fn to_index(i: usize) -> c_ulong {
+        // The type is `unsigned long`, which is always the same as `usize` in
+        // the kernel.
+        build_assert!(mem::size_of::<usize>() == mem::size_of::<c_ulong>());
+        i as c_ulong
+    }
+
+    /// Replaces an entry with a new value, returning the old value (if any).
+    pub fn replace(&self, index: usize, value: T) -> Result<Option<T>> {
+        let new = value.into_foreign();
+
+        build_assert!(T::FOREIGN_ALIGN >= 4);
+
+        // SAFETY: `new` just came from [`into_foreign()`], and we dismiss this guard
+        // if the `xa_store` operation succeeds and takes ownership of the pointer.
+        let guard = ScopeGuard::new(|| unsafe {
+            drop(T::from_foreign(new));
+        });
+
+        // SAFETY: `self.xa` is always valid by the type invariant, and we are
+        // storing a [`T::into_foreign()`] result which upholds the later invariants.
+        let old = unsafe {
+            bindings::xa_store(
+                self.xa.get(),
+                Self::to_index(index),
+                new.cast_mut(),
+                bindings::GFP_KERNEL,
+            )
+        };
+
+        // SAFETY: `xa_store` returns the old entry at this index on success or
+        // a [`XArray`] result, which can be turn into an errno through `xa_err`.
+        to_result(unsafe { bindings::xa_err(old) })?;
+        guard.dismiss();
+
+        Ok(if old.is_null() {
+            None
+        } else {
+            // SAFETY: The old value must have been stored by either this function or
+            // `insert_between`, both of which ensure non-NULL entries are valid
+            // [`ForeignOwnable`] pointers.
+            Some(unsafe { T::from_foreign(old) })
+        })
+    }
+
+    /// Replaces an entry with a new value, dropping the old value (if any).
+    pub fn set(&self, index: usize, value: T) -> Result {
+        self.replace(index, value)?;
+        Ok(())
+    }
+
+    /// Looks up and returns a reference to an entry in the array, returning a
+    /// [`Guard`] if it exists.
+    ///
+    /// This guard blocks all other actions on the [`XArray`]. Callers are expected
+    /// to drop the [`Guard`] eagerly to avoid blocking other users, such as by
+    /// taking a clone of the value.
+    pub fn get_locked(&self, index: usize) -> Option<Guard<'_, T>> {
+        // SAFETY: `self.xa` is always valid by the type invariant.
+        unsafe { bindings::xa_lock(self.xa.get()) };
+
+        // SAFETY: `self.xa` is always valid by the type invariant.
+        let guard = ScopeGuard::new(|| unsafe { bindings::xa_unlock(self.xa.get()) });
+
+        // SAFETY: `self.xa` is always valid by the type invariant.
+        //
+        // We currently hold the `xa_lock`, which is allowed by `xa_load`. The
+        // returned pointer `p` is only valid until we release the lock, which
+        // is okay here, since the returned [`Guard`] ensures that the pointer can
+        // only be used while the lock is still held.
+        let p = unsafe { bindings::xa_load(self.xa.get(), Self::to_index(index)) };
+
+        let p = NonNull::new(p)?;
+        guard.dismiss();
+
+        // INVARIANT: We just dismissed the `guard`, so we can pass ownership of
+        // the lock to the returned [`Guard`].
+        Some(Guard(p, self))
+    }
+
+    /// Removes and returns an entry, returning it if it existed.
+    pub fn remove(&self, index: usize) -> Option<T> {
+        // SAFETY: `self.xa` is always valid by the type invariant.
+        let p = unsafe { bindings::xa_erase(self.xa.get(), Self::to_index(index)) };
+        if p.is_null() {
+            None
+        } else {
+            // SAFETY: By the type invariant of `Self`, all pointers stored in
+            // the array are pointers obtained by calling [`T::into_foreign`].
+            Some(unsafe { T::from_foreign(p) })
+        }
+    }
+
+    /// Allocates a new index in the array, optionally storing a new value into
+    /// it, with configurable bounds for the index range to allocate from.
+    ///
+    /// If `value` is [`None`], then the index is reserved from further allocation
+    /// but remains free for storing a value into it.
+    fn insert_between(&self, value: Option<T>, min: u32, max: u32) -> Result<usize> {
+        let new = value.map_or(core::ptr::null(), |a| a.into_foreign());
+        let mut id: u32 = 0;
+
+        let guard = ScopeGuard::new(|| {
+            if !new.is_null() {
+                // SAFETY: If `new` is not NULL, it came from the [`ForeignOwnable`]
+                // we got from the caller.
+                unsafe { drop(T::from_foreign(new)) };
+            }
+        });
+
+        // SAFETY: `self.xa` is always valid by the type invariant.
+        //
+        // If this succeeds, it takes ownership of the passed `T` (if any).
+        // If it fails, we must drop the `T` again.
+        let ret = unsafe {
+            bindings::xa_alloc(
+                self.xa.get(),
+                &mut id,
+                new.cast_mut(),
+                bindings::xa_limit { min, max },
+                bindings::GFP_KERNEL,
+            )
+        };
+
+        if ret < 0 {
+            Err(Error::from_errno(ret))
+        } else {
+            guard.dismiss();
+            Ok(id as usize)
+        }
+    }
+
+    /// Allocates a new index in the array, storing a new value into it, with
+    /// configurable bounds for the index range to allocate from.
+    pub fn alloc_limits(&self, value: T, min: u32, max: u32) -> Result<usize> {
+        self.insert_between(Some(value), min, max)
+    }
+
+    /// Allocates a new index in the array, storing a new value into it.
+    pub fn alloc(&self, value: T) -> Result<usize> {
+        self.alloc_limits(value, 0, u32::MAX)
+    }
+
+    /// Reserves a new index in the array within configurable bounds for the index.
+    ///
+    /// Returns a [`Reservation`] object, which can then be used to store a value
+    /// at this index or otherwise free it for reuse.
+    pub fn reserve_limits(&self, min: u32, max: u32) -> Result<Reservation<'_, T>> {
+        Ok(Reservation(
+            self,
+            self.insert_between(None, min, max)?,
+        ))
+    }
+
+    /// Reserves a new index in the array.
+    ///
+    /// Returns a `Reservation` object, which can then be used to store a value
+    /// at this index or otherwise free it for reuse.
+    pub fn reserve(&self) -> Result<Reservation<'_, T>> {
+        Ok(Reservation(
+            self,
+            self.insert_between(None, 0, u32::MAX)?,
+        ))
+    }
+}
+
+#[pinned_drop]
+impl<T: ForeignOwnable> PinnedDrop for XArray<T> {
+    fn drop(self: Pin<&mut Self>) {
+        let mut index:c_ulong = 0;
+        let mut entry;
+
+        // SAFETY: `self.xa` is valid by the type invariant.
+        unsafe {
+            entry = bindings::xa_find(
+                self.xa.get(),
+                &mut index,
+                c_ulong::MAX,
+                bindings::XA_PRESENT,
+            );
+        }
+
+        while !entry.is_null() {
+            // SAFETY: All pointers stored in the array are pointers obtained by
+            // calling [`T::into_foreign`].
+            unsafe { drop(T::from_foreign(entry)) };
+
+            // SAFETY: `self.xa` is valid by the type invariant, and as we have
+            // the only reference to the [`XArray`] we can safely iterate its
+            // contents and drop everything.
+            unsafe {
+                entry = bindings::xa_find_after(
+                    self.xa.get(),
+                    &mut index,
+                    c_ulong::MAX,
+                    bindings::XA_PRESENT,
+                );
+            }
+        }
+
+        // SAFETY: By the type invariants, we have ownership of the [`XArray`],
+        // and we don't use it after this call, so we can destroy it.
+        unsafe { bindings::xa_destroy(self.xa.get()) };
+    }
+}
+
+// SAFETY: It is safe to send `XArray<T>` to another thread when the underlying
+// `T` is `Send` because XArray is thread-safe and all mutation operations are
+// internally locked. `T` must be `ForeignOwnable` because all pointers stored
+// in the array are pointers obtained by calling `T::into_foreign` or are NULL
+// pointers.
+unsafe impl<T: Send + ForeignOwnable> Send for XArray<T> {}
+//
+// SAFETY: It is safe to send `&XArray<T>` to another thread when the underlying
+// `T` is `Sync` because it effectively means sharing `&T` (which is safe because
+// `T` is `Sync`). Additionally, `T` is `Send` because XArray is thread-safe and
+// all mutation operations are internally locked. `T` must be `ForeignOwnable`
+// because all pointers stored in the array are pointers obtained by calling
+// `T::into_foreign` or are NULL pointers.
+unsafe impl<T: Send + Sync + ForeignOwnable> Sync for XArray<T> {}