diff mbox series

[3/3] rust: block: convert `block::mq` to use `Refcount`

Message ID 20241004155247.2210469-4-gary@garyguo.net (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Gary Guo Oct. 4, 2024, 3:52 p.m. UTC
Currently there's a custom reference counting in `block::mq`, which uses
`AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit
architectures. We cannot just change it to use 32-bit atomics, because
doing so will make it vulnerable to refcount overflow. So switch it to
use the kernel refcount `kernel::sync::Refcount` instead.

There is an operation needed by `block::mq`, atomically decreasing
refcount from 2 to 0, which is not available through refcount.h, so
I exposed `Refcount::as_atomic` which allows accessing the refcount
directly.

Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
---
 rust/kernel/block/mq/operations.rs |  7 +--
 rust/kernel/block/mq/request.rs    | 70 ++++++++++--------------------
 rust/kernel/sync/refcount.rs       | 12 +++++
 3 files changed, 38 insertions(+), 51 deletions(-)

Comments

Dirk Behme Oct. 4, 2024, 4:43 p.m. UTC | #1
Am 04.10.24 um 17:52 schrieb Gary Guo:
> Currently there's a custom reference counting in `block::mq`, which uses
> `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit
> architectures. We cannot just change it to use 32-bit atomics, because
> doing so will make it vulnerable to refcount overflow. So switch it to
> use the kernel refcount `kernel::sync::Refcount` instead.
> 
> There is an operation needed by `block::mq`, atomically decreasing
> refcount from 2 to 0, which is not available through refcount.h, so
> I exposed `Refcount::as_atomic` which allows accessing the refcount
> directly.
> 
> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> ---
...
> diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
> index a0e22827f3f4..7b63c02bdce7 100644
> --- a/rust/kernel/block/mq/request.rs
> +++ b/rust/kernel/block/mq/request.rs
....
> @@ -37,6 +38,9 @@
>   /// We need to track C and D to ensure that it is safe to end the request and hand
>   /// back ownership to the block layer.
>   ///
> +/// Note that driver can still obtain new `ARef` even there is no `ARef`s in existence by using

Do you like to check this sentence? Maybe:

Note that "a" driver can still obtain "a" new `ARef` even "if" there 
"are" no `ARef`s in existence by using ....

?

Best regards

Dirk
Andreas Hindborg Oct. 4, 2024, 6:05 p.m. UTC | #2
Hi Gary,

"Gary Guo" <gary@garyguo.net> writes:

> Currently there's a custom reference counting in `block::mq`, which uses
> `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit
> architectures. We cannot just change it to use 32-bit atomics, because
> doing so will make it vulnerable to refcount overflow. So switch it to
> use the kernel refcount `kernel::sync::Refcount` instead.
>
> There is an operation needed by `block::mq`, atomically decreasing
> refcount from 2 to 0, which is not available through refcount.h, so
> I exposed `Refcount::as_atomic` which allows accessing the refcount
> directly.

I would rather wait with this patch until the helper LTO patches land
upstream. Or at least let me run the benchmarks to see the effect of not
inlining these refcount operations.

Best regards,
Andreas
Andreas Hindborg Oct. 4, 2024, 6:34 p.m. UTC | #3
Hi Gary,

"Gary Guo" <gary@garyguo.net> writes:

[...]

>  // SAFETY: All instances of `Request<T>` are reference counted. This
>  // implementation of `AlwaysRefCounted` ensure that increments to the ref count
>  // keeps the object alive in memory at least until a matching reference count
>  // decrement is executed.
>  unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {
>      fn inc_ref(&self) {
> -        let refcount = &self.wrapper_ref().refcount();
> -
> -        #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))]
> -        let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0);
> -
> -        #[cfg(CONFIG_DEBUG_MISC)]
> -        if !updated {
> -            panic!("Request refcount zero on clone")
> -        }
> +        self.wrapper_ref().refcount().inc();

What happened to the debug code?


BR Andreas
Gary Guo Oct. 4, 2024, 6:43 p.m. UTC | #4
On Fri, 04 Oct 2024 20:34:01 +0200
Andreas Hindborg <a.hindborg@kernel.org> wrote:

> Hi Gary,
> 
> "Gary Guo" <gary@garyguo.net> writes:
> 
> [...]
> 
> >  // SAFETY: All instances of `Request<T>` are reference counted. This
> >  // implementation of `AlwaysRefCounted` ensure that increments to the ref count
> >  // keeps the object alive in memory at least until a matching reference count
> >  // decrement is executed.
> >  unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {
> >      fn inc_ref(&self) {
> > -        let refcount = &self.wrapper_ref().refcount();
> > -
> > -        #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))]
> > -        let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0);
> > -
> > -        #[cfg(CONFIG_DEBUG_MISC)]
> > -        if !updated {
> > -            panic!("Request refcount zero on clone")
> > -        }
> > +        self.wrapper_ref().refcount().inc();  
> 
> What happened to the debug code?
> 
> 
> BR Andreas
> 

`refcount_inc` will WARN and saturate the refcount when trying to
increment from zero. This is true regardless of config.

I've already captured this in `Refcount::inc` documentation.

Best,
Gary
Andreas Hindborg Oct. 4, 2024, 7:18 p.m. UTC | #5
"Gary Guo" <gary@garyguo.net> writes:

> On Fri, 04 Oct 2024 20:34:01 +0200
> Andreas Hindborg <a.hindborg@kernel.org> wrote:
>
>> Hi Gary,
>>
>> "Gary Guo" <gary@garyguo.net> writes:
>>
>> [...]
>>
>> >  // SAFETY: All instances of `Request<T>` are reference counted. This
>> >  // implementation of `AlwaysRefCounted` ensure that increments to the ref count
>> >  // keeps the object alive in memory at least until a matching reference count
>> >  // decrement is executed.
>> >  unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {
>> >      fn inc_ref(&self) {
>> > -        let refcount = &self.wrapper_ref().refcount();
>> > -
>> > -        #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))]
>> > -        let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0);
>> > -
>> > -        #[cfg(CONFIG_DEBUG_MISC)]
>> > -        if !updated {
>> > -            panic!("Request refcount zero on clone")
>> > -        }
>> > +        self.wrapper_ref().refcount().inc();
>>
>> What happened to the debug code?
>>
>>
>> BR Andreas
>>
>
> `refcount_inc` will WARN and saturate the refcount when trying to
> increment from zero. This is true regardless of config.
>
> I've already captured this in `Refcount::inc` documentation.

I did not know. Thanks!

BR Andreas
diff mbox series

Patch

diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 9ba7fdfeb4b2..36ee5f96c66d 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -9,9 +9,10 @@ 
     block::mq::request::RequestDataWrapper,
     block::mq::Request,
     error::{from_result, Result},
+    sync::Refcount,
     types::ARef,
 };
-use core::{marker::PhantomData, sync::atomic::AtomicU64, sync::atomic::Ordering};
+use core::marker::PhantomData;
 
 /// Implement this trait to interface blk-mq as block devices.
 ///
@@ -77,7 +78,7 @@  impl<T: Operations> OperationsVTable<T> {
         let request = unsafe { &*(*bd).rq.cast::<Request<T>>() };
 
         // One refcount for the ARef, one for being in flight
-        request.wrapper_ref().refcount().store(2, Ordering::Relaxed);
+        request.wrapper_ref().refcount().set(2);
 
         // SAFETY:
         //  - We own a refcount that we took above. We pass that to `ARef`.
@@ -186,7 +187,7 @@  impl<T: Operations> OperationsVTable<T> {
 
             // SAFETY: The refcount field is allocated but not initialized, so
             // it is valid for writes.
-            unsafe { RequestDataWrapper::refcount_ptr(pdu.as_ptr()).write(AtomicU64::new(0)) };
+            unsafe { RequestDataWrapper::refcount_ptr(pdu.as_ptr()).write(Refcount::new(0)) };
 
             Ok(0)
         })
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index a0e22827f3f4..7b63c02bdce7 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -8,12 +8,13 @@ 
     bindings,
     block::mq::Operations,
     error::Result,
+    sync::Refcount,
     types::{ARef, AlwaysRefCounted, Opaque},
 };
 use core::{
     marker::PhantomData,
     ptr::{addr_of_mut, NonNull},
-    sync::atomic::{AtomicU64, Ordering},
+    sync::atomic::Ordering,
 };
 
 /// A wrapper around a blk-mq `struct request`. This represents an IO request.
@@ -37,6 +38,9 @@ 
 /// We need to track C and D to ensure that it is safe to end the request and hand
 /// back ownership to the block layer.
 ///
+/// Note that driver can still obtain new `ARef` even there is no `ARef`s in existence by using
+/// `tag_to_rq`, hence the need to distinct B and C.
+///
 /// The states are tracked through the private `refcount` field of
 /// `RequestDataWrapper`. This structure lives in the private data area of the C
 /// `struct request`.
@@ -91,13 +95,17 @@  pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) {
     /// C `struct request`. If the operation fails, `this` is returned in the
     /// `Err` variant.
     fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> {
-        // We can race with `TagSet::tag_to_rq`
-        if let Err(_old) = this.wrapper_ref().refcount().compare_exchange(
-            2,
-            0,
-            Ordering::Relaxed,
-            Ordering::Relaxed,
-        ) {
+        // To hand back the ownership, we need the current refcount to be 2.
+        // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce
+        // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying
+        // atomics directly.
+        if this
+            .wrapper_ref()
+            .refcount()
+            .as_atomic()
+            .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed)
+            .is_err()
+        {
             return Err(this);
         }
 
@@ -159,13 +167,13 @@  pub(crate) struct RequestDataWrapper {
     /// - 0: The request is owned by C block layer.
     /// - 1: The request is owned by Rust abstractions but there are no ARef references to it.
     /// - 2+: There are `ARef` references to the request.
-    refcount: AtomicU64,
+    refcount: Refcount,
 }
 
 impl RequestDataWrapper {
     /// Return a reference to the refcount of the request that is embedding
     /// `self`.
-    pub(crate) fn refcount(&self) -> &AtomicU64 {
+    pub(crate) fn refcount(&self) -> &Refcount {
         &self.refcount
     }
 
@@ -175,7 +183,7 @@  pub(crate) fn refcount(&self) -> &AtomicU64 {
     /// # Safety
     ///
     /// - `this` must point to a live allocation of at least the size of `Self`.
-    pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut AtomicU64 {
+    pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut Refcount {
         // SAFETY: Because of the safety requirements of this function, the
         // field projection is safe.
         unsafe { addr_of_mut!((*this).refcount) }
@@ -191,47 +199,13 @@  unsafe impl<T: Operations> Send for Request<T> {}
 // mutate `self` are internally synchronized`
 unsafe impl<T: Operations> Sync for Request<T> {}
 
-/// Store the result of `op(target.load())` in target, returning new value of
-/// target.
-fn atomic_relaxed_op_return(target: &AtomicU64, op: impl Fn(u64) -> u64) -> u64 {
-    let old = target.fetch_update(Ordering::Relaxed, Ordering::Relaxed, |x| Some(op(x)));
-
-    // SAFETY: Because the operation passed to `fetch_update` above always
-    // return `Some`, `old` will always be `Ok`.
-    let old = unsafe { old.unwrap_unchecked() };
-
-    op(old)
-}
-
-/// Store the result of `op(target.load)` in `target` if `target.load() !=
-/// pred`, returning true if the target was updated.
-fn atomic_relaxed_op_unless(target: &AtomicU64, op: impl Fn(u64) -> u64, pred: u64) -> bool {
-    target
-        .fetch_update(Ordering::Relaxed, Ordering::Relaxed, |x| {
-            if x == pred {
-                None
-            } else {
-                Some(op(x))
-            }
-        })
-        .is_ok()
-}
-
 // SAFETY: All instances of `Request<T>` are reference counted. This
 // implementation of `AlwaysRefCounted` ensure that increments to the ref count
 // keeps the object alive in memory at least until a matching reference count
 // decrement is executed.
 unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {
     fn inc_ref(&self) {
-        let refcount = &self.wrapper_ref().refcount();
-
-        #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))]
-        let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0);
-
-        #[cfg(CONFIG_DEBUG_MISC)]
-        if !updated {
-            panic!("Request refcount zero on clone")
-        }
+        self.wrapper_ref().refcount().inc();
     }
 
     unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
@@ -243,10 +217,10 @@  unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
         let refcount = unsafe { &*RequestDataWrapper::refcount_ptr(wrapper_ptr) };
 
         #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))]
-        let new_refcount = atomic_relaxed_op_return(refcount, |x| x - 1);
+        let is_zero = refcount.dec_and_test();
 
         #[cfg(CONFIG_DEBUG_MISC)]
-        if new_refcount == 0 {
+        if is_zero {
             panic!("Request reached refcount zero in Rust abstractions");
         }
     }
diff --git a/rust/kernel/sync/refcount.rs b/rust/kernel/sync/refcount.rs
index 4a5b815adc05..2cdcb20e8ee2 100644
--- a/rust/kernel/sync/refcount.rs
+++ b/rust/kernel/sync/refcount.rs
@@ -31,6 +31,18 @@  fn as_ptr(&self) -> *mut bindings::refcount_t {
         self.0.get()
     }
 
+    /// Get the underlying atomic counter that backs the refcount.
+    ///
+    /// NOTE: This will be changed to LKMM atomic in the future.
+    #[inline]
+    pub fn as_atomic(&self) -> &AtomicI32 {
+        let ptr = self.0.get() as *const AtomicI32;
+        // SAFETY: `refcount_t` is a transparent wrapper of `atomic_t`, which is an atomic 32-bit
+        // integer that is layout-wise compatible with `AtomicI32`. All values are valid for
+        // `refcount_t`, despite some of the values are considered saturated and "bad".
+        unsafe { &*ptr }
+    }
+
     /// Get a refcount's value.
     #[inline]
     pub fn read(&self) -> i32 {