diff mbox series

[RFC] rust: add bindings for interrupt sources + interior mutability discussion

Message ID 20241104085159.76841-1-pbonzini@redhat.com (mailing list archive)
State New
Headers show
Series [RFC] rust: add bindings for interrupt sources + interior mutability discussion | expand

Commit Message

Paolo Bonzini Nov. 4, 2024, 8:51 a.m. UTC
The InterruptSource bindings let us call qemu_set_irq() and
sysbus_init_irq() as safe code.  Apart from this, they are a good starting
point to think more about Rust shared/exclusive reference requirements
and what this means for QEMU.

Interrupt sources, qemu_irq in C code, are pointers to IRQState objects.
They are QOM link properties and can be written to outside the control
of the device (i.e. from a shared reference); therefore their Rust
representation must be an interior-mutable field.

But this actually applies to _all_ of the device struct!  Once a pointer
to the device has been handed out (for example via memory_region_init_io
or qdev_init_clock_in), accesses to the device struct must not use
&mut anymore.  It does not matter if C code handed you a *mut, such as
in a MemoryRegion or CharBackend callback: Rust disallows altogether
creating mutable references to data that has an active shared reference.
Instead, individual regions of the device must use cell types to make
_parts_ of the device structs mutable.

Back to interrupt sources, for now this patch uses a Cell, but this is
only an approximation of what is actually going on; a Cell can only
live within one thread, while here the semantics are "accessible by
multiple threads but only under the Big QEMU Lock".  It seems to me that
the way to go is for QEMU to provide its own "cell" types that check
locking rules with respect to the "Big QEMU Lock" or to ``AioContext``s.
For example, qemu_api::cell::Cell, which is for Copy types like
std::cell:Cell, would only allow get()/set() under BQL protection and
therefore could be Send/Sync.  Likewise, qemu_api::cell::RefCell would be
a RefCell that is Send/Sync, because it checks that borrow()/borrow_mut()
is only done under BQL.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	Do people think this is the right way forward for interior
	mutability?  Should these bindings be committed already with
	the FIXME comment, or should qemu_api::cell:Cell be written and
	committed first?  Should the implementation and use be split
	in separate patches or is the diff small enough?

 rust/hw/char/pl011/src/device.rs | 22 +++++++-----
 rust/qemu-api/meson.build        |  2 ++
 rust/qemu-api/src/irq.rs         | 61 ++++++++++++++++++++++++++++++++
 rust/qemu-api/src/lib.rs         |  2 ++
 rust/qemu-api/src/sysbus.rs      | 26 ++++++++++++++
 5 files changed, 104 insertions(+), 9 deletions(-)
 create mode 100644 rust/qemu-api/src/irq.rs
 create mode 100644 rust/qemu-api/src/sysbus.rs

Comments

Junjie Mao Nov. 8, 2024, 6:21 a.m. UTC | #1
Paolo Bonzini <pbonzini@redhat.com> writes:

> The InterruptSource bindings let us call qemu_set_irq() and
> sysbus_init_irq() as safe code.  Apart from this, they are a good starting
> point to think more about Rust shared/exclusive reference requirements
> and what this means for QEMU.
>
> Interrupt sources, qemu_irq in C code, are pointers to IRQState objects.
> They are QOM link properties and can be written to outside the control
> of the device (i.e. from a shared reference); therefore their Rust
> representation must be an interior-mutable field.

I like the idea of modeling device struct fields as interior-mutable.
That reflects how such fields are referenced and can help forbid invalid
accesses early.

To showcase the benefit of modeling interior-mutable fields, a more
complex device example (probably involving coroutines or other async
mechanisms) may be needed.

>
> But this actually applies to _all_ of the device struct! Once a pointer
> to the device has been handed out (for example via memory_region_init_io
> or qdev_init_clock_in), accesses to the device struct must not use
> &mut anymore.  It does not matter if C code handed you a *mut, such as
> in a MemoryRegion or CharBackend callback: Rust disallows altogether
> creating mutable references to data that has an active shared reference.
> Instead, individual regions of the device must use cell types to make
> _parts_ of the device structs mutable.
>
> Back to interrupt sources, for now this patch uses a Cell, but this is
> only an approximation of what is actually going on; a Cell can only
> live within one thread, while here the semantics are "accessible by
> multiple threads but only under the Big QEMU Lock".  It seems to me that
> the way to go is for QEMU to provide its own "cell" types that check
> locking rules with respect to the "Big QEMU Lock" or to ``AioContext``s.
> For example, qemu_api::cell::Cell, which is for Copy types like
> std::cell:Cell, would only allow get()/set() under BQL protection and
> therefore could be Send/Sync.  Likewise, qemu_api::cell::RefCell would be
> a RefCell that is Send/Sync, because it checks that borrow()/borrow_mut()
> is only done under BQL.

To me a container that check locking rules sound more like Lock, not
Cell. Name it as a Cell can be misleading to those being used to Rust
cells.

That said, one of its primary differences from the standard locking
types in Rust is that the lock involved is global instead of being bound
to a specific object. There are two alternative APIs in my mind:

1. get(&self) -> T / set(&self, T) which internally checks if BQL is
held by the current thread, and panics if it is not. This is more
straightforward.

2. get(&self, _: &BqlLockGuard) -> T / set(&self, _: T,
_: &BqlLockGuard) which takes an extra evidence of BQL being
held. BqlLockGuard can only be got as an argument to BQL-protected
callbacks (macro-generated glue code here) or Bql::lock() which calls
bql_lock().

The second approach looks more idiomatic to me and may allow the
compiler to error (via analyzing lifetimes) on derefs of
borrow()/borrow_mut()-returned refs after BQL is unlocked (I need a
double check on this). However, temporarily unlocking BQL is extremely
rare in devices. I'm not sure if that's worthwhile at the cost of making
the APIs more tedious to use.

What do you think?

--
Best Regards
Junjie Mao

>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> 	Do people think this is the right way forward for interior
> 	mutability?  Should these bindings be committed already with
> 	the FIXME comment, or should qemu_api::cell:Cell be written and
> 	committed first?  Should the implementation and use be split
> 	in separate patches or is the diff small enough?
>
>  rust/hw/char/pl011/src/device.rs | 22 +++++++-----
>  rust/qemu-api/meson.build        |  2 ++
>  rust/qemu-api/src/irq.rs         | 61 ++++++++++++++++++++++++++++++++
>  rust/qemu-api/src/lib.rs         |  2 ++
>  rust/qemu-api/src/sysbus.rs      | 26 ++++++++++++++
>  5 files changed, 104 insertions(+), 9 deletions(-)
>  create mode 100644 rust/qemu-api/src/irq.rs
>  create mode 100644 rust/qemu-api/src/sysbus.rs
>
Paolo Bonzini Nov. 11, 2024, 10:09 a.m. UTC | #2
On Fri, Nov 8, 2024 at 10:16 AM Junjie Mao <junjie.mao@hotmail.com> wrote:
> I like the idea of modeling device struct fields as interior-mutable.
> That reflects how such fields are referenced and can help forbid invalid
> accesses early.
>
> To showcase the benefit of modeling interior-mutable fields, a more
> complex device example (probably involving coroutines or other async
> mechanisms) may be needed.

I don't think there's really a benefit, it's more like avoiding a lie.
Even though right now (or in general) it's extremely unlikely that
this causes miscompilation, Rust's expectation is that a &mut is not
aliased except if it is formed from a shared reference via
UnsafeCell<>. Right now that's violated in both realize and the
Chardev callbacks.

> To me a container that check locking rules sound more like Lock, not
> Cell. Name it as a Cell can be misleading to those being used to Rust
> cells.

Right, it still performs the function and has the API of a Cell but
I've settled on BqlCell and BqlRefCell for clarity.

Especially if you have multiple imports of std::cell::Cell and QEMU's
own BQL-based (lowercase) cell, you don't want them to have the same
name.

> That said, one of its primary differences from the standard locking
> types in Rust is that the lock involved is global instead of being bound
> to a specific object. There are two alternative APIs in my mind:
>
> 1. get(&self) -> T / set(&self, T) which internally checks if BQL is
> held by the current thread, and panics if it is not. This is more
> straightforward.
>
> 2. get(&self, _: &BqlLockGuard) -> T / set(&self, _: T,
> _: &BqlLockGuard) which takes an extra evidence of BQL being
> held. BqlLockGuard can only be got as an argument to BQL-protected
> callbacks (macro-generated glue code here) or Bql::lock() which calls
> bql_lock().
>
> The second approach looks more idiomatic to me and may allow the
> compiler to error (via analyzing lifetimes) on derefs of
> borrow()/borrow_mut()-returned refs after BQL is unlocked (I need a
> double check on this). However, temporarily unlocking BQL is extremely
> rare in devices. I'm not sure if that's worthwhile at the cost of making
> the APIs more tedious to use.

Yes, this would be better in terms of compile-time safety, but on the
other hand the guard would propagate to all APIs that require the BQL.
The nice thing about the cell approach is that, as far as most devices
are concerned, QEMU has a single program flow (though not a single
thread) with very well defined preemption points but with an abundance
of reentrancy. If your callbacks get a BQL guard from the outside,
it's true that checks move from run-time to compile-time, on the other
hand you still need a RefCell-like mechanism to protect from
unexpected reentrancy. I'd rather avoid types like Bql<RefCell<T>>,
they are verbose in the definitions and the usage because they require
both a guard and explicit borrowing.

However, there are two points that are related to your observation:

1) when BqlRefCell is added, the C code needs to abort if the BQL is
released while a BqlRefCell borrow (either shared or mutable) is
alive. That's because you can never be sure that C code won't come in
and mutate the data that is protected by BqlRefCell.

2) We probably want a custom guard object sooner or later, maybe for
the block layer to assert that you're in the right AioContext, but for
now the risk of lock-related runtime failures of BqlRefCell seems
small to me (especially compared to already-borrowed panics due to
Rust->C->Rust reentrancy).


There is another issue that is solved by custom cell types compared to
RefCell, by the way, and it's that you cannot do offset_of! of
something within a RefCell. However we can make the BqlRefCell start
with the inner type:

#[repr(C)]
pub struct BqlRefCell<T> {
    inner: T,
    ...
}

and then use offset_of!(DeviceStruct, cell) + offset_of!(DeviceRegs,
reg_name). This however is very much left for later, as there are more
pressing issues and I think no one has put much thought into VMState
yet.

Thanks for sharing. I'll post a version of the BqlCell sometime within
the next week or two.

Paolo
diff mbox series

Patch

diff --git a/rust/hw/char/pl011/src/device.rs b/rust/hw/char/pl011/src/device.rs
index c05c2972716..bd6dc513bf6 100644
--- a/rust/hw/char/pl011/src/device.rs
+++ b/rust/hw/char/pl011/src/device.rs
@@ -13,6 +13,7 @@ 
     c_str,
     definitions::ObjectImpl,
     device_class::TYPE_SYS_BUS_DEVICE,
+    irq::InterruptSource,
 };
 
 use crate::{
@@ -94,7 +95,7 @@  pub struct PL011State {
     ///  * sysbus IRQ 5: `UARTEINTR` (error interrupt line)
     /// ```
     #[doc(alias = "irq")]
-    pub interrupts: [qemu_irq; 6usize],
+    pub interrupts: [InterruptSource; IRQMASK.len()],
     #[doc(alias = "clk")]
     pub clock: NonNull<Clock>,
     #[doc(alias = "migrate_clk")]
@@ -139,7 +140,8 @@  impl PL011State {
     unsafe fn init(&mut self) {
         const CLK_NAME: &CStr = c_str!("clk");
 
-        let dev = addr_of_mut!(*self).cast::<DeviceState>();
+        let sbd = unsafe { &mut *(addr_of_mut!(*self).cast::<SysBusDevice>()) };
+
         // SAFETY:
         //
         // self and self.iomem are guaranteed to be valid at this point since callers
@@ -153,12 +155,15 @@  unsafe fn init(&mut self) {
                 Self::TYPE_INFO.name,
                 0x1000,
             );
-            let sbd = addr_of_mut!(*self).cast::<SysBusDevice>();
             sysbus_init_mmio(sbd, addr_of_mut!(self.iomem));
-            for irq in self.interrupts.iter_mut() {
-                sysbus_init_irq(sbd, irq);
-            }
         }
+
+        for irq in self.interrupts.iter() {
+            sbd.init_irq(irq);
+        }
+
+        let dev = addr_of_mut!(*self).cast::<DeviceState>();
+
         // SAFETY:
         //
         // self.clock is not initialized at this point; but since `NonNull<_>` is Copy,
@@ -498,8 +503,7 @@  pub fn put_fifo(&mut self, value: c_uint) {
     pub fn update(&self) {
         let flags = self.int_level & self.int_enabled;
         for (irq, i) in self.interrupts.iter().zip(IRQMASK) {
-            // SAFETY: self.interrupts have been initialized in init().
-            unsafe { qemu_set_irq(*irq, i32::from(flags & i != 0)); }
+            irq.set(flags & i != 0);
         }
     }
 
@@ -526,7 +530,7 @@  pub fn post_load(&mut self, _version_id: u32) -> Result<(), ()> {
 }
 
 /// Which bits in the interrupt status matter for each outbound IRQ line ?
-pub const IRQMASK: [u32; 6] = [
+const IRQMASK: [u32; 6] = [
     /* combined IRQ */
     Interrupt::E
         | Interrupt::MS
diff --git a/rust/qemu-api/meson.build b/rust/qemu-api/meson.build
index e3870e901e3..a3a3d0101fd 100644
--- a/rust/qemu-api/meson.build
+++ b/rust/qemu-api/meson.build
@@ -13,7 +13,9 @@  _qemu_api_rs = static_library(
       'src/c_str.rs',
       'src/definitions.rs',
       'src/device_class.rs',
+      'src/irq.rs',
       'src/offset_of.rs',
+      'src/sysbus.rs',
       'src/vmstate.rs',
       'src/zeroable.rs',
     ],
diff --git a/rust/qemu-api/src/irq.rs b/rust/qemu-api/src/irq.rs
new file mode 100644
index 00000000000..813330ad802
--- /dev/null
+++ b/rust/qemu-api/src/irq.rs
@@ -0,0 +1,61 @@ 
+// Copyright 2024 Red Hat, Inc.
+// Author(s): Paolo Bonzini <pbonzini@redhat.com>
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+use core::{cell::Cell, ptr};
+
+use crate::bindings::{qemu_set_irq, IRQState};
+
+/// Interrupt sources are used by devices to pass changes to a boolean value to
+/// other devices (typically interrupt or GPIO controllers).  QEMU interrupt
+/// sources are always active-high.
+///
+/// Interrupts are implemented as a pointer to the interrupt "sink", which has
+/// type [`IRQState`].  A device exposes its source as a QOM link property using
+/// a function such as
+/// [`SysBusDevice::init_irq`](crate::sysbus::SysBusDevice::init_irq), and
+/// initially leaves the pointer to a NULL value, representing an unconnected
+/// interrupt. To connect it, whoever creates the device fills the pointer with
+/// the sink's `IRQState *`, for example using `sysbus_connect_irq`.  Because
+/// devices are generally shared objects, interrupt sources are an example of
+/// the interior mutability pattern.
+///
+/// FIXME: Interrupt sources can only be triggered under the Big QEMU Lock, but
+/// they _are_ both `Send` and `Sync`.  `core::cell::Cell` does not capture that.
+#[derive(Debug)]
+pub struct InterruptSource(Cell<*mut IRQState>);
+
+impl InterruptSource {
+    /// Send a low (`false`) value to the interrupt sink.
+    pub fn lower(&self) {
+        self.set(false);
+    }
+
+    /// Send a high-low pulse to the interrupt sink.
+    pub fn pulse(&self) {
+        self.set(true);
+        self.set(false);
+    }
+
+    /// Send a high (`true`) value to the interrupt sink.
+    pub fn raise(&self) {
+        self.set(true);
+    }
+
+    /// Send `level` to the interrupt sink.
+    pub fn set(&self, level: bool) {
+        unsafe {
+            qemu_set_irq(self.0.get(), level as i32);
+        }
+    }
+
+    pub(crate) const fn as_ptr(&self) -> *mut *mut IRQState {
+        self.0.as_ptr()
+    }
+}
+
+impl Default for InterruptSource {
+    fn default() -> Self {
+        InterruptSource(Cell::new(ptr::null_mut()))
+    }
+}
diff --git a/rust/qemu-api/src/lib.rs b/rust/qemu-api/src/lib.rs
index 440aff3817d..fdda5f1d94f 100644
--- a/rust/qemu-api/src/lib.rs
+++ b/rust/qemu-api/src/lib.rs
@@ -10,7 +10,9 @@ 
 pub mod c_str;
 pub mod definitions;
 pub mod device_class;
+pub mod irq;
 pub mod offset_of;
+pub mod sysbus;
 pub mod vmstate;
 pub mod zeroable;
 
diff --git a/rust/qemu-api/src/sysbus.rs b/rust/qemu-api/src/sysbus.rs
new file mode 100644
index 00000000000..1a9b8a1f971
--- /dev/null
+++ b/rust/qemu-api/src/sysbus.rs
@@ -0,0 +1,26 @@ 
+// Copyright 2024 Red Hat, Inc.
+// Author(s): Paolo Bonzini <pbonzini@redhat.com>
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+use std::ptr::addr_of;
+
+pub use bindings::{SysBusDevice, SysBusDeviceClass};
+
+use crate::{bindings, irq::InterruptSource};
+
+impl SysBusDevice {
+    /// Return `self` cast to a mutable pointer, for use in calls to C code.
+    const fn as_mut_ptr(&self) -> *mut SysBusDevice {
+        addr_of!(*self) as *mut _
+    }
+
+    /// Expose an interrupt source outside the device as a qdev GPIO output.
+    /// Note that the ordering of calls to `init_irq` is important, since
+    /// whoever creates the sysbus device will refer to the interrupts with
+    /// a number that corresponds to the order of calls to `init_irq`.
+    pub fn init_irq(&self, irq: &InterruptSource) {
+        unsafe {
+            bindings::sysbus_init_irq(self.as_mut_ptr(), irq.as_ptr());
+        }
+    }
+}