diff mbox series

[RFC,1/9] rust: vmstate: add new type safe implementation

Message ID 20241231002336.25931-2-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series rust: (mostly) type safe VMState | expand

Commit Message

Paolo Bonzini Dec. 31, 2024, 12:23 a.m. UTC
The existing translation of the C macros for vmstate does not make
any attempt to type-check vmstate declarations against the struct, so
introduce a new system that computes VMStateField based on the actual
struct declaration.

Macros do not have full access to the type system, therefore a full
implementation of this scheme requires a helper trait to analyze the
type and produce a VMStateField from it; a macro "vmstate_of!" accepts
arguments similar to "offset_of!" and tricks the compiler into looking
up the trait for the right type.

The patch introduces not just vmstate_of!, but also the slightly too
clever enabling macro call_func_with_field!.  The particular trick used
here was proposed on the users.rust-lang.org forum, so I take no merit
and all the blame.

Introduce the trait and some functions to access it; the actual
implementation comes later.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 rust/qemu-api/src/prelude.rs |   2 +
 rust/qemu-api/src/vmstate.rs | 110 +++++++++++++++++++++++++++++++++--
 2 files changed, 106 insertions(+), 6 deletions(-)

Comments

Zhao Liu Jan. 7, 2025, 8:58 a.m. UTC | #1
On Tue, Dec 31, 2024 at 01:23:28AM +0100, Paolo Bonzini wrote:
> Date: Tue, 31 Dec 2024 01:23:28 +0100
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [RFC PATCH 1/9] rust: vmstate: add new type safe implementation
> X-Mailer: git-send-email 2.47.1
> 
> The existing translation of the C macros for vmstate does not make
> any attempt to type-check vmstate declarations against the struct, so
> introduce a new system that computes VMStateField based on the actual
> struct declaration.
> 
> Macros do not have full access to the type system, therefore a full
> implementation of this scheme requires a helper trait to analyze the
> type and produce a VMStateField from it; a macro "vmstate_of!" accepts
> arguments similar to "offset_of!" and tricks the compiler into looking
> up the trait for the right type.
> 
> The patch introduces not just vmstate_of!, but also the slightly too
> clever enabling macro call_func_with_field!.  The particular trick used
> here was proposed on the users.rust-lang.org forum, so I take no merit
> and all the blame.

This is very good work! I am curious about how QEMU plays with Rust
forum:

Rust forum's disscussion is under MIT and Apache 2.0 licenses [1], and
since vmstate.rs is under the GPLv2 license, do we need to specify that
certain code retains the MIT license?

[1]: https://users.rust-lang.org/t/tos-updated-to-match-rfcs-and-rust-repository/45650

> Introduce the trait and some functions to access it; the actual
> implementation comes later.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  rust/qemu-api/src/prelude.rs |   2 +
>  rust/qemu-api/src/vmstate.rs | 110 +++++++++++++++++++++++++++++++++--
>  2 files changed, 106 insertions(+), 6 deletions(-)
> 
> diff --git a/rust/qemu-api/src/prelude.rs b/rust/qemu-api/src/prelude.rs
> index 4ea70b9c823..2dc86e19b29 100644
> --- a/rust/qemu-api/src/prelude.rs
> +++ b/rust/qemu-api/src/prelude.rs
> @@ -18,3 +18,5 @@
>  pub use crate::qom_isa;
>  
>  pub use crate::sysbus::SysBusDeviceMethods;
> +
> +pub use crate::vmstate::VMState;
> diff --git a/rust/qemu-api/src/vmstate.rs b/rust/qemu-api/src/vmstate.rs
> index 63c897abcdf..bfcf06e8f1d 100644
> --- a/rust/qemu-api/src/vmstate.rs
> +++ b/rust/qemu-api/src/vmstate.rs
> @@ -4,13 +4,111 @@
>  
>  //! Helper macros to declare migration state for device models.
>  //!
> -//! Some macros are direct equivalents to the C macros declared in
> -//! `include/migration/vmstate.h` while
> -//! [`vmstate_subsections`](crate::vmstate_subsections) and
> -//! [`vmstate_fields`](crate::vmstate_fields) are meant to be used when
> -//! declaring a device model state struct.
> +//! This module includes three families of macros:
> +//!
> +//! * [`vmstate_unused!`](crate::vmstate_unused) and
> +//!   [`vmstate_of!`](crate::vmstate_of), which are used to express the
> +//!   migration format for a struct.  This is based on the [`VMState`] trait,
> +//!   which is defined by all migrateable types.
> +//!
> +//! * helper macros to declare a device model state struct, in particular
> +//!   [`vmstate_subsections`](crate::vmstate_subsections) and
> +//!   [`vmstate_fields`](crate::vmstate_fields).
> +//!
> +//! * direct equivalents to the C macros declared in
> +//!   `include/migration/vmstate.h`. These are not type-safe and should not be
> +//!   used if the equivalent functionality is available with `vmstate_of!`.
>  
> -pub use crate::bindings::VMStateDescription;
> +use core::marker::PhantomData;
> +
> +pub use crate::bindings::{VMStateDescription, VMStateField};
> +
> +/// This macro is used to call a function with a generic argument bound
> +/// to the type of a field.  The function must take a
> +/// [`PhantomData`]`<T>` argument; `T` is the type of
> +/// field `$field` in the `$typ` type.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// # use qemu_api::call_func_with_field;
> +/// # use core::marker::PhantomData;
> +/// const fn size_of_field<T>(_: PhantomData<T>) -> usize {
> +///     std::mem::size_of::<T>()
> +/// }
> +///
> +/// struct Foo {
> +///     x: u16,
> +/// };
> +/// // calls size_of_field::<u16>()
> +/// assert_eq!(call_func_with_field!(size_of_field, Foo, x), 2);
> +/// ```
> +#[macro_export]
> +macro_rules! call_func_with_field {
> +    ($func:expr, $typ:ty, $($field:tt).+) => {
> +        $func(loop {
> +            #![allow(unreachable_code)]
> +            const fn phantom__<T>(_: &T) -> ::core::marker::PhantomData<T> { ::core::marker::PhantomData }
> +            // Unreachable code is exempt from checks on uninitialized values.
> +            // Use that trick to infer the type of this PhantomData.
> +            break ::core::marker::PhantomData;
> +            break phantom__(&{ let value__: $typ; value__.$($field).+ });
> +        })
> +    };
> +}

Very flexible and powerful. (I even think this code could be released as
a new public crate.)

> +/// A trait for types that can be included in a device's migration stream.  It
> +/// provides the base contents of a `VMStateField` (minus the name and offset).
> +///
> +/// # Safety
> +///
> +/// The contents of this trait go straight into structs that are parsed by C
> +/// code and used to introspect into other structs.  Be careful.
> +pub unsafe trait VMState {
> +    /// The base contents of a `VMStateField` (minus the name and offset) for
> +    /// the type that is implementing the trait.
> +    const BASE: VMStateField;
> +}
> +
> +/// Internal utility function to retrieve a type's `VMStateField`;
> +/// used by [`vmstate_of!`](crate::vmstate_of).
> +pub const fn vmstate_base<T: VMState>(_: PhantomData<T>) -> VMStateField {
> +    T::BASE
> +}
> +
> +/// Return the `VMStateField` for a field of a struct.  The field must be
> +/// visible in the current scope.
> +///
> +/// In order to support other types, the trait `VMState` must be implemented
> +/// for them.
> +#[macro_export]
> +macro_rules! vmstate_of {
> +    ($struct_name:ty, $field_name:ident $(,)?) => {

why allow a comma at the end? It seems other patches don't use that
style.

> +        $crate::bindings::VMStateField {
> +            name: ::core::concat!(::core::stringify!($field_name), "\0")
> +                .as_bytes()
> +                .as_ptr() as *const ::std::os::raw::c_char,
> +            offset: $crate::offset_of!($struct_name, $field_name),
> +            // Compute most of the VMStateField from the type of the field.
> +            ..$crate::call_func_with_field!(
> +                $crate::vmstate::vmstate_base,
> +                $struct_name,
> +                $field_name
> +            )
> +        }
> +    };
> +}
> +
> +// Add a couple builder-style methods to VMStateField, allowing
> +// easy derivation of VMStateField constants from other types.
> +impl VMStateField {
> +    #[must_use]
> +    pub const fn with_version_id(mut self, version_id: i32) -> Self {

Why not use u32 (and omit an assert)?

> +        assert!(version_id >= 0);
> +        self.version_id = version_id;
> +        self
> +    }
> +}
>  
>  #[doc(alias = "VMSTATE_UNUSED_BUFFER")]
>  #[macro_export]
> -- 
> 2.47.1

Good design! Look good to me.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Paolo Bonzini Jan. 7, 2025, 12:23 p.m. UTC | #2
On Tue, Jan 7, 2025 at 9:39 AM Zhao Liu <zhao1.liu@intel.com> wrote:
> This is very good work! I am curious about how QEMU plays with Rust
> forum:
>
> Rust forum's disscussion is under MIT and Apache 2.0 licenses [1], and
> since vmstate.rs is under the GPLv2 license, do we need to specify that
> certain code retains the MIT license?

I will add a note similar to the one that is already there in offset_of.

> > +/// ```
> > +/// # use qemu_api::call_func_with_field;
> > +/// # use core::marker::PhantomData;
> > +/// const fn size_of_field<T>(_: PhantomData<T>) -> usize {
> > +///     std::mem::size_of::<T>()
> > +/// }
> > +///
> > +/// struct Foo {
> > +///     x: u16,
> > +/// };
> > +/// // calls size_of_field::<u16>()
> > +/// assert_eq!(call_func_with_field!(size_of_field, Foo, x), 2);
> > +/// ```
> > +#[macro_export]
> > +macro_rules! call_func_with_field {
> > +    ($func:expr, $typ:ty, $($field:tt).+) => {
> > +        $func(loop {
> > +            #![allow(unreachable_code)]
> > +            const fn phantom__<T>(_: &T) -> ::core::marker::PhantomData<T> { ::core::marker::PhantomData }
> > +            // Unreachable code is exempt from checks on uninitialized values.
> > +            // Use that trick to infer the type of this PhantomData.
> > +            break ::core::marker::PhantomData;
> > +            break phantom__(&{ let value__: $typ; value__.$($field).+ });
> > +        })
> > +    };
> > +}
>
> Very flexible and powerful. (I even think this code could be released as
> a new public crate.)

It's probably not _that_ useful in general, unless you're implementing
this kind of reflection... otherwise I would have found an existing
solution. :) But yes, it's very powerful.

Out of curiosity, I asked claude.ai to explain it and it said "This is
a rather advanced use of Rust's type system and macro capabilities to
do compile-time reflection - basically inspecting the types of struct
fields without runtime overhead. While creative, this pattern isn't
commonly needed in everyday Rust code."

When fed the initial comment from the Rust forum it said "your comment
about wanting to access <T as SomeTrait>::SOMETHING for a field's type
is a classic serialization pattern - often used to get things like
type IDs, serialization formats, or field metadata at compile time".
That's actually pretty impressive; the LLM was also impressed and it
started asking me more about it ("Are you building a custom
serialization framework from scratch, or extending an existing one?").

> > +/// Return the `VMStateField` for a field of a struct.  The field must be
> > +/// visible in the current scope.
> > +///
> > +/// In order to support other types, the trait `VMState` must be implemented
> > +/// for them.
> > +#[macro_export]
> > +macro_rules! vmstate_of {
> > +    ($struct_name:ty, $field_name:ident $(,)?) => {
>
> why allow a comma at the end? It seems other patches don't use that
> style.

I copied it from the standard library offset_of, but I can remove it too.

> > +// Add a couple builder-style methods to VMStateField, allowing
> > +// easy derivation of VMStateField constants from other types.
> > +impl VMStateField {
> > +    #[must_use]
> > +    pub const fn with_version_id(mut self, version_id: i32) -> Self {
>
> Why not use u32 (and omit an assert)?

The version_id field is an int in the C struct; you'd still need to
assert that it's <= i32::MAX, and you'd also need an 'as i32'. In
practice it will always be a constant, therefore it will auto-cast to
either i32 or u32.

Paolo
Zhao Liu Jan. 7, 2025, 2:01 p.m. UTC | #3
> > Very flexible and powerful. (I even think this code could be released as
> > a new public crate.)
> 
> It's probably not _that_ useful in general, unless you're implementing
> this kind of reflection... otherwise I would have found an existing
> solution. :) But yes, it's very powerful.

Personally, I feel that projects that glue C and Rust together require
similar tricks more, making them more challenging.

> Out of curiosity, I asked claude.ai to explain it and it said "This is
> a rather advanced use of Rust's type system and macro capabilities to
> do compile-time reflection - basically inspecting the types of struct
> fields without runtime overhead. While creative, this pattern isn't
> commonly needed in everyday Rust code."
> 
> When fed the initial comment from the Rust forum it said "your comment
> about wanting to access <T as SomeTrait>::SOMETHING for a field's type
> is a classic serialization pattern - often used to get things like
> type IDs, serialization formats, or field metadata at compile time".
> That's actually pretty impressive; the LLM was also impressed and it
> started asking me more about it ("Are you building a custom
> serialization framework from scratch, or extending an existing one?").

Incredible, commercial LLMs are so proficient in Rust and provide such
professional comments (even a bit off-topic, it feels like LLMs could
even review patches).

Thank you for providing this interesting example. LLMs are indeed the
good tool to help get started with and practice Rust.

Thanks,
Zhao
diff mbox series

Patch

diff --git a/rust/qemu-api/src/prelude.rs b/rust/qemu-api/src/prelude.rs
index 4ea70b9c823..2dc86e19b29 100644
--- a/rust/qemu-api/src/prelude.rs
+++ b/rust/qemu-api/src/prelude.rs
@@ -18,3 +18,5 @@ 
 pub use crate::qom_isa;
 
 pub use crate::sysbus::SysBusDeviceMethods;
+
+pub use crate::vmstate::VMState;
diff --git a/rust/qemu-api/src/vmstate.rs b/rust/qemu-api/src/vmstate.rs
index 63c897abcdf..bfcf06e8f1d 100644
--- a/rust/qemu-api/src/vmstate.rs
+++ b/rust/qemu-api/src/vmstate.rs
@@ -4,13 +4,111 @@ 
 
 //! Helper macros to declare migration state for device models.
 //!
-//! Some macros are direct equivalents to the C macros declared in
-//! `include/migration/vmstate.h` while
-//! [`vmstate_subsections`](crate::vmstate_subsections) and
-//! [`vmstate_fields`](crate::vmstate_fields) are meant to be used when
-//! declaring a device model state struct.
+//! This module includes three families of macros:
+//!
+//! * [`vmstate_unused!`](crate::vmstate_unused) and
+//!   [`vmstate_of!`](crate::vmstate_of), which are used to express the
+//!   migration format for a struct.  This is based on the [`VMState`] trait,
+//!   which is defined by all migrateable types.
+//!
+//! * helper macros to declare a device model state struct, in particular
+//!   [`vmstate_subsections`](crate::vmstate_subsections) and
+//!   [`vmstate_fields`](crate::vmstate_fields).
+//!
+//! * direct equivalents to the C macros declared in
+//!   `include/migration/vmstate.h`. These are not type-safe and should not be
+//!   used if the equivalent functionality is available with `vmstate_of!`.
 
-pub use crate::bindings::VMStateDescription;
+use core::marker::PhantomData;
+
+pub use crate::bindings::{VMStateDescription, VMStateField};
+
+/// This macro is used to call a function with a generic argument bound
+/// to the type of a field.  The function must take a
+/// [`PhantomData`]`<T>` argument; `T` is the type of
+/// field `$field` in the `$typ` type.
+///
+/// # Examples
+///
+/// ```
+/// # use qemu_api::call_func_with_field;
+/// # use core::marker::PhantomData;
+/// const fn size_of_field<T>(_: PhantomData<T>) -> usize {
+///     std::mem::size_of::<T>()
+/// }
+///
+/// struct Foo {
+///     x: u16,
+/// };
+/// // calls size_of_field::<u16>()
+/// assert_eq!(call_func_with_field!(size_of_field, Foo, x), 2);
+/// ```
+#[macro_export]
+macro_rules! call_func_with_field {
+    ($func:expr, $typ:ty, $($field:tt).+) => {
+        $func(loop {
+            #![allow(unreachable_code)]
+            const fn phantom__<T>(_: &T) -> ::core::marker::PhantomData<T> { ::core::marker::PhantomData }
+            // Unreachable code is exempt from checks on uninitialized values.
+            // Use that trick to infer the type of this PhantomData.
+            break ::core::marker::PhantomData;
+            break phantom__(&{ let value__: $typ; value__.$($field).+ });
+        })
+    };
+}
+
+/// A trait for types that can be included in a device's migration stream.  It
+/// provides the base contents of a `VMStateField` (minus the name and offset).
+///
+/// # Safety
+///
+/// The contents of this trait go straight into structs that are parsed by C
+/// code and used to introspect into other structs.  Be careful.
+pub unsafe trait VMState {
+    /// The base contents of a `VMStateField` (minus the name and offset) for
+    /// the type that is implementing the trait.
+    const BASE: VMStateField;
+}
+
+/// Internal utility function to retrieve a type's `VMStateField`;
+/// used by [`vmstate_of!`](crate::vmstate_of).
+pub const fn vmstate_base<T: VMState>(_: PhantomData<T>) -> VMStateField {
+    T::BASE
+}
+
+/// Return the `VMStateField` for a field of a struct.  The field must be
+/// visible in the current scope.
+///
+/// In order to support other types, the trait `VMState` must be implemented
+/// for them.
+#[macro_export]
+macro_rules! vmstate_of {
+    ($struct_name:ty, $field_name:ident $(,)?) => {
+        $crate::bindings::VMStateField {
+            name: ::core::concat!(::core::stringify!($field_name), "\0")
+                .as_bytes()
+                .as_ptr() as *const ::std::os::raw::c_char,
+            offset: $crate::offset_of!($struct_name, $field_name),
+            // Compute most of the VMStateField from the type of the field.
+            ..$crate::call_func_with_field!(
+                $crate::vmstate::vmstate_base,
+                $struct_name,
+                $field_name
+            )
+        }
+    };
+}
+
+// Add a couple builder-style methods to VMStateField, allowing
+// easy derivation of VMStateField constants from other types.
+impl VMStateField {
+    #[must_use]
+    pub const fn with_version_id(mut self, version_id: i32) -> Self {
+        assert!(version_id >= 0);
+        self.version_id = version_id;
+        self
+    }
+}
 
 #[doc(alias = "VMSTATE_UNUSED_BUFFER")]
 #[macro_export]