diff mbox series

[03/14] rust: define traits and pointer wrappers to convert from/to C representations

Message ID 20240701145853.1394967-4-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series rust: example of bindings code for Rust in QEMU | expand

Commit Message

Paolo Bonzini July 1, 2024, 2:58 p.m. UTC
The qemu::util::foreign module provides:

- A trait for structs that can be converted to a C ("foreign") representation
  (CloneToForeign)

- A trait for structs that can be built from a C ("foreign") representation
  (FromForeign), and the utility IntoNative that can be used with less typing
  (similar to the standard library's From and Into pair)

- Automatic implementations of the above traits for Option<>, supporting NULL
  pointers

- A wrapper for a pointer that automatically frees the contained data.  If
  a struct XYZ implements CloneToForeign, you can build an OwnedPointer<XYZ>
  and it will free the contents automatically unless you retrieve it with
  owned_ptr.into_inner()

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qemu/src/lib.rs          |   6 +
 qemu/src/util/foreign.rs | 247 +++++++++++++++++++++++++++++++++++++++
 qemu/src/util/mod.rs     |   1 +
 3 files changed, 254 insertions(+)
 create mode 100644 qemu/src/util/foreign.rs
 create mode 100644 qemu/src/util/mod.rs

Comments

Marc-André Lureau July 3, 2024, 12:48 p.m. UTC | #1
Hi

(adding Sebastian, one of the glib-rs developers in CC)

On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonzini@redhat.com> wrote:

> The qemu::util::foreign module provides:
>
> - A trait for structs that can be converted to a C ("foreign")
> representation
>   (CloneToForeign)
>
> - A trait for structs that can be built from a C ("foreign") representation
>   (FromForeign), and the utility IntoNative that can be used with less
> typing
>   (similar to the standard library's From and Into pair)
>
> - Automatic implementations of the above traits for Option<>, supporting
> NULL
>   pointers
>
> - A wrapper for a pointer that automatically frees the contained data.  If
>   a struct XYZ implements CloneToForeign, you can build an
> OwnedPointer<XYZ>
>   and it will free the contents automatically unless you retrieve it with
>   owned_ptr.into_inner()
>

You worry about technical debt, and I do too. Here you introduce quite
different traits than what glib-rs offers. We already touched this subject
2y ago, my opinion didn't change much (
https://patchew.org/QEMU/20210907121943.3498701-1-marcandre.lureau@redhat.com/20210907121943.3498701-13-marcandre.lureau@redhat.com/).
Also, you don't offer the equivalent of "to_glib_none" which uses a
temporary stash and is quite useful, as a majority of functions don't take
ownership.

Because much of our code is using GLib types and API style, I think we
should strive for something that is close (if not just the same) to what
glib-rs offers. It's already hard enough to handle one binding concept,
having 2 will only make the matter worse. Consider a type like
GHashTable<GUuid, QOM>, it will be very annoying to deal with if we have
different bindings traits and implementations and we will likely end up
duplicating glib-rs effort.

As for naming & consistency, glib-rs settled on something clearer imho:

from_glib_full
from_glib_none
to_glib_full
to_glib_none

vs

from_foreign
cloned_from_foreign
clone_to_foreign
/nothing/

but overall, this is still close enough that we shouldn't reinvent it.

It may be worth studying what the kernel offers, I haven't checked yet.


>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qemu/src/lib.rs          |   6 +
>  qemu/src/util/foreign.rs | 247 +++++++++++++++++++++++++++++++++++++++
>  qemu/src/util/mod.rs     |   1 +
>  3 files changed, 254 insertions(+)
>  create mode 100644 qemu/src/util/foreign.rs
>  create mode 100644 qemu/src/util/mod.rs
>
> diff --git a/qemu/src/lib.rs b/qemu/src/lib.rs
> index fce21d7..c48edcf 100644
> --- a/qemu/src/lib.rs
> +++ b/qemu/src/lib.rs
> @@ -2,3 +2,9 @@
>  #![allow(dead_code)]
>
>  pub mod bindings;
> +
> +pub mod util;
> +pub use util::foreign::CloneToForeign;
> +pub use util::foreign::FromForeign;
> +pub use util::foreign::IntoNative;
> +pub use util::foreign::OwnedPointer;
> diff --git a/qemu/src/util/foreign.rs b/qemu/src/util/foreign.rs
> new file mode 100644
> index 0000000..a591925
> --- /dev/null
> +++ b/qemu/src/util/foreign.rs
> @@ -0,0 +1,247 @@
> +// TODO: change to use .cast() etc.
> +#![allow(clippy::ptr_as_ptr)]
> +
> +/// Traits to map between C structs and native Rust types.
> +/// Similar to glib-rs but a bit simpler and possibly more
> +/// idiomatic.
> +use std::borrow::Cow;
> +use std::fmt;
> +use std::fmt::Debug;
> +use std::mem;
> +use std::ptr;
> +
> +/// A type for which there is a canonical representation as a C datum.
> +pub trait CloneToForeign {
> +    /// The representation of `Self` as a C datum.  Typically a
> +    /// `struct`, though there are exceptions for example `c_char`
> +    /// for strings, since C strings are of `char *` type).
> +    type Foreign;
> +
> +    /// Free the C datum pointed to by `p`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `p` must be `NULL` or point to valid data.
> +    unsafe fn free_foreign(p: *mut Self::Foreign);
> +
> +    /// Convert a native Rust object to a foreign C struct, copying
> +    /// everything pointed to by `self` (same as `to_glib_full` in
> `glib-rs`)
> +    fn clone_to_foreign(&self) -> OwnedPointer<Self>;
> +
> +    /// Convert a native Rust object to a foreign C pointer, copying
> +    /// everything pointed to by `self`.  The returned pointer must
> +    /// be freed with the `free_foreign` associated function.
> +    fn clone_to_foreign_ptr(&self) -> *mut Self::Foreign {
> +        self.clone_to_foreign().into_inner()
> +    }
> +}
> +
> +impl<T> CloneToForeign for Option<T>
> +where
> +    T: CloneToForeign,
> +{
> +    type Foreign = <T as CloneToForeign>::Foreign;
> +
> +    unsafe fn free_foreign(x: *mut Self::Foreign) {
> +        T::free_foreign(x)
> +    }
> +
> +    fn clone_to_foreign(&self) -> OwnedPointer<Self> {
> +        // Same as the underlying implementation, but also convert `None`
> +        // to a `NULL` pointer.
> +        self.as_ref()
> +            .map(CloneToForeign::clone_to_foreign)
> +            .map(OwnedPointer::into)
> +            .unwrap_or_default()
> +    }
> +}
> +
> +impl<T> FromForeign for Option<T>
> +where
> +    T: FromForeign,
> +{
> +    unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self {
> +        // Same as the underlying implementation, but also accept a
> `NULL` pointer.
> +        if p.is_null() {
> +            None
> +        } else {
> +            Some(T::cloned_from_foreign(p))
> +        }
> +    }
> +}
> +
> +impl<T> CloneToForeign for Box<T>
> +where
> +    T: CloneToForeign,
> +{
> +    type Foreign = <T as CloneToForeign>::Foreign;
> +
> +    unsafe fn free_foreign(x: *mut Self::Foreign) {
> +        T::free_foreign(x)
> +    }
> +
> +    fn clone_to_foreign(&self) -> OwnedPointer<Self> {
> +        self.as_ref().clone_to_foreign().into()
> +    }
> +}
> +
> +impl<T> FromForeign for Box<T>
> +where
> +    T: FromForeign,
> +{
> +    unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self {
> +        Box::new(T::cloned_from_foreign(p))
> +    }
> +}
> +
> +impl<'a, B> CloneToForeign for Cow<'a, B>
> +    where B: 'a + ToOwned + ?Sized + CloneToForeign,
> +{
> +    type Foreign = B::Foreign;
> +
> +    unsafe fn free_foreign(ptr: *mut B::Foreign) {
> +        B::free_foreign(ptr);
> +    }
> +
> +    fn clone_to_foreign(&self) -> OwnedPointer<Self> {
> +        self.as_ref().clone_to_foreign().into()
> +    }
> +}
> +
> +/// Convert a C datum into a native Rust object, taking ownership of
> +/// the C datum.  You should not need to implement this trait
> +/// as long as Rust types implement `FromForeign`.
> +pub trait IntoNative<T> {
> +    /// Convert a C datum to a native Rust object, taking ownership of
> +    /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`)
> +    ///
> +    /// # Safety
> +    ///
> +    /// `p` must point to valid data, or can be `NULL` if Self is an
> +    /// `Option` type.  It becomes invalid after the function returns.
> +    unsafe fn into_native(self) -> T;
> +}
> +
> +impl<T, U> IntoNative<U> for *mut T
> +where
> +    U: FromForeign<Foreign = T>,
> +{
> +    unsafe fn into_native(self) -> U {
> +        U::from_foreign(self)
> +    }
> +}
> +
> +/// A type which can be constructed from a canonical representation as a
> +/// C datum.
> +pub trait FromForeign: CloneToForeign + Sized {
> +    /// Convert a C datum to a native Rust object, copying everything
> +    /// pointed to by `p` (same as `from_glib_none` in `glib-rs`)
> +    ///
> +    /// # Safety
> +    ///
> +    /// `p` must point to valid data, or can be `NULL` is `Self` is an
> +    /// `Option` type.
> +    unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self;
> +
> +    /// Convert a C datum to a native Rust object, taking ownership of
> +    /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`)
> +    ///
> +    /// The default implementation calls `cloned_from_foreign` and frees
> `p`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `p` must point to valid data, or can be `NULL` is `Self` is an
> +    /// `Option` type.  `p` becomes invalid after the function returns.
> +    unsafe fn from_foreign(p: *mut Self::Foreign) -> Self {
> +        let result = Self::cloned_from_foreign(p);
> +        Self::free_foreign(p);
> +        result
> +    }
> +}
> +
> +pub struct OwnedPointer<T: CloneToForeign + ?Sized> {
> +    ptr: *mut <T as CloneToForeign>::Foreign,
> +}
> +
> +impl<T: CloneToForeign + ?Sized> OwnedPointer<T> {
> +    /// Return a new `OwnedPointer` that wraps the pointer `ptr`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The pointer must be valid and live until the returned
> `OwnedPointer`
> +    /// is dropped.
> +    pub unsafe fn new(ptr: *mut <T as CloneToForeign>::Foreign) -> Self {
> +        OwnedPointer { ptr }
> +    }
> +
> +    /// Safely create an `OwnedPointer` from one that has the same
> +    /// freeing function.
> +    pub fn from<U>(x: OwnedPointer<U>) -> Self
> +    where
> +        U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign> +
> ?Sized,
> +    {
> +        unsafe {
> +            // SAFETY: the pointer type and free function are the same,
> +            // only the type changes
> +            OwnedPointer::new(x.into_inner())
> +        }
> +    }
> +
> +    /// Safely convert an `OwnedPointer` into one that has the same
> +    /// freeing function.
> +    pub fn into<U>(self) -> OwnedPointer<U>
> +    where
> +        U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign>,
> +    {
> +        OwnedPointer::from(self)
> +    }
> +
> +    /// Return the pointer that is stored in the `OwnedPointer`.  The
> +    /// pointer is valid for as long as the `OwnedPointer` itself.
> +    pub fn as_ptr(&self) -> *const <T as CloneToForeign>::Foreign {
> +        self.ptr
> +    }
> +
> +    pub fn as_mut_ptr(&self) -> *mut <T as CloneToForeign>::Foreign {
> +        self.ptr
> +    }
> +
> +    /// Return the pointer that is stored in the `OwnedPointer`,
> +    /// consuming the `OwnedPointer` but not freeing the pointer.
> +    pub fn into_inner(mut self) -> *mut <T as CloneToForeign>::Foreign {
> +        let result = mem::replace(&mut self.ptr, ptr::null_mut());
> +        mem::forget(self);
> +        result
> +    }
> +}
> +
> +impl<T: FromForeign + ?Sized> OwnedPointer<T> {
> +    /// Convert a C datum to a native Rust object, taking ownership of
> +    /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`)
> +    pub fn into_native(self) -> T {
> +        // SAFETY: the pointer was passed to the unsafe constructor
> OwnedPointer::new
> +        unsafe { T::from_foreign(self.into_inner()) }
> +    }
> +}
> +
> +impl<T: CloneToForeign + ?Sized> Debug for OwnedPointer<T> {
> +    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +        let name = std::any::type_name::<T>();
> +        let name = format!("OwnedPointer<{}>", name);
> +        f.debug_tuple(&name).field(&self.as_ptr()).finish()
> +    }
> +}
> +
> +impl<T: CloneToForeign + Default + ?Sized> Default for OwnedPointer<T> {
> +    fn default() -> Self {
> +        <T as Default>::default().clone_to_foreign()
> +    }
> +}
> +
> +impl<T: CloneToForeign + ?Sized> Drop for OwnedPointer<T> {
> +    fn drop(&mut self) {
> +        let p = mem::replace(&mut self.ptr, ptr::null_mut());
> +        // SAFETY: the pointer was passed to the unsafe constructor
> OwnedPointer::new
> +        unsafe { T::free_foreign(p) }
> +    }
> +}
> diff --git a/qemu/src/util/mod.rs b/qemu/src/util/mod.rs
> new file mode 100644
> index 0000000..be00d0c
> --- /dev/null
> +++ b/qemu/src/util/mod.rs
> @@ -0,0 +1 @@
> +pub mod foreign;
> --
> 2.45.2
>
>
>
Marc-André Lureau July 3, 2024, 12:58 p.m. UTC | #2
Hi

On Wed, Jul 3, 2024 at 4:48 PM Marc-André Lureau <marcandre.lureau@gmail.com>
wrote:

> Hi
>
> (adding Sebastian, one of the glib-rs developers in CC)
>
> On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> The qemu::util::foreign module provides:
>>
>> - A trait for structs that can be converted to a C ("foreign")
>> representation
>>   (CloneToForeign)
>>
>> - A trait for structs that can be built from a C ("foreign")
>> representation
>>   (FromForeign), and the utility IntoNative that can be used with less
>> typing
>>   (similar to the standard library's From and Into pair)
>>
>> - Automatic implementations of the above traits for Option<>, supporting
>> NULL
>>   pointers
>>
>> - A wrapper for a pointer that automatically frees the contained data.  If
>>   a struct XYZ implements CloneToForeign, you can build an
>> OwnedPointer<XYZ>
>>   and it will free the contents automatically unless you retrieve it with
>>   owned_ptr.into_inner()
>>
>
> [...] Also, you don't offer the equivalent of "to_glib_none" which uses a
> temporary stash and is quite useful, as a majority of functions don't take
> ownership.
>

I realize this may be somewhat offered by OwnedPointer (although not
mentioned explicitly in the doc). Its usage seems unsafe, as you have to
handle the foreign pointer lifetime manually...
Paolo Bonzini July 3, 2024, 3:55 p.m. UTC | #3
[warning: long email]

On Wed, Jul 3, 2024 at 2:48 PM Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
>
> Hi
>
> (adding Sebastian, one of the glib-rs developers in CC)
>
> On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> The qemu::util::foreign module provides:
>>
>> - A trait for structs that can be converted to a C ("foreign") representation
>> - A trait for structs that can be built from a C ("foreign") representation
>> - A wrapper for a pointer that automatically frees the contained data.
>
> You worry about technical debt, and I do too. Here you introduce quite different traits than what glib-rs offers.
> We already touched this subject 2y ago, my opinion didn't change much
> (https://patchew.org/QEMU/20210907121943.3498701-1-marcandre.lureau@redhat.com/20210907121943.3498701-13-marcandre.lureau@redhat.com/)

Hi Marc-André, first of all thanks for reviewing and, probably, sorry
for not going more into detail as to why I reinvented this particular
wheel.

Like two years ago, I find the full/none nomenclature very confusing
and there is no trace of them in QEMU, but I could get over that
easily.

But also, I think the two have different focus and design. All I
wanted was a place to put common code to convert Rust objects (mostly
CStr/String/str and Error) to and from C representations (char * and
QEMU's Error*), to avoid rewriting it over and over. So I wanted the
traits to be as simple as possible, and I didn't like that glib puts
to_glib_none and to_glib_full into the same trait, though I understand
why it does so. I also didn't like that to_glib_full() is an express
ticket to a memory leak. :)

OwnedPointer<> and free_foreign solve all the qualms I had with glib-rs:

- OwnedPointer<> takes care of freeing at the cost of extra verbosity
for ".as_ptr()".

- because clone_to_foreign() does not leak, I don't need to think
about to_glib_none() from the get go.

This has a ripple effect on other parts of the design. For example,
because glib has to_glib_none(), its basic design is to never allocate
malloc-ed memory unless ownership of the pointer is passed to C.
glib-rs's to_glib_none can and will "clone to Rust-managed memory" if
needed. Instead, because I have free_foreign(), I can malloc freely.

In the future we could add something like Stash/to_glib_none(), but it
would be very different and with performance properties enforced by
the type system. It would "create a pointer into *existing*
Rust-managed memory" cheaply but, every time a cheap borrow is not
possible, users will have to call clone_to_foreign() explicitly. In
other words there will be a difference between OwnedPointer<> as the
result of an expensive operation, and [the equivalent of glib-rs]
Stash as the result of a cheap operation. This is unlike
to_glib_none() which is e.g. cheap for CStr but not for String. More
on this below.

> Also, you don't offer the equivalent of "to_glib_none" which uses a temporary stash and is quite useful, as a majority of functions don't take ownership.

Right, in part you have already answered that: OwnedPointer handles
freeing in the same way as g_autofree, so you can use it as a
to_glib_none() replacement. If you need to pass the pointer and forget
about it, you can use clone_to_foreign_ptr(). If you need something
temporary that needs to be freed before returning, you use
clone_to_foreign() and as_ptr().

This is not any more unsafe than to_glib_none(). If you pass a
temporary pointer (x.clone_to_foreign().as_ptr()) where C code expects
full ownership, you get a dangling pointer either way. It is less
leak-happy than to_glib_full() though.

Regarding how to obtain a pointer cheaply, there are two use cases:

- borrow:

  fn borrow_foreign<'a>(&'a self) -> Stash<'a,
      <Self as CloneToForeign>::Foreign, &'a Self>;

Useful for slices and elementary integer types - including cases where
C parameters are passed by reference - and also for CStr. However
right now my main user is strings and it doesn't buy much there.
Because String and &str are not null-terminated, a "borrow" would
allocate memory, as in to_glib_none(), which is pretty weird. At this
point, I find it better to have the more explicit clone_to_foreign()
that tells you how expensive it is.

- consumption, which does not exist in glib-rs:

    fn into_foreign(self) -> Stash<'static,
      <Self as CloneToForeign>::Foreign, Self>;

This works for types that, like String, deref to a stable address even
when moved---this way we don't need to deal with pinning. And it is in
fact especially nice for the String case and therefore for anything
that uses format!(), because it could be implemented as

   s.reserve_exact(1);
   s.push('\0');

The two would be separate traits. There was one case in which I could
have used into_foreign(), namely the call to error_setg_internal() in
patch 5. But I am not sure if this will be more common than functions
taking &str, so I left it for later.

> Because much of our code is using GLib types and API style, I
> think we should strive for something that is close (if not just the
> same) to what glib-rs offers. It's already hard enough to handle
> one binding concept, having 2 will only make the matter worse.
> Consider a type like GHashTable<GUuid, QOM>, it will be very
> annoying to deal with if we have different bindings traits and
> implementations and we will likely end up duplicating glib-rs
> effort.

I'm not even sure if it's even true that much QEMU code is using GLib
types. We use quite a bit of GLib, and took a lot of inspiration from
it because GLib is good. But it's notable that we use very little GLib
*across modules*. There are only a handful of occurrences to
GList/GList/GHashTable in function arguments in include/, for all of
QEMU, and all of them are very rarely used functions. Why? Possibly
because, even where we take inspiration (the Error** idiom, QOM
casts), the implementation ends up completely different for reasons
that aren't just NIH.

In any case I don't see us using glib-rs much, if at all; uses of hash
tables in Rust code can just use HashMap<>. Yes, there is no
equivalent of the expensive, O(n) conversion
self.some_g_hash_table.to_glib_none(); but that's a feature, not a
limitation.

Also, anyway we couldn't extend the glib traits a lot, and glib-rs
only supports HashMap<String, String>.

> As for naming & consistency, glib-rs settled on something clearer imho:
>
> from_glib_full
> from_glib_none
> to_glib_full
> to_glib_none
>
> vs
>
> from_foreign
> cloned_from_foreign
> clone_to_foreign
> /nothing/

These last two are respectively clone_to_foreign_ptr and
clone_to_foreign. I am not wed to my names but I'm not sure the glib
ones are clearer.

What I want from names is clarity on who allocates or frees; I do that
with "clone", which is a common Rust term, to indicate that the C side
still needs to be freed after the call. I find it confusing that
from_glib_full() requires no freeing but to_glib_full() requires
freeing, for example - I understand _why_ but still "full" and "none"
are GIR terms that are not really familiar to most Rust or QEMU
developers.

And secondarily, clarity on what is cheap and what isn't. I don't have
that functionality at all for now, but the nomenclature is easily
extended to borrow_foreign() and into_foreign(). Talking about Rust
terms, glib-rs lacks "into" which is the standard Rust way to do
consuming conversions, while my code has into_native().

All in all, I think there are some significant differences between
glib-rs and this implementation that justify the effort.


Paolo
Kevin Wolf Sept. 27, 2024, 4:09 p.m. UTC | #4
Hi Paolo,

as you asked me at KVM Forum to have a look at this, I'm going through
it now.

Am 01.07.2024 um 16:58 hat Paolo Bonzini geschrieben:
> The qemu::util::foreign module provides:
> 
> - A trait for structs that can be converted to a C ("foreign") representation
>   (CloneToForeign)
> 
> - A trait for structs that can be built from a C ("foreign") representation
>   (FromForeign), and the utility IntoNative that can be used with less typing
>   (similar to the standard library's From and Into pair)

It makes sense to me that we'll need something to convert data and that
this usually means creating a new instance, i.e. cloning. However, while
it's obvious that this is similar to From/Into, the part I'm missing
here is what's different from it.

In other words, couldn't we just implement the normal From trait between
FFI types and the native equivalent?

> - Automatic implementations of the above traits for Option<>, supporting NULL
>   pointers

This is nice.

> - A wrapper for a pointer that automatically frees the contained data.  If
>   a struct XYZ implements CloneToForeign, you can build an OwnedPointer<XYZ>
>   and it will free the contents automatically unless you retrieve it with
>   owned_ptr.into_inner()

Something here feels off to me.

At first, I thought it might be only about naming. This is not about
owning the pointer (which you probably do anyway), but that the pointer
owns the object it points to. This concept has in fact a name in Rust:
It's a Box.

The major difference compared to Box is that we're using a different
allocator. Not sure if the allocator APIs would actually be viable, but
they're not stable anyway - but let's at least name this thing in way
that draws the obvious parallel. Maybe ForeignBox.

But the other thing that doesn't feel quite right is how this is coupled
with CloneToForeign. Freeing is different from cloning, and it really
relates to the foreign type itself, and not to the one that can be
cloned into a foreign type.

Bringing both together, what a Box doesn't usually have is a function
pointer for freeing. We probably don't need it here either, almost
everything needs g_free(). There is a different free_foreign()
implementation for Error, but arguably this could be changed:
bindings::Error should then implement Drop for the inner value (with an
error_free_inner() which needs to be exported separately from C first),
and then ForeignBox can just drop the Error object and g_free() the
pointer itself like it would do with any other value.

(Your implementation actually calls free() instead of g_free(). We
generally try to avoid that in our C code, so we should probably avoid
it in Rust, too.)

Kevin
Stefan Hajnoczi Sept. 27, 2024, 7:36 p.m. UTC | #5
On Mon, 1 Jul 2024 at 11:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
> +/// A type for which there is a canonical representation as a C datum.
> +pub trait CloneToForeign {
> +    /// The representation of `Self` as a C datum.  Typically a
> +    /// `struct`, though there are exceptions for example `c_char`
> +    /// for strings, since C strings are of `char *` type).
> +    type Foreign;
> +
> +    /// Free the C datum pointed to by `p`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `p` must be `NULL` or point to valid data.
> +    unsafe fn free_foreign(p: *mut Self::Foreign);
> +
> +    /// Convert a native Rust object to a foreign C struct, copying
> +    /// everything pointed to by `self` (same as `to_glib_full` in `glib-rs`)
> +    fn clone_to_foreign(&self) -> OwnedPointer<Self>;

I expected the return type to be OwnedPointer<Self::Foreign>. Is this a typo?

Also, why is the return type OwnedPointer<T> instead of just T? I
guess it's common to want a heap-allocated value here so you decided
to hard-code OwnedPointer<>, but I'm not sure.
Paolo Bonzini Sept. 27, 2024, 8:19 p.m. UTC | #6
On Fri, Sep 27, 2024 at 6:10 PM Kevin Wolf <kwolf@redhat.com> wrote:
> Am 01.07.2024 um 16:58 hat Paolo Bonzini geschrieben:
> > The qemu::util::foreign module provides:
> >
> > - A trait for structs that can be converted to a C ("foreign") representation
> >   (CloneToForeign)
> >
> > - A trait for structs that can be built from a C ("foreign") representation
> >   (FromForeign), and the utility IntoNative that can be used with less typing
> >   (similar to the standard library's From and Into pair)
>
> It makes sense to me that we'll need something to convert data and that
> this usually means creating a new instance, i.e. cloning. However, while
> it's obvious that this is similar to From/Into, the part I'm missing
> here is what's different from it.
>
> In other words, couldn't we just implement the normal From trait between
> FFI types and the native equivalent?

In general yes. Using new traits has two advantages (all IMHO of
course). First, it makes it possible to tie the implementation to the
availability of a freeing function; second, it makes it possible to
define this always, whereas From is limited by the orphan rule (you
cannot provide implementations of a trait for a struct unless you are
the crate that defines either the struct or the trait).

> > - Automatic implementations of the above traits for Option<>, supporting NULL
> >   pointers
>
> This is nice.

... for example, you can't have such a blanket implementation "impl<T,
U: From<T>> From<T> for Option<U> {}".

> > - A wrapper for a pointer that automatically frees the contained data.  If
> >   a struct XYZ implements CloneToForeign, you can build an OwnedPointer<XYZ>
> >   and it will free the contents automatically unless you retrieve it with
> >   owned_ptr.into_inner()
>
> Something here feels off to me.
>
> At first, I thought it might be only about naming. This is not about
> owning the pointer (which you probably do anyway), but that the pointer
> owns the object it points to. This concept has in fact a name in Rust:
> It's a Box.
>
> The major difference compared to Box is that we're using a different
> allocator. Not sure if the allocator APIs would actually be viable, but
> they're not stable anyway - but let's at least name this thing in way
> that draws the obvious parallel. Maybe ForeignBox.
> But the other thing that doesn't feel quite right is how this is coupled
> with CloneToForeign. Freeing is different from cloning, and it really
> relates to the foreign type itself, and not to the one that can be
> cloned into a foreign type.

I am not 100% convinced of the ForeignBox name but I see your point.
You'd prefer to have "impl FreeForeign for c_char" and "impl
CloneToForeign for str", where it's the CloneToForeign::Foreign
associated type (in this case c_char) that must implement FreeForeign.
Also, clone_to_foreign() for a str would return an
OwnedPointer<c_char>, not an OwnedPointer<str>. Never be too
optimistic about Rust, but that should be doable and I agree it is
better.

> Bringing both together, what a Box doesn't usually have is a function
> pointer for freeing. We probably don't need it here either, almost
> everything needs g_free().

Is that true? For example we could have qobject_unref, or qapi_free_*
as implementations of FreeForeign.

> There is a different free_foreign()
> implementation for Error, but arguably this could be changed:
> bindings::Error should then implement Drop for the inner value (with an
> error_free_inner() which needs to be exported separately from C first),
> and then ForeignBox can just drop the Error object and g_free() the
> pointer itself like it would do with any other value.

I think that's a bit too magical. Making Error always represent an
owned object seems a bit risky; instead I'm treating the bindings::XYZ
structs as very C-like objects, that are usually managed through
either OwnedPointer<...> (e.g. for a parameter) or *mut XYZ (for a
return value).

Thanks for looking at this. This kind of review is something that
we'll have to do a lot and it's better to have some early experience
of what they look like, independent of whether this code ultimately
will or will not be part of QEMU.

Paolo
Paolo Bonzini Sept. 27, 2024, 8:26 p.m. UTC | #7
On Fri, Sep 27, 2024 at 9:36 PM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> On Mon, 1 Jul 2024 at 11:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > +/// A type for which there is a canonical representation as a C datum.
> > +pub trait CloneToForeign {
> > +    /// The representation of `Self` as a C datum.  Typically a
> > +    /// `struct`, though there are exceptions for example `c_char`
> > +    /// for strings, since C strings are of `char *` type).
> > +    type Foreign;
> > +
> > +    /// Free the C datum pointed to by `p`.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// `p` must be `NULL` or point to valid data.
> > +    unsafe fn free_foreign(p: *mut Self::Foreign);
> > +
> > +    /// Convert a native Rust object to a foreign C struct, copying
> > +    /// everything pointed to by `self` (same as `to_glib_full` in `glib-rs`)
> > +    fn clone_to_foreign(&self) -> OwnedPointer<Self>;
>
> I expected the return type to be OwnedPointer<Self::Foreign>. Is this a typo?

Kevin noticed the same. I'd have to check if I am missing something
but it seems to be just tunnel vision.

> Also, why is the return type OwnedPointer<T> instead of just T? I
> guess it's common to want a heap-allocated value here so you decided
> to hard-code OwnedPointer<>, but I'm not sure.

Because at this point it looks like the most important conversion is
to have a clone (meaning its lifetime is independent of the copy) and
a value that is not movable (moves can be unpredictable and then usage
in C is messy). The main example is creating a QEMU Error from
something that implements the Rust std::error::Error trait.

Actually I have written extra code that _borrows_ into a foreign
object (so a CStr can be used as a *const c_char for as long as the
CStr is alive), but I didn't really have a user and wrote it only to
validate the concept.

Paolo
diff mbox series

Patch

diff --git a/qemu/src/lib.rs b/qemu/src/lib.rs
index fce21d7..c48edcf 100644
--- a/qemu/src/lib.rs
+++ b/qemu/src/lib.rs
@@ -2,3 +2,9 @@ 
 #![allow(dead_code)]
 
 pub mod bindings;
+
+pub mod util;
+pub use util::foreign::CloneToForeign;
+pub use util::foreign::FromForeign;
+pub use util::foreign::IntoNative;
+pub use util::foreign::OwnedPointer;
diff --git a/qemu/src/util/foreign.rs b/qemu/src/util/foreign.rs
new file mode 100644
index 0000000..a591925
--- /dev/null
+++ b/qemu/src/util/foreign.rs
@@ -0,0 +1,247 @@ 
+// TODO: change to use .cast() etc.
+#![allow(clippy::ptr_as_ptr)]
+
+/// Traits to map between C structs and native Rust types.
+/// Similar to glib-rs but a bit simpler and possibly more
+/// idiomatic.
+use std::borrow::Cow;
+use std::fmt;
+use std::fmt::Debug;
+use std::mem;
+use std::ptr;
+
+/// A type for which there is a canonical representation as a C datum.
+pub trait CloneToForeign {
+    /// The representation of `Self` as a C datum.  Typically a
+    /// `struct`, though there are exceptions for example `c_char`
+    /// for strings, since C strings are of `char *` type).
+    type Foreign;
+
+    /// Free the C datum pointed to by `p`.
+    ///
+    /// # Safety
+    ///
+    /// `p` must be `NULL` or point to valid data.
+    unsafe fn free_foreign(p: *mut Self::Foreign);
+
+    /// Convert a native Rust object to a foreign C struct, copying
+    /// everything pointed to by `self` (same as `to_glib_full` in `glib-rs`)
+    fn clone_to_foreign(&self) -> OwnedPointer<Self>;
+
+    /// Convert a native Rust object to a foreign C pointer, copying
+    /// everything pointed to by `self`.  The returned pointer must
+    /// be freed with the `free_foreign` associated function.
+    fn clone_to_foreign_ptr(&self) -> *mut Self::Foreign {
+        self.clone_to_foreign().into_inner()
+    }
+}
+
+impl<T> CloneToForeign for Option<T>
+where
+    T: CloneToForeign,
+{
+    type Foreign = <T as CloneToForeign>::Foreign;
+
+    unsafe fn free_foreign(x: *mut Self::Foreign) {
+        T::free_foreign(x)
+    }
+
+    fn clone_to_foreign(&self) -> OwnedPointer<Self> {
+        // Same as the underlying implementation, but also convert `None`
+        // to a `NULL` pointer.
+        self.as_ref()
+            .map(CloneToForeign::clone_to_foreign)
+            .map(OwnedPointer::into)
+            .unwrap_or_default()
+    }
+}
+
+impl<T> FromForeign for Option<T>
+where
+    T: FromForeign,
+{
+    unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self {
+        // Same as the underlying implementation, but also accept a `NULL` pointer.
+        if p.is_null() {
+            None
+        } else {
+            Some(T::cloned_from_foreign(p))
+        }
+    }
+}
+
+impl<T> CloneToForeign for Box<T>
+where
+    T: CloneToForeign,
+{
+    type Foreign = <T as CloneToForeign>::Foreign;
+
+    unsafe fn free_foreign(x: *mut Self::Foreign) {
+        T::free_foreign(x)
+    }
+
+    fn clone_to_foreign(&self) -> OwnedPointer<Self> {
+        self.as_ref().clone_to_foreign().into()
+    }
+}
+
+impl<T> FromForeign for Box<T>
+where
+    T: FromForeign,
+{
+    unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self {
+        Box::new(T::cloned_from_foreign(p))
+    }
+}
+
+impl<'a, B> CloneToForeign for Cow<'a, B>
+    where B: 'a + ToOwned + ?Sized + CloneToForeign,
+{
+    type Foreign = B::Foreign;
+
+    unsafe fn free_foreign(ptr: *mut B::Foreign) {
+        B::free_foreign(ptr);
+    }
+
+    fn clone_to_foreign(&self) -> OwnedPointer<Self> {
+        self.as_ref().clone_to_foreign().into()
+    }
+}
+
+/// Convert a C datum into a native Rust object, taking ownership of
+/// the C datum.  You should not need to implement this trait
+/// as long as Rust types implement `FromForeign`.
+pub trait IntoNative<T> {
+    /// Convert a C datum to a native Rust object, taking ownership of
+    /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`)
+    ///
+    /// # Safety
+    ///
+    /// `p` must point to valid data, or can be `NULL` if Self is an
+    /// `Option` type.  It becomes invalid after the function returns.
+    unsafe fn into_native(self) -> T;
+}
+
+impl<T, U> IntoNative<U> for *mut T
+where
+    U: FromForeign<Foreign = T>,
+{
+    unsafe fn into_native(self) -> U {
+        U::from_foreign(self)
+    }
+}
+
+/// A type which can be constructed from a canonical representation as a
+/// C datum.
+pub trait FromForeign: CloneToForeign + Sized {
+    /// Convert a C datum to a native Rust object, copying everything
+    /// pointed to by `p` (same as `from_glib_none` in `glib-rs`)
+    ///
+    /// # Safety
+    ///
+    /// `p` must point to valid data, or can be `NULL` is `Self` is an
+    /// `Option` type.
+    unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self;
+
+    /// Convert a C datum to a native Rust object, taking ownership of
+    /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`)
+    ///
+    /// The default implementation calls `cloned_from_foreign` and frees `p`.
+    ///
+    /// # Safety
+    ///
+    /// `p` must point to valid data, or can be `NULL` is `Self` is an
+    /// `Option` type.  `p` becomes invalid after the function returns.
+    unsafe fn from_foreign(p: *mut Self::Foreign) -> Self {
+        let result = Self::cloned_from_foreign(p);
+        Self::free_foreign(p);
+        result
+    }
+}
+
+pub struct OwnedPointer<T: CloneToForeign + ?Sized> {
+    ptr: *mut <T as CloneToForeign>::Foreign,
+}
+
+impl<T: CloneToForeign + ?Sized> OwnedPointer<T> {
+    /// Return a new `OwnedPointer` that wraps the pointer `ptr`.
+    ///
+    /// # Safety
+    ///
+    /// The pointer must be valid and live until the returned `OwnedPointer`
+    /// is dropped.
+    pub unsafe fn new(ptr: *mut <T as CloneToForeign>::Foreign) -> Self {
+        OwnedPointer { ptr }
+    }
+
+    /// Safely create an `OwnedPointer` from one that has the same
+    /// freeing function.
+    pub fn from<U>(x: OwnedPointer<U>) -> Self
+    where
+        U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign> + ?Sized,
+    {
+        unsafe {
+            // SAFETY: the pointer type and free function are the same,
+            // only the type changes
+            OwnedPointer::new(x.into_inner())
+        }
+    }
+
+    /// Safely convert an `OwnedPointer` into one that has the same
+    /// freeing function.
+    pub fn into<U>(self) -> OwnedPointer<U>
+    where
+        U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign>,
+    {
+        OwnedPointer::from(self)
+    }
+
+    /// Return the pointer that is stored in the `OwnedPointer`.  The
+    /// pointer is valid for as long as the `OwnedPointer` itself.
+    pub fn as_ptr(&self) -> *const <T as CloneToForeign>::Foreign {
+        self.ptr
+    }
+
+    pub fn as_mut_ptr(&self) -> *mut <T as CloneToForeign>::Foreign {
+        self.ptr
+    }
+
+    /// Return the pointer that is stored in the `OwnedPointer`,
+    /// consuming the `OwnedPointer` but not freeing the pointer.
+    pub fn into_inner(mut self) -> *mut <T as CloneToForeign>::Foreign {
+        let result = mem::replace(&mut self.ptr, ptr::null_mut());
+        mem::forget(self);
+        result
+    }
+}
+
+impl<T: FromForeign + ?Sized> OwnedPointer<T> {
+    /// Convert a C datum to a native Rust object, taking ownership of
+    /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`)
+    pub fn into_native(self) -> T {
+        // SAFETY: the pointer was passed to the unsafe constructor OwnedPointer::new
+        unsafe { T::from_foreign(self.into_inner()) }
+    }
+}
+
+impl<T: CloneToForeign + ?Sized> Debug for OwnedPointer<T> {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        let name = std::any::type_name::<T>();
+        let name = format!("OwnedPointer<{}>", name);
+        f.debug_tuple(&name).field(&self.as_ptr()).finish()
+    }
+}
+
+impl<T: CloneToForeign + Default + ?Sized> Default for OwnedPointer<T> {
+    fn default() -> Self {
+        <T as Default>::default().clone_to_foreign()
+    }
+}
+
+impl<T: CloneToForeign + ?Sized> Drop for OwnedPointer<T> {
+    fn drop(&mut self) {
+        let p = mem::replace(&mut self.ptr, ptr::null_mut());
+        // SAFETY: the pointer was passed to the unsafe constructor OwnedPointer::new
+        unsafe { T::free_foreign(p) }
+    }
+}
diff --git a/qemu/src/util/mod.rs b/qemu/src/util/mod.rs
new file mode 100644
index 0000000..be00d0c
--- /dev/null
+++ b/qemu/src/util/mod.rs
@@ -0,0 +1 @@ 
+pub mod foreign;