Message ID | 20240701145853.1394967-4-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | rust: example of bindings code for Rust in QEMU | expand |
Hi (adding Sebastian, one of the glib-rs developers in CC) On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > The qemu::util::foreign module provides: > > - A trait for structs that can be converted to a C ("foreign") > representation > (CloneToForeign) > > - A trait for structs that can be built from a C ("foreign") representation > (FromForeign), and the utility IntoNative that can be used with less > typing > (similar to the standard library's From and Into pair) > > - Automatic implementations of the above traits for Option<>, supporting > NULL > pointers > > - A wrapper for a pointer that automatically frees the contained data. If > a struct XYZ implements CloneToForeign, you can build an > OwnedPointer<XYZ> > and it will free the contents automatically unless you retrieve it with > owned_ptr.into_inner() > You worry about technical debt, and I do too. Here you introduce quite different traits than what glib-rs offers. We already touched this subject 2y ago, my opinion didn't change much ( https://patchew.org/QEMU/20210907121943.3498701-1-marcandre.lureau@redhat.com/20210907121943.3498701-13-marcandre.lureau@redhat.com/). Also, you don't offer the equivalent of "to_glib_none" which uses a temporary stash and is quite useful, as a majority of functions don't take ownership. Because much of our code is using GLib types and API style, I think we should strive for something that is close (if not just the same) to what glib-rs offers. It's already hard enough to handle one binding concept, having 2 will only make the matter worse. Consider a type like GHashTable<GUuid, QOM>, it will be very annoying to deal with if we have different bindings traits and implementations and we will likely end up duplicating glib-rs effort. As for naming & consistency, glib-rs settled on something clearer imho: from_glib_full from_glib_none to_glib_full to_glib_none vs from_foreign cloned_from_foreign clone_to_foreign /nothing/ but overall, this is still close enough that we shouldn't reinvent it. It may be worth studying what the kernel offers, I haven't checked yet. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > qemu/src/lib.rs | 6 + > qemu/src/util/foreign.rs | 247 +++++++++++++++++++++++++++++++++++++++ > qemu/src/util/mod.rs | 1 + > 3 files changed, 254 insertions(+) > create mode 100644 qemu/src/util/foreign.rs > create mode 100644 qemu/src/util/mod.rs > > diff --git a/qemu/src/lib.rs b/qemu/src/lib.rs > index fce21d7..c48edcf 100644 > --- a/qemu/src/lib.rs > +++ b/qemu/src/lib.rs > @@ -2,3 +2,9 @@ > #![allow(dead_code)] > > pub mod bindings; > + > +pub mod util; > +pub use util::foreign::CloneToForeign; > +pub use util::foreign::FromForeign; > +pub use util::foreign::IntoNative; > +pub use util::foreign::OwnedPointer; > diff --git a/qemu/src/util/foreign.rs b/qemu/src/util/foreign.rs > new file mode 100644 > index 0000000..a591925 > --- /dev/null > +++ b/qemu/src/util/foreign.rs > @@ -0,0 +1,247 @@ > +// TODO: change to use .cast() etc. > +#![allow(clippy::ptr_as_ptr)] > + > +/// Traits to map between C structs and native Rust types. > +/// Similar to glib-rs but a bit simpler and possibly more > +/// idiomatic. > +use std::borrow::Cow; > +use std::fmt; > +use std::fmt::Debug; > +use std::mem; > +use std::ptr; > + > +/// A type for which there is a canonical representation as a C datum. > +pub trait CloneToForeign { > + /// The representation of `Self` as a C datum. Typically a > + /// `struct`, though there are exceptions for example `c_char` > + /// for strings, since C strings are of `char *` type). > + type Foreign; > + > + /// Free the C datum pointed to by `p`. > + /// > + /// # Safety > + /// > + /// `p` must be `NULL` or point to valid data. > + unsafe fn free_foreign(p: *mut Self::Foreign); > + > + /// Convert a native Rust object to a foreign C struct, copying > + /// everything pointed to by `self` (same as `to_glib_full` in > `glib-rs`) > + fn clone_to_foreign(&self) -> OwnedPointer<Self>; > + > + /// Convert a native Rust object to a foreign C pointer, copying > + /// everything pointed to by `self`. The returned pointer must > + /// be freed with the `free_foreign` associated function. > + fn clone_to_foreign_ptr(&self) -> *mut Self::Foreign { > + self.clone_to_foreign().into_inner() > + } > +} > + > +impl<T> CloneToForeign for Option<T> > +where > + T: CloneToForeign, > +{ > + type Foreign = <T as CloneToForeign>::Foreign; > + > + unsafe fn free_foreign(x: *mut Self::Foreign) { > + T::free_foreign(x) > + } > + > + fn clone_to_foreign(&self) -> OwnedPointer<Self> { > + // Same as the underlying implementation, but also convert `None` > + // to a `NULL` pointer. > + self.as_ref() > + .map(CloneToForeign::clone_to_foreign) > + .map(OwnedPointer::into) > + .unwrap_or_default() > + } > +} > + > +impl<T> FromForeign for Option<T> > +where > + T: FromForeign, > +{ > + unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self { > + // Same as the underlying implementation, but also accept a > `NULL` pointer. > + if p.is_null() { > + None > + } else { > + Some(T::cloned_from_foreign(p)) > + } > + } > +} > + > +impl<T> CloneToForeign for Box<T> > +where > + T: CloneToForeign, > +{ > + type Foreign = <T as CloneToForeign>::Foreign; > + > + unsafe fn free_foreign(x: *mut Self::Foreign) { > + T::free_foreign(x) > + } > + > + fn clone_to_foreign(&self) -> OwnedPointer<Self> { > + self.as_ref().clone_to_foreign().into() > + } > +} > + > +impl<T> FromForeign for Box<T> > +where > + T: FromForeign, > +{ > + unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self { > + Box::new(T::cloned_from_foreign(p)) > + } > +} > + > +impl<'a, B> CloneToForeign for Cow<'a, B> > + where B: 'a + ToOwned + ?Sized + CloneToForeign, > +{ > + type Foreign = B::Foreign; > + > + unsafe fn free_foreign(ptr: *mut B::Foreign) { > + B::free_foreign(ptr); > + } > + > + fn clone_to_foreign(&self) -> OwnedPointer<Self> { > + self.as_ref().clone_to_foreign().into() > + } > +} > + > +/// Convert a C datum into a native Rust object, taking ownership of > +/// the C datum. You should not need to implement this trait > +/// as long as Rust types implement `FromForeign`. > +pub trait IntoNative<T> { > + /// Convert a C datum to a native Rust object, taking ownership of > + /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`) > + /// > + /// # Safety > + /// > + /// `p` must point to valid data, or can be `NULL` if Self is an > + /// `Option` type. It becomes invalid after the function returns. > + unsafe fn into_native(self) -> T; > +} > + > +impl<T, U> IntoNative<U> for *mut T > +where > + U: FromForeign<Foreign = T>, > +{ > + unsafe fn into_native(self) -> U { > + U::from_foreign(self) > + } > +} > + > +/// A type which can be constructed from a canonical representation as a > +/// C datum. > +pub trait FromForeign: CloneToForeign + Sized { > + /// Convert a C datum to a native Rust object, copying everything > + /// pointed to by `p` (same as `from_glib_none` in `glib-rs`) > + /// > + /// # Safety > + /// > + /// `p` must point to valid data, or can be `NULL` is `Self` is an > + /// `Option` type. > + unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self; > + > + /// Convert a C datum to a native Rust object, taking ownership of > + /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`) > + /// > + /// The default implementation calls `cloned_from_foreign` and frees > `p`. > + /// > + /// # Safety > + /// > + /// `p` must point to valid data, or can be `NULL` is `Self` is an > + /// `Option` type. `p` becomes invalid after the function returns. > + unsafe fn from_foreign(p: *mut Self::Foreign) -> Self { > + let result = Self::cloned_from_foreign(p); > + Self::free_foreign(p); > + result > + } > +} > + > +pub struct OwnedPointer<T: CloneToForeign + ?Sized> { > + ptr: *mut <T as CloneToForeign>::Foreign, > +} > + > +impl<T: CloneToForeign + ?Sized> OwnedPointer<T> { > + /// Return a new `OwnedPointer` that wraps the pointer `ptr`. > + /// > + /// # Safety > + /// > + /// The pointer must be valid and live until the returned > `OwnedPointer` > + /// is dropped. > + pub unsafe fn new(ptr: *mut <T as CloneToForeign>::Foreign) -> Self { > + OwnedPointer { ptr } > + } > + > + /// Safely create an `OwnedPointer` from one that has the same > + /// freeing function. > + pub fn from<U>(x: OwnedPointer<U>) -> Self > + where > + U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign> + > ?Sized, > + { > + unsafe { > + // SAFETY: the pointer type and free function are the same, > + // only the type changes > + OwnedPointer::new(x.into_inner()) > + } > + } > + > + /// Safely convert an `OwnedPointer` into one that has the same > + /// freeing function. > + pub fn into<U>(self) -> OwnedPointer<U> > + where > + U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign>, > + { > + OwnedPointer::from(self) > + } > + > + /// Return the pointer that is stored in the `OwnedPointer`. The > + /// pointer is valid for as long as the `OwnedPointer` itself. > + pub fn as_ptr(&self) -> *const <T as CloneToForeign>::Foreign { > + self.ptr > + } > + > + pub fn as_mut_ptr(&self) -> *mut <T as CloneToForeign>::Foreign { > + self.ptr > + } > + > + /// Return the pointer that is stored in the `OwnedPointer`, > + /// consuming the `OwnedPointer` but not freeing the pointer. > + pub fn into_inner(mut self) -> *mut <T as CloneToForeign>::Foreign { > + let result = mem::replace(&mut self.ptr, ptr::null_mut()); > + mem::forget(self); > + result > + } > +} > + > +impl<T: FromForeign + ?Sized> OwnedPointer<T> { > + /// Convert a C datum to a native Rust object, taking ownership of > + /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`) > + pub fn into_native(self) -> T { > + // SAFETY: the pointer was passed to the unsafe constructor > OwnedPointer::new > + unsafe { T::from_foreign(self.into_inner()) } > + } > +} > + > +impl<T: CloneToForeign + ?Sized> Debug for OwnedPointer<T> { > + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { > + let name = std::any::type_name::<T>(); > + let name = format!("OwnedPointer<{}>", name); > + f.debug_tuple(&name).field(&self.as_ptr()).finish() > + } > +} > + > +impl<T: CloneToForeign + Default + ?Sized> Default for OwnedPointer<T> { > + fn default() -> Self { > + <T as Default>::default().clone_to_foreign() > + } > +} > + > +impl<T: CloneToForeign + ?Sized> Drop for OwnedPointer<T> { > + fn drop(&mut self) { > + let p = mem::replace(&mut self.ptr, ptr::null_mut()); > + // SAFETY: the pointer was passed to the unsafe constructor > OwnedPointer::new > + unsafe { T::free_foreign(p) } > + } > +} > diff --git a/qemu/src/util/mod.rs b/qemu/src/util/mod.rs > new file mode 100644 > index 0000000..be00d0c > --- /dev/null > +++ b/qemu/src/util/mod.rs > @@ -0,0 +1 @@ > +pub mod foreign; > -- > 2.45.2 > > >
Hi On Wed, Jul 3, 2024 at 4:48 PM Marc-André Lureau <marcandre.lureau@gmail.com> wrote: > Hi > > (adding Sebastian, one of the glib-rs developers in CC) > > On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > >> The qemu::util::foreign module provides: >> >> - A trait for structs that can be converted to a C ("foreign") >> representation >> (CloneToForeign) >> >> - A trait for structs that can be built from a C ("foreign") >> representation >> (FromForeign), and the utility IntoNative that can be used with less >> typing >> (similar to the standard library's From and Into pair) >> >> - Automatic implementations of the above traits for Option<>, supporting >> NULL >> pointers >> >> - A wrapper for a pointer that automatically frees the contained data. If >> a struct XYZ implements CloneToForeign, you can build an >> OwnedPointer<XYZ> >> and it will free the contents automatically unless you retrieve it with >> owned_ptr.into_inner() >> > > [...] Also, you don't offer the equivalent of "to_glib_none" which uses a > temporary stash and is quite useful, as a majority of functions don't take > ownership. > I realize this may be somewhat offered by OwnedPointer (although not mentioned explicitly in the doc). Its usage seems unsafe, as you have to handle the foreign pointer lifetime manually...
[warning: long email] On Wed, Jul 3, 2024 at 2:48 PM Marc-André Lureau <marcandre.lureau@gmail.com> wrote: > > Hi > > (adding Sebastian, one of the glib-rs developers in CC) > > On Mon, Jul 1, 2024 at 7:02 PM Paolo Bonzini <pbonzini@redhat.com> wrote: >> >> The qemu::util::foreign module provides: >> >> - A trait for structs that can be converted to a C ("foreign") representation >> - A trait for structs that can be built from a C ("foreign") representation >> - A wrapper for a pointer that automatically frees the contained data. > > You worry about technical debt, and I do too. Here you introduce quite different traits than what glib-rs offers. > We already touched this subject 2y ago, my opinion didn't change much > (https://patchew.org/QEMU/20210907121943.3498701-1-marcandre.lureau@redhat.com/20210907121943.3498701-13-marcandre.lureau@redhat.com/) Hi Marc-André, first of all thanks for reviewing and, probably, sorry for not going more into detail as to why I reinvented this particular wheel. Like two years ago, I find the full/none nomenclature very confusing and there is no trace of them in QEMU, but I could get over that easily. But also, I think the two have different focus and design. All I wanted was a place to put common code to convert Rust objects (mostly CStr/String/str and Error) to and from C representations (char * and QEMU's Error*), to avoid rewriting it over and over. So I wanted the traits to be as simple as possible, and I didn't like that glib puts to_glib_none and to_glib_full into the same trait, though I understand why it does so. I also didn't like that to_glib_full() is an express ticket to a memory leak. :) OwnedPointer<> and free_foreign solve all the qualms I had with glib-rs: - OwnedPointer<> takes care of freeing at the cost of extra verbosity for ".as_ptr()". - because clone_to_foreign() does not leak, I don't need to think about to_glib_none() from the get go. This has a ripple effect on other parts of the design. For example, because glib has to_glib_none(), its basic design is to never allocate malloc-ed memory unless ownership of the pointer is passed to C. glib-rs's to_glib_none can and will "clone to Rust-managed memory" if needed. Instead, because I have free_foreign(), I can malloc freely. In the future we could add something like Stash/to_glib_none(), but it would be very different and with performance properties enforced by the type system. It would "create a pointer into *existing* Rust-managed memory" cheaply but, every time a cheap borrow is not possible, users will have to call clone_to_foreign() explicitly. In other words there will be a difference between OwnedPointer<> as the result of an expensive operation, and [the equivalent of glib-rs] Stash as the result of a cheap operation. This is unlike to_glib_none() which is e.g. cheap for CStr but not for String. More on this below. > Also, you don't offer the equivalent of "to_glib_none" which uses a temporary stash and is quite useful, as a majority of functions don't take ownership. Right, in part you have already answered that: OwnedPointer handles freeing in the same way as g_autofree, so you can use it as a to_glib_none() replacement. If you need to pass the pointer and forget about it, you can use clone_to_foreign_ptr(). If you need something temporary that needs to be freed before returning, you use clone_to_foreign() and as_ptr(). This is not any more unsafe than to_glib_none(). If you pass a temporary pointer (x.clone_to_foreign().as_ptr()) where C code expects full ownership, you get a dangling pointer either way. It is less leak-happy than to_glib_full() though. Regarding how to obtain a pointer cheaply, there are two use cases: - borrow: fn borrow_foreign<'a>(&'a self) -> Stash<'a, <Self as CloneToForeign>::Foreign, &'a Self>; Useful for slices and elementary integer types - including cases where C parameters are passed by reference - and also for CStr. However right now my main user is strings and it doesn't buy much there. Because String and &str are not null-terminated, a "borrow" would allocate memory, as in to_glib_none(), which is pretty weird. At this point, I find it better to have the more explicit clone_to_foreign() that tells you how expensive it is. - consumption, which does not exist in glib-rs: fn into_foreign(self) -> Stash<'static, <Self as CloneToForeign>::Foreign, Self>; This works for types that, like String, deref to a stable address even when moved---this way we don't need to deal with pinning. And it is in fact especially nice for the String case and therefore for anything that uses format!(), because it could be implemented as s.reserve_exact(1); s.push('\0'); The two would be separate traits. There was one case in which I could have used into_foreign(), namely the call to error_setg_internal() in patch 5. But I am not sure if this will be more common than functions taking &str, so I left it for later. > Because much of our code is using GLib types and API style, I > think we should strive for something that is close (if not just the > same) to what glib-rs offers. It's already hard enough to handle > one binding concept, having 2 will only make the matter worse. > Consider a type like GHashTable<GUuid, QOM>, it will be very > annoying to deal with if we have different bindings traits and > implementations and we will likely end up duplicating glib-rs > effort. I'm not even sure if it's even true that much QEMU code is using GLib types. We use quite a bit of GLib, and took a lot of inspiration from it because GLib is good. But it's notable that we use very little GLib *across modules*. There are only a handful of occurrences to GList/GList/GHashTable in function arguments in include/, for all of QEMU, and all of them are very rarely used functions. Why? Possibly because, even where we take inspiration (the Error** idiom, QOM casts), the implementation ends up completely different for reasons that aren't just NIH. In any case I don't see us using glib-rs much, if at all; uses of hash tables in Rust code can just use HashMap<>. Yes, there is no equivalent of the expensive, O(n) conversion self.some_g_hash_table.to_glib_none(); but that's a feature, not a limitation. Also, anyway we couldn't extend the glib traits a lot, and glib-rs only supports HashMap<String, String>. > As for naming & consistency, glib-rs settled on something clearer imho: > > from_glib_full > from_glib_none > to_glib_full > to_glib_none > > vs > > from_foreign > cloned_from_foreign > clone_to_foreign > /nothing/ These last two are respectively clone_to_foreign_ptr and clone_to_foreign. I am not wed to my names but I'm not sure the glib ones are clearer. What I want from names is clarity on who allocates or frees; I do that with "clone", which is a common Rust term, to indicate that the C side still needs to be freed after the call. I find it confusing that from_glib_full() requires no freeing but to_glib_full() requires freeing, for example - I understand _why_ but still "full" and "none" are GIR terms that are not really familiar to most Rust or QEMU developers. And secondarily, clarity on what is cheap and what isn't. I don't have that functionality at all for now, but the nomenclature is easily extended to borrow_foreign() and into_foreign(). Talking about Rust terms, glib-rs lacks "into" which is the standard Rust way to do consuming conversions, while my code has into_native(). All in all, I think there are some significant differences between glib-rs and this implementation that justify the effort. Paolo
Hi Paolo, as you asked me at KVM Forum to have a look at this, I'm going through it now. Am 01.07.2024 um 16:58 hat Paolo Bonzini geschrieben: > The qemu::util::foreign module provides: > > - A trait for structs that can be converted to a C ("foreign") representation > (CloneToForeign) > > - A trait for structs that can be built from a C ("foreign") representation > (FromForeign), and the utility IntoNative that can be used with less typing > (similar to the standard library's From and Into pair) It makes sense to me that we'll need something to convert data and that this usually means creating a new instance, i.e. cloning. However, while it's obvious that this is similar to From/Into, the part I'm missing here is what's different from it. In other words, couldn't we just implement the normal From trait between FFI types and the native equivalent? > - Automatic implementations of the above traits for Option<>, supporting NULL > pointers This is nice. > - A wrapper for a pointer that automatically frees the contained data. If > a struct XYZ implements CloneToForeign, you can build an OwnedPointer<XYZ> > and it will free the contents automatically unless you retrieve it with > owned_ptr.into_inner() Something here feels off to me. At first, I thought it might be only about naming. This is not about owning the pointer (which you probably do anyway), but that the pointer owns the object it points to. This concept has in fact a name in Rust: It's a Box. The major difference compared to Box is that we're using a different allocator. Not sure if the allocator APIs would actually be viable, but they're not stable anyway - but let's at least name this thing in way that draws the obvious parallel. Maybe ForeignBox. But the other thing that doesn't feel quite right is how this is coupled with CloneToForeign. Freeing is different from cloning, and it really relates to the foreign type itself, and not to the one that can be cloned into a foreign type. Bringing both together, what a Box doesn't usually have is a function pointer for freeing. We probably don't need it here either, almost everything needs g_free(). There is a different free_foreign() implementation for Error, but arguably this could be changed: bindings::Error should then implement Drop for the inner value (with an error_free_inner() which needs to be exported separately from C first), and then ForeignBox can just drop the Error object and g_free() the pointer itself like it would do with any other value. (Your implementation actually calls free() instead of g_free(). We generally try to avoid that in our C code, so we should probably avoid it in Rust, too.) Kevin
On Mon, 1 Jul 2024 at 11:02, Paolo Bonzini <pbonzini@redhat.com> wrote: > +/// A type for which there is a canonical representation as a C datum. > +pub trait CloneToForeign { > + /// The representation of `Self` as a C datum. Typically a > + /// `struct`, though there are exceptions for example `c_char` > + /// for strings, since C strings are of `char *` type). > + type Foreign; > + > + /// Free the C datum pointed to by `p`. > + /// > + /// # Safety > + /// > + /// `p` must be `NULL` or point to valid data. > + unsafe fn free_foreign(p: *mut Self::Foreign); > + > + /// Convert a native Rust object to a foreign C struct, copying > + /// everything pointed to by `self` (same as `to_glib_full` in `glib-rs`) > + fn clone_to_foreign(&self) -> OwnedPointer<Self>; I expected the return type to be OwnedPointer<Self::Foreign>. Is this a typo? Also, why is the return type OwnedPointer<T> instead of just T? I guess it's common to want a heap-allocated value here so you decided to hard-code OwnedPointer<>, but I'm not sure.
On Fri, Sep 27, 2024 at 6:10 PM Kevin Wolf <kwolf@redhat.com> wrote: > Am 01.07.2024 um 16:58 hat Paolo Bonzini geschrieben: > > The qemu::util::foreign module provides: > > > > - A trait for structs that can be converted to a C ("foreign") representation > > (CloneToForeign) > > > > - A trait for structs that can be built from a C ("foreign") representation > > (FromForeign), and the utility IntoNative that can be used with less typing > > (similar to the standard library's From and Into pair) > > It makes sense to me that we'll need something to convert data and that > this usually means creating a new instance, i.e. cloning. However, while > it's obvious that this is similar to From/Into, the part I'm missing > here is what's different from it. > > In other words, couldn't we just implement the normal From trait between > FFI types and the native equivalent? In general yes. Using new traits has two advantages (all IMHO of course). First, it makes it possible to tie the implementation to the availability of a freeing function; second, it makes it possible to define this always, whereas From is limited by the orphan rule (you cannot provide implementations of a trait for a struct unless you are the crate that defines either the struct or the trait). > > - Automatic implementations of the above traits for Option<>, supporting NULL > > pointers > > This is nice. ... for example, you can't have such a blanket implementation "impl<T, U: From<T>> From<T> for Option<U> {}". > > - A wrapper for a pointer that automatically frees the contained data. If > > a struct XYZ implements CloneToForeign, you can build an OwnedPointer<XYZ> > > and it will free the contents automatically unless you retrieve it with > > owned_ptr.into_inner() > > Something here feels off to me. > > At first, I thought it might be only about naming. This is not about > owning the pointer (which you probably do anyway), but that the pointer > owns the object it points to. This concept has in fact a name in Rust: > It's a Box. > > The major difference compared to Box is that we're using a different > allocator. Not sure if the allocator APIs would actually be viable, but > they're not stable anyway - but let's at least name this thing in way > that draws the obvious parallel. Maybe ForeignBox. > But the other thing that doesn't feel quite right is how this is coupled > with CloneToForeign. Freeing is different from cloning, and it really > relates to the foreign type itself, and not to the one that can be > cloned into a foreign type. I am not 100% convinced of the ForeignBox name but I see your point. You'd prefer to have "impl FreeForeign for c_char" and "impl CloneToForeign for str", where it's the CloneToForeign::Foreign associated type (in this case c_char) that must implement FreeForeign. Also, clone_to_foreign() for a str would return an OwnedPointer<c_char>, not an OwnedPointer<str>. Never be too optimistic about Rust, but that should be doable and I agree it is better. > Bringing both together, what a Box doesn't usually have is a function > pointer for freeing. We probably don't need it here either, almost > everything needs g_free(). Is that true? For example we could have qobject_unref, or qapi_free_* as implementations of FreeForeign. > There is a different free_foreign() > implementation for Error, but arguably this could be changed: > bindings::Error should then implement Drop for the inner value (with an > error_free_inner() which needs to be exported separately from C first), > and then ForeignBox can just drop the Error object and g_free() the > pointer itself like it would do with any other value. I think that's a bit too magical. Making Error always represent an owned object seems a bit risky; instead I'm treating the bindings::XYZ structs as very C-like objects, that are usually managed through either OwnedPointer<...> (e.g. for a parameter) or *mut XYZ (for a return value). Thanks for looking at this. This kind of review is something that we'll have to do a lot and it's better to have some early experience of what they look like, independent of whether this code ultimately will or will not be part of QEMU. Paolo
On Fri, Sep 27, 2024 at 9:36 PM Stefan Hajnoczi <stefanha@gmail.com> wrote: > > On Mon, 1 Jul 2024 at 11:02, Paolo Bonzini <pbonzini@redhat.com> wrote: > > +/// A type for which there is a canonical representation as a C datum. > > +pub trait CloneToForeign { > > + /// The representation of `Self` as a C datum. Typically a > > + /// `struct`, though there are exceptions for example `c_char` > > + /// for strings, since C strings are of `char *` type). > > + type Foreign; > > + > > + /// Free the C datum pointed to by `p`. > > + /// > > + /// # Safety > > + /// > > + /// `p` must be `NULL` or point to valid data. > > + unsafe fn free_foreign(p: *mut Self::Foreign); > > + > > + /// Convert a native Rust object to a foreign C struct, copying > > + /// everything pointed to by `self` (same as `to_glib_full` in `glib-rs`) > > + fn clone_to_foreign(&self) -> OwnedPointer<Self>; > > I expected the return type to be OwnedPointer<Self::Foreign>. Is this a typo? Kevin noticed the same. I'd have to check if I am missing something but it seems to be just tunnel vision. > Also, why is the return type OwnedPointer<T> instead of just T? I > guess it's common to want a heap-allocated value here so you decided > to hard-code OwnedPointer<>, but I'm not sure. Because at this point it looks like the most important conversion is to have a clone (meaning its lifetime is independent of the copy) and a value that is not movable (moves can be unpredictable and then usage in C is messy). The main example is creating a QEMU Error from something that implements the Rust std::error::Error trait. Actually I have written extra code that _borrows_ into a foreign object (so a CStr can be used as a *const c_char for as long as the CStr is alive), but I didn't really have a user and wrote it only to validate the concept. Paolo
diff --git a/qemu/src/lib.rs b/qemu/src/lib.rs index fce21d7..c48edcf 100644 --- a/qemu/src/lib.rs +++ b/qemu/src/lib.rs @@ -2,3 +2,9 @@ #![allow(dead_code)] pub mod bindings; + +pub mod util; +pub use util::foreign::CloneToForeign; +pub use util::foreign::FromForeign; +pub use util::foreign::IntoNative; +pub use util::foreign::OwnedPointer; diff --git a/qemu/src/util/foreign.rs b/qemu/src/util/foreign.rs new file mode 100644 index 0000000..a591925 --- /dev/null +++ b/qemu/src/util/foreign.rs @@ -0,0 +1,247 @@ +// TODO: change to use .cast() etc. +#![allow(clippy::ptr_as_ptr)] + +/// Traits to map between C structs and native Rust types. +/// Similar to glib-rs but a bit simpler and possibly more +/// idiomatic. +use std::borrow::Cow; +use std::fmt; +use std::fmt::Debug; +use std::mem; +use std::ptr; + +/// A type for which there is a canonical representation as a C datum. +pub trait CloneToForeign { + /// The representation of `Self` as a C datum. Typically a + /// `struct`, though there are exceptions for example `c_char` + /// for strings, since C strings are of `char *` type). + type Foreign; + + /// Free the C datum pointed to by `p`. + /// + /// # Safety + /// + /// `p` must be `NULL` or point to valid data. + unsafe fn free_foreign(p: *mut Self::Foreign); + + /// Convert a native Rust object to a foreign C struct, copying + /// everything pointed to by `self` (same as `to_glib_full` in `glib-rs`) + fn clone_to_foreign(&self) -> OwnedPointer<Self>; + + /// Convert a native Rust object to a foreign C pointer, copying + /// everything pointed to by `self`. The returned pointer must + /// be freed with the `free_foreign` associated function. + fn clone_to_foreign_ptr(&self) -> *mut Self::Foreign { + self.clone_to_foreign().into_inner() + } +} + +impl<T> CloneToForeign for Option<T> +where + T: CloneToForeign, +{ + type Foreign = <T as CloneToForeign>::Foreign; + + unsafe fn free_foreign(x: *mut Self::Foreign) { + T::free_foreign(x) + } + + fn clone_to_foreign(&self) -> OwnedPointer<Self> { + // Same as the underlying implementation, but also convert `None` + // to a `NULL` pointer. + self.as_ref() + .map(CloneToForeign::clone_to_foreign) + .map(OwnedPointer::into) + .unwrap_or_default() + } +} + +impl<T> FromForeign for Option<T> +where + T: FromForeign, +{ + unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self { + // Same as the underlying implementation, but also accept a `NULL` pointer. + if p.is_null() { + None + } else { + Some(T::cloned_from_foreign(p)) + } + } +} + +impl<T> CloneToForeign for Box<T> +where + T: CloneToForeign, +{ + type Foreign = <T as CloneToForeign>::Foreign; + + unsafe fn free_foreign(x: *mut Self::Foreign) { + T::free_foreign(x) + } + + fn clone_to_foreign(&self) -> OwnedPointer<Self> { + self.as_ref().clone_to_foreign().into() + } +} + +impl<T> FromForeign for Box<T> +where + T: FromForeign, +{ + unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self { + Box::new(T::cloned_from_foreign(p)) + } +} + +impl<'a, B> CloneToForeign for Cow<'a, B> + where B: 'a + ToOwned + ?Sized + CloneToForeign, +{ + type Foreign = B::Foreign; + + unsafe fn free_foreign(ptr: *mut B::Foreign) { + B::free_foreign(ptr); + } + + fn clone_to_foreign(&self) -> OwnedPointer<Self> { + self.as_ref().clone_to_foreign().into() + } +} + +/// Convert a C datum into a native Rust object, taking ownership of +/// the C datum. You should not need to implement this trait +/// as long as Rust types implement `FromForeign`. +pub trait IntoNative<T> { + /// Convert a C datum to a native Rust object, taking ownership of + /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`) + /// + /// # Safety + /// + /// `p` must point to valid data, or can be `NULL` if Self is an + /// `Option` type. It becomes invalid after the function returns. + unsafe fn into_native(self) -> T; +} + +impl<T, U> IntoNative<U> for *mut T +where + U: FromForeign<Foreign = T>, +{ + unsafe fn into_native(self) -> U { + U::from_foreign(self) + } +} + +/// A type which can be constructed from a canonical representation as a +/// C datum. +pub trait FromForeign: CloneToForeign + Sized { + /// Convert a C datum to a native Rust object, copying everything + /// pointed to by `p` (same as `from_glib_none` in `glib-rs`) + /// + /// # Safety + /// + /// `p` must point to valid data, or can be `NULL` is `Self` is an + /// `Option` type. + unsafe fn cloned_from_foreign(p: *const Self::Foreign) -> Self; + + /// Convert a C datum to a native Rust object, taking ownership of + /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`) + /// + /// The default implementation calls `cloned_from_foreign` and frees `p`. + /// + /// # Safety + /// + /// `p` must point to valid data, or can be `NULL` is `Self` is an + /// `Option` type. `p` becomes invalid after the function returns. + unsafe fn from_foreign(p: *mut Self::Foreign) -> Self { + let result = Self::cloned_from_foreign(p); + Self::free_foreign(p); + result + } +} + +pub struct OwnedPointer<T: CloneToForeign + ?Sized> { + ptr: *mut <T as CloneToForeign>::Foreign, +} + +impl<T: CloneToForeign + ?Sized> OwnedPointer<T> { + /// Return a new `OwnedPointer` that wraps the pointer `ptr`. + /// + /// # Safety + /// + /// The pointer must be valid and live until the returned `OwnedPointer` + /// is dropped. + pub unsafe fn new(ptr: *mut <T as CloneToForeign>::Foreign) -> Self { + OwnedPointer { ptr } + } + + /// Safely create an `OwnedPointer` from one that has the same + /// freeing function. + pub fn from<U>(x: OwnedPointer<U>) -> Self + where + U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign> + ?Sized, + { + unsafe { + // SAFETY: the pointer type and free function are the same, + // only the type changes + OwnedPointer::new(x.into_inner()) + } + } + + /// Safely convert an `OwnedPointer` into one that has the same + /// freeing function. + pub fn into<U>(self) -> OwnedPointer<U> + where + U: CloneToForeign<Foreign = <T as CloneToForeign>::Foreign>, + { + OwnedPointer::from(self) + } + + /// Return the pointer that is stored in the `OwnedPointer`. The + /// pointer is valid for as long as the `OwnedPointer` itself. + pub fn as_ptr(&self) -> *const <T as CloneToForeign>::Foreign { + self.ptr + } + + pub fn as_mut_ptr(&self) -> *mut <T as CloneToForeign>::Foreign { + self.ptr + } + + /// Return the pointer that is stored in the `OwnedPointer`, + /// consuming the `OwnedPointer` but not freeing the pointer. + pub fn into_inner(mut self) -> *mut <T as CloneToForeign>::Foreign { + let result = mem::replace(&mut self.ptr, ptr::null_mut()); + mem::forget(self); + result + } +} + +impl<T: FromForeign + ?Sized> OwnedPointer<T> { + /// Convert a C datum to a native Rust object, taking ownership of + /// the pointer or Rust object (same as `from_glib_full` in `glib-rs`) + pub fn into_native(self) -> T { + // SAFETY: the pointer was passed to the unsafe constructor OwnedPointer::new + unsafe { T::from_foreign(self.into_inner()) } + } +} + +impl<T: CloneToForeign + ?Sized> Debug for OwnedPointer<T> { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + let name = std::any::type_name::<T>(); + let name = format!("OwnedPointer<{}>", name); + f.debug_tuple(&name).field(&self.as_ptr()).finish() + } +} + +impl<T: CloneToForeign + Default + ?Sized> Default for OwnedPointer<T> { + fn default() -> Self { + <T as Default>::default().clone_to_foreign() + } +} + +impl<T: CloneToForeign + ?Sized> Drop for OwnedPointer<T> { + fn drop(&mut self) { + let p = mem::replace(&mut self.ptr, ptr::null_mut()); + // SAFETY: the pointer was passed to the unsafe constructor OwnedPointer::new + unsafe { T::free_foreign(p) } + } +} diff --git a/qemu/src/util/mod.rs b/qemu/src/util/mod.rs new file mode 100644 index 0000000..be00d0c --- /dev/null +++ b/qemu/src/util/mod.rs @@ -0,0 +1 @@ +pub mod foreign;
The qemu::util::foreign module provides: - A trait for structs that can be converted to a C ("foreign") representation (CloneToForeign) - A trait for structs that can be built from a C ("foreign") representation (FromForeign), and the utility IntoNative that can be used with less typing (similar to the standard library's From and Into pair) - Automatic implementations of the above traits for Option<>, supporting NULL pointers - A wrapper for a pointer that automatically frees the contained data. If a struct XYZ implements CloneToForeign, you can build an OwnedPointer<XYZ> and it will free the contents automatically unless you retrieve it with owned_ptr.into_inner() Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- qemu/src/lib.rs | 6 + qemu/src/util/foreign.rs | 247 +++++++++++++++++++++++++++++++++++++++ qemu/src/util/mod.rs | 1 + 3 files changed, 254 insertions(+) create mode 100644 qemu/src/util/foreign.rs create mode 100644 qemu/src/util/mod.rs