mbox series

[00/14] rust: example of bindings code for Rust in QEMU

Message ID 20240701145853.1394967-1-pbonzini@redhat.com (mailing list archive)
Headers show
Series rust: example of bindings code for Rust in QEMU | expand

Message

Paolo Bonzini July 1, 2024, 2:58 p.m. UTC
Hi all,

this is an example of what some bindings code for QEMU would look like.
Note that some parts of this code are barely expected to compile, and
are probably full of bugs, but still should look like finished code
(in fact, because they compile, the type system parts should be okay;
though a conditional "should" is required).

This code is not integrated in the QEMU source tree, because again it
is just a example of what kind of Rust code would exist to handle the
C<->Rust FFI.  The translation of a handful of C structs and function
prototypes is done by hand rather than with bindgen, in particular.

The patches are organized as follows:

Patches 1-2 introduce the skeleton for the rest of the code and are
not particularly interesting, since that skeleton would be provided
by the patches that introduce Rust usage in QEMU.


Patches 3-4 define common code to handle conversion of data structures
between Rust and C.  I couldn't find an existing crate to do this,
though there are similar concepts in glib-rs.  The crate defines
variants of Clone, From and Into that convert a Rust struct to a
C pointer, for example:

   let s = "abc".clone_to_foreign();        // mallocs a NULL-terminated copy
   let p = s.as_ptr();
   drop(s);                                 // p is freed now

or the other way round:

   let s = String::cloned_from_foreign(p);  // p is a *const c_char

   let t: String = p.into_native();         // p is a *mut c_char and is free()d

The second patch defines what you need to convert strings from and to
C.  It's used by tests but also by a couple of QOM functions implemented
below; it lets you forget the differences between String, &str, CString,
&CStr and Cow<'_, str>.

This is the only complete part of the skeleton (in fact I wrote it
a couple years ago), and it comes with testcases that pass in both
"#[test]" and "doctest" (i.e. snippets integrated in the documentation
comments) styles.


Patch 5 (mostly complete except for &error_abort support and missing
tests) allows using the above functionality for the QEMU Error*.


Patches 6-8 provide an example of how to use QOM from Rust.  They
define all the functionality to:

- handle reference counting

- handle typesafe casting (i.e. code doesn't compile if an upcast
  is invalid, or if a downcast cannot possibly succeed).

- faking inheritance since Rust doesn't support it.

While I started with some implementation ideas in glib-rs, this has
evolved quite a bit and doesn't really look much like glib-rs anymore.
I provided a handful of functions to wrap C APIs.  For example:

   object_property_add_child(obj, "foo", object_new(TYPE_BLAH))

becomes

   obj.property_add_child("foo", Blah::new());

... except that the former leaks a reference and the latter does not
(gotcha :)).  The idea is that these functions would be written, in
general, as soon as Rust code uses them, but I included a few as an
example of using the various abstractions for typecasting, reference
counting, and converting from/to C data strcutures.

A large part of this code is of the "it compiles and it looks nice, it
must be perfect" kind.  Rust is _very_ picky on type safety, in case you
didn't know.  One day we're all going to be programming in Haskell,
without even noticing it.


Patches 9-10 deal with how to define new subclasses in Rust.  They are
a lot less polished and less ready.  There is probably a lot of polish
that could be applied to make the code look nicer, but I guess there is
always time to make the code look nicer until the details are sorted out.

The things that I considered here are:

- splitting configuration from runtime state.  Configuration is
  immutable throughout the lifetime of the object, and holds the
  value of user-configured properties.  State will typically be a
  Mutex<> or RefCell<> since the QOM bindings make wide use of interior
  mutability---almost all functions are declared as &self---following
  the lead of glib-rs.

- automatic generation of instance_init and instance_finalize.  For
  this I opted to introduce a new initialization step that is tailored to
  Rust, called instance_mem_init(), that is executed before the default
  value of properties is set.  This makes sure that user code only ever
  sees valid values for the whole struct, including after a downcast;
  it matters because some Rust types (notably references) cannot be
  initialized to a zero-bytes pattern.  The default implementation of
  instance_mem_init is simply a memset(), since the callback replaces the

    memset(obj, 0, type->instance_size);

  line in object_initialize_with_type().  I have prototyped this change
  in QEMU already.

- generation of C vtables from safe code that is written in Rust. I
  chose a trait that only contains associated constants as a way to
  access the vtables generically.  For example:

    impl ObjectImpl for TestDevice {
        const UNPARENT: Option<fn(&TestDevice)> = Some(TestDevice::unparent);
    }

    impl DeviceImpl for TestDevice {
        const REALIZE: Option<fn(&TestDevice) -> Result<()>> = Some(TestDevice::realize);
    }

  This works and it seems like a style that (in the future) we could apply
  macro or even procedural macro magic to.

- generation of qdev property tables.  While only boolean properties are
  implemented here, one idea that I experimented with, is that the
  default value of properties is derived from the ConstDefault trait.
  (ConstDefault is provided by the const_default external crate).  Again,
  this is material for future conversion to procedural macros.

I absolutely didn't look at vmstate, but it shouldn't be too different
from properties, at least for the common cases.


Patches 11-14 finally are an example of the changes that are needed
to respect a minimum supported Rust version consistent with what is in
Debian Bullseye.  It's not too bad, especially since the current version
of the QOM bindings does not require generic associated types anymore.


Why am I posting this?  Because this kind of glue code is the ultimate
source of technical debt.  It is the thing that we should be scared of
when introducing a new language in QEMU.  It makes it harder to change C
code, and it is hard to change once Rust code becomes more widespread.
If we think a C API is not fully baked, we probably shouldn't write
Rust code that uses it (including bindings code).  If we think a Rust
API is not fully baked, we probably shouldn't add too much Rust code
that uses it.

We should have an idea of what this glue code looks like, in order to make
an informed choice.  If we think we're not comfortable with reviewing it,
well, we should be ready to say so and stick with C until we are.

The alternative could be to use Rust without this kind of binding.  I
think it's a bad idea.  It removes many of the advantages of Rust 
(which are exemplified by the above object_property_add_child one-liner),
and it also introduces _new_ kinds of memory errors, since Rust has
its own undefined behavior conditions that are not there in C/C++.
For example:

    impl Struct {
        pub fn f(&self) {
            call_some_c_function(Self::g, self as *const Self as *mut _);
        }

        fn do_g(&mut self) {
            ...
        }

        extern "C" fn g(ptr: *mut Self) {
            unsafe { &mut *ptr }.do_g();
        }
    }

is invalid because a &mut reference (exclusive) is alive at the same time
as a & reference (shared).  It is left as an exercise to the reader to
figure out all the possible ways in which we can shoot our own feet,
considering the pervasive use of callbacks in QEMU.


With respect to callbacks, that's something that is missing in this
prototype.  Fortunately, that's also something that will be tackled very
soon if the PL011 example is merged, because memory regions and character
devices both introduce them.  Also, as I understand it, Rust code using
callbacks is not particularly nice anyway, though it is of course doable.
Instead, this exercise are about being able to write *nice* Rust code,
with all the advantages provided by the language, and the cost of
writing/maintaining the glue code that makes it possible.  I expect
that we'll use a technique similar to the extern_c crate (it's 20 lines of
code; https://docs.rs/crate/extern-c/) to convert something that implements
Fn(), including a member function, into an extern "C" function.


Anyhow: I think we can do it, otherwise I would not have written 2000 lines
of code (some of it two or three times).  But if people are now scared and
think we shouldn't, well, that's also a success of its own kind.

Paolo


Paolo Bonzini (14):
  add skeleton
  set expectations
  rust: define traits and pointer wrappers to convert from/to C
    representations
  rust: add tests for util::foreign
  rust: define wrappers for Error
  rust: define wrappers for basic QOM concepts
  rust: define wrappers for methods of the QOM Object class
  rust: define wrappers for methods of the QOM Device class
  rust: add idiomatic bindings to define Object subclasses
  rust: add idiomatic bindings to define Device subclasses
  rust: replace std::ffi::c_char with libc::c_char
  rust: replace c"" literals with cstr crate
  rust: introduce alternative to offset_of!
  rust: use version of toml_edit that does not require new Rust

Comments

Pierrick Bouvier July 4, 2024, 7:26 p.m. UTC | #1
Hi Paolo,

thanks for this series!
Some comments below.

On 7/1/24 07:58, Paolo Bonzini wrote:
> Hi all,
> 
> this is an example of what some bindings code for QEMU would look like.
> Note that some parts of this code are barely expected to compile, and
> are probably full of bugs, but still should look like finished code
> (in fact, because they compile, the type system parts should be okay;
> though a conditional "should" is required).
> 
> This code is not integrated in the QEMU source tree, because again it
> is just a example of what kind of Rust code would exist to handle the
> C<->Rust FFI.  The translation of a handful of C structs and function
> prototypes is done by hand rather than with bindgen, in particular.
> 
> The patches are organized as follows:
> 
> Patches 1-2 introduce the skeleton for the rest of the code and are
> not particularly interesting, since that skeleton would be provided
> by the patches that introduce Rust usage in QEMU.
> 
> 
> Patches 3-4 define common code to handle conversion of data structures
> between Rust and C.  I couldn't find an existing crate to do this,
> though there are similar concepts in glib-rs.  The crate defines
> variants of Clone, From and Into that convert a Rust struct to a
> C pointer, for example:
> 
>     let s = "abc".clone_to_foreign();        // mallocs a NULL-terminated copy
>     let p = s.as_ptr();
>     drop(s);                                 // p is freed now
> 
> or the other way round:
> 
>     let s = String::cloned_from_foreign(p);  // p is a *const c_char
> 
>     let t: String = p.into_native();         // p is a *mut c_char and is free()d
> 
> The second patch defines what you need to convert strings from and to
> C.  It's used by tests but also by a couple of QOM functions implemented
> below; it lets you forget the differences between String, &str, CString,
> &CStr and Cow<'_, str>.
> 
> This is the only complete part of the skeleton (in fact I wrote it
> a couple years ago), and it comes with testcases that pass in both
> "#[test]" and "doctest" (i.e. snippets integrated in the documentation
> comments) styles.
> 
> 
> Patch 5 (mostly complete except for &error_abort support and missing
> tests) allows using the above functionality for the QEMU Error*.
> 
> 
> Patches 6-8 provide an example of how to use QOM from Rust.  They
> define all the functionality to:
> 
> - handle reference counting
> 
> - handle typesafe casting (i.e. code doesn't compile if an upcast
>    is invalid, or if a downcast cannot possibly succeed).
> 
> - faking inheritance since Rust doesn't support it.
> 
> While I started with some implementation ideas in glib-rs, this has
> evolved quite a bit and doesn't really look much like glib-rs anymore.
> I provided a handful of functions to wrap C APIs.  For example:
> 
>     object_property_add_child(obj, "foo", object_new(TYPE_BLAH))
> 
> becomes
> 
>     obj.property_add_child("foo", Blah::new());
> 
> ... except that the former leaks a reference and the latter does not
> (gotcha :)).  The idea is that these functions would be written, in
> general, as soon as Rust code uses them, but I included a few as an
> example of using the various abstractions for typecasting, reference
> counting, and converting from/to C data strcutures.
> 
> A large part of this code is of the "it compiles and it looks nice, it
> must be perfect" kind.  Rust is _very_ picky on type safety, in case you
> didn't know.  One day we're all going to be programming in Haskell,
> without even noticing it.
> 
> 
> Patches 9-10 deal with how to define new subclasses in Rust.  They are
> a lot less polished and less ready.  There is probably a lot of polish
> that could be applied to make the code look nicer, but I guess there is
> always time to make the code look nicer until the details are sorted out.
> 
> The things that I considered here are:
> 
> - splitting configuration from runtime state.  Configuration is
>    immutable throughout the lifetime of the object, and holds the
>    value of user-configured properties.  State will typically be a
>    Mutex<> or RefCell<> since the QOM bindings make wide use of interior
>    mutability---almost all functions are declared as &self---following
>    the lead of glib-rs.
> 
> - automatic generation of instance_init and instance_finalize.  For
>    this I opted to introduce a new initialization step that is tailored to
>    Rust, called instance_mem_init(), that is executed before the default
>    value of properties is set.  This makes sure that user code only ever
>    sees valid values for the whole struct, including after a downcast;
>    it matters because some Rust types (notably references) cannot be
>    initialized to a zero-bytes pattern.  The default implementation of
>    instance_mem_init is simply a memset(), since the callback replaces the
> 
>      memset(obj, 0, type->instance_size);
> 
>    line in object_initialize_with_type().  I have prototyped this change
>    in QEMU already.
> 
> - generation of C vtables from safe code that is written in Rust. I
>    chose a trait that only contains associated constants as a way to
>    access the vtables generically.  For example:
> 
>      impl ObjectImpl for TestDevice {
>          const UNPARENT: Option<fn(&TestDevice)> = Some(TestDevice::unparent);
>      }
> 
>      impl DeviceImpl for TestDevice {
>          const REALIZE: Option<fn(&TestDevice) -> Result<()>> = Some(TestDevice::realize);
>      }
> 
>    This works and it seems like a style that (in the future) we could apply
>    macro or even procedural macro magic to.
> 
> - generation of qdev property tables.  While only boolean properties are
>    implemented here, one idea that I experimented with, is that the
>    default value of properties is derived from the ConstDefault trait.
>    (ConstDefault is provided by the const_default external crate).  Again,
>    this is material for future conversion to procedural macros.
> 
> I absolutely didn't look at vmstate, but it shouldn't be too different
> from properties, at least for the common cases.
> 
> 
> Patches 11-14 finally are an example of the changes that are needed
> to respect a minimum supported Rust version consistent with what is in
> Debian Bullseye.  It's not too bad, especially since the current version
> of the QOM bindings does not require generic associated types anymore.
> 
> 
> Why am I posting this?  Because this kind of glue code is the ultimate
> source of technical debt.  It is the thing that we should be scared of
> when introducing a new language in QEMU.  It makes it harder to change C
> code, and it is hard to change once Rust code becomes more widespread.
> If we think a C API is not fully baked, we probably shouldn't write
> Rust code that uses it (including bindings code).  If we think a Rust
> API is not fully baked, we probably shouldn't add too much Rust code
> that uses it.
> 

Shouldn't we focus more on how we want to write a QOM device in Rust 
instead, by making abstraction of existing C implementation first?
Once we have satisfying idea, we could evaluate how it maps to existing 
implementation, and which compromise should be made for efficiency.

It seems that the current approach is to stick to existing model, and 
derive Rust bindings from this.

I agree this glue can be a source of technical debt, but on the other 
side, it should be easy to refactor it, if we decided first what the 
"clean and idiomatic" Rust API should look like.

A compromise where you have to manually translate some structs, or copy 
memory to traverse languages borders at some point, could be worth the 
safety and expressiveness.

> We should have an idea of what this glue code looks like, in order to make
> an informed choice.  If we think we're not comfortable with reviewing it,
> well, we should be ready to say so and stick with C until we are.
>

While it is important that this glue code is maintainable and looks 
great, I don't think it should be the main reason to accept usage of a 
new language.

> The alternative could be to use Rust without this kind of binding.  I
> think it's a bad idea.  It removes many of the advantages of Rust
> (which are exemplified by the above object_property_add_child one-liner),
> and it also introduces _new_ kinds of memory errors, since Rust has
> its own undefined behavior conditions that are not there in C/C++.
> For example:
> 
>      impl Struct {
>          pub fn f(&self) {
>              call_some_c_function(Self::g, self as *const Self as *mut _);
>          }
> 
>          fn do_g(&mut self) {
>              ...
>          }
> 
>          extern "C" fn g(ptr: *mut Self) {
>              unsafe { &mut *ptr }.do_g();
>          }
>      }
> 
> is invalid because a &mut reference (exclusive) is alive at the same time
> as a & reference (shared).  It is left as an exercise to the reader to
> figure out all the possible ways in which we can shoot our own feet,
> considering the pervasive use of callbacks in QEMU.
> 

I agree with this point that you made on the original RFC series from 
Manos. Just calling qemu C API through unsafe code is not making the 
best possible usage of Rust.

> 
> With respect to callbacks, that's something that is missing in this
> prototype.  Fortunately, that's also something that will be tackled very
> soon if the PL011 example is merged, because memory regions and character
> devices both introduce them.  Also, as I understand it, Rust code using
> callbacks is not particularly nice anyway, though it is of course doable.
> Instead, this exercise are about being able to write *nice* Rust code,
> with all the advantages provided by the language, and the cost of
> writing/maintaining the glue code that makes it possible.  I expect
> that we'll use a technique similar to the extern_c crate (it's 20 lines of
> code; https://docs.rs/crate/extern-c/) to convert something that implements
> Fn(), including a member function, into an extern "C" function.
> 

It could be an interesting thing to explore. Is the best solution to use 
specific traits (no callbacks involved from Rust code), or should we 
have Rust closures that mimic this behavior.

We could have a specific trait with functions to handle various kind of 
events. And the glue code could be responsible to translate this to 
callbacks for the C side (and calling Rust code accordingly, eventually 
serializing this on a single thread to avoid any race issues).

> 
> Anyhow: I think we can do it, otherwise I would not have written 2000 lines
> of code (some of it two or three times).  But if people are now scared and
> think we shouldn't, well, that's also a success of its own kind.
> 
> Paolo
> 
> 
> Paolo Bonzini (14):
>    add skeleton
>    set expectations
>    rust: define traits and pointer wrappers to convert from/to C
>      representations
>    rust: add tests for util::foreign
>    rust: define wrappers for Error
>    rust: define wrappers for basic QOM concepts
>    rust: define wrappers for methods of the QOM Object class
>    rust: define wrappers for methods of the QOM Device class
>    rust: add idiomatic bindings to define Object subclasses
>    rust: add idiomatic bindings to define Device subclasses
>    rust: replace std::ffi::c_char with libc::c_char
>    rust: replace c"" literals with cstr crate
>    rust: introduce alternative to offset_of!
>    rust: use version of toml_edit that does not require new Rust
>
Paolo Bonzini July 5, 2024, 8:06 a.m. UTC | #2
On Thu, Jul 4, 2024 at 9:26 PM Pierrick Bouvier
<pierrick.bouvier@linaro.org> wrote:
> > Patches 9-10 deal with how to define new subclasses in Rust.  They are
> > a lot less polished and less ready.  There is probably a lot of polish
> > that could be applied to make the code look nicer, but I guess there is
> > always time to make the code look nicer until the details are sorted out.
> >
> > The things that I considered here are:
> >
> > - splitting configuration from runtime state
> > - automatic generation of instance_init and instance_finalize
> > - generation of C vtables from safe code that is written in Rust
> > - generation of qdev property tables
>
> Shouldn't we focus more on how we want to write a QOM device in Rust
> instead, by making abstraction of existing C implementation first?
> Once we have satisfying idea, we could evaluate how it maps to existing
> implementation, and which compromise should be made for efficiency.
> It seems that the current approach is to stick to existing model, and
> derive Rust bindings from this.

I think I tried to do a little bit of that. For example, the split of
configuration from runtime state is done to isolate the interior
mutability, and the automatic generation of instance_init and
instance_finalize relies on Rust traits like Default and Drop; the
realize implementation returns a qemu::Result<()>; and so on.

But there are many steps towards converting the PL011 device to Rust:

- declarative definitions of bitfields and registers (done)
- using RefCell for mutable state
- using wrappers for class and property definition
- defining and using wrappers for Chardev/CharBackend
- defining and using wrappers for MemoryRegion callbacks
- defining and using wrappers for SysBusDevice functionality
- defining and using wrappers for VMState functionality

At each step we need to design "how we want to do that in Rust" and
all the things you mention above. This prototype covers the second and
third steps, and it's already big enough! :)

I expect that code like this one would be merged together with the
corresponding changes to PL011: the glue code is added to the qemu/
crate and used in the hw/core/pl011/ directory. This way, both authors
and reviewers will have a clue of what becomes more readable/idiomatic
in the resulting code.

> I agree this glue can be a source of technical debt, but on the other
> side, it should be easy to refactor it, if we decided first what the
> "clean and idiomatic" Rust API should look like.

That's true, especially if we want to add more macro magic on top. We
can decide later where to draw the line between complexity of glue
code and cleanliness of Rust code, and also move the line as we see
fit.

On the other hand, if we believe that _this_ code is already too much,
that's also a data point. Again: I don't think it is, but I want us to
make an informed decision.

> A compromise where you have to manually translate some structs, or copy
> memory to traverse languages borders at some point, could be worth the
> safety and expressiveness.

Yep, Error is an example of this. It's not the common case, so it's
totally fine to convert to and from C Error* (which also includes
copying the inner error string, from a String to malloc-ed char*) to
keep the APIs nicer.

> > We should have an idea of what this glue code looks like, in order to make
> > an informed choice.  If we think we're not comfortable with reviewing it,
> > well, we should be ready to say so and stick with C until we are.
>
> While it is important that this glue code is maintainable and looks
> great, I don't think it should be the main reason to accept usage of a
> new language.

Not the main reason, but an important factor in judging if we are
*able* to accept usage of a new language. Maybe it's just a formal
step, but it feels safer to have _some_ code to show and to read, even
if it's just a prototype that barely compiles.

> We could have a specific trait with functions to handle various kind of
> events. And the glue code could be responsible to translate this to
> callbacks for the C side (and calling Rust code accordingly, eventually
> serializing this on a single thread to avoid any race issues).

That's a possibility, yes. Something like (totally random):

impl CharBackend {
    pub fn start<T: CharBackendCallbacks>(&mut self, obj: &T) {
        qemu_chr_fe_set_handlers(self.0,
                             Some(obj::can_receive),
Some(obj::receive, obj::event),
                             Some(obj::be_change), obj as *const _ as
*const c_void,
                             ptr::null(), true);
    }
    pub fn stop(&mut self) {
        qemu_chr_fe_set_handlers(self.0, None, None,
                             None, ptr::null(), ptr::null(), true);
    }
}

but I left for later because I focused just on the lowest levels code
where you could have "one" design. For example, for memory regions
some devices are going to have more than one, so there could be
reasons to do callbacks in different ways. By the way, all of this
passing of Rust references as opaque pointers needs, at least in
theory, to be annotated for pinning. Do we have to make sure that
pinning is handled correctly, or can we just assume that QOM objects
are pinned? I am not sure I am ready to answer the question...

Paolo
Pierrick Bouvier July 5, 2024, 6:52 p.m. UTC | #3
On 7/5/24 01:06, Paolo Bonzini wrote:
> On Thu, Jul 4, 2024 at 9:26 PM Pierrick Bouvier
> <pierrick.bouvier@linaro.org> wrote:
>>> Patches 9-10 deal with how to define new subclasses in Rust.  They are
>>> a lot less polished and less ready.  There is probably a lot of polish
>>> that could be applied to make the code look nicer, but I guess there is
>>> always time to make the code look nicer until the details are sorted out.
>>>
>>> The things that I considered here are:
>>>
>>> - splitting configuration from runtime state
>>> - automatic generation of instance_init and instance_finalize
>>> - generation of C vtables from safe code that is written in Rust
>>> - generation of qdev property tables
>>
>> Shouldn't we focus more on how we want to write a QOM device in Rust
>> instead, by making abstraction of existing C implementation first?
>> Once we have satisfying idea, we could evaluate how it maps to existing
>> implementation, and which compromise should be made for efficiency.
>> It seems that the current approach is to stick to existing model, and
>> derive Rust bindings from this.
> 
> I think I tried to do a little bit of that. For example, the split of
> configuration from runtime state is done to isolate the interior
> mutability, and the automatic generation of instance_init and
> instance_finalize relies on Rust traits like Default and Drop; the
> realize implementation returns a qemu::Result<()>; and so on.
> 

Patches 9/10 have a concrete example of what a TestDevice look like, 
mixed with some glue definition. But definitely, this example code is 
what would be the most interesting to review for introducing Rust.

To give an example of questions we could have:

Why should we have distinct structs for a device, the object 
represented, and its state?

Properties could be returned by a specific function (properties()), 
called at some given point. Would that be a good approach?

Having to implement ObjectImpl and DeviceImpl looks like an 
implementation detail, that could be derived automatically from a 
device. What's wrong with calling a realize that does nothing in the end?

Could device state be serialized with something like serde?

This is the kind of things that could be discussed, on a reduced 
example, without specially looking at how to implement that concretely, 
in a first time.

> But there are many steps towards converting the PL011 device to Rust:
> 
> - declarative definitions of bitfields and registers (done)
> - using RefCell for mutable state
> - using wrappers for class and property definition
> - defining and using wrappers for Chardev/CharBackend
> - defining and using wrappers for MemoryRegion callbacks
> - defining and using wrappers for SysBusDevice functionality
> - defining and using wrappers for VMState functionality
> 
> At each step we need to design "how we want to do that in Rust" and
> all the things you mention above. This prototype covers the second and
> third steps, and it's already big enough! :)
> 
> I expect that code like this one would be merged together with the
> corresponding changes to PL011: the glue code is added to the qemu/
> crate and used in the hw/core/pl011/ directory. This way, both authors
> and reviewers will have a clue of what becomes more readable/idiomatic
> in the resulting code.
> 
>> I agree this glue can be a source of technical debt, but on the other
>> side, it should be easy to refactor it, if we decided first what the
>> "clean and idiomatic" Rust API should look like.
> 
> That's true, especially if we want to add more macro magic on top. We
> can decide later where to draw the line between complexity of glue
> code and cleanliness of Rust code, and also move the line as we see
> fit.
>

Automatic derivation macros could be ok (to derive some trait or 
internal stuff), but usage of magic macros as syntactic sugar should 
indeed be thought twice.

Return a collection of QDevProperty could be better than having a magic 
@properties syntactic sugar.

Another thing that could be discussed is: do we want to have the whole 
inheritance mechanism for Rust devices? Is it really needed?

> On the other hand, if we believe that _this_ code is already too much,
> that's also a data point. Again: I don't think it is, but I want us to
> make an informed decision.
> 

Between having a clean and simple Device definition, with a bit more of 
magic glue (hidden internally), and requires devices to do some magic 
initialization to satisfy existing architecture, what would be the best?

Even though it takes 1000 more lines, I would be in favor to have 
Devices that are as clean and simple as possible. Because this is the 
kind of code what will be the most added/modified, compared to the glue 
part.

>> A compromise where you have to manually translate some structs, or copy
>> memory to traverse languages borders at some point, could be worth the
>> safety and expressiveness.
> 
> Yep, Error is an example of this. It's not the common case, so it's
> totally fine to convert to and from C Error* (which also includes
> copying the inner error string, from a String to malloc-ed char*) to
> keep the APIs nicer.
> 

Yes, it's a good approach!

>>> We should have an idea of what this glue code looks like, in order to make
>>> an informed choice.  If we think we're not comfortable with reviewing it,
>>> well, we should be ready to say so and stick with C until we are.
>>
>> While it is important that this glue code is maintainable and looks
>> great, I don't think it should be the main reason to accept usage of a
>> new language.
> 
> Not the main reason, but an important factor in judging if we are
> *able* to accept usage of a new language. Maybe it's just a formal
> step, but it feels safer to have _some_ code to show and to read, even
> if it's just a prototype that barely compiles.
> 
>> We could have a specific trait with functions to handle various kind of
>> events. And the glue code could be responsible to translate this to
>> callbacks for the C side (and calling Rust code accordingly, eventually
>> serializing this on a single thread to avoid any race issues).
> 
> That's a possibility, yes. Something like (totally random):
> 
> impl CharBackend {
>      pub fn start<T: CharBackendCallbacks>(&mut self, obj: &T) {
>          qemu_chr_fe_set_handlers(self.0,
>                               Some(obj::can_receive),
> Some(obj::receive, obj::event),
>                               Some(obj::be_change), obj as *const _ as
> *const c_void,
>                               ptr::null(), true);
>      }
>      pub fn stop(&mut self) {
>          qemu_chr_fe_set_handlers(self.0, None, None,
>                               None, ptr::null(), ptr::null(), true);
>      }
> }

Or have multiple traits matching every possible operation, and allow a 
device to implement it or not, like Read/Write traits. And the glue code 
could call qemu_chr_fe_set_handlers.

> 
> but I left for later because I focused just on the lowest levels code
> where you could have "one" design. For example, for memory regions
> some devices are going to have more than one, so there could be
> reasons to do callbacks in different ways. By the way, all of this
> passing of Rust references as opaque pointers needs, at least in
> theory, to be annotated for pinning. Do we have to make sure that
> pinning is handled correctly, or can we just assume that QOM objects
> are pinned? I am not sure I am ready to answer the question...
> 

I hope my answer helped to understand more my point that discussing the 
interface is more important than discussing the glue needed.

> Paolo
>
Paolo Bonzini July 5, 2024, 9:08 p.m. UTC | #4
Hi, first of all I want to clarify the raison d'etre for this posting,
which I have also explained to Manos. Nothing you see here is code
that will be certainly included in QEMU; it's (mostly) throwaway by
design. I don't have much attachment to any of the code except perhaps
the casting and reference counting stuff (which is in, like, the third
rewrite... I got nerd-sniped there :)). But at the same time I think
it's plausibly not too different (in look and complexity) from what
the actual code will look like.

It's also not an attempt to bypass/leapfrog other people in the
design; it's more a "let's be ready for this to happen" than a design
document. The first step and the more immediate focus remains the
build system integration.

But you have good questions and observations so I'll put something
below about the design as well.

On Fri, Jul 5, 2024 at 8:52 PM Pierrick Bouvier
<pierrick.bouvier@linaro.org> wrote:
> To give an example of questions we could have:
>
> Why should we have distinct structs for a device, the object
> represented, and its state?

If you refer to the conf and state, a bit because of the different
traits required (ConstDefault is needed for properties but not the
rest) and especially because we can enforce immutability of
configuration vs. interior mutability of state.

In particular, from Manos's prototype I noticed that you need to
access the Chardev while the state is *not* borrowed; methods on the
Chardev call back into the device and the callback needs mutable
access to the state. So some kind of separation seems to be necessary,
for better or worse, unless interior mutability is achieved with a
dozen Cells (not great).

> Having to implement ObjectImpl and DeviceImpl looks like an
> implementation detail, that could be derived automatically from a
> device. What's wrong with calling a realize that does nothing in the end?

The problem is not calling a realize that does nothing, it's that you
are forced to override the superclass implementation (which might be
in C) if you don't go through Option<>. class_init methods update a
few superclass methods but not all of them.

The vtable in ObjectImpl/DeviceImpl could be generated via macros; for
now I did it by hand to avoid getting carried away too much with
syntactic sugar.

> Could device state be serialized with something like serde?

Probably not, there's extra complications for versioning. A long time
ago there was an attempt to have some kind of IDL embedded in C code
(not unlike Rust attributes) to automatically generate properties and
vmstate, but it was too ambitious.
(https://www.linux-kvm.org/images/b/b5/2012-forum-qidl-talk.pdf,
slides 1-9).

Generating property and vmstate declarations could be done with
"normal" macros just like in C, or with attribute macros (needs a lot
more code and experience, wouldn't do it right away). However, serde
for QAPI and/or visitors may be a possibility, after all JSON is
serde's bread and butter.

> This is the kind of things that could be discussed, on a reduced
> example, without specially looking at how to implement that concretely,
> in a first time.

There are some things that Rust really hates that you do, and they
aren't always obvious. Therefore in this exercise I tried to let
intuition guide me, and see how much the type system fought that
intuition (not much actually!). I started from the more technical and
less artistic part to see if I was able to get somewhere.

But yes, it's a great exercise to do this experiment from the opposite
end. Then the actual glue code will "meet in the middle", applying
lessons learnt from both experiments, with individual pieces of the
real interface implemented and applied to the PL011 sample at the same
time. The main missing thing in PL011 is DMA, otherwise it's a nice
playground.

> usage of magic macros as syntactic sugar should indeed be thought twice.
> Return a collection of QDevProperty could be better than having a magic
> @properties syntactic sugar.

No hard opinions there, sure. I went for the magic syntax because the
properties cannot be a const and I wanted to hide the ugly "static
mut", but certainly there could be better ways to do it.

> Another thing that could be discussed is: do we want to have the whole
> inheritance mechanism for Rust devices? Is it really needed?

Eh, this time unfortunately I think it is. Stuff like children and
properties, which are methods of Object, need to be available to
subclasses in instance_init and/or realize. Reset is in Device and we
have the whole Device/SysBusDevice/PCIDevice set of classes, so you
need to access superclass methods from subclasses. It's not a lot of
code though.

> Between having a clean and simple Device definition, with a bit more of
> magic glue (hidden internally), and requires devices to do some magic
> initialization to satisfy existing architecture, what would be the best?
> Even though it takes 1000 more lines, I would be in favor to have
> Devices that are as clean and simple as possible. Because this is the
> kind of code what will be the most added/modified, compared to the glue
> part.

That's an opinion I share but I wasn't sure it's universal, which was
another reason to prototype and post some "scary but perhaps
necessary" code.

> Or have multiple traits matching every possible operation, and allow a
> device to implement it or not, like Read/Write traits. And the glue code
> could call qemu_chr_fe_set_handlers.

Possibly, yes. https://lwn.net/Articles/863459/ you see many such
techniques: traits (gpio::Chip etc.), composition (device::Data),
lambdas (only for initialization).

The advantage of the lambda approach is that it scales to multiple
backends or multiple memory regions. We'll see.

> I hope my answer helped to understand more my point that discussing the
> interface is more important than discussing the glue needed.

Sure, and I don't think we are very much in disagreement, if at all.

Paolo