diff mbox series

[v2,4/7] device-core: use atomic_set on .realized property

Message ID 20200511160951.8733-5-mlevitsk@redhat.com (mailing list archive)
State New, archived
Headers show
Series RFC/WIP: Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread | expand

Commit Message

Maxim Levitsky May 11, 2020, 4:09 p.m. UTC
Some code might race with placement of new devices on a bus.
We currently first place a (unrealized) device on the bus
and then realize it.

As a workaround, users that scan the child device list, can
check the realized property to see if it is safe to access such a device.
Use an atomic write here too to aid with this.

A separate discussion is what to do with devices that are unrealized:
It looks like for this case we only call the hotplug handler's unplug
callback and its up to it to unrealize the device.
An atomic operation doesn't cause harm for this code path though.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/core/qdev.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Stefan Hajnoczi May 27, 2020, 3 p.m. UTC | #1
On Mon, May 11, 2020 at 07:09:48PM +0300, Maxim Levitsky wrote:
> Some code might race with placement of new devices on a bus.
> We currently first place a (unrealized) device on the bus
> and then realize it.
> 
> As a workaround, users that scan the child device list, can
> check the realized property to see if it is safe to access such a device.
> Use an atomic write here too to aid with this.
> 
> A separate discussion is what to do with devices that are unrealized:
> It looks like for this case we only call the hotplug handler's unplug
> callback and its up to it to unrealize the device.
> An atomic operation doesn't cause harm for this code path though.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  hw/core/qdev.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)

Please add a comment to struct DeviceState saying the realized field
must be accessed with atomic_load_acquire() when used outside the QEMU
global mutex.

> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 732789e2b7..d530c5922f 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -964,7 +964,20 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
>              }
>         }
>  
> +       atomic_store_release(&dev->realized, value);
> +
>      } else if (!value && dev->realized) {
> +
> +        /*
> +         * Change the value so that any concurrent users are aware
> +         * that the device is going to be unrealized
> +         *
> +         * TODO: change .realized property to enum that states
> +         * each phase of the device realization/unrealization
> +         */
> +
> +        atomic_store_release(&dev->realized, value);

I'm not sure if atomic_store_release() is strong enough in the true ->
false case:

  Operations coming after ``atomic_store_release()`` can still be
  reordered before it.

A reader may already seen changes made to unrealize the DeviceState even
though realized still appears to be true. A full write memory barrier
seems safer here.
Maxim Levitsky July 9, 2020, 10:24 a.m. UTC | #2
On Wed, 2020-05-27 at 16:00 +0100, Stefan Hajnoczi wrote:
> On Mon, May 11, 2020 at 07:09:48PM +0300, Maxim Levitsky wrote:
> > Some code might race with placement of new devices on a bus.
> > We currently first place a (unrealized) device on the bus
> > and then realize it.
> > 
> > As a workaround, users that scan the child device list, can
> > check the realized property to see if it is safe to access such a device.
> > Use an atomic write here too to aid with this.
> > 
> > A separate discussion is what to do with devices that are unrealized:
> > It looks like for this case we only call the hotplug handler's unplug
> > callback and its up to it to unrealize the device.
> > An atomic operation doesn't cause harm for this code path though.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  hw/core/qdev.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> Please add a comment to struct DeviceState saying the realized field
> must be accessed with atomic_load_acquire() when used outside the QEMU
> global mutex.
> 
> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > index 732789e2b7..d530c5922f 100644
> > --- a/hw/core/qdev.c
> > +++ b/hw/core/qdev.c
> > @@ -964,7 +964,20 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
> >              }
> >         }
> >  
> > +       atomic_store_release(&dev->realized, value);
> > +
> >      } else if (!value && dev->realized) {
> > +
> > +        /*
> > +         * Change the value so that any concurrent users are aware
> > +         * that the device is going to be unrealized
> > +         *
> > +         * TODO: change .realized property to enum that states
> > +         * each phase of the device realization/unrealization
> > +         */
> > +
> > +        atomic_store_release(&dev->realized, value);
> 
> I'm not sure if atomic_store_release() is strong enough in the true ->
> false case:
> 
>   Operations coming after ``atomic_store_release()`` can still be
>   reordered before it.
> 
> A reader may already seen changes made to unrealize the DeviceState even
> though realized still appears to be true. A full write memory barrier
> seems safer here.
Done.

Best regards,
	Maxim Levitsky
diff mbox series

Patch

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 732789e2b7..d530c5922f 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -964,7 +964,20 @@  static void device_set_realized(Object *obj, bool value, Error **errp)
             }
        }
 
+       atomic_store_release(&dev->realized, value);
+
     } else if (!value && dev->realized) {
+
+        /*
+         * Change the value so that any concurrent users are aware
+         * that the device is going to be unrealized
+         *
+         * TODO: change .realized property to enum that states
+         * each phase of the device realization/unrealization
+         */
+
+        atomic_store_release(&dev->realized, value);
+
         /* We want local_err to track only the first error */
         QLIST_FOREACH(bus, &dev->child_bus, sibling) {
             object_property_set_bool(OBJECT(bus), false, "realized",
@@ -980,12 +993,12 @@  static void device_set_realized(Object *obj, bool value, Error **errp)
         DEVICE_LISTENER_CALL(unrealize, Reverse, dev);
 
         if (local_err != NULL) {
+            atomic_store_release(&dev->realized, true);
             goto fail;
         }
     }
 
     assert(local_err == NULL);
-    dev->realized = value;
     return;
 
 child_realize_fail: