diff mbox

[1/2] libxl/devd: fix a race with concurrent device addition/removal

Message ID 20170510101239.33080-2-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Roger Pau Monné May 10, 2017, 10:12 a.m. UTC
Current code can free the libxl__device inside of the libxl__ddomain_device
before the addition has finished if a removal happens while an addition is
still in process:

  backend_watch_callback
            |
            v
       add_device
            |                 backend_watch_callback
    (async operation)                   |
            |                           v
            |                     remove_device
            |                           |
            |                           V
            |                    device_complete
            |                 (free libxl__device)
            v
     device_complete
  (deref libxl__device)

Fix this by creating a temporary copy of the libxl__device, that's tracked by
the GC of the nested async operation. This ensures that the libxl__device used
by the async operations cannot be freed while being used.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_device.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

Comments

Ian Jackson May 10, 2017, 10:32 a.m. UTC | #1
Roger Pau Monne writes ("[PATCH 1/2] libxl/devd: fix a race with concurrent device addition/removal"):
> Current code can free the libxl__device inside of the libxl__ddomain_device
> before the addition has finished if a removal happens while an addition is
> still in process:
> 
>   backend_watch_callback
>             |
>             v
>        add_device
>             |                 backend_watch_callback
>     (async operation)                   |
>             |                           v
>             |                     remove_device
>             |                           |
>             |                           V
>             |                    device_complete
>             |                 (free libxl__device)
>             v
>      device_complete
>   (deref libxl__device)
> 
> Fix this by creating a temporary copy of the libxl__device, that's
> tracked by the GC of the nested async operation. This ensures that
> the libxl__device used by the async operations cannot be freed while
> being used.

Doesn't this arrange that the remove hotplug script will be invoked
while the add hotplug script is still running ?  Is that really
desirable (or allowed!) ?

Ian.
Roger Pau Monné May 10, 2017, 10:43 a.m. UTC | #2
On Wed, May 10, 2017 at 11:32:45AM +0100, Ian Jackson wrote:
> Roger Pau Monne writes ("[PATCH 1/2] libxl/devd: fix a race with concurrent device addition/removal"):
> > Current code can free the libxl__device inside of the libxl__ddomain_device
> > before the addition has finished if a removal happens while an addition is
> > still in process:
> > 
> >   backend_watch_callback
> >             |
> >             v
> >        add_device
> >             |                 backend_watch_callback
> >     (async operation)                   |
> >             |                           v
> >             |                     remove_device
> >             |                           |
> >             |                           V
> >             |                    device_complete
> >             |                 (free libxl__device)
> >             v
> >      device_complete
> >   (deref libxl__device)
> > 
> > Fix this by creating a temporary copy of the libxl__device, that's
> > tracked by the GC of the nested async operation. This ensures that
> > the libxl__device used by the async operations cannot be freed while
> > being used.
> 
> Doesn't this arrange that the remove hotplug script will be invoked
> while the add hotplug script is still running ?

That's indeed possible (either with the current code or with this patch),
although unlikely. The async code called by remove_device will wait for the
backend to switch to state 6, while the add_device code will wait for state 2
IIRC (one can change those states to make them clash probably).

> Is that really desirable (or allowed!) ?

Hm, no, I don't think it's desirable at all. I still think this is better that
the previous code (at lest it doesn't dereference libxl__device anymore), but
clearly needs further improvements.

Also, it seems to me the same can happen even without driver domains, if a user
executes concurrent block-{attach/detach} operations, but maybe I'm missing
something?

Roger.
Wei Liu May 10, 2017, 10:47 a.m. UTC | #3
On Wed, May 10, 2017 at 11:43:57AM +0100, Roger Pau Monne wrote:
> On Wed, May 10, 2017 at 11:32:45AM +0100, Ian Jackson wrote:
> > Roger Pau Monne writes ("[PATCH 1/2] libxl/devd: fix a race with concurrent device addition/removal"):
> > > Current code can free the libxl__device inside of the libxl__ddomain_device
> > > before the addition has finished if a removal happens while an addition is
> > > still in process:
> > > 
> > >   backend_watch_callback
> > >             |
> > >             v
> > >        add_device
> > >             |                 backend_watch_callback
> > >     (async operation)                   |
> > >             |                           v
> > >             |                     remove_device
> > >             |                           |
> > >             |                           V
> > >             |                    device_complete
> > >             |                 (free libxl__device)
> > >             v
> > >      device_complete
> > >   (deref libxl__device)
> > > 
> > > Fix this by creating a temporary copy of the libxl__device, that's
> > > tracked by the GC of the nested async operation. This ensures that
> > > the libxl__device used by the async operations cannot be freed while
> > > being used.
> > 
> > Doesn't this arrange that the remove hotplug script will be invoked
> > while the add hotplug script is still running ?
> 
> That's indeed possible (either with the current code or with this patch),
> although unlikely. The async code called by remove_device will wait for the
> backend to switch to state 6, while the add_device code will wait for state 2
> IIRC (one can change those states to make them clash probably).
> 
> > Is that really desirable (or allowed!) ?
> 
> Hm, no, I don't think it's desirable at all. I still think this is better that
> the previous code (at lest it doesn't dereference libxl__device anymore), but
> clearly needs further improvements.
> 
> Also, it seems to me the same can happen even without driver domains, if a user
> executes concurrent block-{attach/detach} operations, but maybe I'm missing
> something?
> 

There is a lot of locking for all the device add / remove code. See
libxl_internal.h L2588.

> Roger.
Roger Pau Monné May 10, 2017, 1:03 p.m. UTC | #4
On Wed, May 10, 2017 at 11:47:31AM +0100, Wei Liu wrote:
> On Wed, May 10, 2017 at 11:43:57AM +0100, Roger Pau Monne wrote:
> > On Wed, May 10, 2017 at 11:32:45AM +0100, Ian Jackson wrote:
> > > Roger Pau Monne writes ("[PATCH 1/2] libxl/devd: fix a race with concurrent device addition/removal"):
> > > > Current code can free the libxl__device inside of the libxl__ddomain_device
> > > > before the addition has finished if a removal happens while an addition is
> > > > still in process:
> > > > 
> > > >   backend_watch_callback
> > > >             |
> > > >             v
> > > >        add_device
> > > >             |                 backend_watch_callback
> > > >     (async operation)                   |
> > > >             |                           v
> > > >             |                     remove_device
> > > >             |                           |
> > > >             |                           V
> > > >             |                    device_complete
> > > >             |                 (free libxl__device)
> > > >             v
> > > >      device_complete
> > > >   (deref libxl__device)
> > > > 
> > > > Fix this by creating a temporary copy of the libxl__device, that's
> > > > tracked by the GC of the nested async operation. This ensures that
> > > > the libxl__device used by the async operations cannot be freed while
> > > > being used.
> > > 
> > > Doesn't this arrange that the remove hotplug script will be invoked
> > > while the add hotplug script is still running ?
> > 
> > That's indeed possible (either with the current code or with this patch),
> > although unlikely. The async code called by remove_device will wait for the
> > backend to switch to state 6, while the add_device code will wait for state 2
> > IIRC (one can change those states to make them clash probably).
> > 
> > > Is that really desirable (or allowed!) ?
> > 
> > Hm, no, I don't think it's desirable at all. I still think this is better that
> > the previous code (at lest it doesn't dereference libxl__device anymore), but
> > clearly needs further improvements.
> > 
> > Also, it seems to me the same can happen even without driver domains, if a user
> > executes concurrent block-{attach/detach} operations, but maybe I'm missing
> > something?
> > 
> 
> There is a lot of locking for all the device add / remove code. See
> libxl_internal.h L2588.

It's been a long time since I've played with libxl, last time none of this
existed, sorry. Sadly devd completely bypasses all this, a simple way to fix
this would be to call libxl__lock_domain_userdata from {add/remove}_device
maybe? (and drop the lock at device_complete).

Roger.
Roger Pau Monné May 10, 2017, 2:23 p.m. UTC | #5
On Wed, May 10, 2017 at 02:03:58PM +0100, Roger Pau Monne wrote:
> On Wed, May 10, 2017 at 11:47:31AM +0100, Wei Liu wrote:
> > On Wed, May 10, 2017 at 11:43:57AM +0100, Roger Pau Monne wrote:
> > > On Wed, May 10, 2017 at 11:32:45AM +0100, Ian Jackson wrote:
> > > > Roger Pau Monne writes ("[PATCH 1/2] libxl/devd: fix a race with concurrent device addition/removal"):
> > > > > Current code can free the libxl__device inside of the libxl__ddomain_device
> > > > > before the addition has finished if a removal happens while an addition is
> > > > > still in process:
> > > > > 
> > > > >   backend_watch_callback
> > > > >             |
> > > > >             v
> > > > >        add_device
> > > > >             |                 backend_watch_callback
> > > > >     (async operation)                   |
> > > > >             |                           v
> > > > >             |                     remove_device
> > > > >             |                           |
> > > > >             |                           V
> > > > >             |                    device_complete
> > > > >             |                 (free libxl__device)
> > > > >             v
> > > > >      device_complete
> > > > >   (deref libxl__device)
> > > > > 
> > > > > Fix this by creating a temporary copy of the libxl__device, that's
> > > > > tracked by the GC of the nested async operation. This ensures that
> > > > > the libxl__device used by the async operations cannot be freed while
> > > > > being used.
> > > > 
> > > > Doesn't this arrange that the remove hotplug script will be invoked
> > > > while the add hotplug script is still running ?
> > > 
> > > That's indeed possible (either with the current code or with this patch),
> > > although unlikely. The async code called by remove_device will wait for the
> > > backend to switch to state 6, while the add_device code will wait for state 2
> > > IIRC (one can change those states to make them clash probably).
> > > 
> > > > Is that really desirable (or allowed!) ?
> > > 
> > > Hm, no, I don't think it's desirable at all. I still think this is better that
> > > the previous code (at lest it doesn't dereference libxl__device anymore), but
> > > clearly needs further improvements.
> > > 
> > > Also, it seems to me the same can happen even without driver domains, if a user
> > > executes concurrent block-{attach/detach} operations, but maybe I'm missing
> > > something?
> > > 
> > 
> > There is a lot of locking for all the device add / remove code. See
> > libxl_internal.h L2588.
> 
> It's been a long time since I've played with libxl, last time none of this
> existed, sorry. Sadly devd completely bypasses all this, a simple way to fix
> this would be to call libxl__lock_domain_userdata from {add/remove}_device
> maybe? (and drop the lock at device_complete).

As discussed on IRC, this is not possible because in order to lock the user
data lock the CTX lock must be held, and leaving libxl with the CTX lock held
is not allowed (as would happen when waiting for an async operation to
finish).

I think we have concluded that this, although nice to have, is not worse than
the situation when running hotplug scripts from Dom0, hence the patch still
stands.

Thanks, Roger.
Wei Liu May 10, 2017, 4:26 p.m. UTC | #6
On Wed, May 10, 2017 at 11:12:38AM +0100, Roger Pau Monne wrote:
> Current code can free the libxl__device inside of the libxl__ddomain_device
> before the addition has finished if a removal happens while an addition is
> still in process:
> 
>   backend_watch_callback
>             |
>             v
>        add_device
>             |                 backend_watch_callback
>     (async operation)                   |
>             |                           v
>             |                     remove_device
>             |                           |
>             |                           V
>             |                    device_complete
>             |                 (free libxl__device)
>             v
>      device_complete
>   (deref libxl__device)
> 
> Fix this by creating a temporary copy of the libxl__device, that's tracked by
> the GC of the nested async operation. This ensures that the libxl__device used
> by the async operations cannot be freed while being used.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Reported-by: Ian Jackson <ian.jackson@eu.citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
diff mbox

Patch

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 5e966762c6..cd4ad05a6f 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -1415,9 +1415,6 @@  static void device_complete(libxl__egc *egc, libxl__ao_device *aodev)
                libxl__device_action_to_string(aodev->action),
                aodev->rc ? "failed" : "succeed");
 
-    if (aodev->action == LIBXL__DEVICE_ACTION_REMOVE)
-        free(aodev->dev);
-
     libxl__nested_ao_free(aodev->ao);
 }
 
@@ -1521,7 +1518,12 @@  static int add_device(libxl__egc *egc, libxl__ao *ao,
 
         GCNEW(aodev);
         libxl__prepare_ao_device(ao, aodev);
-        aodev->dev = dev;
+        /*
+         * Clone the libxl__device to avoid races if remove_device is called
+         * before the device addition has finished.
+         */
+        GCNEW(aodev->dev);
+        *aodev->dev = *dev;
         aodev->action = LIBXL__DEVICE_ACTION_ADD;
         aodev->callback = device_complete;
         libxl__wait_device_connection(egc, aodev);
@@ -1564,7 +1566,12 @@  static int remove_device(libxl__egc *egc, libxl__ao *ao,
 
         GCNEW(aodev);
         libxl__prepare_ao_device(ao, aodev);
-        aodev->dev = dev;
+        /*
+         * Clone the libxl__device to avoid races if there's a add_device
+         * running in parallel.
+         */
+        GCNEW(aodev->dev);
+        *aodev->dev = *dev;
         aodev->action = LIBXL__DEVICE_ACTION_REMOVE;
         aodev->callback = device_complete;
         libxl__initiate_device_generic_remove(egc, aodev);
@@ -1576,7 +1583,6 @@  static int remove_device(libxl__egc *egc, libxl__ao *ao,
                 goto out;
         }
         libxl__device_destroy(gc, dev);
-        free(dev);
         /* Fall through to return > 0, no ao has been dispatched */
     default:
         rc = 1;
@@ -1597,7 +1603,7 @@  static void backend_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
     char *p, *path;
     const char *sstate, *sonline;
     int state, online, rc, num_devs;
-    libxl__device *dev = NULL;
+    libxl__device *dev;
     libxl__ddomain_device *ddev = NULL;
     libxl__ddomain_guest *dguest = NULL;
     bool free_ao = false;
@@ -1625,7 +1631,7 @@  static void backend_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
         goto skip;
     online = atoi(sonline);
 
-    dev = libxl__zalloc(NOGC, sizeof(*dev));
+    GCNEW(dev);
     rc = libxl__parse_backend_path(gc, path, dev);
     if (rc)
         goto skip;
@@ -1659,7 +1665,8 @@  static void backend_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
          * to the list of active devices for a given guest.
          */
         ddev = libxl__zalloc(NOGC, sizeof(*ddev));
-        ddev->dev = dev;
+        ddev->dev = libxl__zalloc(NOGC, sizeof(*ddev->dev));
+        *ddev->dev = *dev;
         LIBXL_SLIST_INSERT_HEAD(&dguest->devices, ddev, next);
         LOGD(DEBUG, dev->domid, "Added device %s to the list of active devices",
              path);
@@ -1670,9 +1677,6 @@  static void backend_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
         /*
          * Removal of an active device, remove it from the list and
          * free it's data structures if they are no longer needed.
-         *
-         * The free of the associated libxl__device is left to the
-         * helper remove_device function.
          */
         LIBXL_SLIST_REMOVE(&dguest->devices, ddev, libxl__ddomain_device,
                            next);
@@ -1682,6 +1686,7 @@  static void backend_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
         if (rc > 0)
             free_ao = true;
 
+        free(ddev->dev);
         free(ddev);
         /* If this was the last device in the domain, remove it from the list */
         num_devs = dguest->num_vifs + dguest->num_vbds + dguest->num_qdisks;
@@ -1703,7 +1708,8 @@  static void backend_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
 
 skip:
     libxl__nested_ao_free(nested_ao);
-    free(dev);
+    if (ddev)
+        free(ddev->dev);
     free(ddev);
     free(dguest);
     return;