diff mbox series

drivers/base: use a worker for sysfs unbind

Message ID 20181207093133.15648-1-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series drivers/base: use a worker for sysfs unbind | expand

Commit Message

Daniel Vetter Dec. 7, 2018, 9:31 a.m. UTC
Drivers might want to remove some sysfs files, which needs the same
locks and ends up angering lockdep. Relevant snippet of the stack
trace:

  kernfs_remove_by_name_ns+0x3b/0x80
  bus_remove_driver+0x92/0xa0
  acpi_video_unregister+0x24/0x40
  i915_driver_unload+0x42/0x130 [i915]
  i915_pci_remove+0x19/0x30 [i915]
  pci_device_remove+0x36/0xb0
  device_release_driver_internal+0x185/0x250
  unbind_store+0xaf/0x180
  kernfs_fop_write+0x104/0x190

I've stumbled over this because some new patches by Ram connect the
snd-hda-intel unload (where we do use sysfs unbind) with the locking
chains in the i915 unload code (but without creating a new loop),
which upset our CI. But the bug is already there and can be easily
reproduced by unbind i915 directly.

No idea whether this is the correct place to fix this, should at least
get CI happy again.

Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/base/bus.c | 33 +++++++++++++++++++++++++++------
 1 file changed, 27 insertions(+), 6 deletions(-)

Comments

Chris Wilson Dec. 7, 2018, 9:58 a.m. UTC | #1
Quoting Daniel Vetter (2018-12-07 09:31:33)
> +void unbind_work_fn(struct work_struct *work)
> +{
> +       struct unbind_work *unbind_work =
> +               container_of(work, struct unbind_work, work);
> +
> +       device_release_driver(unbind_work->dev);
> +       put_device(unbind_work->dev);
> +}
> +
>  /* Manually detach a device from its associated driver. */
>  static ssize_t unbind_store(struct device_driver *drv, const char *buf,
>                             size_t count)
>  {
>         struct bus_type *bus = bus_get(drv->bus);
> +       struct unbind_work *unbind_work;
>         struct device *dev;
>         int err = -ENODEV;
>  
>         dev = bus_find_device_by_name(bus, NULL, buf);
>         if (dev && dev->driver == drv) {
> -               if (dev->parent && dev->bus->need_parent_lock)
> -                       device_lock(dev->parent);

Do we not need to keep this locking in the worker?
-Chris
Daniel Vetter Dec. 7, 2018, 11 a.m. UTC | #2
On Fri, Dec 07, 2018 at 09:58:06AM +0000, Chris Wilson wrote:
> Quoting Daniel Vetter (2018-12-07 09:31:33)
> > +void unbind_work_fn(struct work_struct *work)
> > +{
> > +       struct unbind_work *unbind_work =
> > +               container_of(work, struct unbind_work, work);
> > +
> > +       device_release_driver(unbind_work->dev);
> > +       put_device(unbind_work->dev);
> > +}
> > +
> >  /* Manually detach a device from its associated driver. */
> >  static ssize_t unbind_store(struct device_driver *drv, const char *buf,
> >                             size_t count)
> >  {
> >         struct bus_type *bus = bus_get(drv->bus);
> > +       struct unbind_work *unbind_work;
> >         struct device *dev;
> >         int err = -ENODEV;
> >  
> >         dev = bus_find_device_by_name(bus, NULL, buf);
> >         if (dev && dev->driver == drv) {
> > -               if (dev->parent && dev->bus->need_parent_lock)
> > -                       device_lock(dev->parent);
> 
> Do we not need to keep this locking in the worker?

Ah forgot to mention that in the commit message: This locking is already
done in device_release_driver -> device_release_driver_internal. I'll fix
up the commit message, thanks for the reminder.
-Daniel
Chris Wilson Dec. 7, 2018, 12:24 p.m. UTC | #3
Quoting Patchwork (2018-12-07 12:15:14)
> == Series Details ==
> 
> Series: drivers/base: use a worker for sysfs unbind
> URL   : https://patchwork.freedesktop.org/series/53734/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_5282 -> Patchwork_11046
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_11046 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_11046, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://patchwork.freedesktop.org/api/1.0/series/53734/revisions/1/mbox/
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_11046:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@i915_module_load@reload-no-display:
>     - fi-skl-6700hq:      PASS -> INCOMPLETE

Do we need to hold a reference to the entire driver tree? Or perhaps an
ordered wq.
-Chris
diff mbox series

Patch

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 8bfd27ec73d6..864412df86a9 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -174,22 +174,43 @@  static const struct kset_uevent_ops bus_uevent_ops = {
 
 static struct kset *bus_kset;
 
+struct unbind_work {
+	struct work_struct work;
+	struct device *dev;
+};
+
+void unbind_work_fn(struct work_struct *work)
+{
+	struct unbind_work *unbind_work =
+		container_of(work, struct unbind_work, work);
+
+	device_release_driver(unbind_work->dev);
+	put_device(unbind_work->dev);
+}
+
 /* Manually detach a device from its associated driver. */
 static ssize_t unbind_store(struct device_driver *drv, const char *buf,
 			    size_t count)
 {
 	struct bus_type *bus = bus_get(drv->bus);
+	struct unbind_work *unbind_work;
 	struct device *dev;
 	int err = -ENODEV;
 
 	dev = bus_find_device_by_name(bus, NULL, buf);
 	if (dev && dev->driver == drv) {
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_lock(dev->parent);
-		device_release_driver(dev);
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_unlock(dev->parent);
-		err = count;
+		unbind_work = kmalloc(sizeof(*unbind_work), GFP_KERNEL);
+		if (unbind_work) {
+			unbind_work->dev = dev;
+			get_device(dev);
+			INIT_WORK(&unbind_work->work, unbind_work_fn);
+
+			schedule_work(&unbind_work->work);
+
+			err = count;
+		} else {
+			err = -ENOMEM;
+		}
 	}
 	put_device(dev);
 	bus_put(bus);