diff mbox series

[4/4] drm: Nerv drm_global_mutex BKL for good drivers

Message ID 20200128104602.1459802-4-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series [1/4] drm: Complain if drivers still use the ->load callback | expand

Commit Message

Daniel Vetter Jan. 28, 2020, 10:46 a.m. UTC
This catches the majority of drivers (unfortunately not if we take
users into account, because all the big drivers have at least a
lastclose hook).

With the prep patches out of the way all drm state is fully protected
and either prevents or can deal with the races from dropping the BKL
around open/close. The only thing left to audit are the various driver
hooks - by keeping the BKL around if any of them are set we have a
very simple cop-out!

Note that one of the biggest prep pieces to get here was making
dev->open_count atomic, which was done in

commit 7e13ad896484a0165a68197a2e64091ea28c9602
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jan 24 13:01:07 2020 +0000

    drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/drm_drv.c      |  6 +++--
 drivers/gpu/drm/drm_file.c     | 46 ++++++++++++++++++++++++++++++----
 drivers/gpu/drm/drm_internal.h |  1 +
 3 files changed, 46 insertions(+), 7 deletions(-)

Comments

Chris Wilson Jan. 28, 2020, 11 a.m. UTC | #1
Quoting Daniel Vetter (2020-01-28 10:46:01)
> This catches the majority of drivers (unfortunately not if we take
> users into account, because all the big drivers have at least a
> lastclose hook).

Yeah, that lastclose for switching back to fbcon, and ensuring that
switch is started before the next client takes over the console, does
nerf the ability of avoiding the global_mutex.

> 
> With the prep patches out of the way all drm state is fully protected
> and either prevents or can deal with the races from dropping the BKL
> around open/close. The only thing left to audit are the various driver
> hooks - by keeping the BKL around if any of them are set we have a
> very simple cop-out!
> 
> Note that one of the biggest prep pieces to get here was making
> dev->open_count atomic, which was done in
> 
> commit 7e13ad896484a0165a68197a2e64091ea28c9602
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jan 24 13:01:07 2020 +0000
> 
>     drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/drm_drv.c      |  6 +++--
>  drivers/gpu/drm/drm_file.c     | 46 ++++++++++++++++++++++++++++++----
>  drivers/gpu/drm/drm_internal.h |  1 +
>  3 files changed, 46 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 05bdf0b9d2b3..9fcd6ab3c154 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -946,7 +946,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
>         struct drm_driver *driver = dev->driver;
>         int ret;
>  
> -       mutex_lock(&drm_global_mutex);
> +       if (drm_dev_needs_global_mutex(dev))
> +               mutex_lock(&drm_global_mutex);
>  
>         if (dev->driver->load) {
>                 if (!drm_core_check_feature(dev, DRIVER_LEGACY))
> @@ -992,7 +993,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
>         drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
>         drm_minor_unregister(dev, DRM_MINOR_RENDER);
>  out_unlock:
> -       mutex_unlock(&drm_global_mutex);
> +       if (drm_dev_needs_global_mutex(dev))
> +               mutex_unlock(&drm_global_mutex);
>         return ret;
>  }
>  EXPORT_SYMBOL(drm_dev_register);
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index d36cb74ebe0c..efd6fe0b6b4f 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -51,6 +51,37 @@
>  /* from BKL pushdown */
>  DEFINE_MUTEX(drm_global_mutex);
>  
> +bool drm_dev_needs_global_mutex(struct drm_device *dev)
> +{
> +       /*
> +        * Legacy drivers rely on all kinds of BKL locking semantics, don't
> +        * bother. They also still need BKL locking for their ioctls, so better
> +        * safe than sorry.
> +        */
> +       if (drm_core_check_feature(dev, DRIVER_LEGACY))
> +               return true;
> +
> +       /*
> +        * The deprecated ->load callback must be called after the driver is
> +        * already registered. This means such drivers rely on the BKL to make
> +        * sure an open can't proceed until the driver is actually fully set up.
> +        * Similar hilarity holds for the unload callback.
> +        */
> +       if (dev->driver->load || dev->driver->unload)
> +               return true;
> +
> +       /*
> +        * Drivers with the lastclose callback assume that it's synchronized
> +        * against concurrent opens, which again needs the BKL. The proper fix
> +        * is to use the drm_client infrastructure with proper locking for each
> +        * client.
> +        */
> +       if (dev->driver->lastclose)
> +               return true;
> +
> +       return false;

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I'm not particularly fussed by this patch, maybe a bit too obviously
midlayer.

I was wondering if we should have a *dev->global_mutex which drivers can
set to be any level they need (with a bit of care on our part to make
sure it is not destroyed across dev->driver->lastclose).
-Chris
Daniel Vetter Jan. 28, 2020, 2:59 p.m. UTC | #2
On Tue, Jan 28, 2020 at 11:00:32AM +0000, Chris Wilson wrote:
> Quoting Daniel Vetter (2020-01-28 10:46:01)
> > This catches the majority of drivers (unfortunately not if we take
> > users into account, because all the big drivers have at least a
> > lastclose hook).
> 
> Yeah, that lastclose for switching back to fbcon, and ensuring that
> switch is started before the next client takes over the console, does
> nerf the ability of avoiding the global_mutex.
> 
> > 
> > With the prep patches out of the way all drm state is fully protected
> > and either prevents or can deal with the races from dropping the BKL
> > around open/close. The only thing left to audit are the various driver
> > hooks - by keeping the BKL around if any of them are set we have a
> > very simple cop-out!
> > 
> > Note that one of the biggest prep pieces to get here was making
> > dev->open_count atomic, which was done in
> > 
> > commit 7e13ad896484a0165a68197a2e64091ea28c9602
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jan 24 13:01:07 2020 +0000
> > 
> >     drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> >  drivers/gpu/drm/drm_drv.c      |  6 +++--
> >  drivers/gpu/drm/drm_file.c     | 46 ++++++++++++++++++++++++++++++----
> >  drivers/gpu/drm/drm_internal.h |  1 +
> >  3 files changed, 46 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 05bdf0b9d2b3..9fcd6ab3c154 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -946,7 +946,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
> >         struct drm_driver *driver = dev->driver;
> >         int ret;
> >  
> > -       mutex_lock(&drm_global_mutex);
> > +       if (drm_dev_needs_global_mutex(dev))
> > +               mutex_lock(&drm_global_mutex);
> >  
> >         if (dev->driver->load) {
> >                 if (!drm_core_check_feature(dev, DRIVER_LEGACY))
> > @@ -992,7 +993,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
> >         drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
> >         drm_minor_unregister(dev, DRM_MINOR_RENDER);
> >  out_unlock:
> > -       mutex_unlock(&drm_global_mutex);
> > +       if (drm_dev_needs_global_mutex(dev))
> > +               mutex_unlock(&drm_global_mutex);
> >         return ret;
> >  }
> >  EXPORT_SYMBOL(drm_dev_register);
> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > index d36cb74ebe0c..efd6fe0b6b4f 100644
> > --- a/drivers/gpu/drm/drm_file.c
> > +++ b/drivers/gpu/drm/drm_file.c
> > @@ -51,6 +51,37 @@
> >  /* from BKL pushdown */
> >  DEFINE_MUTEX(drm_global_mutex);
> >  
> > +bool drm_dev_needs_global_mutex(struct drm_device *dev)
> > +{
> > +       /*
> > +        * Legacy drivers rely on all kinds of BKL locking semantics, don't
> > +        * bother. They also still need BKL locking for their ioctls, so better
> > +        * safe than sorry.
> > +        */
> > +       if (drm_core_check_feature(dev, DRIVER_LEGACY))
> > +               return true;
> > +
> > +       /*
> > +        * The deprecated ->load callback must be called after the driver is
> > +        * already registered. This means such drivers rely on the BKL to make
> > +        * sure an open can't proceed until the driver is actually fully set up.
> > +        * Similar hilarity holds for the unload callback.
> > +        */
> > +       if (dev->driver->load || dev->driver->unload)
> > +               return true;
> > +
> > +       /*
> > +        * Drivers with the lastclose callback assume that it's synchronized
> > +        * against concurrent opens, which again needs the BKL. The proper fix
> > +        * is to use the drm_client infrastructure with proper locking for each
> > +        * client.
> > +        */
> > +       if (dev->driver->lastclose)
> > +               return true;
> > +
> > +       return false;
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> I'm not particularly fussed by this patch, maybe a bit too obviously
> midlayer.

Yeah it's not great. But I think long-term we might have a chance to limit
this to DRIVER_LEGACY:
- load/unload is on the way out
- generic fbdev won't need lastclose
- I have an idea for fixing the vgaswitcheroo locking, now that I starred
  at this wall a bit more. Should be a good undependent cleanup.
- we could perhaps do a ->lastclose_unlocked for reducing that midlayer
  smell a tad. Same way we slowly managed to get rid of the
  dev->struct_mutex locking midlayer.

> I was wondering if we should have a *dev->global_mutex which drivers can
> set to be any level they need (with a bit of care on our part to make
> sure it is not destroyed across dev->driver->lastclose).

I feel like outside of legacy driver the drm_global_mutex protection is
limited enough that there's no need to give driver writers ideas :-)
-Daniel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 05bdf0b9d2b3..9fcd6ab3c154 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -946,7 +946,8 @@  int drm_dev_register(struct drm_device *dev, unsigned long flags)
 	struct drm_driver *driver = dev->driver;
 	int ret;
 
-	mutex_lock(&drm_global_mutex);
+	if (drm_dev_needs_global_mutex(dev))
+		mutex_lock(&drm_global_mutex);
 
 	if (dev->driver->load) {
 		if (!drm_core_check_feature(dev, DRIVER_LEGACY))
@@ -992,7 +993,8 @@  int drm_dev_register(struct drm_device *dev, unsigned long flags)
 	drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
 	drm_minor_unregister(dev, DRM_MINOR_RENDER);
 out_unlock:
-	mutex_unlock(&drm_global_mutex);
+	if (drm_dev_needs_global_mutex(dev))
+		mutex_unlock(&drm_global_mutex);
 	return ret;
 }
 EXPORT_SYMBOL(drm_dev_register);
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index d36cb74ebe0c..efd6fe0b6b4f 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -51,6 +51,37 @@ 
 /* from BKL pushdown */
 DEFINE_MUTEX(drm_global_mutex);
 
+bool drm_dev_needs_global_mutex(struct drm_device *dev)
+{
+	/*
+	 * Legacy drivers rely on all kinds of BKL locking semantics, don't
+	 * bother. They also still need BKL locking for their ioctls, so better
+	 * safe than sorry.
+	 */
+	if (drm_core_check_feature(dev, DRIVER_LEGACY))
+		return true;
+
+	/*
+	 * The deprecated ->load callback must be called after the driver is
+	 * already registered. This means such drivers rely on the BKL to make
+	 * sure an open can't proceed until the driver is actually fully set up.
+	 * Similar hilarity holds for the unload callback.
+	 */
+	if (dev->driver->load || dev->driver->unload)
+		return true;
+
+	/*
+	 * Drivers with the lastclose callback assume that it's synchronized
+	 * against concurrent opens, which again needs the BKL. The proper fix
+	 * is to use the drm_client infrastructure with proper locking for each
+	 * client.
+	 */
+	if (dev->driver->lastclose)
+		return true;
+
+	return false;
+}
+
 /**
  * DOC: file operations
  *
@@ -378,9 +409,10 @@  int drm_open(struct inode *inode, struct file *filp)
 	if (IS_ERR(minor))
 		return PTR_ERR(minor);
 
-	mutex_unlock(&drm_global_mutex);
-
 	dev = minor->dev;
+	if (drm_dev_needs_global_mutex(dev))
+		mutex_unlock(&drm_global_mutex);
+
 	if (!atomic_fetch_inc(&dev->open_count))
 		need_setup = 1;
 
@@ -398,13 +430,15 @@  int drm_open(struct inode *inode, struct file *filp)
 		}
 	}
 
-	mutex_unlock(&drm_global_mutex);
+	if (drm_dev_needs_global_mutex(dev))
+		mutex_unlock(&drm_global_mutex);
 
 	return 0;
 
 err_undo:
 	atomic_dec(&dev->open_count);
-	mutex_unlock(&drm_global_mutex);
+	if (drm_dev_needs_global_mutex(dev))
+		mutex_unlock(&drm_global_mutex);
 	drm_minor_release(minor);
 	return retcode;
 }
@@ -444,6 +478,7 @@  int drm_release(struct inode *inode, struct file *filp)
 	struct drm_minor *minor = file_priv->minor;
 	struct drm_device *dev = minor->dev;
 
+	if (drm_dev_needs_global_mutex(dev))
 	mutex_lock(&drm_global_mutex);
 
 	DRM_DEBUG("open_count = %d\n", atomic_read(&dev->open_count));
@@ -453,7 +488,8 @@  int drm_release(struct inode *inode, struct file *filp)
 	if (atomic_dec_and_test(&dev->open_count))
 		drm_lastclose(dev);
 
-	mutex_unlock(&drm_global_mutex);
+	if (drm_dev_needs_global_mutex(dev))
+		mutex_unlock(&drm_global_mutex);
 
 	drm_minor_release(minor);
 
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 6937bf923f05..aeec2e68d772 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -41,6 +41,7 @@  struct drm_printer;
 
 /* drm_file.c */
 extern struct mutex drm_global_mutex;
+bool drm_dev_needs_global_mutex(struct drm_device *dev);
 struct drm_file *drm_file_alloc(struct drm_minor *minor);
 void drm_file_free(struct drm_file *file);
 void drm_lastclose(struct drm_device *dev);