drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
diff mbox series

Message ID 20200124130107.125404-1-chris@chris-wilson.co.uk
State New
Headers show
Series
  • drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
Related show

Commit Message

Chris Wilson Jan. 24, 2020, 1:01 p.m. UTC
Since drm_global_mutex is a true global mutex across devices, we don't
want to acquire it unless absolutely necessary. For maintaining the
device local open_count, we can use atomic operations on the counter
itself, except when making the transition to/from 0. Here, we tackle the
easy portion of delaying acquiring the drm_global_mutex for the final
release by using atomic_dec_and_mutex_lock(), leaving the global
serialisation across the device opens.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
---
atomic_dec_and_mutex_lock needs pairing with mutex_unlock (you fool)
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
 drivers/gpu/drm/drm_file.c                 | 16 ++++++++--------
 drivers/gpu/drm/i915/i915_switcheroo.c     |  2 +-
 drivers/gpu/drm/nouveau/nouveau_vga.c      |  2 +-
 drivers/gpu/drm/radeon/radeon_device.c     |  2 +-
 include/drm/drm_device.h                   |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

Comments

Thomas Hellström (VMware) Jan. 24, 2020, 1:37 p.m. UTC | #1
On 1/24/20 2:01 PM, Chris Wilson wrote:
> Since drm_global_mutex is a true global mutex across devices, we don't
> want to acquire it unless absolutely necessary. For maintaining the
> device local open_count, we can use atomic operations on the counter
> itself, except when making the transition to/from 0. Here, we tackle the
> easy portion of delaying acquiring the drm_global_mutex for the final
> release by using atomic_dec_and_mutex_lock(), leaving the global
> serialisation across the device opens.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>

For the series:

Reviewed-by: Thomas Hellström <thellstrom@vmware.com>

Now the only remaining (though pre-existing) problem I can see is that 
there is no corresponding mutex lock in drm_open() so that firstopen 
might race with lastclose.. Or I might be missing something..

/Thomas
Chris Wilson Jan. 24, 2020, 1:40 p.m. UTC | #2
Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47)
> On 1/24/20 2:01 PM, Chris Wilson wrote:
> > Since drm_global_mutex is a true global mutex across devices, we don't
> > want to acquire it unless absolutely necessary. For maintaining the
> > device local open_count, we can use atomic operations on the counter
> > itself, except when making the transition to/from 0. Here, we tackle the
> > easy portion of delaying acquiring the drm_global_mutex for the final
> > release by using atomic_dec_and_mutex_lock(), leaving the global
> > serialisation across the device opens.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
> 
> For the series:
> 
> Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
> 
> Now the only remaining (though pre-existing) problem I can see is that 
> there is no corresponding mutex lock in drm_open() so that firstopen 
> might race with lastclose.. Or I might be missing something..

iirc, it's a complicated dance where it goes through drm_stub_open()
first which acquires the drm_global_mutex.
-Chris
Chris Wilson Jan. 24, 2020, 6:39 p.m. UTC | #3
Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47)
> On 1/24/20 2:01 PM, Chris Wilson wrote:
> > Since drm_global_mutex is a true global mutex across devices, we don't
> > want to acquire it unless absolutely necessary. For maintaining the
> > device local open_count, we can use atomic operations on the counter
> > itself, except when making the transition to/from 0. Here, we tackle the
> > easy portion of delaying acquiring the drm_global_mutex for the final
> > release by using atomic_dec_and_mutex_lock(), leaving the global
> > serialisation across the device opens.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
> 
> For the series:
> 
> Reviewed-by: Thomas Hellström <thellstrom@vmware.com>

Now being opt-in, it is fairly limited in scope and will not randomly
break others (touch wood) and the close() racing in BAT didn't throw
anything up, so pushed to drm-misc-next. Thanks for the review and
suggestions,

Next task is to suggest others might like to use it as well.
-Chris
Thomas Hellström (VMware) Jan. 24, 2020, 8:32 p.m. UTC | #4
On 1/24/20 7:39 PM, Chris Wilson wrote:
> Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47)
>> On 1/24/20 2:01 PM, Chris Wilson wrote:
>>> Since drm_global_mutex is a true global mutex across devices, we don't
>>> want to acquire it unless absolutely necessary. For maintaining the
>>> device local open_count, we can use atomic operations on the counter
>>> itself, except when making the transition to/from 0. Here, we tackle the
>>> easy portion of delaying acquiring the drm_global_mutex for the final
>>> release by using atomic_dec_and_mutex_lock(), leaving the global
>>> serialisation across the device opens.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
>> For the series:
>>
>> Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
> Now being opt-in, it is fairly limited in scope and will not randomly
> break others (touch wood) and the close() racing in BAT didn't throw
> anything up, so pushed to drm-misc-next. Thanks for the review and
> suggestions,
>
> Next task is to suggest others might like to use it as well.
> -Chris

Thanks. I'll look at doing the same for those drivers I audited.

/Thomas
Daniel Vetter Jan. 27, 2020, 9:19 a.m. UTC | #5
On Fri, Jan 24, 2020 at 06:39:26PM +0000, Chris Wilson wrote:
> Quoting Thomas Hellström (VMware) (2020-01-24 13:37:47)
> > On 1/24/20 2:01 PM, Chris Wilson wrote:
> > > Since drm_global_mutex is a true global mutex across devices, we don't
> > > want to acquire it unless absolutely necessary. For maintaining the
> > > device local open_count, we can use atomic operations on the counter
> > > itself, except when making the transition to/from 0. Here, we tackle the
> > > easy portion of delaying acquiring the drm_global_mutex for the final
> > > release by using atomic_dec_and_mutex_lock(), leaving the global
> > > serialisation across the device opens.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
> > 
> > For the series:
> > 
> > Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
> 
> Now being opt-in, it is fairly limited in scope and will not randomly
> break others (touch wood) and the close() racing in BAT didn't throw
> anything up, so pushed to drm-misc-next. Thanks for the review and
> suggestions,

Yeah this version looks reasonable compared to the previous few (I'm
catching up on dri-devel). I've looked at getting rid of the global_mutex,
and all I have is a simple patch with a pile of notes. It's real nasty.

This one here is a neat trick that I missed, and I'm semi-convinced it's safe :-)

> Next task is to suggest others might like to use it as well.

My idea for the opt-in was to look at whether ->load/->unload exists. And
ofc not bother with any of this for DRIVER_LEGACY. So maybe next step
would be to define a

drm_can_noglobal()
{
	return !DRIVER_LEGACY && !->load && !->unload;
}

and inline the close helper again and see what breaks? At least from what
I've looked trying to duplicate paths and opt-in is going to be real tough
on the open side of things. Best I've done thus far is minor pushing of
the critical section.
-Daniel

Patch
diff mbox series

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 53d882000101..c3c0356dfa61 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1136,7 +1136,7 @@  static bool amdgpu_switcheroo_can_switch(struct pci_dev *pdev)
 	* locking inversion with the driver load path. And the access here is
 	* completely racy anyway. So don't bother with locking for now.
 	*/
-	return dev->open_count == 0;
+	return atomic_read(&dev->open_count) == 0;
 }
 
 static const struct vga_switcheroo_client_ops amdgpu_switcheroo_ops = {
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index e25306c49cc6..1075b3a8b5b1 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -220,7 +220,7 @@  void drm_file_free(struct drm_file *file)
 	DRM_DEBUG("pid = %d, device = 0x%lx, open_count = %d\n",
 		  task_pid_nr(current),
 		  (long)old_encode_dev(file->minor->kdev->devt),
-		  READ_ONCE(dev->open_count));
+		  atomic_read(&dev->open_count));
 
 	if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
 	    dev->driver->preclose)
@@ -379,7 +379,7 @@  int drm_open(struct inode *inode, struct file *filp)
 		return PTR_ERR(minor);
 
 	dev = minor->dev;
-	if (!dev->open_count++)
+	if (!atomic_fetch_inc(&dev->open_count))
 		need_setup = 1;
 
 	/* share address_space across all char-devs of a single device */
@@ -398,7 +398,7 @@  int drm_open(struct inode *inode, struct file *filp)
 	return 0;
 
 err_undo:
-	dev->open_count--;
+	atomic_dec(&dev->open_count);
 	drm_minor_release(minor);
 	return retcode;
 }
@@ -440,11 +440,11 @@  int drm_release(struct inode *inode, struct file *filp)
 
 	mutex_lock(&drm_global_mutex);
 
-	DRM_DEBUG("open_count = %d\n", dev->open_count);
+	DRM_DEBUG("open_count = %d\n", atomic_read(&dev->open_count));
 
 	drm_close_helper(filp);
 
-	if (!--dev->open_count)
+	if (atomic_dec_and_test(&dev->open_count))
 		drm_lastclose(dev);
 
 	mutex_unlock(&drm_global_mutex);
@@ -478,10 +478,10 @@  int drm_release_noglobal(struct inode *inode, struct file *filp)
 
 	drm_close_helper(filp);
 
-	mutex_lock(&drm_global_mutex);
-	if (!--dev->open_count)
+	if (atomic_dec_and_mutex_lock(&dev->open_count, &drm_global_mutex)) {
 		drm_lastclose(dev);
-	mutex_unlock(&drm_global_mutex);
+		mutex_unlock(&drm_global_mutex);
+	}
 
 	drm_minor_release(minor);
 
diff --git a/drivers/gpu/drm/i915/i915_switcheroo.c b/drivers/gpu/drm/i915/i915_switcheroo.c
index 39c79e1c5b52..ed69b5d4a375 100644
--- a/drivers/gpu/drm/i915/i915_switcheroo.c
+++ b/drivers/gpu/drm/i915/i915_switcheroo.c
@@ -43,7 +43,7 @@  static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
 	 * locking inversion with the driver load path. And the access here is
 	 * completely racy anyway. So don't bother with locking for now.
 	 */
-	return i915 && i915->drm.open_count == 0;
+	return i915 && atomic_read(&i915->drm.open_count) == 0;
 }
 
 static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c
index d865d8aeac3c..c85dd8afa3c3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -72,7 +72,7 @@  nouveau_switcheroo_can_switch(struct pci_dev *pdev)
 	 * locking inversion with the driver load path. And the access here is
 	 * completely racy anyway. So don't bother with locking for now.
 	 */
-	return dev->open_count == 0;
+	return atomic_read(&dev->open_count) == 0;
 }
 
 static const struct vga_switcheroo_client_ops
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index a522e092038b..266e3cbbd09b 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1263,7 +1263,7 @@  static bool radeon_switcheroo_can_switch(struct pci_dev *pdev)
 	 * locking inversion with the driver load path. And the access here is
 	 * completely racy anyway. So don't bother with locking for now.
 	 */
-	return dev->open_count == 0;
+	return atomic_read(&dev->open_count) == 0;
 }
 
 static const struct vga_switcheroo_client_ops radeon_switcheroo_ops = {
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 1acfc3bbd3fb..bb60a949f416 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -144,7 +144,7 @@  struct drm_device {
 	 * Usage counter for outstanding files open,
 	 * protected by drm_global_mutex
 	 */
-	int open_count;
+	atomic_t open_count;
 
 	/** @filelist_mutex: Protects @filelist. */
 	struct mutex filelist_mutex;