diff mbox

[v2] drm: Release driver references to handle before making it available again

Message ID 1452295625-17520-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Jan. 8, 2016, 11:27 p.m. UTC
When userspace closes a handle, we remove it from the file->object_idr
and then tell the driver to drop its references to that file/handle.
However, as the file/handle is already available again for reuse, it may
be reallocated back to userspace and active on a new object before the
driver has had a chance to drop the old file/handle references.

Whilst calling back into the driver, we have to drop the
file->table_lock spinlock and so to prevent reusing the closed handle we
mark that handle as stale in the idr, perform the callback and then
remove the handle. We set the stale handle to point to the NULL object,
then any idr_find() whilst the driver is removing the handle will return
NULL, just as if the handle is already removed from idr.

v2: Use NULL rather than an ERR_PTR to avoid having to adjust callers.
idr_alloc() tracks existing handles using an internal bitmap, so we are
free to use the NULL object as our stale identifier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/drm_gem.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Ville Syrjala Jan. 11, 2016, 5:51 p.m. UTC | #1
On Fri, Jan 08, 2016 at 11:27:05PM +0000, Chris Wilson wrote:
> When userspace closes a handle, we remove it from the file->object_idr
> and then tell the driver to drop its references to that file/handle.
> However, as the file/handle is already available again for reuse, it may
> be reallocated back to userspace and active on a new object before the
> driver has had a chance to drop the old file/handle references.

Hmm. What's the problem with another object starting to reuse the same
handle while we're still deleting the old one? So far I didn't spot
anything in the code that would go boom if there's another object
already around with the same handle.

> 
> Whilst calling back into the driver, we have to drop the
> file->table_lock spinlock and so to prevent reusing the closed handle we
> mark that handle as stale in the idr, perform the callback and then
> remove the handle. We set the stale handle to point to the NULL object,
> then any idr_find() whilst the driver is removing the handle will return
> NULL, just as if the handle is already removed from idr.
> 
> v2: Use NULL rather than an ERR_PTR to avoid having to adjust callers.
> idr_alloc() tracks existing handles using an internal bitmap, so we are
> free to use the NULL object as our stale identifier.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: dri-devel@lists.freedesktop.org
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Clark <robdclark@gmail.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/drm/drm_gem.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 2e8c77e71e1f..d1909d1a1eb4 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -294,18 +294,21 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
>  	spin_lock(&filp->table_lock);
>  
>  	/* Check if we currently have a reference on the object */
> -	obj = idr_find(&filp->object_idr, handle);
> -	if (obj == NULL) {
> +	obj = idr_replace(&filp->object_idr, NULL, handle);
> +	if (IS_ERR(obj)) {
>  		spin_unlock(&filp->table_lock);
>  		return -EINVAL;
>  	}
>  	dev = obj->dev;
> +	spin_unlock(&filp->table_lock);
>  
>  	/* Release reference and decrement refcount. */
> +	drm_gem_object_release_handle(handle, obj, filp);
> +
> +	spin_lock(&filp->table_lock);
>  	idr_remove(&filp->object_idr, handle);
>  	spin_unlock(&filp->table_lock);
>  
> -	drm_gem_object_release_handle(handle, obj, filp);
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_gem_handle_delete);
> -- 
> 2.7.0.rc3
Chris Wilson Jan. 11, 2016, 8:44 p.m. UTC | #2
On Mon, Jan 11, 2016 at 07:51:03PM +0200, Ville Syrjälä wrote:
> On Fri, Jan 08, 2016 at 11:27:05PM +0000, Chris Wilson wrote:
> > When userspace closes a handle, we remove it from the file->object_idr
> > and then tell the driver to drop its references to that file/handle.
> > However, as the file/handle is already available again for reuse, it may
> > be reallocated back to userspace and active on a new object before the
> > driver has had a chance to drop the old file/handle references.
> 
> Hmm. What's the problem with another object starting to reuse the same
> handle while we're still deleting the old one? So far I didn't spot
> anything in the code that would go boom if there's another object
> already around with the same handle.

Imagine a driver storing a hashtable to contract the handle->object->vma
lookup into just handle->vma, like the old idea of replacing the
object_idr with an ida plus hashtable of objects. (This saves the double
step lookup that caused a regression with the ppgtt work, and the linear
walk of object->vma_list which is a major slowdown in various full-pggtt
OpenGL tests). In such a scheme, where the driver has a parallel lut to
the core, the driver needs to be notified before the handle is then
accessible again to userspace. Or in any other scenario where the driver
is using the handle, as would be implied by having the open/close
callbacks.
-Chris
Ville Syrjala Jan. 12, 2016, 10:17 a.m. UTC | #3
On Mon, Jan 11, 2016 at 08:44:03PM +0000, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 07:51:03PM +0200, Ville Syrjälä wrote:
> > On Fri, Jan 08, 2016 at 11:27:05PM +0000, Chris Wilson wrote:
> > > When userspace closes a handle, we remove it from the file->object_idr
> > > and then tell the driver to drop its references to that file/handle.
> > > However, as the file/handle is already available again for reuse, it may
> > > be reallocated back to userspace and active on a new object before the
> > > driver has had a chance to drop the old file/handle references.
> > 
> > Hmm. What's the problem with another object starting to reuse the same
> > handle while we're still deleting the old one? So far I didn't spot
> > anything in the code that would go boom if there's another object
> > already around with the same handle.
> 
> Imagine a driver storing a hashtable to contract the handle->object->vma
> lookup into just handle->vma, like the old idea of replacing the
> object_idr with an ida plus hashtable of objects. (This saves the double
> step lookup that caused a regression with the ppgtt work, and the linear
> walk of object->vma_list which is a major slowdown in various full-pggtt
> OpenGL tests). In such a scheme, where the driver has a parallel lut to
> the core, the driver needs to be notified before the handle is then
> accessible again to userspace. Or in any other scenario where the driver
> is using the handle, as would be implied by having the open/close
> callbacks.

I see. Yeah, that makes sense. I was just a bit confused since I
couldn't find any real problem in the tree currently.
Ville Syrjala Jan. 12, 2016, 10:19 a.m. UTC | #4
On Fri, Jan 08, 2016 at 11:27:05PM +0000, Chris Wilson wrote:
> When userspace closes a handle, we remove it from the file->object_idr
> and then tell the driver to drop its references to that file/handle.
> However, as the file/handle is already available again for reuse, it may
> be reallocated back to userspace and active on a new object before the
> driver has had a chance to drop the old file/handle references.
> 
> Whilst calling back into the driver, we have to drop the
> file->table_lock spinlock and so to prevent reusing the closed handle we
> mark that handle as stale in the idr, perform the callback and then
> remove the handle. We set the stale handle to point to the NULL object,
> then any idr_find() whilst the driver is removing the handle will return
> NULL, just as if the handle is already removed from idr.
> 
> v2: Use NULL rather than an ERR_PTR to avoid having to adjust callers.
> idr_alloc() tracks existing handles using an internal bitmap, so we are
> free to use the NULL object as our stale identifier.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: dri-devel@lists.freedesktop.org
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Clark <robdclark@gmail.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/drm/drm_gem.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 2e8c77e71e1f..d1909d1a1eb4 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -294,18 +294,21 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
>  	spin_lock(&filp->table_lock);
>  
>  	/* Check if we currently have a reference on the object */
> -	obj = idr_find(&filp->object_idr, handle);
> -	if (obj == NULL) {
> +	obj = idr_replace(&filp->object_idr, NULL, handle);
> +	if (IS_ERR(obj)) {
>  		spin_unlock(&filp->table_lock);
>  		return -EINVAL;
>  	}
>  	dev = obj->dev;
> +	spin_unlock(&filp->table_lock);

Could shrink the spinlocked section to be just the idr_replace()
call I suppose, and thus avoid the spin_unlock() in the error path.

Otherwise makes sense so
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

>  
>  	/* Release reference and decrement refcount. */
> +	drm_gem_object_release_handle(handle, obj, filp);
> +
> +	spin_lock(&filp->table_lock);
>  	idr_remove(&filp->object_idr, handle);
>  	spin_unlock(&filp->table_lock);
>  
> -	drm_gem_object_release_handle(handle, obj, filp);
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_gem_handle_delete);
> -- 
> 2.7.0.rc3
Chris Wilson Jan. 12, 2016, 10:54 a.m. UTC | #5
On Tue, Jan 12, 2016 at 12:19:12PM +0200, Ville Syrjälä wrote:
> On Fri, Jan 08, 2016 at 11:27:05PM +0000, Chris Wilson wrote:
> > When userspace closes a handle, we remove it from the file->object_idr
> > and then tell the driver to drop its references to that file/handle.
> > However, as the file/handle is already available again for reuse, it may
> > be reallocated back to userspace and active on a new object before the
> > driver has had a chance to drop the old file/handle references.
> > 
> > Whilst calling back into the driver, we have to drop the
> > file->table_lock spinlock and so to prevent reusing the closed handle we
> > mark that handle as stale in the idr, perform the callback and then
> > remove the handle. We set the stale handle to point to the NULL object,
> > then any idr_find() whilst the driver is removing the handle will return
> > NULL, just as if the handle is already removed from idr.
> > 
> > v2: Use NULL rather than an ERR_PTR to avoid having to adjust callers.
> > idr_alloc() tracks existing handles using an internal bitmap, so we are
> > free to use the NULL object as our stale identifier.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rob Clark <robdclark@gmail.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/gpu/drm/drm_gem.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > index 2e8c77e71e1f..d1909d1a1eb4 100644
> > --- a/drivers/gpu/drm/drm_gem.c
> > +++ b/drivers/gpu/drm/drm_gem.c
> > @@ -294,18 +294,21 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
> >  	spin_lock(&filp->table_lock);
> >  
> >  	/* Check if we currently have a reference on the object */
> > -	obj = idr_find(&filp->object_idr, handle);
> > -	if (obj == NULL) {
> > +	obj = idr_replace(&filp->object_idr, NULL, handle);
> > +	if (IS_ERR(obj)) {
> >  		spin_unlock(&filp->table_lock);
> >  		return -EINVAL;
> >  	}
> >  	dev = obj->dev;
> > +	spin_unlock(&filp->table_lock);
> 
> Could shrink the spinlocked section to be just the idr_replace()
> call I suppose, and thus avoid the spin_unlock() in the error path.

Indeed, missed that. I also missed in v2 that the IS_ERR(obj) test needed
to become IS_ERR_OR_NULL(obj) to catch the concurrent deletion.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 2e8c77e71e1f..d1909d1a1eb4 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -294,18 +294,21 @@  drm_gem_handle_delete(struct drm_file *filp, u32 handle)
 	spin_lock(&filp->table_lock);
 
 	/* Check if we currently have a reference on the object */
-	obj = idr_find(&filp->object_idr, handle);
-	if (obj == NULL) {
+	obj = idr_replace(&filp->object_idr, NULL, handle);
+	if (IS_ERR(obj)) {
 		spin_unlock(&filp->table_lock);
 		return -EINVAL;
 	}
 	dev = obj->dev;
+	spin_unlock(&filp->table_lock);
 
 	/* Release reference and decrement refcount. */
+	drm_gem_object_release_handle(handle, obj, filp);
+
+	spin_lock(&filp->table_lock);
 	idr_remove(&filp->object_idr, handle);
 	spin_unlock(&filp->table_lock);
 
-	drm_gem_object_release_handle(handle, obj, filp);
 	return 0;
 }
 EXPORT_SYMBOL(drm_gem_handle_delete);