Message ID | 1442927760-21865-1-git-send-email-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 9/22/2015 2:16 PM, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Prevent leaking VMAs and PPGTT VMs when objects are imported > via flink. > > Scenario is that any VMAs created by the importer will be left > dangling after the importer exits, or destroys the PPGTT context > with which they are associated. > > This is caused by object destruction not running when the > importer closes the buffer object handle due the reference held > by the exporter. This also leaks the VM since the VMA has a > reference on it. > > In practice these leaks can be observed by stopping and starting > the X server on a kernel with fbcon compiled in. Every time > X server exits another VMA will be leaked against the fbcon's > frame buffer object. > > Also on systems where flink buffer sharing is used extensively, > like Android, this leak has even more serious consequences. > > This version is takes a general approach from the earlier work > by Rafael Barabalho (drm/i915: Clean-up PPGTT on context ^Barbalho > destruction) and tries to incorporate the subsequent discussion > between Chris Wilson and Daniel Vetter. > > v2: > > Removed immediate cleanup on object retire - it was causing a > recursive VMA unbind via i915_gem_object_wait_rendering. And > it is in fact not even needed since by definition context > cleanup worker runs only after the last context reference has > been dropped, hence all VMAs against the VM belonging to the > context are already on the inactive list. > > v3: > > Previous version could deadlock since VMA unbind waits on any > rendering on an object to complete. Objects can be busy in a > different VM which would mean that the cleanup loop would do > the wait with the struct mutex held. > > This is an even simpler approach where we just unbind VMAs > without waiting since we know all VMAs belonging to this VM > are idle, and there is nothing in flight, at the point > context destructor runs. > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak I guess we'll have to wait a bit more to have a fix for flink-and-close-vma-leak subtest... Anyway, this addresses the flink leak (also seen in bug 87729 - igt/gem_close_race) > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Rafael Barbalho <rafael.barbalho@intel.com> > Cc: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 5 +++++ > drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++++++++---- > drivers/gpu/drm/i915/i915_gem_context.c | 24 ++++++++++++++++++++++++ > 3 files changed, 45 insertions(+), 4 deletions(-) > > @@ -3274,6 +3276,16 @@ int i915_vma_unbind(struct i915_vma *vma) > return 0; > } > > +int i915_vma_unbind(struct i915_vma *vma) > +{ > + return __i915_vma_unbind(vma, true); > +} > + > +int _i915_vma_unbind_no_wait(struct i915_vma *vma) ^ it needs one more underscore (__i915_vma_unbind_no_wait). > +{ > + return __i915_vma_unbind(vma, false); > +} > + Reviewed-by: Michel Thierry <michel.thierry@intel.com>
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bc9c3a58d498..6dfd8830fe6b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2832,6 +2832,11 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level, u32 flags); int __must_check i915_vma_unbind(struct i915_vma *vma); +/* + * BEWARE: Do not use the function below unless you can _absolutely_ + * _guarantee_ VMA in question is _not in use_ anywhere. + */ +int __must_check _i915_vma_unbind_no_wait(struct i915_vma *vma); int i915_gem_object_put_pages(struct drm_i915_gem_object *obj); void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv); void i915_gem_release_mmap(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 64a5847f5107..3685080985df 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3208,7 +3208,7 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) old_write_domain); } -int i915_vma_unbind(struct i915_vma *vma) +static int __i915_vma_unbind(struct i915_vma *vma, bool wait) { struct drm_i915_gem_object *obj = vma->obj; struct drm_i915_private *dev_priv = obj->base.dev->dev_private; @@ -3227,9 +3227,11 @@ int i915_vma_unbind(struct i915_vma *vma) BUG_ON(obj->pages == NULL); - ret = i915_gem_object_wait_rendering(obj, false); - if (ret) - return ret; + if (wait) { + ret = i915_gem_object_wait_rendering(obj, false); + if (ret) + return ret; + } if (i915_is_ggtt(vma->vm) && vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { @@ -3274,6 +3276,16 @@ int i915_vma_unbind(struct i915_vma *vma) return 0; } +int i915_vma_unbind(struct i915_vma *vma) +{ + return __i915_vma_unbind(vma, true); +} + +int _i915_vma_unbind_no_wait(struct i915_vma *vma) +{ + return __i915_vma_unbind(vma, false); +} + int i915_gpu_idle(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 74aa0c9929ba..d9d7ba8af3b4 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -133,6 +133,23 @@ static int get_context_size(struct drm_device *dev) return ret; } +static void i915_gem_context_clean(struct intel_context *ctx) +{ + struct i915_hw_ppgtt *ppgtt = ctx->ppgtt; + struct i915_vma *vma, *next; + + if (WARN_ON_ONCE(!ppgtt)) + return; + + WARN_ON(!list_empty(&ppgtt->base.active_list)); + + list_for_each_entry_safe(vma, next, &ppgtt->base.inactive_list, + mm_list) { + if (WARN_ON(_i915_vma_unbind_no_wait(vma))) + break; + } +} + void i915_gem_context_free(struct kref *ctx_ref) { struct intel_context *ctx = container_of(ctx_ref, typeof(*ctx), ref); @@ -142,6 +159,13 @@ void i915_gem_context_free(struct kref *ctx_ref) if (i915.enable_execlists) intel_lr_context_free(ctx); + /* + * This context is going away and we need to remove all VMAs still + * around. This is to handle imported shared objects for which + * destructor did not run when their handles were closed. + */ + i915_gem_context_clean(ctx); + i915_ppgtt_put(ctx->ppgtt); if (ctx->legacy_hw_ctx.rcs_state)