diff mbox

drm/i915: Ensure associated VMAs are inactive when contexts are destroyed

Message ID 1445857503-26621-1-git-send-email-tvrtko.ursulin@linux.intel.com
State New, archived
Headers show

Commit Message

Tvrtko Ursulin Oct. 26, 2015, 11:05 a.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

In the following commit:

    commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae
    Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Date:   Mon Oct 5 13:26:36 2015 +0100

        drm/i915: Clean up associated VMAs on context destruction

I added a WARN_ON assertion that VM's active list must be empty
at the time of owning context is getting freed, but that turned
out to be a wrong assumption.

Due ordering of operations in i915_gem_object_retire__read, where
contexts are unreferenced before VMAs are moved to the inactive
list, the described situation can in fact happen.

It feels wrong to do things in such order so this fix makes sure
a reference to context is held until the move to inactive list
is completed.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Chris Wilson Oct. 26, 2015, 11:23 a.m. UTC | #1
On Mon, Oct 26, 2015 at 11:05:03AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> In the following commit:
> 
>     commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae
>     Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>     Date:   Mon Oct 5 13:26:36 2015 +0100
> 
>         drm/i915: Clean up associated VMAs on context destruction
> 
> I added a WARN_ON assertion that VM's active list must be empty
> at the time of owning context is getting freed, but that turned
> out to be a wrong assumption.
> 
> Due ordering of operations in i915_gem_object_retire__read, where
> contexts are unreferenced before VMAs are moved to the inactive
> list, the described situation can in fact happen.

The context is being unreferenced indirectly. Adding a direct reference
here is even more bizarre.
-Chris
Tvrtko Ursulin Oct. 26, 2015, noon UTC | #2
On 26/10/15 11:23, Chris Wilson wrote:
> On Mon, Oct 26, 2015 at 11:05:03AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> In the following commit:
>>
>>      commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae
>>      Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>      Date:   Mon Oct 5 13:26:36 2015 +0100
>>
>>          drm/i915: Clean up associated VMAs on context destruction
>>
>> I added a WARN_ON assertion that VM's active list must be empty
>> at the time of owning context is getting freed, but that turned
>> out to be a wrong assumption.
>>
>> Due ordering of operations in i915_gem_object_retire__read, where
>> contexts are unreferenced before VMAs are moved to the inactive
>> list, the described situation can in fact happen.
>
> The context is being unreferenced indirectly. Adding a direct reference
> here is even more bizarre.

Perhaps is not the prettiest, but it sounds logical to me to ensure that 
order of destruction of involved object hierarchy goes from the 
bottom-up and is not interleaved.

If you consider the active/inactive list position as part of the retire 
process, doing it at the very place in code, and the very object that 
looked to be destroyed out of sequence, to me sounded logical.

How would you do it, can you think of a better way?

Regards,

Tvrtko
Chris Wilson Oct. 26, 2015, 12:10 p.m. UTC | #3
On Mon, Oct 26, 2015 at 12:00:06PM +0000, Tvrtko Ursulin wrote:
> 
> On 26/10/15 11:23, Chris Wilson wrote:
> >On Mon, Oct 26, 2015 at 11:05:03AM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>In the following commit:
> >>
> >>     commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae
> >>     Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>     Date:   Mon Oct 5 13:26:36 2015 +0100
> >>
> >>         drm/i915: Clean up associated VMAs on context destruction
> >>
> >>I added a WARN_ON assertion that VM's active list must be empty
> >>at the time of owning context is getting freed, but that turned
> >>out to be a wrong assumption.
> >>
> >>Due ordering of operations in i915_gem_object_retire__read, where
> >>contexts are unreferenced before VMAs are moved to the inactive
> >>list, the described situation can in fact happen.
> >
> >The context is being unreferenced indirectly. Adding a direct reference
> >here is even more bizarre.
> 
> Perhaps is not the prettiest, but it sounds logical to me to ensure
> that order of destruction of involved object hierarchy goes from the
> bottom-up and is not interleaved.
> 
> If you consider the active/inactive list position as part of the
> retire process, doing it at the very place in code, and the very
> object that looked to be destroyed out of sequence, to me sounded
> logical.
> 
> How would you do it, can you think of a better way?

The reference is via the request. We are handling requests, it makes
more sense that you take the reference on the request.

I would just revert the patch, it doesn't fix the problem you tried to
solve and just adds more.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9b2048c7077d..6cbe3fdbca96 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2373,19 +2373,27 @@  static void
 i915_gem_object_retire__read(struct drm_i915_gem_object *obj, int ring)
 {
 	struct i915_vma *vma;
+	struct intel_context *ctx;
 
 	RQ_BUG_ON(obj->last_read_req[ring] == NULL);
 	RQ_BUG_ON(!(obj->active & (1 << ring)));
 
 	list_del_init(&obj->ring_list[ring]);
+
+	/* Ensure context cannot be destroyed with VMAs on the active list. */
+	ctx = obj->last_read_req[ring]->ctx;
+	i915_gem_context_reference(ctx);
+
 	i915_gem_request_assign(&obj->last_read_req[ring], NULL);
 
 	if (obj->last_write_req && obj->last_write_req->ring->id == ring)
 		i915_gem_object_retire__write(obj);
 
 	obj->active &= ~(1 << ring);
-	if (obj->active)
+	if (obj->active) {
+		i915_gem_context_unreference(ctx);
 		return;
+	}
 
 	/* Bump our place on the bound list to keep it roughly in LRU order
 	 * so that we don't steal from recently used but inactive objects
@@ -2399,6 +2407,8 @@  i915_gem_object_retire__read(struct drm_i915_gem_object *obj, int ring)
 			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
 	}
 
+	i915_gem_context_unreference(ctx);
+
 	i915_gem_request_assign(&obj->last_fenced_req, NULL);
 	drm_gem_object_unreference(&obj->base);
 }