[3/3] drm/i915: Fix premature LRC unpin in GuC mode
diff mbox

Message ID 1453297257-4707-3-git-send-email-tvrtko.ursulin@linux.intel.com
State New
Headers show

Commit Message

Tvrtko Ursulin Jan. 20, 2016, 1:40 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

In GuC mode LRC pinning lifetime depends exclusively on the
request liftime. Since that is terminated by the seqno update
that opens up a race condition between GPU finishing writing
out the context image and the driver unpinning the LRC.

To extend the LRC lifetime we will employ a similar approach
to what legacy ringbuffer submission does.

We will start tracking the last submitted context per engine
and keep it pinned until it is replaced by another one.

Note that the driver unload path is a bit fragile and could
benefit greatly from efforts to unify the legacy and exec
list submission code paths.

At the moment i915_gem_context_fini has special casing for the
two which are potentialy not needed, and also depends on
i915_gem_cleanup_ringbuffer running before itself.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Issue: VIZ-4277
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nick Hoath <nicholas.hoath@intel.com>
---
I cannot test this with GuC but it passes BAT with execlists
and some real world smoke tests.
---
 drivers/gpu/drm/i915/i915_gem_context.c | 4 +++-
 drivers/gpu/drm/i915/intel_lrc.c        | 7 +++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Chris Wilson Jan. 20, 2016, 1:55 p.m. UTC | #1
On Wed, Jan 20, 2016 at 01:40:57PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> In GuC mode LRC pinning lifetime depends exclusively on the
> request liftime. Since that is terminated by the seqno update
> that opens up a race condition between GPU finishing writing
> out the context image and the driver unpinning the LRC.
> 
> To extend the LRC lifetime we will employ a similar approach
> to what legacy ringbuffer submission does.
> 
> We will start tracking the last submitted context per engine
> and keep it pinned until it is replaced by another one.
> 
> Note that the driver unload path is a bit fragile and could
> benefit greatly from efforts to unify the legacy and exec
> list submission code paths.
> 
> At the moment i915_gem_context_fini has special casing for the
> two which are potentialy not needed, and also depends on
> i915_gem_cleanup_ringbuffer running before itself.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Issue: VIZ-4277
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Nick Hoath <nicholas.hoath@intel.com>
> ---
> I cannot test this with GuC but it passes BAT with execlists
> and some real world smoke tests.
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 4 +++-
>  drivers/gpu/drm/i915/intel_lrc.c        | 7 +++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index c25083c78ba7..0b419e165836 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -438,7 +438,9 @@ void i915_gem_context_fini(struct drm_device *dev)
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
>  		struct intel_engine_cs *ring = &dev_priv->ring[i];
>  
> -		if (ring->last_context)
> +		if (ring->last_context && i915.enable_execlists)
> +			intel_lr_context_unpin(ring->last_context, ring);
> +		else if (ring->last_context)
>  			i915_gem_context_unreference(ring->last_context);
>  
>  		ring->default_context = NULL;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 5c3f57fed916..b8a7e126d6d2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -918,6 +918,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>  	struct intel_engine_cs  *ring = params->ring;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf;
> +	struct intel_context    *ctx = params->request->ctx;
>  	u64 exec_start;
>  	int instp_mode;
>  	u32 instp_mask;
> @@ -982,6 +983,12 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>  
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
> +	if (ring->last_context && ring->last_context != ctx) {
> +		intel_lr_context_unpin(ring->last_context, ring);
> +		intel_lr_context_pin(ctx, ring);
> +		ring->last_context = ctx;
> +	}

I think this is the wrong location and should be part of submitting the
context inside the engine (because intel_execlists_submission should not
as it is entirely duplicating the common GEM batch submision code and
the unique part is engine->add_request()).

Note that it should be:

if (engine->last_context != request->ctx) {
	if (engine->last_context)
		intel_lr_context_unpin(engine->last_context, engine);
	engine->last_context = request->ctx;
	intel_lr_context_pin(engine->last_context, engine);
}
-Chris
Tvrtko Ursulin Jan. 20, 2016, 2:06 p.m. UTC | #2
On 20/01/16 13:55, Chris Wilson wrote:
> On Wed, Jan 20, 2016 at 01:40:57PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> In GuC mode LRC pinning lifetime depends exclusively on the
>> request liftime. Since that is terminated by the seqno update
>> that opens up a race condition between GPU finishing writing
>> out the context image and the driver unpinning the LRC.
>>
>> To extend the LRC lifetime we will employ a similar approach
>> to what legacy ringbuffer submission does.
>>
>> We will start tracking the last submitted context per engine
>> and keep it pinned until it is replaced by another one.
>>
>> Note that the driver unload path is a bit fragile and could
>> benefit greatly from efforts to unify the legacy and exec
>> list submission code paths.
>>
>> At the moment i915_gem_context_fini has special casing for the
>> two which are potentialy not needed, and also depends on
>> i915_gem_cleanup_ringbuffer running before itself.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Issue: VIZ-4277
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Nick Hoath <nicholas.hoath@intel.com>
>> ---
>> I cannot test this with GuC but it passes BAT with execlists
>> and some real world smoke tests.
>> ---
>>   drivers/gpu/drm/i915/i915_gem_context.c | 4 +++-
>>   drivers/gpu/drm/i915/intel_lrc.c        | 7 +++++++
>>   2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index c25083c78ba7..0b419e165836 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -438,7 +438,9 @@ void i915_gem_context_fini(struct drm_device *dev)
>>   	for (i = 0; i < I915_NUM_RINGS; i++) {
>>   		struct intel_engine_cs *ring = &dev_priv->ring[i];
>>
>> -		if (ring->last_context)
>> +		if (ring->last_context && i915.enable_execlists)
>> +			intel_lr_context_unpin(ring->last_context, ring);
>> +		else if (ring->last_context)
>>   			i915_gem_context_unreference(ring->last_context);
>>
>>   		ring->default_context = NULL;
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 5c3f57fed916..b8a7e126d6d2 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -918,6 +918,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>   	struct intel_engine_cs  *ring = params->ring;
>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>>   	struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf;
>> +	struct intel_context    *ctx = params->request->ctx;
>>   	u64 exec_start;
>>   	int instp_mode;
>>   	u32 instp_mask;
>> @@ -982,6 +983,12 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>
>>   	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>
>> +	if (ring->last_context && ring->last_context != ctx) {
>> +		intel_lr_context_unpin(ring->last_context, ring);
>> +		intel_lr_context_pin(ctx, ring);
>> +		ring->last_context = ctx;
>> +	}
>
> I think this is the wrong location and should be part of submitting the
> context inside the engine (because intel_execlists_submission should not
> as it is entirely duplicating the common GEM batch submision code and
> the unique part is engine->add_request()).

So into engine->emit_request you are saying? That works just as well 
AFAICS, just making sure I understood correctly.

> Note that it should be:
>
> if (engine->last_context != request->ctx) {
> 	if (engine->last_context)
> 		intel_lr_context_unpin(engine->last_context, engine);
> 	engine->last_context = request->ctx;
> 	intel_lr_context_pin(engine->last_context, engine);
> }

Ooops!

Regards,

Tvrtko
Chris Wilson Jan. 20, 2016, 2:18 p.m. UTC | #3
On Wed, Jan 20, 2016 at 02:06:43PM +0000, Tvrtko Ursulin wrote:
> 
> On 20/01/16 13:55, Chris Wilson wrote:
> >On Wed, Jan 20, 2016 at 01:40:57PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>In GuC mode LRC pinning lifetime depends exclusively on the
> >>request liftime. Since that is terminated by the seqno update
> >>that opens up a race condition between GPU finishing writing
> >>out the context image and the driver unpinning the LRC.
> >>
> >>To extend the LRC lifetime we will employ a similar approach
> >>to what legacy ringbuffer submission does.
> >>
> >>We will start tracking the last submitted context per engine
> >>and keep it pinned until it is replaced by another one.
> >>
> >>Note that the driver unload path is a bit fragile and could
> >>benefit greatly from efforts to unify the legacy and exec
> >>list submission code paths.
> >>
> >>At the moment i915_gem_context_fini has special casing for the
> >>two which are potentialy not needed, and also depends on
> >>i915_gem_cleanup_ringbuffer running before itself.
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>Issue: VIZ-4277
> >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>Cc: Nick Hoath <nicholas.hoath@intel.com>
> >>---
> >>I cannot test this with GuC but it passes BAT with execlists
> >>and some real world smoke tests.
> >>---
> >>  drivers/gpu/drm/i915/i915_gem_context.c | 4 +++-
> >>  drivers/gpu/drm/i915/intel_lrc.c        | 7 +++++++
> >>  2 files changed, 10 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >>index c25083c78ba7..0b419e165836 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_context.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >>@@ -438,7 +438,9 @@ void i915_gem_context_fini(struct drm_device *dev)
> >>  	for (i = 0; i < I915_NUM_RINGS; i++) {
> >>  		struct intel_engine_cs *ring = &dev_priv->ring[i];
> >>
> >>-		if (ring->last_context)
> >>+		if (ring->last_context && i915.enable_execlists)
> >>+			intel_lr_context_unpin(ring->last_context, ring);
> >>+		else if (ring->last_context)
> >>  			i915_gem_context_unreference(ring->last_context);
> >>
> >>  		ring->default_context = NULL;
> >>diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>index 5c3f57fed916..b8a7e126d6d2 100644
> >>--- a/drivers/gpu/drm/i915/intel_lrc.c
> >>+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>@@ -918,6 +918,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
> >>  	struct intel_engine_cs  *ring = params->ring;
> >>  	struct drm_i915_private *dev_priv = dev->dev_private;
> >>  	struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf;
> >>+	struct intel_context    *ctx = params->request->ctx;
> >>  	u64 exec_start;
> >>  	int instp_mode;
> >>  	u32 instp_mask;
> >>@@ -982,6 +983,12 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
> >>
> >>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
> >>
> >>+	if (ring->last_context && ring->last_context != ctx) {
> >>+		intel_lr_context_unpin(ring->last_context, ring);
> >>+		intel_lr_context_pin(ctx, ring);
> >>+		ring->last_context = ctx;
> >>+	}
> >
> >I think this is the wrong location and should be part of submitting the
> >context inside the engine (because intel_execlists_submission should not
> >as it is entirely duplicating the common GEM batch submision code and
> >the unique part is engine->add_request()).
> 
> So into engine->emit_request you are saying? That works just as well
> AFAICS, just making sure I understood correctly.

Oh, yeah you haven't got the unify add_request/emit_request patch...

So gen8_emit_request(), but considering the other patches in flight,
intel_logical_ring_advance_and_submit() will be the ultimate location.
-Chris
Nick Hoath Jan. 20, 2016, 2:21 p.m. UTC | #4
On 20/01/2016 14:06, Tvrtko Ursulin wrote:
>
> On 20/01/16 13:55, Chris Wilson wrote:
>> On Wed, Jan 20, 2016 at 01:40:57PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> In GuC mode LRC pinning lifetime depends exclusively on the
>>> request liftime. Since that is terminated by the seqno update
>>> that opens up a race condition between GPU finishing writing
>>> out the context image and the driver unpinning the LRC.
>>>
>>> To extend the LRC lifetime we will employ a similar approach
>>> to what legacy ringbuffer submission does.
>>>
>>> We will start tracking the last submitted context per engine
>>> and keep it pinned until it is replaced by another one.
>>>
>>> Note that the driver unload path is a bit fragile and could
>>> benefit greatly from efforts to unify the legacy and exec
>>> list submission code paths.
>>>
>>> At the moment i915_gem_context_fini has special casing for the
>>> two which are potentialy not needed, and also depends on
>>> i915_gem_cleanup_ringbuffer running before itself.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Issue: VIZ-4277
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Nick Hoath <nicholas.hoath@intel.com>
>>> ---
>>> I cannot test this with GuC but it passes BAT with execlists
>>> and some real world smoke tests.
>>> ---
>>>    drivers/gpu/drm/i915/i915_gem_context.c | 4 +++-
>>>    drivers/gpu/drm/i915/intel_lrc.c        | 7 +++++++
>>>    2 files changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index c25083c78ba7..0b419e165836 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -438,7 +438,9 @@ void i915_gem_context_fini(struct drm_device *dev)
>>>    	for (i = 0; i < I915_NUM_RINGS; i++) {
>>>    		struct intel_engine_cs *ring = &dev_priv->ring[i];
>>>
>>> -		if (ring->last_context)
>>> +		if (ring->last_context && i915.enable_execlists)
>>> +			intel_lr_context_unpin(ring->last_context, ring);
>>> +		else if (ring->last_context)
>>>    			i915_gem_context_unreference(ring->last_context);
>>>
>>>    		ring->default_context = NULL;
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 5c3f57fed916..b8a7e126d6d2 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -918,6 +918,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>>    	struct intel_engine_cs  *ring = params->ring;
>>>    	struct drm_i915_private *dev_priv = dev->dev_private;
>>>    	struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf;
>>> +	struct intel_context    *ctx = params->request->ctx;
>>>    	u64 exec_start;
>>>    	int instp_mode;
>>>    	u32 instp_mask;
>>> @@ -982,6 +983,12 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>>
>>>    	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>>
>>> +	if (ring->last_context && ring->last_context != ctx) {
>>> +		intel_lr_context_unpin(ring->last_context, ring);
>>> +		intel_lr_context_pin(ctx, ring);
>>> +		ring->last_context = ctx;
>>> +	}
>>
>> I think this is the wrong location and should be part of submitting the
>> context inside the engine (because intel_execlists_submission should not
>> as it is entirely duplicating the common GEM batch submision code and
>> the unique part is engine->add_request()).
>
> So into engine->emit_request you are saying? That works just as well
> AFAICS, just making sure I understood correctly.

I think it should go in to intel_logical_ring_advance_and_submit. The 
extra pinning is being put in place to cover GPU usage of the pin. It 
should probably therefore go in to the last common place between 
execlists & GUC, as close to hardware submission as possible.

>
>> Note that it should be:
>>
>> if (engine->last_context != request->ctx) {
>> 	if (engine->last_context)
>> 		intel_lr_context_unpin(engine->last_context, engine);
>> 	engine->last_context = request->ctx;
>> 	intel_lr_context_pin(engine->last_context, engine);
>> }
>
> Ooops!
>
> Regards,
>
> Tvrtko
>

Patch
diff mbox

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c25083c78ba7..0b419e165836 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -438,7 +438,9 @@  void i915_gem_context_fini(struct drm_device *dev)
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		struct intel_engine_cs *ring = &dev_priv->ring[i];
 
-		if (ring->last_context)
+		if (ring->last_context && i915.enable_execlists)
+			intel_lr_context_unpin(ring->last_context, ring);
+		else if (ring->last_context)
 			i915_gem_context_unreference(ring->last_context);
 
 		ring->default_context = NULL;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5c3f57fed916..b8a7e126d6d2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -918,6 +918,7 @@  int intel_execlists_submission(struct i915_execbuffer_params *params,
 	struct intel_engine_cs  *ring = params->ring;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf;
+	struct intel_context    *ctx = params->request->ctx;
 	u64 exec_start;
 	int instp_mode;
 	u32 instp_mask;
@@ -982,6 +983,12 @@  int intel_execlists_submission(struct i915_execbuffer_params *params,
 
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
+	if (ring->last_context && ring->last_context != ctx) {
+		intel_lr_context_unpin(ring->last_context, ring);
+		intel_lr_context_pin(ctx, ring);
+		ring->last_context = ctx;
+	}
+
 	i915_gem_execbuffer_move_to_active(vmas, params->request);
 	i915_gem_execbuffer_retire_commands(params);