diff mbox

[v2,1/7] drm/i915: Flush submission tasklet after bumping priority

Message ID 20180507135731.10587-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson May 7, 2018, 1:57 p.m. UTC
When called from process context tasklet_schedule() defers itself to
ksoftirqd. From experience this may cause unacceptable latencies of over
200ms in executing the submission tasklet, our goal is to reprioritise
the HW execution queue and trigger HW preemption immediately, so disable
bh over the call to schedule and force the tasklet to run afterwards if
scheduled.

v2: Keep rcu_read_lock() around for PREEMPT_RCU

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Tvrtko Ursulin May 8, 2018, 9:40 a.m. UTC | #1
On 07/05/2018 14:57, Chris Wilson wrote:
> When called from process context tasklet_schedule() defers itself to
> ksoftirqd. From experience this may cause unacceptable latencies of over
> 200ms in executing the submission tasklet, our goal is to reprioritise
> the HW execution queue and trigger HW preemption immediately, so disable
> bh over the call to schedule and force the tasklet to run afterwards if
> scheduled.
> 
> v2: Keep rcu_read_lock() around for PREEMPT_RCU
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 5ece6ae4bdff..89bf5d67cb74 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -578,10 +578,12 @@ static void __fence_set_priority(struct dma_fence *fence,
>   	rq = to_request(fence);
>   	engine = rq->engine;
>   
> -	rcu_read_lock();
> +	local_bh_disable();
> +	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
>   	if (engine->schedule)
>   		engine->schedule(rq, attr);
>   	rcu_read_unlock();
> +	local_bh_enable(); /* kick the tasklets if queues were reprioritised */
>   }
>   
>   static void fence_set_priority(struct dma_fence *fence,
> 

Is the sequence not the wrong way around? I would expect disable 
preemption, then disable softirq, order for disable, on ofc opposite for 
enable.

Regards,

Tvrtko
Chris Wilson May 8, 2018, 9:45 a.m. UTC | #2
Quoting Tvrtko Ursulin (2018-05-08 10:40:43)
> 
> On 07/05/2018 14:57, Chris Wilson wrote:
> > When called from process context tasklet_schedule() defers itself to
> > ksoftirqd. From experience this may cause unacceptable latencies of over
> > 200ms in executing the submission tasklet, our goal is to reprioritise
> > the HW execution queue and trigger HW preemption immediately, so disable
> > bh over the call to schedule and force the tasklet to run afterwards if
> > scheduled.
> > 
> > v2: Keep rcu_read_lock() around for PREEMPT_RCU
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 5ece6ae4bdff..89bf5d67cb74 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -578,10 +578,12 @@ static void __fence_set_priority(struct dma_fence *fence,
> >       rq = to_request(fence);
> >       engine = rq->engine;
> >   
> > -     rcu_read_lock();
> > +     local_bh_disable();
> > +     rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> >       if (engine->schedule)
> >               engine->schedule(rq, attr);
> >       rcu_read_unlock();
> > +     local_bh_enable(); /* kick the tasklets if queues were reprioritised */
> >   }
> >   
> >   static void fence_set_priority(struct dma_fence *fence,
> > 
> 
> Is the sequence not the wrong way around? I would expect disable 
> preemption, then disable softirq, order for disable, on ofc opposite for 
> enable.

We disable preemption (i.e. softirq) then mark ->schedule as being RCU
protected; unwrap and re-enable preemption (kicking softirq tasklets).

I felt it better to keep the RCU tight to ->schedule than let it wrap
local_bh_enable() suggesting that the tasklets might need additional
protection.
-Chris
Tvrtko Ursulin May 8, 2018, 9:57 a.m. UTC | #3
On 08/05/2018 10:45, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-05-08 10:40:43)
>>
>> On 07/05/2018 14:57, Chris Wilson wrote:
>>> When called from process context tasklet_schedule() defers itself to
>>> ksoftirqd. From experience this may cause unacceptable latencies of over
>>> 200ms in executing the submission tasklet, our goal is to reprioritise
>>> the HW execution queue and trigger HW preemption immediately, so disable
>>> bh over the call to schedule and force the tasklet to run afterwards if
>>> scheduled.
>>>
>>> v2: Keep rcu_read_lock() around for PREEMPT_RCU
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_gem.c | 4 +++-
>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 5ece6ae4bdff..89bf5d67cb74 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -578,10 +578,12 @@ static void __fence_set_priority(struct dma_fence *fence,
>>>        rq = to_request(fence);
>>>        engine = rq->engine;
>>>    
>>> -     rcu_read_lock();
>>> +     local_bh_disable();
>>> +     rcu_read_lock(); /* RCU serialisation for set-wedged protection */
>>>        if (engine->schedule)
>>>                engine->schedule(rq, attr);
>>>        rcu_read_unlock();
>>> +     local_bh_enable(); /* kick the tasklets if queues were reprioritised */
>>>    }
>>>    
>>>    static void fence_set_priority(struct dma_fence *fence,
>>>
>>
>> Is the sequence not the wrong way around? I would expect disable
>> preemption, then disable softirq, order for disable, on ofc opposite for
>> enable.
> 
> We disable preemption (i.e. softirq) then mark ->schedule as being RCU
> protected; unwrap and re-enable preemption (kicking softirq tasklets).
> 
> I felt it better to keep the RCU tight to ->schedule than let it wrap
> local_bh_enable() suggesting that the tasklets might need additional
> protection.

Later I noticed than in a different thread Mika pointed out there is 
preemptible RCU as well so my argument about ordering falls apart a bit.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5ece6ae4bdff..89bf5d67cb74 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -578,10 +578,12 @@  static void __fence_set_priority(struct dma_fence *fence,
 	rq = to_request(fence);
 	engine = rq->engine;
 
-	rcu_read_lock();
+	local_bh_disable();
+	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
 	if (engine->schedule)
 		engine->schedule(rq, attr);
 	rcu_read_unlock();
+	local_bh_enable(); /* kick the tasklets if queues were reprioritised */
 }
 
 static void fence_set_priority(struct dma_fence *fence,