Message ID | 20180507135731.10587-2-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/05/2018 14:57, Chris Wilson wrote: > During request submission, we call the engine->schedule() function so > that we may reorder the active requests as required for inheriting the > new request's priority. This may schedule several tasklets to run on the > local CPU, but we will need to schedule the tasklets again for the new > request. Delay all the local tasklets until the end, so that we only > have to process the queue just once. > > v2: Beware PREEMPT_RCU, as then local_bh_disable() is then not a > superset of rcu_read_lock(). > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_request.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > index e4cf76ec14a6..f336942229cf 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -1110,12 +1110,11 @@ void __i915_request_add(struct i915_request *request, bool flush_caches) > * decide whether to preempt the entire chain so that it is ready to > * run at the earliest possible convenience. > */ > - rcu_read_lock(); > + local_bh_disable(); > + rcu_read_lock(); /* RCU serialisation for set-wedged protection */ > if (engine->schedule) > engine->schedule(request, &request->ctx->sched); > rcu_read_unlock(); > - > - local_bh_disable(); > i915_sw_fence_commit(&request->submit); > local_bh_enable(); /* Kick the execlists tasklet if just scheduled */ > > Before I wasn't sure about this one, since it lengthens the softirq off section and still doesn't make the re-schedule atomic. But today I am thinking that in normal use dependency chains should not be that deep for it to become an issue. In the light of that, go for it: Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Regards, Tvrtko
Quoting Tvrtko Ursulin (2018-05-08 11:02:38) > > On 07/05/2018 14:57, Chris Wilson wrote: > > During request submission, we call the engine->schedule() function so > > that we may reorder the active requests as required for inheriting the > > new request's priority. This may schedule several tasklets to run on the > > local CPU, but we will need to schedule the tasklets again for the new > > request. Delay all the local tasklets until the end, so that we only > > have to process the queue just once. > > > > v2: Beware PREEMPT_RCU, as then local_bh_disable() is then not a > > superset of rcu_read_lock(). > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/i915_request.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > index e4cf76ec14a6..f336942229cf 100644 > > --- a/drivers/gpu/drm/i915/i915_request.c > > +++ b/drivers/gpu/drm/i915/i915_request.c > > @@ -1110,12 +1110,11 @@ void __i915_request_add(struct i915_request *request, bool flush_caches) > > * decide whether to preempt the entire chain so that it is ready to > > * run at the earliest possible convenience. > > */ > > - rcu_read_lock(); > > + local_bh_disable(); > > + rcu_read_lock(); /* RCU serialisation for set-wedged protection */ > > if (engine->schedule) > > engine->schedule(request, &request->ctx->sched); > > rcu_read_unlock(); > > - > > - local_bh_disable(); > > i915_sw_fence_commit(&request->submit); > > local_bh_enable(); /* Kick the execlists tasklet if just scheduled */ > > > > > > Before I wasn't sure about this one, since it lengthens the softirq off > section and still doesn't make the re-schedule atomic. But today I am > thinking that in normal use dependency chains should not be that deep > for it to become an issue. In the light of that, go for it: We have big problems when the dependency chain grows, simply because we recursively build a linear list and then walk it again. It's not the worst nightmare in the code, and I don't think anyone has complained about it yet, but we do have a few pathological igt (gem_exec_schedule deep/wide) to show that it can be the ratelimiting step. As far as blocking tasklets across this; only the local tasklet is blocked and for good reason as we reorder the queues. A tasklet on another CPU is also (mostly) blocked by the spinlock (and local irq off). Overall I don't think the problem is exacerbated at all by the local_bh_disable(). -Chris
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index e4cf76ec14a6..f336942229cf 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1110,12 +1110,11 @@ void __i915_request_add(struct i915_request *request, bool flush_caches) * decide whether to preempt the entire chain so that it is ready to * run at the earliest possible convenience. */ - rcu_read_lock(); + local_bh_disable(); + rcu_read_lock(); /* RCU serialisation for set-wedged protection */ if (engine->schedule) engine->schedule(request, &request->ctx->sched); rcu_read_unlock(); - - local_bh_disable(); i915_sw_fence_commit(&request->submit); local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
During request submission, we call the engine->schedule() function so that we may reorder the active requests as required for inheriting the new request's priority. This may schedule several tasklets to run on the local CPU, but we will need to schedule the tasklets again for the new request. Delay all the local tasklets until the end, so that we only have to process the queue just once. v2: Beware PREEMPT_RCU, as then local_bh_disable() is then not a superset of rcu_read_lock(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/i915_request.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)