Message ID | 20180531185204.19520-5-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 31/05/2018 19:51, Chris Wilson wrote: > In the following patch, we will process the CSB interrupt under the > timeline.lock and not under the tasklet lock. This also means that we Implied tasklet lock? Or tasklet serialization? > will need to protect access to common variables such as > execlists->csb_head with the timeline.lock during reset. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/intel_lrc.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index d207a1bf9dc9..ec54c29b610f 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1927,8 +1927,7 @@ static void execlists_reset(struct intel_engine_cs *engine, > engine->name, request ? request->global_seqno : 0, > intel_engine_get_seqno(engine)); > > - /* See execlists_cancel_requests() for the irq/spinlock split. */ > - local_irq_save(flags); > + spin_lock_irqsave(&engine->timeline.lock, flags); > > /* > * Catch up with any missed context-switch interrupts. > @@ -1943,14 +1942,12 @@ static void execlists_reset(struct intel_engine_cs *engine, > reset_irq(engine); There is a synchronize_hardirq in this one so this could be a deadlock I think depending on lock ordering. If another CPU enters the irq handler and tries to take the timeline lock which this thread already has, then they deadlock each other. Regards, Tvrtko > > /* Push back any incomplete requests for replay after the reset. */ > - spin_lock(&engine->timeline.lock); > __unwind_incomplete_requests(engine); > - spin_unlock(&engine->timeline.lock); > > /* Following the reset, we need to reload the CSB read/write pointers */ > engine->execlists.csb_head = GEN8_CSB_ENTRIES - 1; > > - local_irq_restore(flags); > + spin_unlock_irqrestore(&engine->timeline.lock, flags); > > /* > * If the request was innocent, we leave the request in the ELSP >
Quoting Tvrtko Ursulin (2018-06-04 15:25:26) > > On 31/05/2018 19:51, Chris Wilson wrote: > > In the following patch, we will process the CSB interrupt under the > > timeline.lock and not under the tasklet lock. This also means that we > > Implied tasklet lock? Or tasklet serialization? We were using the tasklet to provide exclusive access to the port[], so were using it as a serialisation primitive and lock. > > will need to protect access to common variables such as > > execlists->csb_head with the timeline.lock during reset. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/i915/intel_lrc.c | 7 ++----- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > > index d207a1bf9dc9..ec54c29b610f 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -1927,8 +1927,7 @@ static void execlists_reset(struct intel_engine_cs *engine, > > engine->name, request ? request->global_seqno : 0, > > intel_engine_get_seqno(engine)); > > > > - /* See execlists_cancel_requests() for the irq/spinlock split. */ > > - local_irq_save(flags); > > + spin_lock_irqsave(&engine->timeline.lock, flags); > > > > /* > > * Catch up with any missed context-switch interrupts. > > @@ -1943,14 +1942,12 @@ static void execlists_reset(struct intel_engine_cs *engine, > > reset_irq(engine); > > There is a synchronize_hardirq in this one so this could be a deadlock I > think depending on lock ordering. If another CPU enters the irq handler > and tries to take the timeline lock which this thread already has, then > they deadlock each other. Nothing, and that would explain the deadlock in the other thread. -Chris
On 04/06/2018 16:29, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-06-04 15:25:26) >> >> On 31/05/2018 19:51, Chris Wilson wrote: >>> In the following patch, we will process the CSB interrupt under the >>> timeline.lock and not under the tasklet lock. This also means that we >> >> Implied tasklet lock? Or tasklet serialization? > > We were using the tasklet to provide exclusive access to the port[], so > were using it as a serialisation primitive and lock. Yes of course, was just thinking if it worth being more verbose with the "tasklist lock" wording. I just think of it more as "serialized by the tasklet" and not so much as the lock. >>> will need to protect access to common variables such as >>> execlists->csb_head with the timeline.lock during reset. >>> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> --- >>> drivers/gpu/drm/i915/intel_lrc.c | 7 ++----- >>> 1 file changed, 2 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c >>> index d207a1bf9dc9..ec54c29b610f 100644 >>> --- a/drivers/gpu/drm/i915/intel_lrc.c >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c >>> @@ -1927,8 +1927,7 @@ static void execlists_reset(struct intel_engine_cs *engine, >>> engine->name, request ? request->global_seqno : 0, >>> intel_engine_get_seqno(engine)); >>> >>> - /* See execlists_cancel_requests() for the irq/spinlock split. */ >>> - local_irq_save(flags); >>> + spin_lock_irqsave(&engine->timeline.lock, flags); >>> >>> /* >>> * Catch up with any missed context-switch interrupts. >>> @@ -1943,14 +1942,12 @@ static void execlists_reset(struct intel_engine_cs *engine, >>> reset_irq(engine); >> >> There is a synchronize_hardirq in this one so this could be a deadlock I >> think depending on lock ordering. If another CPU enters the irq handler >> and tries to take the timeline lock which this thread already has, then >> they deadlock each other. > > Nothing, and that would explain the deadlock in the other thread. Nothing what? :) Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index d207a1bf9dc9..ec54c29b610f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1927,8 +1927,7 @@ static void execlists_reset(struct intel_engine_cs *engine, engine->name, request ? request->global_seqno : 0, intel_engine_get_seqno(engine)); - /* See execlists_cancel_requests() for the irq/spinlock split. */ - local_irq_save(flags); + spin_lock_irqsave(&engine->timeline.lock, flags); /* * Catch up with any missed context-switch interrupts. @@ -1943,14 +1942,12 @@ static void execlists_reset(struct intel_engine_cs *engine, reset_irq(engine); /* Push back any incomplete requests for replay after the reset. */ - spin_lock(&engine->timeline.lock); __unwind_incomplete_requests(engine); - spin_unlock(&engine->timeline.lock); /* Following the reset, we need to reload the CSB read/write pointers */ engine->execlists.csb_head = GEN8_CSB_ENTRIES - 1; - local_irq_restore(flags); + spin_unlock_irqrestore(&engine->timeline.lock, flags); /* * If the request was innocent, we leave the request in the ELSP
In the following patch, we will process the CSB interrupt under the timeline.lock and not under the tasklet lock. This also means that we will need to protect access to common variables such as execlists->csb_head with the timeline.lock during reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/intel_lrc.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)