diff mbox series

drm/i915/gt: Ensure irqs' status does not change with spin_unlock

Message ID gtrmxhovj2qpmcica76f7uxharhztbt7fdoromyxbsd7ltjvuq@lhnv2wcxm7or (mailing list archive)
State New
Headers show
Series drm/i915/gt: Ensure irqs' status does not change with spin_unlock | expand

Commit Message

Krzysztof Karas Jan. 10, 2025, 2:08 p.m. UTC
spin_unlock() function enables irqs regardless of their state
before spin_lock() was called. This might result in an interrupt
while holding a lock further down in the execution, as seen in
GitLab issue #13399.

Try to remedy the problem by saving irq state before spin lock
acquisition.

Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
---
This issue is hit rarely on CI and I was not able to reproduce
it locally. There might be more places where we should save and
restore irq state, so I am not adding "Closes" label for the
issue yet.

 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Krzysztof Karas Jan. 13, 2025, 2:06 p.m. UTC | #1
Hi Maciej,

> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 12f1ba7ca9c1..e9102f7246f5 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -4338,10 +4338,11 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq,
> >   static void guc_retire_inflight_request_prio(struct i915_request *rq)
> >   {
> >   	struct intel_context *ce = request_to_scheduling_context(rq);
> > +	unsigned long flags;
> > -	spin_lock(&ce->guc_state.lock);
> > +	spin_lock_irqsave(&ce->guc_state.lock, flags);
> >   	guc_prio_fini(rq, ce);
> > -	spin_unlock(&ce->guc_state.lock);
> > +	spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> >   }
> >   static void sanitize_hwsp(struct intel_engine_cs *engine)
> 
> The guc_retire_inflight_request_prio is called in intel_breadcrumbs.c
> signal_irq_work().
> 
> There is a similar situation
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c?h=v6.13-rc7#n255
> 
> There is also spin_(un)lock while potentially IRQs are disabled.
> 
> Should it also be addressed?
Thanks for spotting this. Yes, I believe we should also address
other spin locks/unlocks on this path
(inside list_for_each_entry_rcu loops for example), as they
might cause similar problems. I will include these in v2.

Krzysztof
> 
> Regards,
> 
> Maciej
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 12f1ba7ca9c1..e9102f7246f5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4338,10 +4338,11 @@  static void guc_bump_inflight_request_prio(struct i915_request *rq,
 static void guc_retire_inflight_request_prio(struct i915_request *rq)
 {
 	struct intel_context *ce = request_to_scheduling_context(rq);
+	unsigned long flags;
 
-	spin_lock(&ce->guc_state.lock);
+	spin_lock_irqsave(&ce->guc_state.lock, flags);
 	guc_prio_fini(rq, ce);
-	spin_unlock(&ce->guc_state.lock);
+	spin_unlock_irqrestore(&ce->guc_state.lock, flags);
 }
 
 static void sanitize_hwsp(struct intel_engine_cs *engine)