diff mbox series

[v1] FOR-CI: drm/i915/guc: Disable ct receive tasklet during reset preparation

Message ID 20241028232632.1951286-1-zhanjun.dong@intel.com (mailing list archive)
State New
Headers show
Series [v1] FOR-CI: drm/i915/guc: Disable ct receive tasklet during reset preparation | expand

Commit Message

Zhanjun Dong Oct. 28, 2024, 11:26 p.m. UTC
During GuC reset prepare, interrupt is disabled, if the interrupt
event already happens and is in progress, from interrupt event to
tasklet get running, there is alway some kind of latency. In long
latency case, it might have 2 rare race conditions:
1. Tasklet runs after IRQ flush, add request to queue after worker
flush started, causes unexpected G2H message request processing,
while reset prepare code already get context destroyed. Request
handler will report error about bad context state.
2. Tasklet runs after intel_guc_submission_reset_prepare,
ct_try_receive_message start to run, while intel_uc_reset_prepare
already finished guc sanitize and set ct->enable to false. This will
causes warning on incorrect ct->enable state.

Fixed by disable ct receive tasklet during reset preparation to avoid
the above race condition.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 9ede6f240d79..f82fec33c432 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1684,15 +1684,20 @@  void intel_guc_submission_reset_prepare(struct intel_guc *guc)
 	guc->interrupts.disable(guc);
 	__reset_guc_busyness_stats(guc);
 
-	/* Flush IRQ handler */
-	spin_lock_irq(guc_to_gt(guc)->irq_lock);
-	spin_unlock_irq(guc_to_gt(guc)->irq_lock);
+	/*
+	 * Disable tasklet until end of prepare, if tasklet is active,
+	 * tasklet_disable will wait until it finished
+	 */
+	tasklet_disable(&guc->ct.receive_tasklet);
 
 	guc_flush_submissions(guc);
 	guc_flush_destroyed_contexts(guc);
 	flush_work(&guc->ct.requests.worker);
 
 	scrub_guc_desc_for_outstanding_g2h(guc);
+
+	/* Enable tasklet at the end, before HW reset */
+	tasklet_enable(&guc->ct.receive_tasklet);
 }
 
 static struct intel_engine_cs *