[24/24] drm/i915: Kill context before taking ctx->mutex

Message ID	20200603145713.3835124-24-maarten.lankhorst@linux.intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=1CNh=7Q=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AEC7520679 From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> To: intel-gfx@lists.freedesktop.org Date: Wed, 3 Jun 2020 16:57:13 +0200 Message-Id: <20200603145713.3835124-24-maarten.lankhorst@linux.intel.com> In-Reply-To: <20200603145713.3835124-1-maarten.lankhorst@linux.intel.com> References: <20200603145713.3835124-1-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 24/24] drm/i915: Kill context before taking ctx->mutex Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	[01/24] Revert "drm/i915/gem: Drop relocation slowpath". \| expand [01/24] Revert "drm/i915/gem: Drop relocation slowpath". [02/24] drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2. [03/24] drm/i915: Remove locking from i915_gem_object_prepare_read/write [04/24] drm/i915: Parse command buffer earlier in eb_relocate(slow) [05/24] Revert "drm/i915/gem: Split eb_vma into its own allocation" [06/24] drm/i915/gem: Make eb_add_lut interruptible wait on object lock. [07/24] drm/i915: Use per object locking in execbuf, v11. [08/24] drm/i915: Use ww locking in intel_renderstate. [09/24] drm/i915: Add ww context handling to context_barrier_task [10/24] drm/i915: Nuke arguments to eb_pin_engine [11/24] drm/i915: Pin engine before pinning all objects, v4. [12/24] drm/i915: Rework intel_context pinning to do everything outside of pin_mutex [13/24] drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin. [14/24] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2. [15/24] drm/i915: Kill last user of intel_context_create_request outside of selftests [16/24] drm/i915: Convert i915_perf to ww locking as well [17/24] drm/i915: Dirty hack to fix selftests locking inversion [18/24] drm/i915/selftests: Fix locking inversion in lrc selftest. [19/24] drm/i915: Use ww pinning for intel_context_create_request() [20/24] drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v2. [21/24] drm/i915: Add ww locking to vm_fault_gtt [22/24] drm/i915: Add ww locking to pin_to_display_plane [23/24] drm/i915: Ensure we hold the pin mutex [24/24] drm/i915: Kill context before taking ctx->mutex

Message ID

20200603145713.3835124-24-maarten.lankhorst@linux.intel.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AEC7520679
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: intel-gfx@lists.freedesktop.org
Date: Wed,  3 Jun 2020 16:57:13 +0200
Message-Id: <20200603145713.3835124-24-maarten.lankhorst@linux.intel.com>
In-Reply-To: <20200603145713.3835124-1-maarten.lankhorst@linux.intel.com>
References: <20200603145713.3835124-1-maarten.lankhorst@linux.intel.com>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH 24/24] drm/i915: Kill context before taking
 ctx->mutex
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

[01/24] Revert "drm/i915/gem: Drop relocation slowpath". | expand

Commit Message

Maarten Lankhorst June 3, 2020, 2:57 p.m. UTC

Killing context before taking ctx->mutex fixes a hang in
gem_ctx_persistence.close-replace-race, where lut_close
takes obj->resv.lock which is already held by execbuf,
causing a stalling indefinitely.

[ 1904.342847] 2 locks held by gem_ctx_persist/11520:
[ 1904.342849]  #0: ffff8882188e4968 (&ctx->mutex){+.+.}-{3:3}, at: context_close+0xe6/0x850 [i915]
[ 1904.342941]  #1: ffff88821c58a5a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: lut_close+0x2c2/0xba0 [i915]
[ 1904.343033] 3 locks held by gem_ctx_persist/11521:
[ 1904.343035]  #0: ffffc900008ff938 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_gem_do_execbuffer+0x103d/0x54c0 [i915]
[ 1904.343157]  #1: ffff88821c58a5a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: eb_validate_vmas+0x602/0x2010 [i915]
[ 1904.343267]  #2: ffff88820afd9200 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x335/0x2300 [i915]

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 24 ++++++++++-----------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 2048b21ac8b2..05df7ffff624 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -623,6 +623,18 @@  static void context_close(struct i915_gem_context *ctx)
 	i915_gem_context_set_closed(ctx);
 	mutex_unlock(&ctx->engines_mutex);
 
+	/*
+	 * If the user has disabled hangchecking, we can not be sure that
+	 * the batches will ever complete after the context is closed,
+	 * keeping the context and all resources pinned forever. So in this
+	 * case we opt to forcibly kill off all remaining requests on
+	 * context close.
+	 */
+	if (!i915_gem_context_is_persistent(ctx) ||
+	    !i915_modparams.enable_hangcheck)
+		kill_context(ctx);
+
+
 	mutex_lock(&ctx->mutex);
 
 	set_closed_name(ctx);
@@ -641,18 +653,6 @@  static void context_close(struct i915_gem_context *ctx)
 	lut_close(ctx);
 
 	mutex_unlock(&ctx->mutex);
-
-	/*
-	 * If the user has disabled hangchecking, we can not be sure that
-	 * the batches will ever complete after the context is closed,
-	 * keeping the context and all resources pinned forever. So in this
-	 * case we opt to forcibly kill off all remaining requests on
-	 * context close.
-	 */
-	if (!i915_gem_context_is_persistent(ctx) ||
-	    !i915_modparams.enable_hangcheck)
-		kill_context(ctx);
-
 	i915_gem_context_put(ctx);
 }

[24/24] drm/i915: Kill context before taking ctx->mutex

Commit Message

Patch