diff mbox

[1/5] i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock

Message ID 20170406232347.988-2-aarcange@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrea Arcangeli April 6, 2017, 11:23 p.m. UTC
synchronize_rcu/synchronize_sched/synchronize_rcu_expedited() will
hang until its own workqueues are run. The i915 gem workqueues will
wait on the struct_mutex to be released. So we cannot wait for a
quiescent state using those rcu primitives while holding the
struct_mutex or it creates a circular lock dependency resulting in
kernel hangs (which is reproducible but goes undetected by lockdep).

This started in commit 3d3d18f086cdda72ee18a454db70ca72c6e3246c and
lockdep didn't detect it apparently.

kswapd0         D    0   700      2 0x00000000
Call Trace:
? __schedule+0x1a5/0x660
? schedule+0x36/0x80
? _synchronize_rcu_expedited.constprop.65+0x2ef/0x300
? wake_up_bit+0x20/0x20
? rcu_stall_kick_kthreads.part.54+0xc0/0xc0
? rcu_exp_wait_wake+0x530/0x530
? i915_gem_shrink+0x34b/0x4b0
? i915_gem_shrinker_scan+0x7c/0x90
? i915_gem_shrinker_scan+0x7c/0x90
? shrink_slab.part.61.constprop.72+0x1c1/0x3a0
? shrink_zone+0x154/0x160
? kswapd+0x40a/0x720
? kthread+0xf4/0x130
? try_to_free_pages+0x450/0x450
? kthread_create_on_node+0x40/0x40
? ret_from_fork+0x23/0x30
plasmashell     D    0  4657   4614 0x00000000
Call Trace:
? __schedule+0x1a5/0x660
? schedule+0x36/0x80
? schedule_preempt_disabled+0xe/0x10
? __mutex_lock.isra.4+0x1c9/0x790
? i915_gem_close_object+0x26/0xc0
? i915_gem_close_object+0x26/0xc0
? drm_gem_object_release_handle+0x48/0x90
? drm_gem_handle_delete+0x50/0x80
? drm_ioctl+0x1fa/0x420
? drm_gem_handle_create+0x40/0x40
? pipe_write+0x391/0x410
? __vfs_write+0xc6/0x120
? do_vfs_ioctl+0x8b/0x5d0
? SyS_ioctl+0x3b/0x70
? entry_SYSCALL_64_fastpath+0x13/0x94
kworker/0:0     D    0 29186      2 0x00000000
Workqueue: events __i915_gem_free_work
Call Trace:
? __schedule+0x1a5/0x660
? schedule+0x36/0x80
? schedule_preempt_disabled+0xe/0x10
? __mutex_lock.isra.4+0x1c9/0x790
? del_timer_sync+0x44/0x50
? update_curr+0x57/0x110
? __i915_gem_free_objects+0x31/0x300
? __i915_gem_free_objects+0x31/0x300
? __i915_gem_free_work+0x2d/0x40
? process_one_work+0x13a/0x3b0
? worker_thread+0x4a/0x460
? kthread+0xf4/0x130
? process_one_work+0x3b0/0x3b0
? kthread_create_on_node+0x40/0x40
? ret_from_fork+0x23/0x30

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 drivers/gpu/drm/i915/i915_gem.c          |  9 +++++++++
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 14 ++++++++++----
 2 files changed, 19 insertions(+), 4 deletions(-)

Comments

Joonas Lahtinen April 7, 2017, 9:05 a.m. UTC | #1
On pe, 2017-04-07 at 01:23 +0200, Andrea Arcangeli wrote:
> synchronize_rcu/synchronize_sched/synchronize_rcu_expedited() will
> hang until its own workqueues are run. The i915 gem workqueues will
> wait on the struct_mutex to be released. So we cannot wait for a
> quiescent state using those rcu primitives while holding the
> struct_mutex or it creates a circular lock dependency resulting in
> kernel hangs (which is reproducible but goes undetected by lockdep).
> 
> This started in commit 3d3d18f086cdda72ee18a454db70ca72c6e3246c and
> lockdep didn't detect it apparently.

The right format is;

Fixes: 3d3d18f086cd ("drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker)")

> @@ -324,6 +320,16 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
>  	if (unlock)
>  		mutex_unlock(&dev->struct_mutex);
>  
> +	if (likely(__mutex_owner(&dev->struct_mutex) != current))

This check can be dropped and synchronize_rcu_expedited() should be
embedded directly to the if (unlock) branch as it's functionally
equivalent. This can be applied to all the unlock cases, not just this
one. That should be the correct action to avoid the deadlock. I've sent
a patch to do this (Cc'd you), can you verify that it gets rid of the
problem for you?

> +		/*
> +		 * If reclaim was invoked by an allocation done while
> +		 * holding the struct mutex, we cannot call
> +		 * synchronize_rcu_expedited() as it depends on
> +		 * workqueues to run but the running workqueue may be
> +		 * blocked waiting on us to release struct_mutex.
> +		 */
> +		synchronize_rcu_expedited();
> +
>  	return freed;
>  }
>  
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 67b1fc5..3982489 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4742,6 +4742,13 @@  int i915_gem_freeze(struct drm_i915_private *dev_priv)
 	i915_gem_shrink_all(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
+	/*
+	 * Cannot call synchronize_rcu() inside the struct_mutex
+	 * because it may block until workqueues complete, and the
+	 * running workqueue may wait on the struct_mutex.
+	 */
+	synchronize_rcu(); /* wait for our earlier RCU delayed slab frees */
+
 	intel_runtime_pm_put(dev_priv);
 
 	return 0;
@@ -4781,6 +4788,8 @@  int i915_gem_freeze_late(struct drm_i915_private *dev_priv)
 	}
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
+	synchronize_rcu_expedited();
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index d5d2b4c..fea1454 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -235,9 +235,6 @@  i915_gem_shrink(struct drm_i915_private *dev_priv,
 	if (unlock)
 		mutex_unlock(&dev_priv->drm.struct_mutex);
 
-	/* expedite the RCU grace period to free some request slabs */
-	synchronize_rcu_expedited();
-
 	return count;
 }
 
@@ -263,7 +260,6 @@  unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv)
 				I915_SHRINK_BOUND |
 				I915_SHRINK_UNBOUND |
 				I915_SHRINK_ACTIVE);
-	synchronize_rcu(); /* wait for our earlier RCU delayed slab frees */
 
 	return freed;
 }
@@ -324,6 +320,16 @@  i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 	if (unlock)
 		mutex_unlock(&dev->struct_mutex);
 
+	if (likely(__mutex_owner(&dev->struct_mutex) != current))
+		/*
+		 * If reclaim was invoked by an allocation done while
+		 * holding the struct mutex, we cannot call
+		 * synchronize_rcu_expedited() as it depends on
+		 * workqueues to run but the running workqueue may be
+		 * blocked waiting on us to release struct_mutex.
+		 */
+		synchronize_rcu_expedited();
+
 	return freed;
 }