Message ID | 20171009164401.16035-2-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Daniel Vetter (2017-10-09 17:44:01) > stop_machine is not really a locking primitive we should use, except > when the hw folks tell us the hw is broken and that's the only way to > work around it. > > This patch tries to address the locking abuse of stop_machine() from > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Tue Nov 22 14:41:21 2016 +0000 > > drm/i915: Stop the machine as we install the wedged submit_request handler > > Chris said parts of the reasons for going with stop_machine() was that > it's no overhead for the fast-path. But these callbacks use irqsave > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > To stay as close as possible to the stop_machine semantics we first > update all the submit function pointers to the nop handler, then call > synchronize_rcu() to make sure no new requests can be submitted. This > should give us exactly the huge barrier we want. > > I pondered whether we should annotate engine->submit_request as __rcu > and use rcu_assign_pointer and rcu_dereference on it. But the reason > behind those is to make sure the compiler/cpu barriers are there for > when you have an actual data structure you point at, to make sure all > the writes are seen correctly on the read side. But we just have a > function pointer, and .text isn't changed, so no need for these > barriers and hence no need for annotations. > > Unfortunately there's a complication with the call to > intel_engine_init_global_seqno: This is still broken in the same way as nop_submit_request may execute while you sleep, breaking cancel_requests. -Chris
Daniel Vetter <daniel.vetter@ffwll.ch> writes: > stop_machine is not really a locking primitive we should use, except > when the hw folks tell us the hw is broken and that's the only way to > work around it. > > This patch tries to address the locking abuse of stop_machine() from > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Tue Nov 22 14:41:21 2016 +0000 > > drm/i915: Stop the machine as we install the wedged submit_request handler > > Chris said parts of the reasons for going with stop_machine() was that > it's no overhead for the fast-path. But these callbacks use irqsave > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > To stay as close as possible to the stop_machine semantics we first > update all the submit function pointers to the nop handler, then call > synchronize_rcu() to make sure no new requests can be submitted. This > should give us exactly the huge barrier we want. > > I pondered whether we should annotate engine->submit_request as __rcu > and use rcu_assign_pointer and rcu_dereference on it. But the reason > behind those is to make sure the compiler/cpu barriers are there for > when you have an actual data structure you point at, to make sure all > the writes are seen correctly on the read side. But we just have a > function pointer, and .text isn't changed, so no need for these > barriers and hence no need for annotations. > > Unfortunately there's a complication with the call to > intel_engine_init_global_seqno: > > - Without stop_machine we must hold the corresponding spinlock. > > - Without stop_machine we must ensure that all requests are marked as > having failed with dma_fence_set_error() before we call it. That > means we need to split the nop request submission into two phases, > both synchronized with rcu: > > 1. Only stop submitting the requests to hw and mark them as failed. > > 2. After all pending requests in the scheduler/ring are suitably > marked up as failed and we can force complete them all, also force > complete by calling intel_engine_init_global_seqno(). > > This should fix the followwing lockdep splat: > > ====================================================== > WARNING: possible circular locking dependency detected > 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U > ------------------------------------------------------ > kworker/3:4/562 is trying to acquire lock: > (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40 > > but task is already holding lock: > (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #6 (&dev->struct_mutex){+.+.}: > __lock_acquire+0x1420/0x15e0 > lock_acquire+0xb0/0x200 > __mutex_lock+0x86/0x9b0 > mutex_lock_interruptible_nested+0x1b/0x20 > i915_mutex_lock_interruptible+0x51/0x130 [i915] > i915_gem_fault+0x209/0x650 [i915] > __do_fault+0x1e/0x80 > __handle_mm_fault+0xa08/0xed0 > handle_mm_fault+0x156/0x300 > __do_page_fault+0x2c5/0x570 > do_page_fault+0x28/0x250 > page_fault+0x22/0x30 > > -> #5 (&mm->mmap_sem){++++}: > __lock_acquire+0x1420/0x15e0 > lock_acquire+0xb0/0x200 > __might_fault+0x68/0x90 > _copy_to_user+0x23/0x70 > filldir+0xa5/0x120 > dcache_readdir+0xf9/0x170 > iterate_dir+0x69/0x1a0 > SyS_getdents+0xa5/0x140 > entry_SYSCALL_64_fastpath+0x1c/0xb1 > > -> #4 (&sb->s_type->i_mutex_key#5){++++}: > down_write+0x3b/0x70 > handle_create+0xcb/0x1e0 > devtmpfsd+0x139/0x180 > kthread+0x152/0x190 > ret_from_fork+0x27/0x40 > > -> #3 ((complete)&req.done){+.+.}: > __lock_acquire+0x1420/0x15e0 > lock_acquire+0xb0/0x200 > wait_for_common+0x58/0x210 > wait_for_completion+0x1d/0x20 > devtmpfs_create_node+0x13d/0x160 > device_add+0x5eb/0x620 > device_create_groups_vargs+0xe0/0xf0 > device_create+0x3a/0x40 > msr_device_create+0x2b/0x40 > cpuhp_invoke_callback+0xc9/0xbf0 > cpuhp_thread_fun+0x17b/0x240 > smpboot_thread_fn+0x18a/0x280 > kthread+0x152/0x190 > ret_from_fork+0x27/0x40 > > -> #2 (cpuhp_state-up){+.+.}: > __lock_acquire+0x1420/0x15e0 > lock_acquire+0xb0/0x200 > cpuhp_issue_call+0x133/0x1c0 > __cpuhp_setup_state_cpuslocked+0x139/0x2a0 > __cpuhp_setup_state+0x46/0x60 > page_writeback_init+0x43/0x67 > pagecache_init+0x3d/0x42 > start_kernel+0x3a8/0x3fc > x86_64_start_reservations+0x2a/0x2c > x86_64_start_kernel+0x6d/0x70 > verify_cpu+0x0/0xfb > > -> #1 (cpuhp_state_mutex){+.+.}: > __lock_acquire+0x1420/0x15e0 > lock_acquire+0xb0/0x200 > __mutex_lock+0x86/0x9b0 > mutex_lock_nested+0x1b/0x20 > __cpuhp_setup_state_cpuslocked+0x53/0x2a0 > __cpuhp_setup_state+0x46/0x60 > page_alloc_init+0x28/0x30 > start_kernel+0x145/0x3fc > x86_64_start_reservations+0x2a/0x2c > x86_64_start_kernel+0x6d/0x70 > verify_cpu+0x0/0xfb > > -> #0 (cpu_hotplug_lock.rw_sem){++++}: > check_prev_add+0x430/0x840 > __lock_acquire+0x1420/0x15e0 > lock_acquire+0xb0/0x200 > cpus_read_lock+0x3d/0xb0 > stop_machine+0x1c/0x40 > i915_gem_set_wedged+0x1a/0x20 [i915] > i915_reset+0xb9/0x230 [i915] > i915_reset_device+0x1f6/0x260 [i915] > i915_handle_error+0x2d8/0x430 [i915] > hangcheck_declare_hang+0xd3/0xf0 [i915] > i915_hangcheck_elapsed+0x262/0x2d0 [i915] > process_one_work+0x233/0x660 > worker_thread+0x4e/0x3b0 > kthread+0x152/0x190 > ret_from_fork+0x27/0x40 > > other info that might help us debug this: > > Chain exists of: > cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&dev->struct_mutex); > lock(&mm->mmap_sem); > lock(&dev->struct_mutex); > lock(cpu_hotplug_lock.rw_sem); > > *** DEADLOCK *** > > 3 locks held by kworker/3:4/562: > #0: ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 > #1: ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 > #2: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] > > stack backtrace: > CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G U 4.14.0-rc3-CI-CI_DRM_3179+ #1 > Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017 > Workqueue: events_long i915_hangcheck_elapsed [i915] > Call Trace: > dump_stack+0x68/0x9f > print_circular_bug+0x235/0x3c0 > ? lockdep_init_map_crosslock+0x20/0x20 > check_prev_add+0x430/0x840 > ? irq_work_queue+0x86/0xe0 > ? wake_up_klogd+0x53/0x70 > __lock_acquire+0x1420/0x15e0 > ? __lock_acquire+0x1420/0x15e0 > ? lockdep_init_map_crosslock+0x20/0x20 > lock_acquire+0xb0/0x200 > ? stop_machine+0x1c/0x40 > ? i915_gem_object_truncate+0x50/0x50 [i915] > cpus_read_lock+0x3d/0xb0 > ? stop_machine+0x1c/0x40 > stop_machine+0x1c/0x40 > i915_gem_set_wedged+0x1a/0x20 [i915] > i915_reset+0xb9/0x230 [i915] > i915_reset_device+0x1f6/0x260 [i915] > ? gen8_gt_irq_ack+0x170/0x170 [i915] > ? work_on_cpu_safe+0x60/0x60 > i915_handle_error+0x2d8/0x430 [i915] > ? vsnprintf+0xd1/0x4b0 > ? scnprintf+0x3a/0x70 > hangcheck_declare_hang+0xd3/0xf0 [i915] > ? intel_runtime_pm_put+0x56/0xa0 [i915] > i915_hangcheck_elapsed+0x262/0x2d0 [i915] > process_one_work+0x233/0x660 > worker_thread+0x4e/0x3b0 > kthread+0x152/0x190 > ? process_one_work+0x660/0x660 > ? kthread_create_on_node+0x40/0x40 > ret_from_fork+0x27/0x40 > Setting dangerous option reset - tainting kernel > i915 0000:00:02.0: Resetting chip after gpu hang > Setting dangerous option reset - tainting kernel > i915 0000:00:02.0: Resetting chip after gpu hang > > v2: Have 1 global synchronize_rcu() barrier across all engines, and > improve commit message. > > v3: We need to protect the seqno update with the timeline spinlock (in > set_wedged) to avoid racing with other updates of the seqno, like we > already do in nop_submit_request (Chris). > > v4: Use two-phase sequence to plug the race Chris spotted where we can > complete requests before they're marked up with -EIO. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096 > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Marta Lofstedt <marta.lofstedt@intel.com> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > drivers/gpu/drm/i915/i915_gem.c | 76 +++++++++++++++-------- > drivers/gpu/drm/i915/i915_gem_request.c | 2 + > drivers/gpu/drm/i915/selftests/i915_gem_request.c | 2 + > 3 files changed, 53 insertions(+), 27 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 82a10036fb38..8d7d8d1f78db 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3057,49 +3057,71 @@ static void nop_submit_request(struct drm_i915_gem_request *request) > > spin_lock_irqsave(&request->engine->timeline->lock, flags); > __i915_gem_request_submit(request); > - intel_engine_init_global_seqno(request->engine, request->global_seqno); > spin_unlock_irqrestore(&request->engine->timeline->lock, flags); > } > > -static void engine_set_wedged(struct intel_engine_cs *engine) > +static void nop_complete_submit_request(struct drm_i915_gem_request *request) > { > - /* We need to be sure that no thread is running the old callback as > - * we install the nop handler (otherwise we would submit a request > - * to hardware that will never complete). In order to prevent this > - * race, we wait until the machine is idle before making the swap > - * (using stop_machine()). > - */ > - engine->submit_request = nop_submit_request; > + unsigned long flags; > > - /* Mark all executing requests as skipped */ > - engine->cancel_requests(engine); > + GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error)); > + dma_fence_set_error(&request->fence, -EIO); > > - /* Mark all pending requests as complete so that any concurrent > - * (lockless) lookup doesn't try and wait upon the request as we > - * reset it. > - */ > - intel_engine_init_global_seqno(engine, > - intel_engine_last_submit(engine)); > + spin_lock_irqsave(&request->engine->timeline->lock, flags); > + __i915_gem_request_submit(request); > + intel_engine_init_global_seqno(request->engine, request->global_seqno); > + spin_unlock_irqrestore(&request->engine->timeline->lock, flags); > } > > -static int __i915_gem_set_wedged_BKL(void *data) > +void i915_gem_set_wedged(struct drm_i915_private *i915) > { > - struct drm_i915_private *i915 = data; > struct intel_engine_cs *engine; > enum intel_engine_id id; > > + /* First, stop submission to hw, but do not yet complete requests by > + * rolling the global seqno forward (since this would complete requests > + * for which we haven't set the fence error to EIO yet). > + */ > for_each_engine(engine, i915, id) > - engine_set_wedged(engine); > + engine->submit_request = nop_submit_request; > > - set_bit(I915_WEDGED, &i915->gpu_error.flags); > - wake_up_all(&i915->gpu_error.reset_queue); > + /* Make sure no one is running the old callback before we proceed with > + * cancelling requests and resetting the completion tracking. Otherwise > + * we might submit a request to the hardware which never completes. > + */ > + synchronize_rcu(); > > - return 0; > -} > + for_each_engine(engine, i915, id) { > + /* Mark all executing requests as skipped */ > + engine->cancel_requests(engine); > > -void i915_gem_set_wedged(struct drm_i915_private *dev_priv) > -{ > - stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL); > + /* Only once we've force-cancelled all in-flight requests can we > + * start to complete all requests. > + */ > + engine->submit_request = nop_complete_submit_request; > + } > + > + /* Make sure no request can slip through without getting completed by > + * either this call here to intel_engine_init_global_seqno, or the one > + * in nop_complete_submit_request. > + */ > + synchronize_rcu(); > + > + for_each_engine(engine, i915, id) { > + unsigned long flags; > + > + /* Mark all pending requests as complete so that any concurrent > + * (lockless) lookup doesn't try and wait upon the request as we > + * reset it. > + */ > + spin_lock_irqsave(&engine->timeline->lock, flags); > + intel_engine_init_global_seqno(engine, > + intel_engine_last_submit(engine)); > + spin_unlock_irqrestore(&engine->timeline->lock, flags); > + } > + > + set_bit(I915_WEDGED, &i915->gpu_error.flags); Can we this set_bit above the loop, right after synchronize_rcu()? Thinking is that we stop accepting requests we can't complete the earliest. -Mika > + wake_up_all(&i915->gpu_error.reset_queue); > } > > bool i915_gem_unset_wedged(struct drm_i915_private *i915) > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c > index b100b38f1dd2..ef78a85cb845 100644 > --- a/drivers/gpu/drm/i915/i915_gem_request.c > +++ b/drivers/gpu/drm/i915/i915_gem_request.c > @@ -556,7 +556,9 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > switch (state) { > case FENCE_COMPLETE: > trace_i915_gem_request_submit(request); > + rcu_read_lock(); > request->engine->submit_request(request); > + rcu_read_unlock(); > break; > > case FENCE_FREE: > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_request.c b/drivers/gpu/drm/i915/selftests/i915_gem_request.c > index 78b9f811707f..a999161e8db1 100644 > --- a/drivers/gpu/drm/i915/selftests/i915_gem_request.c > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_request.c > @@ -215,7 +215,9 @@ static int igt_request_rewind(void *arg) > } > i915_gem_request_get(vip); > i915_add_request(vip); > + rcu_read_lock(); > request->engine->submit_request(request); > + rcu_read_unlock(); > > mutex_unlock(&i915->drm.struct_mutex); > > -- > 2.14.1
On Tue, Oct 10, 2017 at 10:21:45AM +0100, Chris Wilson wrote: > Quoting Daniel Vetter (2017-10-09 17:44:01) > > stop_machine is not really a locking primitive we should use, except > > when the hw folks tell us the hw is broken and that's the only way to > > work around it. > > > > This patch tries to address the locking abuse of stop_machine() from > > > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > Date: Tue Nov 22 14:41:21 2016 +0000 > > > > drm/i915: Stop the machine as we install the wedged submit_request handler > > > > Chris said parts of the reasons for going with stop_machine() was that > > it's no overhead for the fast-path. But these callbacks use irqsave > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > > > To stay as close as possible to the stop_machine semantics we first > > update all the submit function pointers to the nop handler, then call > > synchronize_rcu() to make sure no new requests can be submitted. This > > should give us exactly the huge barrier we want. > > > > I pondered whether we should annotate engine->submit_request as __rcu > > and use rcu_assign_pointer and rcu_dereference on it. But the reason > > behind those is to make sure the compiler/cpu barriers are there for > > when you have an actual data structure you point at, to make sure all > > the writes are seen correctly on the read side. But we just have a > > function pointer, and .text isn't changed, so no need for these > > barriers and hence no need for annotations. > > > > Unfortunately there's a complication with the call to > > intel_engine_init_global_seqno: > > This is still broken in the same way as nop_submit_request may execute > while you sleep, breaking cancel_requests. I guess then I didn't understand which race you mean, since I think the one I've found should be fixed now. Can you pls explain in more detail what exactly is racing against what else? From what I can tell legacy cancel_request is entirely under the spinlock, so hard to race, same for lrc. And with the global seqno update untangled, they shouldn't complete too early anymore, which I thought was the race you pointed out on the previous thread. I did reply to that to check my understanding, but didn't get a reply. Thanks, Daniel
On Tue, Oct 10, 2017 at 12:50:45PM +0300, Mika Kuoppala wrote: > Daniel Vetter <daniel.vetter@ffwll.ch> writes: > > > stop_machine is not really a locking primitive we should use, except > > when the hw folks tell us the hw is broken and that's the only way to > > work around it. > > > > This patch tries to address the locking abuse of stop_machine() from > > > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > Date: Tue Nov 22 14:41:21 2016 +0000 > > > > drm/i915: Stop the machine as we install the wedged submit_request handler > > > > Chris said parts of the reasons for going with stop_machine() was that > > it's no overhead for the fast-path. But these callbacks use irqsave > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > > > To stay as close as possible to the stop_machine semantics we first > > update all the submit function pointers to the nop handler, then call > > synchronize_rcu() to make sure no new requests can be submitted. This > > should give us exactly the huge barrier we want. > > > > I pondered whether we should annotate engine->submit_request as __rcu > > and use rcu_assign_pointer and rcu_dereference on it. But the reason > > behind those is to make sure the compiler/cpu barriers are there for > > when you have an actual data structure you point at, to make sure all > > the writes are seen correctly on the read side. But we just have a > > function pointer, and .text isn't changed, so no need for these > > barriers and hence no need for annotations. > > > > Unfortunately there's a complication with the call to > > intel_engine_init_global_seqno: > > > > - Without stop_machine we must hold the corresponding spinlock. > > > > - Without stop_machine we must ensure that all requests are marked as > > having failed with dma_fence_set_error() before we call it. That > > means we need to split the nop request submission into two phases, > > both synchronized with rcu: > > > > 1. Only stop submitting the requests to hw and mark them as failed. > > > > 2. After all pending requests in the scheduler/ring are suitably > > marked up as failed and we can force complete them all, also force > > complete by calling intel_engine_init_global_seqno(). > > > > This should fix the followwing lockdep splat: > > > > ====================================================== > > WARNING: possible circular locking dependency detected > > 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U > > ------------------------------------------------------ > > kworker/3:4/562 is trying to acquire lock: > > (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40 > > > > but task is already holding lock: > > (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #6 (&dev->struct_mutex){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > __mutex_lock+0x86/0x9b0 > > mutex_lock_interruptible_nested+0x1b/0x20 > > i915_mutex_lock_interruptible+0x51/0x130 [i915] > > i915_gem_fault+0x209/0x650 [i915] > > __do_fault+0x1e/0x80 > > __handle_mm_fault+0xa08/0xed0 > > handle_mm_fault+0x156/0x300 > > __do_page_fault+0x2c5/0x570 > > do_page_fault+0x28/0x250 > > page_fault+0x22/0x30 > > > > -> #5 (&mm->mmap_sem){++++}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > __might_fault+0x68/0x90 > > _copy_to_user+0x23/0x70 > > filldir+0xa5/0x120 > > dcache_readdir+0xf9/0x170 > > iterate_dir+0x69/0x1a0 > > SyS_getdents+0xa5/0x140 > > entry_SYSCALL_64_fastpath+0x1c/0xb1 > > > > -> #4 (&sb->s_type->i_mutex_key#5){++++}: > > down_write+0x3b/0x70 > > handle_create+0xcb/0x1e0 > > devtmpfsd+0x139/0x180 > > kthread+0x152/0x190 > > ret_from_fork+0x27/0x40 > > > > -> #3 ((complete)&req.done){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > wait_for_common+0x58/0x210 > > wait_for_completion+0x1d/0x20 > > devtmpfs_create_node+0x13d/0x160 > > device_add+0x5eb/0x620 > > device_create_groups_vargs+0xe0/0xf0 > > device_create+0x3a/0x40 > > msr_device_create+0x2b/0x40 > > cpuhp_invoke_callback+0xc9/0xbf0 > > cpuhp_thread_fun+0x17b/0x240 > > smpboot_thread_fn+0x18a/0x280 > > kthread+0x152/0x190 > > ret_from_fork+0x27/0x40 > > > > -> #2 (cpuhp_state-up){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > cpuhp_issue_call+0x133/0x1c0 > > __cpuhp_setup_state_cpuslocked+0x139/0x2a0 > > __cpuhp_setup_state+0x46/0x60 > > page_writeback_init+0x43/0x67 > > pagecache_init+0x3d/0x42 > > start_kernel+0x3a8/0x3fc > > x86_64_start_reservations+0x2a/0x2c > > x86_64_start_kernel+0x6d/0x70 > > verify_cpu+0x0/0xfb > > > > -> #1 (cpuhp_state_mutex){+.+.}: > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > __mutex_lock+0x86/0x9b0 > > mutex_lock_nested+0x1b/0x20 > > __cpuhp_setup_state_cpuslocked+0x53/0x2a0 > > __cpuhp_setup_state+0x46/0x60 > > page_alloc_init+0x28/0x30 > > start_kernel+0x145/0x3fc > > x86_64_start_reservations+0x2a/0x2c > > x86_64_start_kernel+0x6d/0x70 > > verify_cpu+0x0/0xfb > > > > -> #0 (cpu_hotplug_lock.rw_sem){++++}: > > check_prev_add+0x430/0x840 > > __lock_acquire+0x1420/0x15e0 > > lock_acquire+0xb0/0x200 > > cpus_read_lock+0x3d/0xb0 > > stop_machine+0x1c/0x40 > > i915_gem_set_wedged+0x1a/0x20 [i915] > > i915_reset+0xb9/0x230 [i915] > > i915_reset_device+0x1f6/0x260 [i915] > > i915_handle_error+0x2d8/0x430 [i915] > > hangcheck_declare_hang+0xd3/0xf0 [i915] > > i915_hangcheck_elapsed+0x262/0x2d0 [i915] > > process_one_work+0x233/0x660 > > worker_thread+0x4e/0x3b0 > > kthread+0x152/0x190 > > ret_from_fork+0x27/0x40 > > > > other info that might help us debug this: > > > > Chain exists of: > > cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > lock(&dev->struct_mutex); > > lock(&mm->mmap_sem); > > lock(&dev->struct_mutex); > > lock(cpu_hotplug_lock.rw_sem); > > > > *** DEADLOCK *** > > > > 3 locks held by kworker/3:4/562: > > #0: ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 > > #1: ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 > > #2: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] > > > > stack backtrace: > > CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G U 4.14.0-rc3-CI-CI_DRM_3179+ #1 > > Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017 > > Workqueue: events_long i915_hangcheck_elapsed [i915] > > Call Trace: > > dump_stack+0x68/0x9f > > print_circular_bug+0x235/0x3c0 > > ? lockdep_init_map_crosslock+0x20/0x20 > > check_prev_add+0x430/0x840 > > ? irq_work_queue+0x86/0xe0 > > ? wake_up_klogd+0x53/0x70 > > __lock_acquire+0x1420/0x15e0 > > ? __lock_acquire+0x1420/0x15e0 > > ? lockdep_init_map_crosslock+0x20/0x20 > > lock_acquire+0xb0/0x200 > > ? stop_machine+0x1c/0x40 > > ? i915_gem_object_truncate+0x50/0x50 [i915] > > cpus_read_lock+0x3d/0xb0 > > ? stop_machine+0x1c/0x40 > > stop_machine+0x1c/0x40 > > i915_gem_set_wedged+0x1a/0x20 [i915] > > i915_reset+0xb9/0x230 [i915] > > i915_reset_device+0x1f6/0x260 [i915] > > ? gen8_gt_irq_ack+0x170/0x170 [i915] > > ? work_on_cpu_safe+0x60/0x60 > > i915_handle_error+0x2d8/0x430 [i915] > > ? vsnprintf+0xd1/0x4b0 > > ? scnprintf+0x3a/0x70 > > hangcheck_declare_hang+0xd3/0xf0 [i915] > > ? intel_runtime_pm_put+0x56/0xa0 [i915] > > i915_hangcheck_elapsed+0x262/0x2d0 [i915] > > process_one_work+0x233/0x660 > > worker_thread+0x4e/0x3b0 > > kthread+0x152/0x190 > > ? process_one_work+0x660/0x660 > > ? kthread_create_on_node+0x40/0x40 > > ret_from_fork+0x27/0x40 > > Setting dangerous option reset - tainting kernel > > i915 0000:00:02.0: Resetting chip after gpu hang > > Setting dangerous option reset - tainting kernel > > i915 0000:00:02.0: Resetting chip after gpu hang > > > > v2: Have 1 global synchronize_rcu() barrier across all engines, and > > improve commit message. > > > > v3: We need to protect the seqno update with the timeline spinlock (in > > set_wedged) to avoid racing with other updates of the seqno, like we > > already do in nop_submit_request (Chris). > > > > v4: Use two-phase sequence to plug the race Chris spotted where we can > > complete requests before they're marked up with -EIO. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886 > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096 > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Cc: Marta Lofstedt <marta.lofstedt@intel.com> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 76 +++++++++++++++-------- > > drivers/gpu/drm/i915/i915_gem_request.c | 2 + > > drivers/gpu/drm/i915/selftests/i915_gem_request.c | 2 + > > 3 files changed, 53 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 82a10036fb38..8d7d8d1f78db 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3057,49 +3057,71 @@ static void nop_submit_request(struct drm_i915_gem_request *request) > > > > spin_lock_irqsave(&request->engine->timeline->lock, flags); > > __i915_gem_request_submit(request); > > - intel_engine_init_global_seqno(request->engine, request->global_seqno); > > spin_unlock_irqrestore(&request->engine->timeline->lock, flags); > > } > > > > -static void engine_set_wedged(struct intel_engine_cs *engine) > > +static void nop_complete_submit_request(struct drm_i915_gem_request *request) > > { > > - /* We need to be sure that no thread is running the old callback as > > - * we install the nop handler (otherwise we would submit a request > > - * to hardware that will never complete). In order to prevent this > > - * race, we wait until the machine is idle before making the swap > > - * (using stop_machine()). > > - */ > > - engine->submit_request = nop_submit_request; > > + unsigned long flags; > > > > - /* Mark all executing requests as skipped */ > > - engine->cancel_requests(engine); > > + GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error)); > > + dma_fence_set_error(&request->fence, -EIO); > > > > - /* Mark all pending requests as complete so that any concurrent > > - * (lockless) lookup doesn't try and wait upon the request as we > > - * reset it. > > - */ > > - intel_engine_init_global_seqno(engine, > > - intel_engine_last_submit(engine)); > > + spin_lock_irqsave(&request->engine->timeline->lock, flags); > > + __i915_gem_request_submit(request); > > + intel_engine_init_global_seqno(request->engine, request->global_seqno); > > + spin_unlock_irqrestore(&request->engine->timeline->lock, flags); > > } > > > > -static int __i915_gem_set_wedged_BKL(void *data) > > +void i915_gem_set_wedged(struct drm_i915_private *i915) > > { > > - struct drm_i915_private *i915 = data; > > struct intel_engine_cs *engine; > > enum intel_engine_id id; > > > > + /* First, stop submission to hw, but do not yet complete requests by > > + * rolling the global seqno forward (since this would complete requests > > + * for which we haven't set the fence error to EIO yet). > > + */ > > for_each_engine(engine, i915, id) > > - engine_set_wedged(engine); > > + engine->submit_request = nop_submit_request; > > > > - set_bit(I915_WEDGED, &i915->gpu_error.flags); > > - wake_up_all(&i915->gpu_error.reset_queue); > > + /* Make sure no one is running the old callback before we proceed with > > + * cancelling requests and resetting the completion tracking. Otherwise > > + * we might submit a request to the hardware which never completes. > > + */ > > + synchronize_rcu(); > > > > - return 0; > > -} > > + for_each_engine(engine, i915, id) { > > + /* Mark all executing requests as skipped */ > > + engine->cancel_requests(engine); > > > > -void i915_gem_set_wedged(struct drm_i915_private *dev_priv) > > -{ > > - stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL); > > + /* Only once we've force-cancelled all in-flight requests can we > > + * start to complete all requests. > > + */ > > + engine->submit_request = nop_complete_submit_request; > > + } > > + > > + /* Make sure no request can slip through without getting completed by > > + * either this call here to intel_engine_init_global_seqno, or the one > > + * in nop_complete_submit_request. > > + */ > > + synchronize_rcu(); > > + > > + for_each_engine(engine, i915, id) { > > + unsigned long flags; > > + > > + /* Mark all pending requests as complete so that any concurrent > > + * (lockless) lookup doesn't try and wait upon the request as we > > + * reset it. > > + */ > > + spin_lock_irqsave(&engine->timeline->lock, flags); > > + intel_engine_init_global_seqno(engine, > > + intel_engine_last_submit(engine)); > > + spin_unlock_irqrestore(&engine->timeline->lock, flags); > > + } > > + > > + set_bit(I915_WEDGED, &i915->gpu_error.flags); > > Can we this set_bit above the loop, right after synchronize_rcu()? > Thinking is that we stop accepting requests we can't complete > the earliest. tbh I have no idea what this thing does there anymore, it looks like this will terminally wedge the gpu and refuse all subsequent requests (see i915_gem_request_alloc), except that i915_gem_set_wedged seems to be the new way to recover a fully hung gpu (and then later on requests are ok again). In short, I have no idea what this does nor where it should be put. Can you pls explain more? Thanks, Daniel > > -Mika > > > + wake_up_all(&i915->gpu_error.reset_queue); > > } > > > > bool i915_gem_unset_wedged(struct drm_i915_private *i915) > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c > > index b100b38f1dd2..ef78a85cb845 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_request.c > > +++ b/drivers/gpu/drm/i915/i915_gem_request.c > > @@ -556,7 +556,9 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > > switch (state) { > > case FENCE_COMPLETE: > > trace_i915_gem_request_submit(request); > > + rcu_read_lock(); > > request->engine->submit_request(request); > > + rcu_read_unlock(); > > break; > > > > case FENCE_FREE: > > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_request.c b/drivers/gpu/drm/i915/selftests/i915_gem_request.c > > index 78b9f811707f..a999161e8db1 100644 > > --- a/drivers/gpu/drm/i915/selftests/i915_gem_request.c > > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_request.c > > @@ -215,7 +215,9 @@ static int igt_request_rewind(void *arg) > > } > > i915_gem_request_get(vip); > > i915_add_request(vip); > > + rcu_read_lock(); > > request->engine->submit_request(request); > > + rcu_read_unlock(); > > > > mutex_unlock(&i915->drm.struct_mutex); > > > > -- > > 2.14.1
Quoting Chris Wilson (2017-10-10 10:21:45) > Quoting Daniel Vetter (2017-10-09 17:44:01) > > stop_machine is not really a locking primitive we should use, except > > when the hw folks tell us the hw is broken and that's the only way to > > work around it. > > > > This patch tries to address the locking abuse of stop_machine() from > > > > commit 20e4933c478a1ca694b38fa4ac44d99e659941f5 > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > Date: Tue Nov 22 14:41:21 2016 +0000 > > > > drm/i915: Stop the machine as we install the wedged submit_request handler > > > > Chris said parts of the reasons for going with stop_machine() was that > > it's no overhead for the fast-path. But these callbacks use irqsave > > spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. > > > > To stay as close as possible to the stop_machine semantics we first > > update all the submit function pointers to the nop handler, then call > > synchronize_rcu() to make sure no new requests can be submitted. This > > should give us exactly the huge barrier we want. > > > > I pondered whether we should annotate engine->submit_request as __rcu > > and use rcu_assign_pointer and rcu_dereference on it. But the reason > > behind those is to make sure the compiler/cpu barriers are there for > > when you have an actual data structure you point at, to make sure all > > the writes are seen correctly on the read side. But we just have a > > function pointer, and .text isn't changed, so no need for these > > barriers and hence no need for annotations. > > > > Unfortunately there's a complication with the call to > > intel_engine_init_global_seqno: > > This is still broken in the same way as nop_submit_request may execute > while you sleep, breaking cancel_requests. My apologies, I misread the diff and didn't catch the removal of init_seqno. From that pov, the execlists should be intact and I can't see a way for a lack of -EIO being reported. (That we have them is a topic for another day, as set-wedged is not meant to be regarded as a normal operation, but user data loss.) The residual panic I have is that we had to throw set-wedged vs unwedge ordering out of the window; it used to be struct_mutex. Making set-wedged prolonged only makes the window wider, it didn't create that window, and at the moment it's carefully ordered inside the i915_handle_error fallback. (We blissfully ignore debugfs.) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
Style nits: Quoting Daniel Vetter (2017-10-09 17:44:01) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 82a10036fb38..8d7d8d1f78db 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3057,49 +3057,71 @@ static void nop_submit_request(struct drm_i915_gem_request *request) > > spin_lock_irqsave(&request->engine->timeline->lock, flags); > __i915_gem_request_submit(request); > - intel_engine_init_global_seqno(request->engine, request->global_seqno); > spin_unlock_irqrestore(&request->engine->timeline->lock, flags); This reduced to i915_gem_request_submit(). Hah, you can just engine->submit_request = i915_gem_request_submit, and keep the nop_submit_request for the second phase! That may make the diff neater? > -static int __i915_gem_set_wedged_BKL(void *data) > +void i915_gem_set_wedged(struct drm_i915_private *i915) > { > - struct drm_i915_private *i915 = data; > struct intel_engine_cs *engine; > enum intel_engine_id id; > > + /* First, stop submission to hw, but do not yet complete requests by > + * rolling the global seqno forward (since this would complete requests > + * for which we haven't set the fence error to EIO yet). > + */ I'm doing a quiet fixup of all my comments to follow /* * The core convention, which you normally use anyway. */ > for_each_engine(engine, i915, id) > - engine_set_wedged(engine); > + engine->submit_request = nop_submit_request; > > - set_bit(I915_WEDGED, &i915->gpu_error.flags); > - wake_up_all(&i915->gpu_error.reset_queue); > + /* Make sure no one is running the old callback before we proceed with > + * cancelling requests and resetting the completion tracking. Otherwise > + * we might submit a request to the hardware which never completes. > + */ > + synchronize_rcu(); > > - return 0; > -} > + for_each_engine(engine, i915, id) { > + /* Mark all executing requests as skipped */ > + engine->cancel_requests(engine); > > -void i915_gem_set_wedged(struct drm_i915_private *dev_priv) > -{ > - stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL); > + /* Only once we've force-cancelled all in-flight requests can we > + * start to complete all requests. > + */ > + engine->submit_request = nop_complete_submit_request; > + } > + > + /* Make sure no request can slip through without getting completed by > + * either this call here to intel_engine_init_global_seqno, or the one > + * in nop_complete_submit_request. > + */ > + synchronize_rcu(); > + > + for_each_engine(engine, i915, id) { > + unsigned long flags; > + > + /* Mark all pending requests as complete so that any concurrent > + * (lockless) lookup doesn't try and wait upon the request as we > + * reset it. > + */ > + spin_lock_irqsave(&engine->timeline->lock, flags); > + intel_engine_init_global_seqno(engine, > + intel_engine_last_submit(engine)); > + spin_unlock_irqrestore(&engine->timeline->lock, flags); > + } > + > + set_bit(I915_WEDGED, &i915->gpu_error.flags); > + wake_up_all(&i915->gpu_error.reset_queue); > } > > bool i915_gem_unset_wedged(struct drm_i915_private *i915) > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c > index b100b38f1dd2..ef78a85cb845 100644 > --- a/drivers/gpu/drm/i915/i915_gem_request.c > +++ b/drivers/gpu/drm/i915/i915_gem_request.c > @@ -556,7 +556,9 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > switch (state) { > case FENCE_COMPLETE: > trace_i915_gem_request_submit(request); > + rcu_read_lock(); This trick needs a comment. /* * We need to serialize use of the submit_request() callback with its * hotplugging performed during an emergency i915_gem_set_wedged(). * We use the RCU mechanism to mark the critical section in order to * force i915_gem_set_wedged() to wait until the submit_request() is * completed before proceeding. */ -Chris
On Tue, Oct 10, 2017 at 3:39 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > Style nits: > > Quoting Daniel Vetter (2017-10-09 17:44:01) >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c >> index 82a10036fb38..8d7d8d1f78db 100644 >> --- a/drivers/gpu/drm/i915/i915_gem.c >> +++ b/drivers/gpu/drm/i915/i915_gem.c >> @@ -3057,49 +3057,71 @@ static void nop_submit_request(struct drm_i915_gem_request *request) >> >> spin_lock_irqsave(&request->engine->timeline->lock, flags); >> __i915_gem_request_submit(request); >> - intel_engine_init_global_seqno(request->engine, request->global_seqno); >> spin_unlock_irqrestore(&request->engine->timeline->lock, flags); > > This reduced to i915_gem_request_submit(). > > Hah, you can just engine->submit_request = i915_gem_request_submit, and > keep the nop_submit_request for the second phase! That may make the > diff neater? We need to call dma_fence_set_error still, and we need to start doing that before we start calling ->cancel_request, or we might miss some requests. I'll do all the other suggestions and resubmit. -Daniel >> -static int __i915_gem_set_wedged_BKL(void *data) >> +void i915_gem_set_wedged(struct drm_i915_private *i915) >> { >> - struct drm_i915_private *i915 = data; >> struct intel_engine_cs *engine; >> enum intel_engine_id id; >> >> + /* First, stop submission to hw, but do not yet complete requests by >> + * rolling the global seqno forward (since this would complete requests >> + * for which we haven't set the fence error to EIO yet). >> + */ > > I'm doing a quiet fixup of all my comments to follow > /* > * The core convention, which you normally use anyway. > */ > >> for_each_engine(engine, i915, id) >> - engine_set_wedged(engine); >> + engine->submit_request = nop_submit_request; >> >> - set_bit(I915_WEDGED, &i915->gpu_error.flags); >> - wake_up_all(&i915->gpu_error.reset_queue); >> + /* Make sure no one is running the old callback before we proceed with >> + * cancelling requests and resetting the completion tracking. Otherwise >> + * we might submit a request to the hardware which never completes. >> + */ >> + synchronize_rcu(); >> >> - return 0; >> -} >> + for_each_engine(engine, i915, id) { >> + /* Mark all executing requests as skipped */ >> + engine->cancel_requests(engine); >> >> -void i915_gem_set_wedged(struct drm_i915_private *dev_priv) >> -{ >> - stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL); >> + /* Only once we've force-cancelled all in-flight requests can we >> + * start to complete all requests. >> + */ >> + engine->submit_request = nop_complete_submit_request; >> + } >> + >> + /* Make sure no request can slip through without getting completed by >> + * either this call here to intel_engine_init_global_seqno, or the one >> + * in nop_complete_submit_request. >> + */ >> + synchronize_rcu(); >> + >> + for_each_engine(engine, i915, id) { >> + unsigned long flags; >> + >> + /* Mark all pending requests as complete so that any concurrent >> + * (lockless) lookup doesn't try and wait upon the request as we >> + * reset it. >> + */ >> + spin_lock_irqsave(&engine->timeline->lock, flags); >> + intel_engine_init_global_seqno(engine, >> + intel_engine_last_submit(engine)); >> + spin_unlock_irqrestore(&engine->timeline->lock, flags); >> + } >> + >> + set_bit(I915_WEDGED, &i915->gpu_error.flags); >> + wake_up_all(&i915->gpu_error.reset_queue); >> } >> >> bool i915_gem_unset_wedged(struct drm_i915_private *i915) >> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c >> index b100b38f1dd2..ef78a85cb845 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_request.c >> +++ b/drivers/gpu/drm/i915/i915_gem_request.c >> @@ -556,7 +556,9 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) >> switch (state) { >> case FENCE_COMPLETE: >> trace_i915_gem_request_submit(request); >> + rcu_read_lock(); > > This trick needs a comment. > > /* > * We need to serialize use of the submit_request() callback with its > * hotplugging performed during an emergency i915_gem_set_wedged(). > * We use the RCU mechanism to mark the critical section in order to > * force i915_gem_set_wedged() to wait until the submit_request() is > * completed before proceeding. > */ > -Chris
====================================================== WARNING: possible circular locking dependency detected 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U ------------------------------------------------------ kworker/3:4/562 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40 but task is already holding lock: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&dev->struct_mutex){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_interruptible_nested+0x1b/0x20 i915_mutex_lock_interruptible+0x51/0x130 [i915] i915_gem_fault+0x209/0x650 [i915] __do_fault+0x1e/0x80 __handle_mm_fault+0xa08/0xed0 handle_mm_fault+0x156/0x300 __do_page_fault+0x2c5/0x570 do_page_fault+0x28/0x250 page_fault+0x22/0x30 -> #5 (&mm->mmap_sem){++++}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __might_fault+0x68/0x90 _copy_to_user+0x23/0x70 filldir+0xa5/0x120 dcache_readdir+0xf9/0x170 iterate_dir+0x69/0x1a0 SyS_getdents+0xa5/0x140 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #4 (&sb->s_type->i_mutex_key#5){++++}: down_write+0x3b/0x70 handle_create+0xcb/0x1e0 devtmpfsd+0x139/0x180 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #3 ((complete)&req.done){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 wait_for_common+0x58/0x210 wait_for_completion+0x1d/0x20 devtmpfs_create_node+0x13d/0x160 device_add+0x5eb/0x620 device_create_groups_vargs+0xe0/0xf0 device_create+0x3a/0x40 msr_device_create+0x2b/0x40 cpuhp_invoke_callback+0xc9/0xbf0 cpuhp_thread_fun+0x17b/0x240 smpboot_thread_fn+0x18a/0x280 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #2 (cpuhp_state-up){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpuhp_issue_call+0x133/0x1c0 __cpuhp_setup_state_cpuslocked+0x139/0x2a0 __cpuhp_setup_state+0x46/0x60 page_writeback_init+0x43/0x67 pagecache_init+0x3d/0x42 start_kernel+0x3a8/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #1 (cpuhp_state_mutex){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_nested+0x1b/0x20 __cpuhp_setup_state_cpuslocked+0x53/0x2a0 __cpuhp_setup_state+0x46/0x60 page_alloc_init+0x28/0x30 start_kernel+0x145/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #0 (cpu_hotplug_lock.rw_sem){++++}: check_prev_add+0x430/0x840 __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpus_read_lock+0x3d/0xb0 stop_machine+0x1c/0x40 i915_gem_set_wedged+0x1a/0x20 [i915] i915_reset+0xb9/0x230 [i915] i915_reset_device+0x1f6/0x260 [i915] i915_handle_error+0x2d8/0x430 [i915] hangcheck_declare_hang+0xd3/0xf0 [i915] i915_hangcheck_elapsed+0x262/0x2d0 [i915] process_one_work+0x233/0x660 worker_thread+0x4e/0x3b0 kthread+0x152/0x190 ret_from_fork+0x27/0x40 other info that might help us debug this: Chain exists of: cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dev->struct_mutex); lock(&mm->mmap_sem); lock(&dev->struct_mutex); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by kworker/3:4/562: #0: ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 #1: ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 #2: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] stack backtrace: CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G U 4.14.0-rc3-CI-CI_DRM_3179+ #1 Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017 Workqueue: events_long i915_hangcheck_elapsed [i915] Call Trace: dump_stack+0x68/0x9f print_circular_bug+0x235/0x3c0 ? lockdep_init_map_crosslock+0x20/0x20 check_prev_add+0x430/0x840 ? irq_work_queue+0x86/0xe0 ? wake_up_klogd+0x53/0x70 __lock_acquire+0x1420/0x15e0 ? __lock_acquire+0x1420/0x15e0 ? lockdep_init_map_crosslock+0x20/0x20 lock_acquire+0xb0/0x200 ? stop_machine+0x1c/0x40 ? i915_gem_object_truncate+0x50/0x50 [i915] cpus_read_lock+0x3d/0xb0 ? stop_machine+0x1c/0x40 stop_machine+0x1c/0x40 i915_gem_set_wedged+0x1a/0x20 [i915] i915_reset+0xb9/0x230 [i915] i915_reset_device+0x1f6/0x260 [i915] ? gen8_gt_irq_ack+0x170/0x170 [i915] ? work_on_cpu_safe+0x60/0x60 i915_handle_error+0x2d8/0x430 [i915] ? vsnprintf+0xd1/0x4b0 ? scnprintf+0x3a/0x70 hangcheck_declare_hang+0xd3/0xf0 [i915] ? intel_runtime_pm_put+0x56/0xa0 [i915] i915_hangcheck_elapsed+0x262/0x2d0 [i915] process_one_work+0x233/0x660 worker_thread+0x4e/0x3b0 kthread+0x152/0x190 ? process_one_work+0x660/0x660 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x27/0x40 Setting dangerous option reset - tainting kernel i915 0000:00:02.0: Resetting chip after gpu hang Setting dangerous option reset - tainting kernel i915 0000:00:02.0: Resetting chip after gpu hang v2: Have 1 global synchronize_rcu() barrier across all engines, and improve commit message. v3: We need to protect the seqno update with the timeline spinlock (in set_wedged) to avoid racing with other updates of the seqno, like we already do in nop_submit_request (Chris). v4: Use two-phase sequence to plug the race Chris spotted where we can complete requests before they're marked up with -EIO. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 76 +++++++++++++++-------- drivers/gpu/drm/i915/i915_gem_request.c | 2 + drivers/gpu/drm/i915/selftests/i915_gem_request.c | 2 + 3 files changed, 53 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 82a10036fb38..8d7d8d1f78db 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3057,49 +3057,71 @@ static void nop_submit_request(struct drm_i915_gem_request *request) spin_lock_irqsave(&request->engine->timeline->lock, flags); __i915_gem_request_submit(request); - intel_engine_init_global_seqno(request->engine, request->global_seqno); spin_unlock_irqrestore(&request->engine->timeline->lock, flags); } -static void engine_set_wedged(struct intel_engine_cs *engine) +static void nop_complete_submit_request(struct drm_i915_gem_request *request) { - /* We need to be sure that no thread is running the old callback as - * we install the nop handler (otherwise we would submit a request - * to hardware that will never complete). In order to prevent this - * race, we wait until the machine is idle before making the swap - * (using stop_machine()). - */ - engine->submit_request = nop_submit_request; + unsigned long flags; - /* Mark all executing requests as skipped */ - engine->cancel_requests(engine); + GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error)); + dma_fence_set_error(&request->fence, -EIO); - /* Mark all pending requests as complete so that any concurrent - * (lockless) lookup doesn't try and wait upon the request as we - * reset it. - */ - intel_engine_init_global_seqno(engine, - intel_engine_last_submit(engine)); + spin_lock_irqsave(&request->engine->timeline->lock, flags); + __i915_gem_request_submit(request); + intel_engine_init_global_seqno(request->engine, request->global_seqno); + spin_unlock_irqrestore(&request->engine->timeline->lock, flags); } -static int __i915_gem_set_wedged_BKL(void *data) +void i915_gem_set_wedged(struct drm_i915_private *i915) { - struct drm_i915_private *i915 = data; struct intel_engine_cs *engine; enum intel_engine_id id; + /* First, stop submission to hw, but do not yet complete requests by + * rolling the global seqno forward (since this would complete requests + * for which we haven't set the fence error to EIO yet). + */ for_each_engine(engine, i915, id) - engine_set_wedged(engine); + engine->submit_request = nop_submit_request; - set_bit(I915_WEDGED, &i915->gpu_error.flags); - wake_up_all(&i915->gpu_error.reset_queue); + /* Make sure no one is running the old callback before we proceed with + * cancelling requests and resetting the completion tracking. Otherwise + * we might submit a request to the hardware which never completes. + */ + synchronize_rcu(); - return 0; -} + for_each_engine(engine, i915, id) { + /* Mark all executing requests as skipped */ + engine->cancel_requests(engine); -void i915_gem_set_wedged(struct drm_i915_private *dev_priv) -{ - stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL); + /* Only once we've force-cancelled all in-flight requests can we + * start to complete all requests. + */ + engine->submit_request = nop_complete_submit_request; + } + + /* Make sure no request can slip through without getting completed by + * either this call here to intel_engine_init_global_seqno, or the one + * in nop_complete_submit_request. + */ + synchronize_rcu(); + + for_each_engine(engine, i915, id) { + unsigned long flags; + + /* Mark all pending requests as complete so that any concurrent + * (lockless) lookup doesn't try and wait upon the request as we + * reset it. + */ + spin_lock_irqsave(&engine->timeline->lock, flags); + intel_engine_init_global_seqno(engine, + intel_engine_last_submit(engine)); + spin_unlock_irqrestore(&engine->timeline->lock, flags); + } + + set_bit(I915_WEDGED, &i915->gpu_error.flags); + wake_up_all(&i915->gpu_error.reset_queue); } bool i915_gem_unset_wedged(struct drm_i915_private *i915) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index b100b38f1dd2..ef78a85cb845 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -556,7 +556,9 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) switch (state) { case FENCE_COMPLETE: trace_i915_gem_request_submit(request); + rcu_read_lock(); request->engine->submit_request(request); + rcu_read_unlock(); break; case FENCE_FREE: diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_request.c b/drivers/gpu/drm/i915/selftests/i915_gem_request.c index 78b9f811707f..a999161e8db1 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_request.c @@ -215,7 +215,9 @@ static int igt_request_rewind(void *arg) } i915_gem_request_get(vip); i915_add_request(vip); + rcu_read_lock(); request->engine->submit_request(request); + rcu_read_unlock(); mutex_unlock(&i915->drm.struct_mutex);