Message ID | 1360940866-22435-2-git-send-email-ville.syrjala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Someone may be waiting for a flip that will never complete due to a GPU > reset. Wake up all such waiters after the GPU reset processing has > finished. > > v2: Dropped the wake_up_all() from i915_handle_error() since > we no longer wait for pending flips with struct_mutex held. Isn't the wake_up(pending_flip_queue) superseded by performing the explicit do_intel_finish_page_flip() in patch 3? -Chris
On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote: > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote: > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > Someone may be waiting for a flip that will never complete due to a GPU > > reset. Wake up all such waiters after the GPU reset processing has > > finished. > > > > v2: Dropped the wake_up_all() from i915_handle_error() since > > we no longer wait for pending flips with struct_mutex held. > > Isn't the wake_up(pending_flip_queue) superseded by performing the > explicit do_intel_finish_page_flip() in patch 3? Yes that's correct. But I actually forgot to remove the wake_up patch from my tree when I tested this. I'll run a few more tests just to make sure it still works.
On Mon, Feb 18, 2013 at 11:58:23AM +0200, Ville Syrjälä wrote: > On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote: > > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote: > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > > > Someone may be waiting for a flip that will never complete due to a GPU > > > reset. Wake up all such waiters after the GPU reset processing has > > > finished. > > > > > > v2: Dropped the wake_up_all() from i915_handle_error() since > > > we no longer wait for pending flips with struct_mutex held. > > > > Isn't the wake_up(pending_flip_queue) superseded by performing the > > explicit do_intel_finish_page_flip() in patch 3? > > Yes that's correct. But I actually forgot to remove the wake_up patch > from my tree when I tested this. I'll run a few more tests just to make > sure it still works. I just tried it w/o the wake_up_all() and unfortunately it hung :( Need to think about it a bit more I suppose.
On Mon, Feb 18, 2013 at 02:41:00PM +0200, Ville Syrjälä wrote: > On Mon, Feb 18, 2013 at 11:58:23AM +0200, Ville Syrjälä wrote: > > On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote: > > > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote: > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > > > > > Someone may be waiting for a flip that will never complete due to a GPU > > > > reset. Wake up all such waiters after the GPU reset processing has > > > > finished. > > > > > > > > v2: Dropped the wake_up_all() from i915_handle_error() since > > > > we no longer wait for pending flips with struct_mutex held. > > > > > > Isn't the wake_up(pending_flip_queue) superseded by performing the > > > explicit do_intel_finish_page_flip() in patch 3? > > > > Yes that's correct. But I actually forgot to remove the wake_up patch > > from my tree when I tested this. I'll run a few more tests just to make > > sure it still works. > > I just tried it w/o the wake_up_all() and unfortunately it hung :( > > Need to think about it a bit more I suppose. Well after a bit more though I think I know what happened: 1. page flip submitted on crtc which is not the first crtc in the list 2. mode set operation on any crtc (will grab all crtc locks) 3. reset handling loops over crtcs 3.1 first crtc doesn't have a pending flip, so pending_flip_queue is not woken up 3.2 code tries to grab first crtc mutex to update the base address -> deadlock So either I need to keep the wake_up_all() call in the reset handling, or I need to do two loops over the crtcs, first loop would complete all page flips, and second loop would update the base address registers. I don't really care which way we go. Anyone else have a preference?
On Mon, Feb 18, 2013 at 05:31:14PM +0200, Ville Syrjälä wrote: > On Mon, Feb 18, 2013 at 02:41:00PM +0200, Ville Syrjälä wrote: > > On Mon, Feb 18, 2013 at 11:58:23AM +0200, Ville Syrjälä wrote: > > > On Fri, Feb 15, 2013 at 11:53:14PM +0000, Chris Wilson wrote: > > > > On Fri, Feb 15, 2013 at 05:07:44PM +0200, ville.syrjala@linux.intel.com wrote: > > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > > > > > > > Someone may be waiting for a flip that will never complete due to a GPU > > > > > reset. Wake up all such waiters after the GPU reset processing has > > > > > finished. > > > > > > > > > > v2: Dropped the wake_up_all() from i915_handle_error() since > > > > > we no longer wait for pending flips with struct_mutex held. > > > > > > > > Isn't the wake_up(pending_flip_queue) superseded by performing the > > > > explicit do_intel_finish_page_flip() in patch 3? > > > > > > Yes that's correct. But I actually forgot to remove the wake_up patch > > > from my tree when I tested this. I'll run a few more tests just to make > > > sure it still works. > > > > I just tried it w/o the wake_up_all() and unfortunately it hung :( > > > > Need to think about it a bit more I suppose. > > Well after a bit more though I think I know what happened: > > 1. page flip submitted on crtc which is not the first crtc in the list > 2. mode set operation on any crtc (will grab all crtc locks) > 3. reset handling loops over crtcs > 3.1 first crtc doesn't have a pending flip, so pending_flip_queue is not woken up > 3.2 code tries to grab first crtc mutex to update the base address > -> deadlock > > So either I need to keep the wake_up_all() call in the reset handling, > or I need to do two loops over the crtcs, first loop would complete all > page flips, and second loop would update the base address registers. > I don't really care which way we go. Anyone else have a preference? I'd vote for the second approach of first waking everyone up, and then doing the modeset updates. I guess you could even use drm_modeset_lock_all, not really worth optimizing things here imo. Also please add a comment to explain this ordering constraint in the reset code. -Daniel
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 2cd97d1..672816a 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -915,6 +915,8 @@ static void i915_error_work_func(struct work_struct *work) for_each_ring(ring, dev_priv, i) wake_up_all(&ring->irq_queue); + wake_up_all(&dev_priv->pending_flip_queue); + wake_up_all(&dev_priv->gpu_error.reset_queue); } }