diff mbox

[2/5] drm/i915: use wait_event_timeout when waiting for flip completions

Message ID 1380311846-1581-2-git-send-email-jbarnes@virtuousgeek.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jesse Barnes Sept. 27, 2013, 7:57 p.m. UTC
We're shutting the crtc off and don't want to hang forever.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
 drivers/gpu/drm/i915/intel_display.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Chris Wilson Sept. 27, 2013, 8:45 p.m. UTC | #1
On Fri, Sep 27, 2013 at 12:57:23PM -0700, Jesse Barnes wrote:
> We're shutting the crtc off and don't want to hang forever.

That scares me ever so slightly, especially ignoring the failure. Do you
want to reset the display controller if the interrupts die, or just
report a WARN, or propagate the failure back and make userspace try
again?

But not taking any action at all is a recipe for a silent disaster. It
has taken years to get to the point where we can turn off the pipes
reliably...
-Chris
Jesse Barnes Sept. 27, 2013, 9:27 p.m. UTC | #2
On Fri, 27 Sep 2013 21:45:39 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> On Fri, Sep 27, 2013 at 12:57:23PM -0700, Jesse Barnes wrote:
> > We're shutting the crtc off and don't want to hang forever.
> 
> That scares me ever so slightly, especially ignoring the failure. Do you
> want to reset the display controller if the interrupts die, or just
> report a WARN, or propagate the failure back and make userspace try
> again?
> 
> But not taking any action at all is a recipe for a silent disaster. It
> has taken years to get to the point where we can turn off the pipes
> reliably...

Yeah this one is definitely a WIP.  I'd like to warn when this happens
and clean things up rather than just timing out and letting disaster
befall us later.
Daniel Vetter Sept. 28, 2013, 9:35 a.m. UTC | #3
On Fri, Sep 27, 2013 at 9:57 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> We're shutting the crtc off and don't want to hang forever.

Reading the source and the test-suite is advisable ;-)
- We actually don't hang here if the MI_DISPLAY_FLIP doesn't happen.
- We already recover the display state (not so relevant here where we
shut it off, but in the set_base path where we have the same problem
with waiting for flips).
- We have extensive test coverage for gpu hangs vs. flips in all kinds
of contrived situations in igt.

That leaves us with the flip not completing in the hw after the
MI_DISPLAY_FLIP has executed. Usually that just means we miss a
workaround or have a bug in our code, and again we have extensive
testcases for this.

Furthermore the hang recover code is ridiculously tricky - just for
3.12-fixes I've hunted down 3 deadlocks in there. By bailing out too
early you have a good chance to confuse the code and actually make
matters worse ;-)

Cheers, Daniel
Jesse Barnes Sept. 28, 2013, 2:58 p.m. UTC | #4
On Sat, 28 Sep 2013 11:35:11 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Fri, Sep 27, 2013 at 9:57 PM, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> > We're shutting the crtc off and don't want to hang forever.
> 
> Reading the source and the test-suite is advisable ;-)
> - We actually don't hang here if the MI_DISPLAY_FLIP doesn't happen.
> - We already recover the display state (not so relevant here where we
> shut it off, but in the set_base path where we have the same problem
> with waiting for flips).
> - We have extensive test coverage for gpu hangs vs. flips in all kinds
> of contrived situations in igt.
> 
> That leaves us with the flip not completing in the hw after the
> MI_DISPLAY_FLIP has executed. Usually that just means we miss a
> workaround or have a bug in our code, and again we have extensive
> testcases for this.
> 
> Furthermore the hang recover code is ridiculously tricky - just for
> 3.12-fixes I've hunted down 3 deadlocks in there. By bailing out too
> early you have a good chance to confuse the code and actually make
> matters worse ;-)

Sorry replied from my phone earlier.

Maybe you should read my earlier reply to Chris.

It's all very nice to say "it doesn't hang here".  However it does hang
waiting in this function on this machine.

It could be that the DPLL CRI clock source fix fixes that, I haven't
re-tested yet.

But this was also not meant for submission, and the earlier reply made
that clear (plus the obvious stuff Chris pointed out), so I don't need
the extra snark in this message.

On top of that, if this code has become so complex and fragile, maybe
it's time to re-think it.  Maybe we should drop ring based flips
altogether and just go with MMIO, and forget about all the races and
even pipe off cases, and just emit completions at vblank time or
immediately after the MMIO is written.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index bc25481..8da1c96 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2912,8 +2912,9 @@  static void intel_crtc_wait_for_pending_flips(struct drm_crtc *crtc)
 
 	WARN_ON(waitqueue_active(&dev_priv->pending_flip_queue));
 
-	wait_event(dev_priv->pending_flip_queue,
-		   !intel_crtc_has_pending_flip(crtc));
+	wait_event_timeout(dev_priv->pending_flip_queue,
+			   !intel_crtc_has_pending_flip(crtc),
+			   msecs_to_jiffies(50));
 
 	mutex_lock(&dev->struct_mutex);
 	intel_finish_fb(crtc->fb);