diff mbox

[Intel-gfx] drm/i915: WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock)

Message ID 1477342074.1872.24.camel@tiscali.nl (mailing list archive)
State New, archived
Headers show

Commit Message

Paul Bolle Oct. 24, 2016, 8:47 p.m. UTC
[Detailed post, but please give it a quick scan.]

On Wed, 2016-10-12 at 14:06 +0200, Paul Bolle wrote:
> On Wed, 2016-10-12 at 14:08 +0300, Joonas Lahtinen wrote:
> > Bisecting the offending commit between v4.8 and v4.8.1 would be a good
> > start.
> 
> That would be between v4.7 and v4.8. (I guess my report was
> ambiguous.)
> 
> That might take some time. Because bisecting always takes a long time
> and especially since hitting this WARNING sometimes takes over an hour.
> Anyhow, please prod me if I stay silent for too long.

0) So I've lost my courage to do a bisect when my first attempt landed
me in v4.6-rc3. This is about for issue popping up between v4.7 and
v4.8-rc1.

1) So I used the most reliable debugging tool that I actually
understand: printk():


2) This taught me that crtc_clock always is 373250 on my machine. cdclk
mostly is 450000, but every now and then it briefly is 337500.

3) Now the interesting pattern is that cdclk drops to 337500 only after
two quick calls of skl_max_scale() with cdclk set to 450000, and a
roughly 300ms pause before the third call of that function. Example:

<7>[23758.501933] i915: cdclk >= crtc_clock: 450000 >= 373250
<7>[23758.515211] i915: cdclk >= crtc_clock: 450000 >= 373250
<7>[23758.869057] i915: cdclk < crtc_clock: 337500 < 373250

This pattern of cdclk being 337500 after roughly 300ms is surprisingly
stable.

4) So _perhaps_ there's some roughly 300ms timeout, somehow, somewhere,
that sets cdclk to 337500 and triggers this issue. Ideas?

To be continued,


Paul Bolle

Comments

Jani Nikula Oct. 25, 2016, 8:37 a.m. UTC | #1
On Mon, 24 Oct 2016, Paul Bolle <pebolle@tiscali.nl> wrote:
> [Detailed post, but please give it a quick scan.]

Please file the information in the bug you filed. Please attach dmesg
(again, on the bug) with drm.debug=14 and running your patch.

BR,
Jani.

>
> On Wed, 2016-10-12 at 14:06 +0200, Paul Bolle wrote:
>> On Wed, 2016-10-12 at 14:08 +0300, Joonas Lahtinen wrote:
>> > Bisecting the offending commit between v4.8 and v4.8.1 would be a good
>> > start.
>> 
>> That would be between v4.7 and v4.8. (I guess my report was
>> ambiguous.)
>> 
>> That might take some time. Because bisecting always takes a long time
>> and especially since hitting this WARNING sometimes takes over an hour.
>> Anyhow, please prod me if I stay silent for too long.
>
> 0) So I've lost my courage to do a bisect when my first attempt landed
> me in v4.6-rc3. This is about for issue popping up between v4.7 and
> v4.8-rc1.
>
> 1) So I used the most reliable debugging tool that I actually
> understand: printk():
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index fbcfed63a76e..791de414cf1e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -14771,10 +14771,16 @@ skl_max_scale(struct intel_crtc *intel_crtc, struct intel_crtc_state *crtc_state
>                 return DRM_PLANE_HELPER_NO_SCALING;
>  
>         crtc_clock = crtc_state->base.adjusted_mode.crtc_clock;
> -       cdclk = to_intel_atomic_state(crtc_state->base.state)->cdclk;
> +       if (WARN_ON_ONCE(!crtc_clock))
> +               return DRM_PLANE_HELPER_NO_SCALING;
>  
> -       if (WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock))
> +       cdclk = to_intel_atomic_state(crtc_state->base.state)->cdclk;
> +       if (WARN_ON_ONCE(cdclk < crtc_clock)) {
> +               printk(KERN_DEBUG "i915: cdclk < crtc_clock: %d < %d\n", cdclk, crtc_clock);
>                 return DRM_PLANE_HELPER_NO_SCALING;
> +       }
> +
> +       printk_ratelimited(KERN_DEBUG "i915: cdclk >= crtc_clock: %d >= %d\n", cdclk, crtc_clock);
>  
>         /*
>          * skl max scale is lower of:
>
> 2) This taught me that crtc_clock always is 373250 on my machine. cdclk
> mostly is 450000, but every now and then it briefly is 337500.
>
> 3) Now the interesting pattern is that cdclk drops to 337500 only after
> two quick calls of skl_max_scale() with cdclk set to 450000, and a
> roughly 300ms pause before the third call of that function. Example:
>
> <7>[23758.501933] i915: cdclk >= crtc_clock: 450000 >= 373250
> <7>[23758.515211] i915: cdclk >= crtc_clock: 450000 >= 373250
> <7>[23758.869057] i915: cdclk < crtc_clock: 337500 < 373250
>
> This pattern of cdclk being 337500 after roughly 300ms is surprisingly
> stable.
>
> 4) So _perhaps_ there's some roughly 300ms timeout, somehow, somewhere,
> that sets cdclk to 337500 and triggers this issue. Ideas?
>
> To be continued,
>
>
> Paul Bolle
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index fbcfed63a76e..791de414cf1e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14771,10 +14771,16 @@  skl_max_scale(struct intel_crtc *intel_crtc, struct intel_crtc_state *crtc_state
                return DRM_PLANE_HELPER_NO_SCALING;
 
        crtc_clock = crtc_state->base.adjusted_mode.crtc_clock;
-       cdclk = to_intel_atomic_state(crtc_state->base.state)->cdclk;
+       if (WARN_ON_ONCE(!crtc_clock))
+               return DRM_PLANE_HELPER_NO_SCALING;
 
-       if (WARN_ON_ONCE(!crtc_clock || cdclk < crtc_clock))
+       cdclk = to_intel_atomic_state(crtc_state->base.state)->cdclk;
+       if (WARN_ON_ONCE(cdclk < crtc_clock)) {
+               printk(KERN_DEBUG "i915: cdclk < crtc_clock: %d < %d\n", cdclk, crtc_clock);
                return DRM_PLANE_HELPER_NO_SCALING;
+       }
+
+       printk_ratelimited(KERN_DEBUG "i915: cdclk >= crtc_clock: %d >= %d\n", cdclk, crtc_clock);
 
        /*
         * skl max scale is lower of: