diff mbox

drm/i915: Fix cdclk vs. dev_cdclk mess when not recomputing things

Message ID 1477673960-3274-1-git-send-email-ville.syrjala@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ville Syrjala Oct. 28, 2016, 4:59 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

When we end up not recomputing the cdclk, we need to populate
intel_state->cdclk with the "atomic_cdclk_freq" instead of the
current cdclk_freq. When no pipes are active, the actual cdclk_freq
may be lower than what the configuration of the planes and
pipes would require from the point of view of the software state.

intel_state->dev_cdclk is the computed actual cdclk in such cases,
so let's populate that with the current cdclk value. Although basically
nothing should ever use dev_cdclk for any checks and whatnot.

This fixes bogus WARNS from skl_max_scale() which is trying to check
the plane software state against the cdclk frequency. So any time
it got called during DPMS off for instance, we might have tripped
the warn if the current mode would have required a higher than
minimum cdclk.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: bruno.pagani@ens-lyon.org
Cc: Daniel J Blueman <daniel.blueman@gmail.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Joseph Yasi <joe.yasi@gmail.com>
Tested-by: Paul Bolle <pebolle@tiscali.nl>
Tested-by: Joseph Yasi <joe.yasi@gmail.com>
Cc: stable@vger.kernel.org
Fixes: 1a617b77658e ("drm/i915: Keep track of the cdclk as if all crtc's were active.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98214
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Paul Bolle Oct. 28, 2016, 9:05 p.m. UTC | #1
On Fri, 2016-10-28 at 19:59 +0300, ville.syrjala@linux.intel.com wrote:
> Fixes: 1a617b77658e ("drm/i915: Keep track of the cdclk as if all crtc's were active.")

Obviously, I'm pretty happy with this patch. One question though: this
fixes a commit that shipped in v4.6. Do you have any idea why this
issue apparently never surfaced before v4.8?

Thanks,


Paul Bolle
Ville Syrjala Oct. 28, 2016, 9:30 p.m. UTC | #2
On Fri, Oct 28, 2016 at 11:05:48PM +0200, Paul Bolle wrote:
> On Fri, 2016-10-28 at 19:59 +0300, ville.syrjala@linux.intel.com wrote:
> > Fixes: 1a617b77658e ("drm/i915: Keep track of the cdclk as if all crtc's were active.")
> 
> Obviously, I'm pretty happy with this patch. One question though: this
> fixes a commit that shipped in v4.6. Do you have any idea why this
> issue apparently never surfaced before v4.8?

Pop quiz time, eh? :)

I think it was probably due to

commit 44d1240d006c9cd0249263b5449c8e4752500f6a
Author: Marek Szyprowski <m.szyprowski@samsung.com>
Date:   Mon Jun 13 11:11:26 2016 +0200

    drm: add generic zpos property

If you want want to grade my answer all you have to do is revert that
on 4.8 and see what happens. Might be interesting to see if I'm right
actually since the link between the WARN and that commit isn't entirely
obvious.
Paul Bolle Oct. 28, 2016, 9:38 p.m. UTC | #3
On Sat, 2016-10-29 at 00:30 +0300, Ville Syrjälä wrote:
> I think it was probably due to
> 
> commit 44d1240d006c9cd0249263b5449c8e4752500f6a
> Author: Marek Szyprowski <m.szyprowski@samsung.com>
> Date:   Mon Jun 13 11:11:26 2016 +0200
> 
>     drm: add generic zpos property
> 
> If you want want to grade my answer all you have to do is revert that
> on 4.8 and see what happens. Might be interesting to see if I'm right
> actually since the link between the WARN and that commit isn't entirely
> obvious.

You mean reverting commit 44d1240d006c ("drm: add generic zpos
property") and not applying this fix, right?

Anyway, I'm happy to grade your answer in a few days. Please prod if
notifying you of your grade takes too long.


Paul Bolle
Ville Syrjala Oct. 28, 2016, 10:01 p.m. UTC | #4
On Fri, Oct 28, 2016 at 11:38:14PM +0200, Paul Bolle wrote:
> On Sat, 2016-10-29 at 00:30 +0300, Ville Syrjälä wrote:
> > I think it was probably due to
> > 
> > commit 44d1240d006c9cd0249263b5449c8e4752500f6a
> > Author: Marek Szyprowski <m.szyprowski@samsung.com>
> > Date:   Mon Jun 13 11:11:26 2016 +0200
> > 
> >     drm: add generic zpos property
> > 
> > If you want want to grade my answer all you have to do is revert that
> > on 4.8 and see what happens. Might be interesting to see if I'm right
> > actually since the link between the WARN and that commit isn't entirely
> > obvious.
> 
> You mean reverting commit 44d1240d006c ("drm: add generic zpos
> property") and not applying this fix, right?

Yep.

> Anyway, I'm happy to grade your answer in a few days. Please prod if
> notifying you of your grade takes too long.
> 
> 
> Paul Bolle
Maarten Lankhorst Nov. 1, 2016, 8:57 a.m. UTC | #5
Op 28-10-16 om 18:59 schreef ville.syrjala@linux.intel.com:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> When we end up not recomputing the cdclk, we need to populate
> intel_state->cdclk with the "atomic_cdclk_freq" instead of the
> current cdclk_freq. When no pipes are active, the actual cdclk_freq
> may be lower than what the configuration of the planes and
> pipes would require from the point of view of the software state.
>
> intel_state->dev_cdclk is the computed actual cdclk in such cases,
> so let's populate that with the current cdclk value. Although basically
> nothing should ever use dev_cdclk for any checks and whatnot.
>
> This fixes bogus WARNS from skl_max_scale() which is trying to check
> the plane software state against the cdclk frequency. So any time
> it got called during DPMS off for instance, we might have tripped
> the warn if the current mode would have required a higher than
> minimum cdclk.
>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Mika Kahola <mika.kahola@intel.com>
> Cc: bruno.pagani@ens-lyon.org
> Cc: Daniel J Blueman <daniel.blueman@gmail.com>
> Cc: Paul Bolle <pebolle@tiscali.nl>
> Cc: Joseph Yasi <joe.yasi@gmail.com>
> Tested-by: Paul Bolle <pebolle@tiscali.nl>
> Tested-by: Joseph Yasi <joe.yasi@gmail.com>
> Cc: stable@vger.kernel.org
> Fixes: 1a617b77658e ("drm/i915: Keep track of the cdclk as if all crtc's were active.")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98214
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 895b3dc50e3f..f010e154e33e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -14040,8 +14040,10 @@ static int intel_modeset_checks(struct drm_atomic_state *state)
>  
>  		DRM_DEBUG_KMS("New cdclk calculated to be atomic %u, actual %u\n",
>  			      intel_state->cdclk, intel_state->dev_cdclk);
> -	} else
> +	} else {
>  		to_intel_atomic_state(state)->cdclk = dev_priv->atomic_cdclk_freq;
> +		to_intel_atomic_state(state)->dev_cdclk = dev_priv->cdclk_freq;
> +	}
This shouldn't be required in this case, but might as well do so since it doesn't hurt either.
>  	intel_modeset_clear_plls(state);
>  
> @@ -14142,8 +14144,10 @@ static int intel_atomic_check(struct drm_device *dev,
>  
>  		if (ret)
>  			return ret;
> -	} else
> -		intel_state->cdclk = dev_priv->cdclk_freq;
> +	} else {
> +		intel_state->cdclk = dev_priv->atomic_cdclk_freq;
> +		intel_state->dev_cdclk = dev_priv->cdclk_freq;
> +	}
We shouldn't rely on dev_cdclk being valid for the !modeset case.
Best to keep it zero there, the global cdclk can't be changed and the non-modeset case shouldn't rely on the current setting.

Otherwise looks sane, I have a similar patch in my tree. I didn't submit it yet but the fix was similar. Except for the
dev_cdclk stuff.

With the last dev_cdclk assignment removed:

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Paul Bolle Nov. 1, 2016, 9:07 a.m. UTC | #6
On Tue, 2016-11-01 at 09:57 +0100, Maarten Lankhorst wrote:
> Otherwise looks sane, I have a similar patch in my tree. I didn't
> submit it yet but the fix was similar. Except for the
> dev_cdclk stuff.
> 
> With the last dev_cdclk assignment removed:
> 
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

So I've been running this patch for a few days now. First I tested it
on top of v4.8.4. Now I'm running it on top of v4.8.5.

My current v4.8.5 boot saw this new (for me) *ERROR*, twice:
    <3>[43483.521341] [drm:skl_set_cdclk [i915]] *ERROR* failed to inform PCU about cdclk change
    <3>[108639.090776] [drm:skl_set_cdclk [i915]] *ERROR* failed to inform PCU about cdclk change

Related or a coincidence?

Thanks,


Paul Bolle
Ville Syrjala Nov. 1, 2016, 12:40 p.m. UTC | #7
On Tue, Nov 01, 2016 at 09:57:43AM +0100, Maarten Lankhorst wrote:
> Op 28-10-16 om 18:59 schreef ville.syrjala@linux.intel.com:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > When we end up not recomputing the cdclk, we need to populate
> > intel_state->cdclk with the "atomic_cdclk_freq" instead of the
> > current cdclk_freq. When no pipes are active, the actual cdclk_freq
> > may be lower than what the configuration of the planes and
> > pipes would require from the point of view of the software state.
> >
> > intel_state->dev_cdclk is the computed actual cdclk in such cases,
> > so let's populate that with the current cdclk value. Although basically
> > nothing should ever use dev_cdclk for any checks and whatnot.
> >
> > This fixes bogus WARNS from skl_max_scale() which is trying to check
> > the plane software state against the cdclk frequency. So any time
> > it got called during DPMS off for instance, we might have tripped
> > the warn if the current mode would have required a higher than
> > minimum cdclk.
> >
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Mika Kahola <mika.kahola@intel.com>
> > Cc: bruno.pagani@ens-lyon.org
> > Cc: Daniel J Blueman <daniel.blueman@gmail.com>
> > Cc: Paul Bolle <pebolle@tiscali.nl>
> > Cc: Joseph Yasi <joe.yasi@gmail.com>
> > Tested-by: Paul Bolle <pebolle@tiscali.nl>
> > Tested-by: Joseph Yasi <joe.yasi@gmail.com>
> > Cc: stable@vger.kernel.org
> > Fixes: 1a617b77658e ("drm/i915: Keep track of the cdclk as if all crtc's were active.")
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98214
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_display.c | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 895b3dc50e3f..f010e154e33e 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -14040,8 +14040,10 @@ static int intel_modeset_checks(struct drm_atomic_state *state)
> >  
> >  		DRM_DEBUG_KMS("New cdclk calculated to be atomic %u, actual %u\n",
> >  			      intel_state->cdclk, intel_state->dev_cdclk);
> > -	} else
> > +	} else {
> >  		to_intel_atomic_state(state)->cdclk = dev_priv->atomic_cdclk_freq;
> > +		to_intel_atomic_state(state)->dev_cdclk = dev_priv->cdclk_freq;
> > +	}
> This shouldn't be required in this case, but might as well do so since it doesn't hurt either.
> >  	intel_modeset_clear_plls(state);
> >  
> > @@ -14142,8 +14144,10 @@ static int intel_atomic_check(struct drm_device *dev,
> >  
> >  		if (ret)
> >  			return ret;
> > -	} else
> > -		intel_state->cdclk = dev_priv->cdclk_freq;
> > +	} else {
> > +		intel_state->cdclk = dev_priv->atomic_cdclk_freq;
> > +		intel_state->dev_cdclk = dev_priv->cdclk_freq;
> > +	}
> We shouldn't rely on dev_cdclk being valid for the !modeset case.
> Best to keep it zero there, the global cdclk can't be changed and the non-modeset case shouldn't rely on the current setting.

It should pretty much be protected by any of the crtc locks, at least
for now since we don't allow changing it w/o modesetting all the pipes.
But yeah, nothing should be using it for any checks so could just leave
it unset.

But this got me thinking about dev_priv->atomic_cdclk_freq. Essentially
that one is protected by connection_mutex, which we won't be holding
for the !modeset case. So I think using it there is a bit dubious. I
guess it would require a modeset on one pipe that doesn't actually
end up changing the cdclk frequency but which changes atomic_cdclk_freq,
and a parallel plane update on another pipe. I guess that would mean
both pipes would have be !active at the time so that dev_cdclk remains
stable. Seems to me that we'd need to lock all the crtcs (without
forcing a modeset on them) when atomic_cdclk changes.

> 
> Otherwise looks sane, I have a similar patch in my tree. I didn't submit it yet but the fix was similar. Except for the
> dev_cdclk stuff.
> 
> With the last dev_cdclk assignment removed:
> 
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Ville Syrjala Nov. 1, 2016, 12:41 p.m. UTC | #8
On Tue, Nov 01, 2016 at 10:07:51AM +0100, Paul Bolle wrote:
> On Tue, 2016-11-01 at 09:57 +0100, Maarten Lankhorst wrote:
> > Otherwise looks sane, I have a similar patch in my tree. I didn't
> > submit it yet but the fix was similar. Except for the
> > dev_cdclk stuff.
> > 
> > With the last dev_cdclk assignment removed:
> > 
> > Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> 
> So I've been running this patch for a few days now. First I tested it
> on top of v4.8.4. Now I'm running it on top of v4.8.5.
> 
> My current v4.8.5 boot saw this new (for me) *ERROR*, twice:
>     <3>[43483.521341] [drm:skl_set_cdclk [i915]] *ERROR* failed to inform PCU about cdclk change
>     <3>[108639.090776] [drm:skl_set_cdclk [i915]] *ERROR* failed to inform PCU about cdclk change
> 
> Related or a coincidence?

Not directly related. That's actually a more serious problem we really
need to figure out. I already tried to fix it but apparently I failed.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 895b3dc50e3f..f010e154e33e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14040,8 +14040,10 @@  static int intel_modeset_checks(struct drm_atomic_state *state)
 
 		DRM_DEBUG_KMS("New cdclk calculated to be atomic %u, actual %u\n",
 			      intel_state->cdclk, intel_state->dev_cdclk);
-	} else
+	} else {
 		to_intel_atomic_state(state)->cdclk = dev_priv->atomic_cdclk_freq;
+		to_intel_atomic_state(state)->dev_cdclk = dev_priv->cdclk_freq;
+	}
 
 	intel_modeset_clear_plls(state);
 
@@ -14142,8 +14144,10 @@  static int intel_atomic_check(struct drm_device *dev,
 
 		if (ret)
 			return ret;
-	} else
-		intel_state->cdclk = dev_priv->cdclk_freq;
+	} else {
+		intel_state->cdclk = dev_priv->atomic_cdclk_freq;
+		intel_state->dev_cdclk = dev_priv->cdclk_freq;
+	}
 
 	ret = drm_atomic_helper_check_planes(dev, state);
 	if (ret)