diff mbox

[4/3] drm/i915: Read RPS frequencies earlier on non-VLV/CHV

Message ID 1457377029-526-1-git-send-email-ville.syrjala@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ville Syrjälä March 7, 2016, 6:57 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Read out the RPS frequencies already in intel_init_gt_powersave()
on all the platforms. So far we only did that on VLV/CHV, and the
rest of the platforms read them out at rps enable time, which happens
asynchronously from a workqueue. Reading them out earlier prevents
userspace from reading out invalid (zero) values via the relevant
sysfs files before the rps enable work has been executed.

This used to be prevented by the flush_delayed_work() + locking
in the sysfs code, but now that we no longer do that, we run the
risk of letting userspace see the initial zeroed values.

Note that it's still possible to read out cur_freq as 0, since that
only gets initialized from the delayed rps enable. Should that pose
a real problem, I guess we could always add the flush+locking back
for the cur_freq read.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

Comments

Tom.O'Rourke@intel.com March 8, 2016, 12:49 a.m. UTC | #1
On Mon, Mar 07, 2016 at 08:57:09PM +0200, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Read out the RPS frequencies already in intel_init_gt_powersave()
> on all the platforms. So far we only did that on VLV/CHV, and the
> rest of the platforms read them out at rps enable time, which happens
> asynchronously from a workqueue. Reading them out earlier prevents
> userspace from reading out invalid (zero) values via the relevant
> sysfs files before the rps enable work has been executed.
> 
> This used to be prevented by the flush_delayed_work() + locking
> in the sysfs code, but now that we no longer do that, we run the
> risk of letting userspace see the initial zeroed values.
> 
> Note that it's still possible to read out cur_freq as 0, since that
> only gets initialized from the delayed rps enable. Should that pose
> a real problem, I guess we could always add the flush+locking back
> for the cur_freq read.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---

Hello,

The flush_delayed_work() calls were added in:
"commit 5c9669cee534cbb834d51aae115267f5e561b622
Author: Tom O'Rourke <Tom.O'Rourke@intel.com>
Date:   Mon Sep 16 14:56:43 2013 -0700

    drm/i915: Finish enabling rps before use by sysfs or debugfs"

This change was made to address use before initialization
problems in min/max/cur frequency values.  The work queue
being flushed was added in:
"commit 1a01ab3b2dc4394c46c4c3230805748f632f6f74
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Fri Nov 2 11:14:00 2012 -0700

    drm/i915: put ring frequency and turbo setup into a work queue v5

    Communicating via the mailbox registers with the PCU can take quite
    awhile.  And updating the ring frequency or enabling turbo is not
    something that needs to happen synchronously, so take it out of our init
    and resume paths to speed things up (~200ms on my T420)."

This change was made to reduce driver load times.

The two concerns I have are:
1) Similar changes would be needed in debugfs files.
2) This change may increase driver load times due to pcode
mailbox command used in gen6_rps_init_frequencies() that
is now in the init path.

I am not objecting to these changes but we should
be aware of the impact to driver load latency.

Thanks,
Tom
Ville Syrjälä March 8, 2016, 1:34 p.m. UTC | #2
On Mon, Mar 07, 2016 at 04:49:14PM -0800, O'Rourke, Tom wrote:
> On Mon, Mar 07, 2016 at 08:57:09PM +0200, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Read out the RPS frequencies already in intel_init_gt_powersave()
> > on all the platforms. So far we only did that on VLV/CHV, and the
> > rest of the platforms read them out at rps enable time, which happens
> > asynchronously from a workqueue. Reading them out earlier prevents
> > userspace from reading out invalid (zero) values via the relevant
> > sysfs files before the rps enable work has been executed.
> > 
> > This used to be prevented by the flush_delayed_work() + locking
> > in the sysfs code, but now that we no longer do that, we run the
> > risk of letting userspace see the initial zeroed values.
> > 
> > Note that it's still possible to read out cur_freq as 0, since that
> > only gets initialized from the delayed rps enable. Should that pose
> > a real problem, I guess we could always add the flush+locking back
> > for the cur_freq read.
> > 
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> 
> Hello,
> 
> The flush_delayed_work() calls were added in:
> "commit 5c9669cee534cbb834d51aae115267f5e561b622
> Author: Tom O'Rourke <Tom.O'Rourke@intel.com>
> Date:   Mon Sep 16 14:56:43 2013 -0700
> 
>     drm/i915: Finish enabling rps before use by sysfs or debugfs"
> 
> This change was made to address use before initialization
> problems in min/max/cur frequency values.  The work queue
> being flushed was added in:
> "commit 1a01ab3b2dc4394c46c4c3230805748f632f6f74
> Author: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date:   Fri Nov 2 11:14:00 2012 -0700
> 
>     drm/i915: put ring frequency and turbo setup into a work queue v5
> 
>     Communicating via the mailbox registers with the PCU can take quite
>     awhile.  And updating the ring frequency or enabling turbo is not
>     something that needs to happen synchronously, so take it out of our init
>     and resume paths to speed things up (~200ms on my T420)."
> 
> This change was made to reduce driver load times.
> 
> The two concerns I have are:
> 1) Similar changes would be needed in debugfs files.
> 2) This change may increase driver load times due to pcode
> mailbox command used in gen6_rps_init_frequencies() that
> is now in the init path.
> 
> I am not objecting to these changes but we should
> be aware of the impact to driver load latency.

If a single pcode command takes singificant amount of time there must
be something seriously wrong. TBH I never really bought the whole two
stage rps enable thing. Seemed like extra complexity for little gain.
Someone should definitely measure it again though, especially after
commit 9848de082ff4 ("drm/i915: Use usleep_range() in wait_for()")
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2591d533a895..5fe098b94f06 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4782,8 +4782,6 @@  static void gen9_enable_rps(struct drm_device *dev)
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
-	gen6_init_rps_frequencies(dev);
-
 	/* WaGsvDisableTurbo: Workaround to disable turbo on BXT A* */
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
 		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
@@ -4896,9 +4894,6 @@  static void gen8_enable_rps(struct drm_device *dev)
 	/* 2a: Disable RC states. */
 	I915_WRITE(GEN6_RC_CONTROL, 0);
 
-	/* Initialize rps frequencies */
-	gen6_init_rps_frequencies(dev);
-
 	/* 2b: Program RC6 thresholds.*/
 	I915_WRITE(GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16);
 	I915_WRITE(GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
@@ -4988,9 +4983,6 @@  static void gen6_enable_rps(struct drm_device *dev)
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
-	/* Initialize rps frequencies */
-	gen6_init_rps_frequencies(dev);
-
 	/* disable the counters and set deterministic thresholds */
 	I915_WRITE(GEN6_RC_CONTROL, 0);
 
@@ -6197,6 +6189,8 @@  void intel_init_gt_powersave(struct drm_device *dev)
 		cherryview_init_gt_powersave(dev);
 	else if (IS_VALLEYVIEW(dev))
 		valleyview_init_gt_powersave(dev);
+	else if (INTEL_INFO(dev_priv)->gen >= 6)
+		gen6_init_rps_frequencies(dev);
 }
 
 void intel_cleanup_gt_powersave(struct drm_device *dev)