drm/i915: Implement a better i945gm vblank irq vs. C-states workaround
diff mbox series

Message ID 20191003140231.24408-1-ville.syrjala@linux.intel.com
State New
Headers show
Series
  • drm/i915: Implement a better i945gm vblank irq vs. C-states workaround
Related show

Commit Message

Ville Syrjälä Oct. 3, 2019, 2:02 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

The current "disable C3+" workaround for the delayed vblank
irqs on i945gm no longer works. I'm not sure what changed, but
now I need to also disable C2. I also got my hands on a i915gm
machine that suffers from the same issue.

After some furious poking of registers I managed to find a
better workaround: The "Do not Turn off Core Render Clock in C
states" bit. With that I no longer have to disable any C-states,
and as a nice bonus the power cost is only ~1/4 of the
"disable C3+" method (which mind you doesn't even work anymore,
and so would have an even higher power cost if we made it work
by also disabling C2).

So let's throw out all the cpuidle/qos crap and just toggle
the magic bit as needed. And we extend the workaround to cover
i915gm as well.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 +--
 drivers/gpu/drm/i915/i915_drv.h              |  9 +--
 drivers/gpu/drm/i915/i915_irq.c              | 79 +++-----------------
 drivers/gpu/drm/i915/i915_irq.h              |  4 +-
 drivers/gpu/drm/i915/i915_reg.h              |  1 +
 5 files changed, 22 insertions(+), 81 deletions(-)

Comments

Chris Wilson Oct. 3, 2019, 2:12 p.m. UTC | #1
Quoting Ville Syrjala (2019-10-03 15:02:31)
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> The current "disable C3+" workaround for the delayed vblank
> irqs on i945gm no longer works. I'm not sure what changed, but
> now I need to also disable C2. I also got my hands on a i915gm
> machine that suffers from the same issue.
> 
> After some furious poking of registers I managed to find a
> better workaround: The "Do not Turn off Core Render Clock in C
> states" bit. With that I no longer have to disable any C-states,
> and as a nice bonus the power cost is only ~1/4 of the
> "disable C3+" method (which mind you doesn't even work anymore,
> and so would have an even higher power cost if we made it work
> by also disabling C2).
> 
> So let's throw out all the cpuidle/qos crap and just toggle
> the magic bit as needed. And we extend the workaround to cover
> i915gm as well.

Nice discovery. ScratchPad0 suggests that it may have been a late
addition, there might be some chips out there that don't have the magic
bit. Working on most is better than broken on all.

> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Empirical results rule, and I for one am much happier without needing
qos dma_latency,
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Ville Syrjälä Oct. 3, 2019, 2:18 p.m. UTC | #2
On Thu, Oct 03, 2019 at 03:12:11PM +0100, Chris Wilson wrote:
> Quoting Ville Syrjala (2019-10-03 15:02:31)
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > The current "disable C3+" workaround for the delayed vblank
> > irqs on i945gm no longer works. I'm not sure what changed, but
> > now I need to also disable C2. I also got my hands on a i915gm
> > machine that suffers from the same issue.
> > 
> > After some furious poking of registers I managed to find a
> > better workaround: The "Do not Turn off Core Render Clock in C
> > states" bit. With that I no longer have to disable any C-states,
> > and as a nice bonus the power cost is only ~1/4 of the
> > "disable C3+" method (which mind you doesn't even work anymore,
> > and so would have an even higher power cost if we made it work
> > by also disabling C2).
> > 
> > So let's throw out all the cpuidle/qos crap and just toggle
> > the magic bit as needed. And we extend the workaround to cover
> > i915gm as well.
> 
> Nice discovery. ScratchPad0 suggests that it may have been a late
> addition, there might be some chips out there that don't have the magic
> bit. Working on most is better than broken on all.

Yeah. And at least 100% of *my* broken machines now work ;)

> 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Empirical results rule, and I for one am much happier without needing
> qos dma_latency,
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index c3ac5a5c5185..8f077e2d0dfd 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -15092,12 +15092,12 @@  static const struct drm_crtc_funcs i965_crtc_funcs = {
 	.disable_vblank = i965_disable_vblank,
 };
 
-static const struct drm_crtc_funcs i945gm_crtc_funcs = {
+static const struct drm_crtc_funcs i915gm_crtc_funcs = {
 	INTEL_CRTC_FUNCS,
 
 	.get_vblank_counter = i915_get_vblank_counter,
-	.enable_vblank = i945gm_enable_vblank,
-	.disable_vblank = i945gm_disable_vblank,
+	.enable_vblank = i915gm_enable_vblank,
+	.disable_vblank = i915gm_disable_vblank,
 };
 
 static const struct drm_crtc_funcs i915_crtc_funcs = {
@@ -15168,8 +15168,8 @@  static int intel_crtc_init(struct drm_i915_private *dev_priv, enum pipe pipe)
 			funcs = &g4x_crtc_funcs;
 		else if (IS_GEN(dev_priv, 4))
 			funcs = &i965_crtc_funcs;
-		else if (IS_I945GM(dev_priv))
-			funcs = &i945gm_crtc_funcs;
+		else if (IS_I945GM(dev_priv) || IS_I915GM(dev_priv))
+			funcs = &i915gm_crtc_funcs;
 		else if (IS_GEN(dev_priv, 3))
 			funcs = &i915_crtc_funcs;
 		else
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 337d8306416a..5a6f2de2c175 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1730,13 +1730,8 @@  struct drm_i915_private {
 		struct work_struct idle_work;
 	} gem;
 
-	/* For i945gm vblank irq vs. C3 workaround */
-	struct {
-		struct work_struct work;
-		struct pm_qos_request pm_qos;
-		u8 c3_disable_latency;
-		u8 enabled;
-	} i945gm_vblank;
+	/* For i915gm/i945gm vblank irq workaround */
+	u8 vblank_enabled;
 
 	/* perform PHY state sanity checks? */
 	bool chv_phy_assert[2];
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index bc83f094065a..f2371b6083c6 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -29,7 +29,6 @@ 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/circ_buf.h>
-#include <linux/cpuidle.h>
 #include <linux/slab.h>
 #include <linux/sysrq.h>
 
@@ -2910,12 +2909,18 @@  int i8xx_enable_vblank(struct drm_crtc *crtc)
 	return 0;
 }
 
-int i945gm_enable_vblank(struct drm_crtc *crtc)
+int i915gm_enable_vblank(struct drm_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
 
-	if (dev_priv->i945gm_vblank.enabled++ == 0)
-		schedule_work(&dev_priv->i945gm_vblank.work);
+	/*
+	 * Vblank interrupts fail to wake the device up from C2+.
+	 * Disabling render clock gating during C-states avoids
+	 * the problem. There is a small power cost so we do this
+	 * only when vblank interrupts are actually enabled.
+	 */
+	if (dev_priv->vblank_enabled++ == 0)
+		I915_WRITE(SCPD0, _MASKED_BIT_ENABLE(CSTATE_RENDER_CLOCK_GATE_DISABLE));
 
 	return i8xx_enable_vblank(crtc);
 }
@@ -2988,14 +2993,14 @@  void i8xx_disable_vblank(struct drm_crtc *crtc)
 	spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags);
 }
 
-void i945gm_disable_vblank(struct drm_crtc *crtc)
+void i915gm_disable_vblank(struct drm_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
 
 	i8xx_disable_vblank(crtc);
 
-	if (--dev_priv->i945gm_vblank.enabled == 0)
-		schedule_work(&dev_priv->i945gm_vblank.work);
+	if (--dev_priv->vblank_enabled == 0)
+		I915_WRITE(SCPD0, _MASKED_BIT_DISABLE(CSTATE_RENDER_CLOCK_GATE_DISABLE));
 }
 
 void i965_disable_vblank(struct drm_crtc *crtc)
@@ -3034,60 +3039,6 @@  void bdw_disable_vblank(struct drm_crtc *crtc)
 	spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags);
 }
 
-static void i945gm_vblank_work_func(struct work_struct *work)
-{
-	struct drm_i915_private *dev_priv =
-		container_of(work, struct drm_i915_private, i945gm_vblank.work);
-
-	/*
-	 * Vblank interrupts fail to wake up the device from C3,
-	 * hence we want to prevent C3 usage while vblank interrupts
-	 * are enabled.
-	 */
-	pm_qos_update_request(&dev_priv->i945gm_vblank.pm_qos,
-			      READ_ONCE(dev_priv->i945gm_vblank.enabled) ?
-			      dev_priv->i945gm_vblank.c3_disable_latency :
-			      PM_QOS_DEFAULT_VALUE);
-}
-
-static int cstate_disable_latency(const char *name)
-{
-	const struct cpuidle_driver *drv;
-	int i;
-
-	drv = cpuidle_get_driver();
-	if (!drv)
-		return 0;
-
-	for (i = 0; i < drv->state_count; i++) {
-		const struct cpuidle_state *state = &drv->states[i];
-
-		if (!strcmp(state->name, name))
-			return state->exit_latency ?
-				state->exit_latency - 1 : 0;
-	}
-
-	return 0;
-}
-
-static void i945gm_vblank_work_init(struct drm_i915_private *dev_priv)
-{
-	INIT_WORK(&dev_priv->i945gm_vblank.work,
-		  i945gm_vblank_work_func);
-
-	dev_priv->i945gm_vblank.c3_disable_latency =
-		cstate_disable_latency("C3");
-	pm_qos_add_request(&dev_priv->i945gm_vblank.pm_qos,
-			   PM_QOS_CPU_DMA_LATENCY,
-			   PM_QOS_DEFAULT_VALUE);
-}
-
-static void i945gm_vblank_work_fini(struct drm_i915_private *dev_priv)
-{
-	cancel_work_sync(&dev_priv->i945gm_vblank.work);
-	pm_qos_remove_request(&dev_priv->i945gm_vblank.pm_qos);
-}
-
 static void ibx_irq_reset(struct drm_i915_private *dev_priv)
 {
 	struct intel_uncore *uncore = &dev_priv->uncore;
@@ -4298,9 +4249,6 @@  void intel_irq_init(struct drm_i915_private *dev_priv)
 	struct intel_rps *rps = &dev_priv->gt_pm.rps;
 	int i;
 
-	if (IS_I945GM(dev_priv))
-		i945gm_vblank_work_init(dev_priv);
-
 	intel_hpd_init_work(dev_priv);
 
 	INIT_WORK(&rps->work, gen6_pm_rps_work);
@@ -4388,9 +4336,6 @@  void intel_irq_fini(struct drm_i915_private *i915)
 {
 	int i;
 
-	if (IS_I945GM(i915))
-		i945gm_vblank_work_fini(i915);
-
 	for (i = 0; i < MAX_L3_SLICES; ++i)
 		kfree(i915->l3_parity.remap_info[i]);
 }
diff --git a/drivers/gpu/drm/i915/i915_irq.h b/drivers/gpu/drm/i915/i915_irq.h
index 8e7e6071777e..19a3bc019535 100644
--- a/drivers/gpu/drm/i915/i915_irq.h
+++ b/drivers/gpu/drm/i915/i915_irq.h
@@ -122,12 +122,12 @@  u32 i915_get_vblank_counter(struct drm_crtc *crtc);
 u32 g4x_get_vblank_counter(struct drm_crtc *crtc);
 
 int i8xx_enable_vblank(struct drm_crtc *crtc);
-int i945gm_enable_vblank(struct drm_crtc *crtc);
+int i915gm_enable_vblank(struct drm_crtc *crtc);
 int i965_enable_vblank(struct drm_crtc *crtc);
 int ilk_enable_vblank(struct drm_crtc *crtc);
 int bdw_enable_vblank(struct drm_crtc *crtc);
 void i8xx_disable_vblank(struct drm_crtc *crtc);
-void i945gm_disable_vblank(struct drm_crtc *crtc);
+void i915gm_disable_vblank(struct drm_crtc *crtc);
 void i965_disable_vblank(struct drm_crtc *crtc);
 void ilk_disable_vblank(struct drm_crtc *crtc);
 void bdw_disable_vblank(struct drm_crtc *crtc);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index eefd789b9a28..13dd1df0c853 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2709,6 +2709,7 @@  static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define VLV_GU_CTL0	_MMIO(VLV_DISPLAY_BASE + 0x2030)
 #define VLV_GU_CTL1	_MMIO(VLV_DISPLAY_BASE + 0x2034)
 #define SCPD0		_MMIO(0x209c) /* 915+ only */
+#define  CSTATE_RENDER_CLOCK_GATE_DISABLE	(1 << 5)
 #define GEN2_IER	_MMIO(0x20a0)
 #define GEN2_IIR	_MMIO(0x20a4)
 #define GEN2_IMR	_MMIO(0x20a8)