drm/i915/skl: Add support for the SAGV, fix underrun hangs

Message ID	1468274446-29900-1-git-send-email-cpaul@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Lyude <cpaul@redhat.com> To: intel-gfx@lists.freedesktop.org Date: Mon, 11 Jul 2016 18:00:46 -0400 Message-Id: <1468274446-29900-1-git-send-email-cpaul@redhat.com> MIME-Version: 1.0 Cc: David Airlie <airlied@linux.ie>, Daniel Vetter <daniel.vetter@ffwll.ch>, "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., linux-kernel@vger.kernel.org open list" <dri-devel@lists.freedesktop.org>, Daniel Vetter <daniel.vetter@intel.com> Subject: [Intel-gfx] [PATCH] drm/i915/skl: Add support for the SAGV, fix underrun hangs Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Message ID

1468274446-29900-1-git-send-email-cpaul@redhat.com (mailing list archive)

State

New, archived

Headers

From: Lyude <cpaul@redhat.com>
To: intel-gfx@lists.freedesktop.org
Date: Mon, 11 Jul 2016 18:00:46 -0400
Message-Id: <1468274446-29900-1-git-send-email-cpaul@redhat.com>
MIME-Version: 1.0
Cc: David Airlie <airlied@linux.ie>, Daniel Vetter <daniel.vetter@ffwll.ch>, 
	"open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., 
	linux-kernel@vger.kernel.org open list"
	<dri-devel@lists.freedesktop.org>, 
	Daniel Vetter <daniel.vetter@intel.com>
Subject: [Intel-gfx] [PATCH] drm/i915/skl: Add support for the SAGV,
	fix underrun hangs
Precedence: list
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Commit Message

cpaul@redhat.com July 11, 2016, 10 p.m. UTC

Since the watermark calculations for Skylake are still broken, we're apt
to hitting underruns very easily under multi-monitor configurations.
While it would be lovely if this was fixed, it's not. Another problem
that's been coming from this however, is the mysterious issue of
underruns causing full system hangs. An easy way to reproduce this with
a skylake system:

- Get a laptop with a skylake GPU, and hook up two external monitors to
  it
- Move the cursor from the built-in LCD to one of the external displays
  as quickly as you can
- You'll get a few pipe underruns, and eventually the entire system will
  just freeze.

After doing a lot of investigation and reading through the bspec, I
found the existence of the SAGV, which is responsible for adjusting the
system agent voltage and clock frequencies depending on how much power
we need. According to the bspec:

"The display engine access to system memory is blocked during the
 adjustment time. SAGV defaults to enabled. Software must use the
 GT-driver pcode mailbox to disable SAGV when the display engine is not
 able to tolerate the blocking time."

The rest of the bspec goes on to explain that software can simply leave
the SAGV enabled, and disable it when we use interlaced pipes/have more
then one pipe active.

Sure enough, with this patchset the system hangs resulting from pipe
underruns on Skylake have completely vanished on my T460s.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/i915/i915_drv.h |   2 +
 drivers/gpu/drm/i915/i915_reg.h |   5 ++
 drivers/gpu/drm/i915/intel_pm.c | 103 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 110 insertions(+)

Comments

Matt Roper July 12, 2016, 3:59 p.m. UTC | #1

On Mon, Jul 11, 2016 at 06:00:46PM -0400, Lyude wrote:
> Since the watermark calculations for Skylake are still broken, we're apt
> to hitting underruns very easily under multi-monitor configurations.
> While it would be lovely if this was fixed, it's not. Another problem
> that's been coming from this however, is the mysterious issue of
> underruns causing full system hangs. An easy way to reproduce this with
> a skylake system:
> 
> - Get a laptop with a skylake GPU, and hook up two external monitors to
>   it
> - Move the cursor from the built-in LCD to one of the external displays
>   as quickly as you can
> - You'll get a few pipe underruns, and eventually the entire system will
>   just freeze.
> 
> After doing a lot of investigation and reading through the bspec, I
> found the existence of the SAGV, which is responsible for adjusting the
> system agent voltage and clock frequencies depending on how much power
> we need. According to the bspec:
> 
> "The display engine access to system memory is blocked during the
>  adjustment time. SAGV defaults to enabled. Software must use the
>  GT-driver pcode mailbox to disable SAGV when the display engine is not
>  able to tolerate the blocking time."
> 
> The rest of the bspec goes on to explain that software can simply leave
> the SAGV enabled, and disable it when we use interlaced pipes/have more
> then one pipe active.
> 
> Sure enough, with this patchset the system hangs resulting from pipe
> underruns on Skylake have completely vanished on my T460s.
> 
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Lyude <cpaul@redhat.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |   2 +
>  drivers/gpu/drm/i915/i915_reg.h |   5 ++
>  drivers/gpu/drm/i915/intel_pm.c | 103 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 110 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 03e1bfa..660d0a6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1959,6 +1959,8 @@ struct drm_i915_private {
>  	struct i915_suspend_saved_registers regfile;
>  	struct vlv_s0ix_state vlv_s0ix_state;
>  
> +	bool skl_sagv_enabled;
> +
>  	struct {
>  		/*
>  		 * Raw watermark latency values:
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 8bfde75..9b2eb0b 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7162,6 +7162,11 @@ enum {
>  #define   HSW_PCODE_DE_WRITE_FREQ_REQ		0x17
>  #define   DISPLAY_IPS_CONTROL			0x19
>  #define	  HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL	0x1A
> +#define   GEN9_PCODE_SAGV_CONTROL		0x21
> +#define     GEN9_SAGV_DISABLE			0x0
> +#define     GEN9_SAGV_LOW_FREQ			0x1
> +#define     GEN9_SAGV_HIGH_FREQ			0x2
> +#define     GEN9_SAGV_DYNAMIC_FREQ              0x3
>  #define GEN6_PCODE_DATA				_MMIO(0x138128)
>  #define   GEN6_PCODE_FREQ_IA_RATIO_SHIFT	8
>  #define   GEN6_PCODE_FREQ_RING_RATIO_SHIFT	16
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 5a8ee0c..07807aa 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2876,6 +2876,102 @@ skl_wm_plane_id(const struct intel_plane *plane)
>  }
>  
>  static void
> +skl_sagv_get_hw_state(struct drm_i915_private *dev_priv)
> +{
> +	u32 temp;
> +	int ret;
> +
> +	mutex_lock(&dev_priv->rps.hw_lock);
> +	ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL, &temp);
> +	mutex_unlock(&dev_priv->rps.hw_lock);
> +
> +	if (!ret) {
> +		dev_priv->skl_sagv_enabled = !!(temp & GEN9_SAGV_DYNAMIC_FREQ);
> +	} else {
> +		/*
> +		 * If for some reason we can't access the SAGV state, follow
> +		 * the bspec and assume it's enabled
> +		 */
> +		DRM_ERROR("Failed to get SAGV state, assuming enabled\n");
> +		dev_priv->skl_sagv_enabled = true;
> +	}
> +}
> +
> +/*
> + * SAGV dynamically adjusts the system agent voltage and clock frequencies
> + * depending on power and performance requirements. The display engine access
> + * to system memory is blocked during the adjustment time. Having this enabled
> + * in multi-pipe configurations can cause issues (such as underruns causing
> + * full system hangs), and the bspec also suggests that software disable it
> + * when more then one pipe is enabled.
> + */
> +static int
> +skl_enable_sagv(struct drm_i915_private *dev_priv)
> +{
> +	int ret;
> +
> +	if (dev_priv->skl_sagv_enabled)
> +		return 0;
> +
> +	mutex_lock(&dev_priv->rps.hw_lock);
> +	DRM_DEBUG_KMS("Enabling the SAGV\n");
> +
> +	ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
> +				      GEN9_SAGV_DYNAMIC_FREQ);
> +	if (!ret)
> +		dev_priv->skl_sagv_enabled = true;
> +	else
> +		DRM_ERROR("Failed to enable the SAGV\n");
> +
> +	/* We don't need to wait for SAGV when enabling */
> +	mutex_unlock(&dev_priv->rps.hw_lock);
> +	return ret;
> +}
> +
> +static int
> +skl_disable_sagv(struct drm_i915_private *dev_priv)
> +{
> +	int ret = 0;
> +	unsigned long timeout;
> +	u32 temp;
> +
> +	if (!dev_priv->skl_sagv_enabled)
> +		return 0;
> +
> +	mutex_lock(&dev_priv->rps.hw_lock);
> +	DRM_DEBUG_KMS("Disabling the SAGV\n");
> +
> +	/* bspec says to keep retrying for at least 1 ms */
> +	timeout = jiffies + msecs_to_jiffies(1);
> +	do {
> +		ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
> +					      GEN9_SAGV_DISABLE);
> +		if (ret) {
> +			DRM_ERROR("Failed to disable the SAGV\n");
> +			goto out;
> +		}
> +
> +		ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL,
> +					     &temp);
> +		if (ret) {
> +			DRM_ERROR("Failed to check the status of the SAGV\n");
> +			goto out;
> +		}
> +	} while (!(temp & 0x1) && jiffies < timeout);
> +
> +	if (temp & 0x1)
> +		dev_priv->skl_sagv_enabled = false;
> +	else {

Minor coding style nitpick; if/else branches need to have braces on both
branches if either branch needs them.

> +		ret = -1;
> +		DRM_ERROR("Request to disable SAGV timed out\n");
> +	}
> +
> +out:
> +	mutex_unlock(&dev_priv->rps.hw_lock);
> +	return ret;
> +}
> +
> +static void
>  skl_ddb_get_pipe_allocation_limits(struct drm_device *dev,
>  				   const struct intel_crtc_state *cstate,
>  				   struct skl_ddb_entry *alloc, /* out */
> @@ -3686,6 +3782,11 @@ static void skl_write_wm_values(struct drm_i915_private *dev_priv,
>  	struct drm_device *dev = &dev_priv->drm;
>  	struct intel_crtc *crtc;
>  
> +	if (dev_priv->active_crtcs == 1)
> +		skl_enable_sagv(dev_priv);
> +	else
> +		skl_disable_sagv(dev_priv);
> +

I think you want to make this conditional on !BROXTON.  The bspec isn't
super clear, but it says "On core CPUs..." which I believe indicates
that this isn't relevant to the Atom processors like Broxton.  The SAGV
block time table also lists "SKL, EXCLUDE(BXT)."

This would also explain why I've been unable to reproduce your SKL
problems on my Broxton hardware.


Matt


>  	for_each_intel_crtc(dev, crtc) {
>  		int i, level, max_level = ilk_wm_max_level(dev);
>  		enum pipe pipe = crtc->pipe;
> @@ -4228,6 +4329,8 @@ void skl_wm_get_hw_state(struct drm_device *dev)
>  		/* Easy/common case; just sanitize DDB now if everything off */
>  		memset(ddb, 0, sizeof(*ddb));
>  	}
> +
> +	skl_sagv_get_hw_state(dev_priv);
>  }
>  
>  static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)
> -- 
> 2.7.4
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 03e1bfa..660d0a6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1959,6 +1959,8 @@  struct drm_i915_private {
 	struct i915_suspend_saved_registers regfile;
 	struct vlv_s0ix_state vlv_s0ix_state;
 
+	bool skl_sagv_enabled;
+
 	struct {
 		/*
 		 * Raw watermark latency values:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8bfde75..9b2eb0b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7162,6 +7162,11 @@  enum {
 #define   HSW_PCODE_DE_WRITE_FREQ_REQ		0x17
 #define   DISPLAY_IPS_CONTROL			0x19
 #define	  HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL	0x1A
+#define   GEN9_PCODE_SAGV_CONTROL		0x21
+#define     GEN9_SAGV_DISABLE			0x0
+#define     GEN9_SAGV_LOW_FREQ			0x1
+#define     GEN9_SAGV_HIGH_FREQ			0x2
+#define     GEN9_SAGV_DYNAMIC_FREQ              0x3
 #define GEN6_PCODE_DATA				_MMIO(0x138128)
 #define   GEN6_PCODE_FREQ_IA_RATIO_SHIFT	8
 #define   GEN6_PCODE_FREQ_RING_RATIO_SHIFT	16
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 5a8ee0c..07807aa 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2876,6 +2876,102 @@  skl_wm_plane_id(const struct intel_plane *plane)
 }
 
 static void
+skl_sagv_get_hw_state(struct drm_i915_private *dev_priv)
+{
+	u32 temp;
+	int ret;
+
+	mutex_lock(&dev_priv->rps.hw_lock);
+	ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL, &temp);
+	mutex_unlock(&dev_priv->rps.hw_lock);
+
+	if (!ret) {
+		dev_priv->skl_sagv_enabled = !!(temp & GEN9_SAGV_DYNAMIC_FREQ);
+	} else {
+		/*
+		 * If for some reason we can't access the SAGV state, follow
+		 * the bspec and assume it's enabled
+		 */
+		DRM_ERROR("Failed to get SAGV state, assuming enabled\n");
+		dev_priv->skl_sagv_enabled = true;
+	}
+}
+
+/*
+ * SAGV dynamically adjusts the system agent voltage and clock frequencies
+ * depending on power and performance requirements. The display engine access
+ * to system memory is blocked during the adjustment time. Having this enabled
+ * in multi-pipe configurations can cause issues (such as underruns causing
+ * full system hangs), and the bspec also suggests that software disable it
+ * when more then one pipe is enabled.
+ */
+static int
+skl_enable_sagv(struct drm_i915_private *dev_priv)
+{
+	int ret;
+
+	if (dev_priv->skl_sagv_enabled)
+		return 0;
+
+	mutex_lock(&dev_priv->rps.hw_lock);
+	DRM_DEBUG_KMS("Enabling the SAGV\n");
+
+	ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
+				      GEN9_SAGV_DYNAMIC_FREQ);
+	if (!ret)
+		dev_priv->skl_sagv_enabled = true;
+	else
+		DRM_ERROR("Failed to enable the SAGV\n");
+
+	/* We don't need to wait for SAGV when enabling */
+	mutex_unlock(&dev_priv->rps.hw_lock);
+	return ret;
+}
+
+static int
+skl_disable_sagv(struct drm_i915_private *dev_priv)
+{
+	int ret = 0;
+	unsigned long timeout;
+	u32 temp;
+
+	if (!dev_priv->skl_sagv_enabled)
+		return 0;
+
+	mutex_lock(&dev_priv->rps.hw_lock);
+	DRM_DEBUG_KMS("Disabling the SAGV\n");
+
+	/* bspec says to keep retrying for at least 1 ms */
+	timeout = jiffies + msecs_to_jiffies(1);
+	do {
+		ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
+					      GEN9_SAGV_DISABLE);
+		if (ret) {
+			DRM_ERROR("Failed to disable the SAGV\n");
+			goto out;
+		}
+
+		ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL,
+					     &temp);
+		if (ret) {
+			DRM_ERROR("Failed to check the status of the SAGV\n");
+			goto out;
+		}
+	} while (!(temp & 0x1) && jiffies < timeout);
+
+	if (temp & 0x1)
+		dev_priv->skl_sagv_enabled = false;
+	else {
+		ret = -1;
+		DRM_ERROR("Request to disable SAGV timed out\n");
+	}
+
+out:
+	mutex_unlock(&dev_priv->rps.hw_lock);
+	return ret;
+}
+
+static void
 skl_ddb_get_pipe_allocation_limits(struct drm_device *dev,
 				   const struct intel_crtc_state *cstate,
 				   struct skl_ddb_entry *alloc, /* out */
@@ -3686,6 +3782,11 @@  static void skl_write_wm_values(struct drm_i915_private *dev_priv,
 	struct drm_device *dev = &dev_priv->drm;
 	struct intel_crtc *crtc;
 
+	if (dev_priv->active_crtcs == 1)
+		skl_enable_sagv(dev_priv);
+	else
+		skl_disable_sagv(dev_priv);
+
 	for_each_intel_crtc(dev, crtc) {
 		int i, level, max_level = ilk_wm_max_level(dev);
 		enum pipe pipe = crtc->pipe;
@@ -4228,6 +4329,8 @@  void skl_wm_get_hw_state(struct drm_device *dev)
 		/* Easy/common case; just sanitize DDB now if everything off */
 		memset(ddb, 0, sizeof(*ddb));
 	}
+
+	skl_sagv_get_hw_state(dev_priv);
 }
 
 static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)

drm/i915/skl: Add support for the SAGV, fix underrun hangs

Commit Message

Comments

Patch