diff mbox

drm/i915/guc: Prevent ggtt->invalidate assert during GuC reload

Message ID 20170630174149.11202-1-michel.thierry@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Michel Thierry June 30, 2017, 5:41 p.m. UTC
The driver reloads the GuC firmware after full gpu reset or
suspend/resume, but it never disables the GuC beforehand.
This leads us to hit the assert inside i915_ggtt_enable_guc added
by commit 04f7b24eccdf ("drm/i915/guc: Assert that we switch between
known ggtt->invalidate functions").

As a workaround, don't call i915_ggtt_enable_guc if there is a GuC
execbuf_client; because if there isn't one we are either loading in a
fresh system or we called intel_uc_fini_hw.

I'm inclined to this approach because even if intel_uc_fini_hw could be
added to the suspend path, we will still need to keep this for the
full gpu reset case.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_uc.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Michel Thierry July 26, 2017, 8:59 p.m. UTC | #1
On 6/30/2017 10:41 AM, Michel Thierry wrote:
> The driver reloads the GuC firmware after full gpu reset or
> suspend/resume, but it never disables the GuC beforehand.
> This leads us to hit the assert inside i915_ggtt_enable_guc added
> by commit 04f7b24eccdf ("drm/i915/guc: Assert that we switch between
> known ggtt->invalidate functions").
> 

Ping...
I know this isn't important, since who is using guc? (I only know people 
who hates it).

Or we just remove the assert in i915_ggtt_enable_guc.

> As a workaround, don't call i915_ggtt_enable_guc if there is a GuC
> execbuf_client; because if there isn't one we are either loading in a
> fresh system or we called intel_uc_fini_hw.
> 
> I'm inclined to this approach because even if intel_uc_fini_hw could be
> added to the suspend path, we will still need to keep this for the
> full gpu reset case.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Winiarski <michal.winiarski@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_uc.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
> index a8930f2feacf..564b0d2b8842 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -339,8 +339,14 @@ int intel_uc_init_hw(struct drm_i915_private *dev_priv)
>          guc_disable_communication(guc);
>          gen9_reset_guc_interrupts(dev_priv);
> 
> -       /* We need to notify the guc whenever we change the GGTT */
> -       i915_ggtt_enable_guc(dev_priv);
> +       /*
> +        * We need to notify the guc whenever we change the GGTT; but if we
> +        * are reloading the firmware (after full gpu reset or suspend/resume),
> +        * we should skip this since gtt->invalidate was already set (or we hit
> +        * an assert).
> +        */
> +       if (!dev_priv->guc.execbuf_client)
> +               i915_ggtt_enable_guc(dev_priv);
> 
>          if (i915.enable_guc_submission) {
>                  /*
> --
> 2.11.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Chris Wilson July 27, 2017, 6:44 a.m. UTC | #2
Quoting Michel Thierry (2017-07-26 21:59:07)
> On 6/30/2017 10:41 AM, Michel Thierry wrote:
> > The driver reloads the GuC firmware after full gpu reset or
> > suspend/resume, but it never disables the GuC beforehand.
> > This leads us to hit the assert inside i915_ggtt_enable_guc added
> > by commit 04f7b24eccdf ("drm/i915/guc: Assert that we switch between
> > known ggtt->invalidate functions").
> > 
> 
> Ping...
> I know this isn't important, since who is using guc? (I only know people 
> who hates it).
> 
> Or we just remove the assert in i915_ggtt_enable_guc.

Kind of. This implies we've lost control over the guc-driver state. My
feeling is that we need a post-reset hook for the guc so that we can
throw away the lost state.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index a8930f2feacf..564b0d2b8842 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -339,8 +339,14 @@  int intel_uc_init_hw(struct drm_i915_private *dev_priv)
 	guc_disable_communication(guc);
 	gen9_reset_guc_interrupts(dev_priv);
 
-	/* We need to notify the guc whenever we change the GGTT */
-	i915_ggtt_enable_guc(dev_priv);
+	/*
+	 * We need to notify the guc whenever we change the GGTT; but if we
+	 * are reloading the firmware (after full gpu reset or suspend/resume),
+	 * we should skip this since gtt->invalidate was already set (or we hit
+	 * an assert).
+	 */
+	if (!dev_priv->guc.execbuf_client)
+		i915_ggtt_enable_guc(dev_priv);
 
 	if (i915.enable_guc_submission) {
 		/*