[RFC] drm/i915: do not stop engines on sanitize if i915.reset=0
diff mbox

Message ID 20180207212440.13438-1-daniele.ceraolospurio@intel.com
State New
Headers show

Commit Message

Daniele Ceraolo Spurio Feb. 7, 2018, 9:24 p.m. UTC
Since commit 5896a5c8c9c0 (drm/i915: Always stop the rings before a
missing GPU reset) we attempt to stop the engines during gem_sanitize
even if reset=0 and nothing bad happened on the gpu.
The specs says that the STOP_RINGS bit needs to be cleared to resume
normal operation, but for some reason the value of the bit seems to be
changing without us writing to it (maybe rc6 entry/exit?), so normal
operation resumes correctly. However, it still feels incorrect to stop
the engines if there hasn't been any issue so skip the whole reset
call in gem_sanitize if i915.reset=0

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Chris Wilson Feb. 7, 2018, 9:39 p.m. UTC | #1
Quoting Daniele Ceraolo Spurio (2018-02-07 21:24:40)
> Since commit 5896a5c8c9c0 (drm/i915: Always stop the rings before a
> missing GPU reset) we attempt to stop the engines during gem_sanitize
> even if reset=0 and nothing bad happened on the gpu.
> The specs says that the STOP_RINGS bit needs to be cleared to resume
> normal operation, but for some reason the value of the bit seems to be
> changing without us writing to it (maybe rc6 entry/exit?), so normal
> operation resumes correctly. However, it still feels incorrect to stop
> the engines if there hasn't been any issue so skip the whole reset
> call in gem_sanitize if i915.reset=0
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c1b80cd52f9e..beb351cb7a12 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4882,10 +4882,8 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
>          * it may impact the display and we are uncertain about the stability
>          * of the reset, so this could be applied to even earlier gen.
>          */
> -       if (INTEL_GEN(i915) >= 5) {
> -               int reset = intel_gpu_reset(i915, ALL_ENGINES);
> -               WARN_ON(reset && reset != -ENODEV);
> -       }
> +       if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
> +               WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));

I'll buy just because it makes the code tidier and should make the
warning more understandable, otherwise I look at i915.reset=0 (or lack of
reset implementation) as something that should be fixed rather than
worked around.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Chris Wilson Feb. 8, 2018, 7:35 a.m. UTC | #2
Quoting Patchwork (2018-02-08 05:13:08)
> == Series Details ==
> 
> Series: drm/i915: do not stop engines on sanitize if i915.reset=0
> URL   : https://patchwork.freedesktop.org/series/37846/
> State : failure
> 
> == Summary ==
> 
> Test pm_rpm:
>         Subgroup system-suspend-execbuf:
>                 pass       -> INCOMPLETE (shard-hsw) fdo#103375
> Test kms_flip:
>         Subgroup 2x-dpms-vs-vblank-race-interruptible:
>                 fail       -> PASS       (shard-hsw) fdo#103060 +2
>         Subgroup flip-vs-modeset-vs-hang-interruptible:
>                 dmesg-warn -> PASS       (shard-snb) fdo#104311
> Test kms_vblank:
>         Subgroup pipe-a-ts-continuation-dpms-suspend:
>                 skip       -> PASS       (shard-hsw)
> Test kms_plane_lowres:
>         Subgroup pipe-a-tiling-none:
>                 pass       -> FAIL       (shard-apl)
>         Subgroup pipe-b-tiling-y:
>                 fail       -> PASS       (shard-apl) fdo#103166

And clear. Pushed, thanks for the patch.
-Chris

Patch
diff mbox

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c1b80cd52f9e..beb351cb7a12 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4882,10 +4882,8 @@  void i915_gem_sanitize(struct drm_i915_private *i915)
 	 * it may impact the display and we are uncertain about the stability
 	 * of the reset, so this could be applied to even earlier gen.
 	 */
-	if (INTEL_GEN(i915) >= 5) {
-		int reset = intel_gpu_reset(i915, ALL_ENGINES);
-		WARN_ON(reset && reset != -ENODEV);
-	}
+	if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
+		WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
 }
 
 int i915_gem_suspend(struct drm_i915_private *dev_priv)