diff mbox

drm/i915/selftests: Skip over live context testing when wedged

Message ID 20180705145845.24005-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson July 5, 2018, 2:58 p.m. UTC
If the GPU is terminally wedged we cannot submit any requests into a
context, completely unfulfilling our purpose of doing so. As this
expectedly fails, skip over the test.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/i915_gem_context.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Rodrigo Vivi July 5, 2018, 8:52 p.m. UTC | #1
On Thu, Jul 05, 2018 at 03:58:45PM +0100, Chris Wilson wrote:
> If the GPU is terminally wedged we cannot submit any requests into a
> context, completely unfulfilling our purpose of doing so. As this
> expectedly fails, skip over the test.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/selftests/i915_gem_context.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index cc848ceeb3c3..0b36265a0f96 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -599,6 +599,9 @@ int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
>  	bool fake_alias = false;
>  	int err;
>  
> +	if (i915_terminally_wedged(&dev_priv->gpu_error))
> +		return 0;
> +

I wonder if this could mask a real failure under the skips?

but if actual error is already captured somewhere else than probably better to avoid the noise
of a ENOTRECOVERABLE or whatever...

>  	/* Install a fake aliasing gtt for exercise */
>  	if (USES_PPGTT(dev_priv) && !dev_priv->mm.aliasing_ppgtt) {
>  		mutex_lock(&dev_priv->drm.struct_mutex);
> -- 
> 2.18.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson July 6, 2018, 6:42 a.m. UTC | #2
Quoting Rodrigo Vivi (2018-07-05 21:52:10)
> On Thu, Jul 05, 2018 at 03:58:45PM +0100, Chris Wilson wrote:
> > If the GPU is terminally wedged we cannot submit any requests into a
> > context, completely unfulfilling our purpose of doing so. As this
> > expectedly fails, skip over the test.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/selftests/i915_gem_context.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > index cc848ceeb3c3..0b36265a0f96 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > @@ -599,6 +599,9 @@ int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
> >       bool fake_alias = false;
> >       int err;
> >  
> > +     if (i915_terminally_wedged(&dev_priv->gpu_error))
> > +             return 0;
> > +
> 
> I wonder if this could mask a real failure under the skips?

The *test* can't be run, so what failure relevant to this *test* can be
shown?

As you notice, when we get to the reset test, we do proclaim failure as
we've already demonstrated reset is bust.
-Chris
Rodrigo Vivi July 6, 2018, 4:15 p.m. UTC | #3
On Fri, Jul 06, 2018 at 07:42:00AM +0100, Chris Wilson wrote:
> Quoting Rodrigo Vivi (2018-07-05 21:52:10)
> > On Thu, Jul 05, 2018 at 03:58:45PM +0100, Chris Wilson wrote:
> > > If the GPU is terminally wedged we cannot submit any requests into a
> > > context, completely unfulfilling our purpose of doing so. As this
> > > expectedly fails, skip over the test.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/selftests/i915_gem_context.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > > index cc848ceeb3c3..0b36265a0f96 100644
> > > --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > > @@ -599,6 +599,9 @@ int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
> > >       bool fake_alias = false;
> > >       int err;
> > >  
> > > +     if (i915_terminally_wedged(&dev_priv->gpu_error))
> > > +             return 0;
> > > +
> > 
> > I wonder if this could mask a real failure under the skips?
> 
> The *test* can't be run, so what failure relevant to this *test* can be
> shown?
> 
> As you notice, when we get to the reset test, we do proclaim failure as
> we've already demonstrated reset is bust.

Makes sense...

I was going to add rv-b here for this and others, but
I saw you already got hem and is already pushing ;)

Also thanks for the explanation on -ENOTRECOVERABLE one..
that also makes sense.

> -Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index cc848ceeb3c3..0b36265a0f96 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -599,6 +599,9 @@  int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
 	bool fake_alias = false;
 	int err;
 
+	if (i915_terminally_wedged(&dev_priv->gpu_error))
+		return 0;
+
 	/* Install a fake aliasing gtt for exercise */
 	if (USES_PPGTT(dev_priv) && !dev_priv->mm.aliasing_ppgtt) {
 		mutex_lock(&dev_priv->drm.struct_mutex);