Message ID | 20150615135341.GA28462@nuc-i3427.alporthouse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 15, 2015 at 02:53:41PM +0100, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 03:45:38PM +0200, Daniel Vetter wrote: > > On Mon, Jun 15, 2015 at 12:23:48PM +0100, Chris Wilson wrote: > > > In igt, we want to test handling of GPU hangs, both for recovery > > > purposes and for reporting. However, we don't want to inject a genuine > > > GPU hang onto a machine that cannot recover and so be permenantly > > > wedged. Rather than embed heuristics into igt, have the kernel report > > > exactly when it expects the GPU reset to work. > > > > > > This can also be usefully extended in future to indicate different > > > levels of fine-grained resets. > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > > Cc: Tim Gore <tim.gore@intel.com> > > > Cc: Tomas Elf <tomas.elf@intel.com> > > > > Yeah makes sense. Will merge as soon as someone smashes a t-b with a few > > igt patches using this on top. > > diff --git a/lib/igt_gt.c b/lib/igt_gt.c > index deb5560..8a1ffb2 100644 > --- a/lib/igt_gt.c > +++ b/lib/igt_gt.c > @@ -26,6 +26,7 @@ > #include <errno.h> > #include <sys/types.h> > #include <sys/stat.h> > +#include <sys/ioctl.h> > #include <fcntl.h> > > #include "drmtest.h" > @@ -47,6 +48,21 @@ > * engines. > */ > > +static bool has_gpu_reset(int fd) > +{ > + struct drm_i915_getparam gp; > + int val = 0; > + > + memset(&gp, 0, sizeof(gp)); > + gp.param = 35; /* HAS_GPU_RESET */ > + gp.value = &val; > + > + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp, sizeof(gp))) > + return intel_gen(intel_get_drm_devid(fd)) >= 5; > + > + return val > 0; > +} > > /** > * igt_require_hang_ring: > @@ -60,7 +76,7 @@ > void igt_require_hang_ring(int fd, int ring) > { > gem_context_require_ban_period(fd); > - igt_require(intel_gen(intel_get_drm_devid(fd)) >= 5); > + igt_require(has_gpu_reset(fd)); > } Speaking of which, do we want igt_require(getenv("IGT_DISABLE_HANG") == NULL); here? -Chris
On Mon, Jun 15, 2015 at 02:58:17PM +0100, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 02:53:41PM +0100, Chris Wilson wrote: > > On Mon, Jun 15, 2015 at 03:45:38PM +0200, Daniel Vetter wrote: > > > On Mon, Jun 15, 2015 at 12:23:48PM +0100, Chris Wilson wrote: > > > > In igt, we want to test handling of GPU hangs, both for recovery > > > > purposes and for reporting. However, we don't want to inject a genuine > > > > GPU hang onto a machine that cannot recover and so be permenantly > > > > wedged. Rather than embed heuristics into igt, have the kernel report > > > > exactly when it expects the GPU reset to work. > > > > > > > > This can also be usefully extended in future to indicate different > > > > levels of fine-grained resets. > > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > Cc: Tim Gore <tim.gore@intel.com> > > > > Cc: Tomas Elf <tomas.elf@intel.com> > > > > > > Yeah makes sense. Will merge as soon as someone smashes a t-b with a few > > > igt patches using this on top. > > > > diff --git a/lib/igt_gt.c b/lib/igt_gt.c > > index deb5560..8a1ffb2 100644 > > --- a/lib/igt_gt.c > > +++ b/lib/igt_gt.c > > @@ -26,6 +26,7 @@ > > #include <errno.h> > > #include <sys/types.h> > > #include <sys/stat.h> > > +#include <sys/ioctl.h> > > #include <fcntl.h> > > > > #include "drmtest.h" > > @@ -47,6 +48,21 @@ > > * engines. > > */ > > > > +static bool has_gpu_reset(int fd) > > +{ > > + struct drm_i915_getparam gp; > > + int val = 0; > > + > > + memset(&gp, 0, sizeof(gp)); > > + gp.param = 35; /* HAS_GPU_RESET */ > > + gp.value = &val; > > + > > + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp, sizeof(gp))) > > + return intel_gen(intel_get_drm_devid(fd)) >= 5; > > + > > + return val > 0; > > +} > > > > /** > > * igt_require_hang_ring: > > @@ -60,7 +76,7 @@ > > void igt_require_hang_ring(int fd, int ring) > > { > > gem_context_require_ban_period(fd); > > - igt_require(intel_gen(intel_get_drm_devid(fd)) >= 5); > > + igt_require(has_gpu_reset(fd)); > > } Count me convinced, patch applied ;-) > Speaking of which, do we want > igt_require(getenv("IGT_DISABLE_HANG") == NULL); > here? Well igt_require(!igt_check_boolean_env_var(IGT_DISABLE_HANG, false)); but tbh I'm not sure of that. Filtering testcases with piglit using -x hang should amount to the same really. -Daniel
diff --git a/lib/igt_gt.c b/lib/igt_gt.c index deb5560..8a1ffb2 100644 --- a/lib/igt_gt.c +++ b/lib/igt_gt.c @@ -26,6 +26,7 @@ #include <errno.h> #include <sys/types.h> #include <sys/stat.h> +#include <sys/ioctl.h> #include <fcntl.h> #include "drmtest.h" @@ -47,6 +48,21 @@ * engines. */ +static bool has_gpu_reset(int fd) +{ + struct drm_i915_getparam gp; + int val = 0; + + memset(&gp, 0, sizeof(gp)); + gp.param = 35; /* HAS_GPU_RESET */ + gp.value = &val; + + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp, sizeof(gp))) + return intel_gen(intel_get_drm_devid(fd)) >= 5; + + return val > 0; +} /** * igt_require_hang_ring: @@ -60,7 +76,7 @@