[i-g-t,v2] tests/gem_ringfill: Add {render, blitter}-forked subtests.
diff mbox

Message ID 1435233884.3368.0.camel@jlahtine-mobl1
State New
Headers show

Commit Message

Joonas Lahtinen June 25, 2015, 12:04 p.m. UTC
Add forking subtests to gem_ringfill. Tests cause consistent GPU
hangs on SKL.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89959
---
 tests/gem_ringfill.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Chris Wilson June 25, 2015, 12:49 p.m. UTC | #1
On Thu, Jun 25, 2015 at 03:04:44PM +0300, Joonas Lahtinen wrote:
> Add forking subtests to gem_ringfill. Tests cause consistent GPU
> hangs on SKL.
> 
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89959
> ---
>  tests/gem_ringfill.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/tests/gem_ringfill.c b/tests/gem_ringfill.c
> index 85b01ea..1b93a03 100644
> --- a/tests/gem_ringfill.c
> +++ b/tests/gem_ringfill.c
> @@ -241,6 +241,28 @@ igt_main
>  	}
>  	igt_stop_signal_helper();
>  
> +	igt_subtest("blitter-forked") {
> +		igt_fork(child, 1) {

One child? Only a single child is required to provoke the bug? So what's
the difference between calling check_ring() in the child and the parent?

> +			check_ring(bufmgr, batch, "blt", blt_copy);
> +		}
> +		igt_waitchildren();
> +	}
> +
> +	/* Strictly only required on architectures with a separate BLT ring,
> +	 * but lets stress everybody.
> +	 */
> +	igt_subtest("render-forked") {
> +		igt_render_copyfunc_t copy;
> +
> +		copy = igt_get_render_copyfunc(batch->devid);
> +		igt_require(copy);
> +
> +		igt_fork(child, 1) {
> +			check_ring(bufmgr, batch, "render", copy);
> +		}
> +		igt_waitchildren();
> +	}

One more for forked:
igt_subtest("all-forked") {
	for_each_ring() {
		if (copy(ring)) == NULL)
			continue;
		igt_fork(child, N)
			check_ring(bufmgr, batch, name(ring), copy);
	}
	igt_waitchildren()

Other tasks required for running forked is to display the interactive
progress bar, bufmgr should be instantiated per child.
-Chris
Joonas Lahtinen June 25, 2015, 1:23 p.m. UTC | #2
On to, 2015-06-25 at 13:49 +0100, Chris Wilson wrote:
> On Thu, Jun 25, 2015 at 03:04:44PM +0300, Joonas Lahtinen wrote:
> > Add forking subtests to gem_ringfill. Tests cause consistent GPU
> > hangs on SKL.
> > 
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89959
> > ---
> >  tests/gem_ringfill.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/tests/gem_ringfill.c b/tests/gem_ringfill.c
> > index 85b01ea..1b93a03 100644
> > --- a/tests/gem_ringfill.c
> > +++ b/tests/gem_ringfill.c
> > @@ -241,6 +241,28 @@ igt_main
> >  	}
> >  	igt_stop_signal_helper();
> >  
> > +	igt_subtest("blitter-forked") {
> > +		igt_fork(child, 1) {
> 
> One child? Only a single child is required to provoke the bug? So what's
> the difference between calling check_ring() in the child and the parent?
> 

Well, it is the difference between GPU hang and no GPU hang :P It is
currently under investigation exactly why. Changing the batchbuffer
payload to NOOP stops hanging, but the subtest execution time drops from
a minute to a couple of seconds.

> > +			check_ring(bufmgr, batch, "blt", blt_copy);
> > +		}
> > +		igt_waitchildren();
> > +	}
> > +
> > +	/* Strictly only required on architectures with a separate BLT ring,
> > +	 * but lets stress everybody.
> > +	 */
> > +	igt_subtest("render-forked") {
> > +		igt_render_copyfunc_t copy;
> > +
> > +		copy = igt_get_render_copyfunc(batch->devid);
> > +		igt_require(copy);
> > +
> > +		igt_fork(child, 1) {
> > +			check_ring(bufmgr, batch, "render", copy);
> > +		}
> > +		igt_waitchildren();
> > +	}
> 
> One more for forked:
> igt_subtest("all-forked") {
> 	for_each_ring() {
> 		if (copy(ring)) == NULL)
> 			continue;
> 		igt_fork(child, N)
> 			check_ring(bufmgr, batch, name(ring), copy);
> 	}
> 	igt_waitchildren()
> 

This is something that could be added, yep. But I do not see it relevant
to the current patch, which is needed to better track down the root
cause for the SKL hangs.

Regards, Joonas

> Other tasks required for running forked is to display the interactive
> progress bar, bufmgr should be instantiated per child.
> -Chris
>
Chris Wilson June 25, 2015, 1:34 p.m. UTC | #3
On Thu, Jun 25, 2015 at 04:23:43PM +0300, Joonas Lahtinen wrote:
> On to, 2015-06-25 at 13:49 +0100, Chris Wilson wrote:
> > On Thu, Jun 25, 2015 at 03:04:44PM +0300, Joonas Lahtinen wrote:
> > > Add forking subtests to gem_ringfill. Tests cause consistent GPU
> > > hangs on SKL.
> > > 
> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89959
> > > ---
> > >  tests/gem_ringfill.c | 22 ++++++++++++++++++++++
> > >  1 file changed, 22 insertions(+)
> > > 
> > > diff --git a/tests/gem_ringfill.c b/tests/gem_ringfill.c
> > > index 85b01ea..1b93a03 100644
> > > --- a/tests/gem_ringfill.c
> > > +++ b/tests/gem_ringfill.c
> > > @@ -241,6 +241,28 @@ igt_main
> > >  	}
> > >  	igt_stop_signal_helper();
> > >  
> > > +	igt_subtest("blitter-forked") {
> > > +		igt_fork(child, 1) {
> > 
> > One child? Only a single child is required to provoke the bug? So what's
> > the difference between calling check_ring() in the child and the parent?
> > 
> 
> Well, it is the difference between GPU hang and no GPU hang :P It is
> currently under investigation exactly why. Changing the batchbuffer
> payload to NOOP stops hanging, but the subtest execution time drops from
> a minute to a couple of seconds.
> 
> > > +			check_ring(bufmgr, batch, "blt", blt_copy);
> > > +		}
> > > +		igt_waitchildren();
> > > +	}
> > > +
> > > +	/* Strictly only required on architectures with a separate BLT ring,
> > > +	 * but lets stress everybody.
> > > +	 */
> > > +	igt_subtest("render-forked") {
> > > +		igt_render_copyfunc_t copy;
> > > +
> > > +		copy = igt_get_render_copyfunc(batch->devid);
> > > +		igt_require(copy);
> > > +
> > > +		igt_fork(child, 1) {
> > > +			check_ring(bufmgr, batch, "render", copy);
> > > +		}
> > > +		igt_waitchildren();
> > > +	}
> > 
> > One more for forked:
> > igt_subtest("all-forked") {
> > 	for_each_ring() {
> > 		if (copy(ring)) == NULL)
> > 			continue;
> > 		igt_fork(child, N)
> > 			check_ring(bufmgr, batch, name(ring), copy);
> > 	}
> > 	igt_waitchildren()
> > 
> 
> This is something that could be added, yep. But I do not see it relevant
> to the current patch, which is needed to better track down the root
> cause for the SKL hangs.

As it stands, this code is buggy if attempted to be used from multiple
processes at once. Each process will share the same fd and have an
existing set of shared handles that it will attempt to use without
synchronisation with the other processes, the result will be
indeterministic. So when adding the -forked variants, we should also
make it possible to run the test forked.
-Chris

Patch
diff mbox

diff --git a/tests/gem_ringfill.c b/tests/gem_ringfill.c
index 85b01ea..1b93a03 100644
--- a/tests/gem_ringfill.c
+++ b/tests/gem_ringfill.c
@@ -241,6 +241,28 @@  igt_main
 	}
 	igt_stop_signal_helper();
 
+	igt_subtest("blitter-forked") {
+		igt_fork(child, 1) {
+			check_ring(bufmgr, batch, "blt", blt_copy);
+		}
+		igt_waitchildren();
+	}
+
+	/* Strictly only required on architectures with a separate BLT ring,
+	 * but lets stress everybody.
+	 */
+	igt_subtest("render-forked") {
+		igt_render_copyfunc_t copy;
+
+		copy = igt_get_render_copyfunc(batch->devid);
+		igt_require(copy);
+
+		igt_fork(child, 1) {
+			check_ring(bufmgr, batch, "render", copy);
+		}
+		igt_waitchildren();
+	}
+
 	igt_fixture {
 		intel_batchbuffer_free(batch);
 		drm_intel_bufmgr_destroy(bufmgr);