diff mbox

[i-g-t] igt/perf_pmu: Flush to idle after hang

Message ID 20180525151439.12438-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson May 25, 2018, 3:14 p.m. UTC
We may not idle immediately after a hang, and indeed may send a pulse
down the pipeline periodically to become idle. Rather than make a flimsy
assumption about how long we need to sleep before the system idles,
wait for the system to declare itself idle; flushing it to idle in the
process!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 tests/perf_pmu.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

Comments

Antonio Argenziano May 29, 2018, 8:05 p.m. UTC | #1
On 25/05/18 08:14, Chris Wilson wrote:
> We may not idle immediately after a hang, and indeed may send a pulse
> down the pipeline periodically to become idle. Rather than make a flimsy
> assumption about how long we need to sleep before the system idles,
> wait for the system to declare itself idle; flushing it to idle in the
> process!
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>

LGTM.
Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com>

> ---
>   tests/perf_pmu.c | 11 ++---------
>   1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 590e6526b..9af192dd8 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -281,16 +281,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
>   
>   	/* Check for idle after hang. */
>   	if (flags & FLAG_HANG) {
> -		/* Sleep for a bit for reset unwind to settle. */
> -		usleep(500e3);
> -		/*
> -		 * Ensure batch was executing before reset, meaning it must be
> -		 * idle by now. Unless it did not even manage to start before we
> -		 * triggered the reset, in which case the idleness check below
> -		 * might fail. The latter is very unlikely since there are two
> -		 * sleeps during which it had an opportunity to start.
> -		 */
> +		gem_quiescent_gpu(gem_fd);
>   		igt_assert(!gem_bo_busy(gem_fd, spin->handle));
> +
>   		val = pmu_read_single(fd);
>   		slept = measured_usleep(batch_duration_ns / 1000);
>   		val = pmu_read_single(fd) - val;
>
Tvrtko Ursulin May 30, 2018, 10:48 a.m. UTC | #2
On 25/05/2018 16:14, Chris Wilson wrote:
> We may not idle immediately after a hang, and indeed may send a pulse
> down the pipeline periodically to become idle. Rather than make a flimsy
> assumption about how long we need to sleep before the system idles,
> wait for the system to declare itself idle; flushing it to idle in the
> process!
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
>   tests/perf_pmu.c | 11 ++---------
>   1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 590e6526b..9af192dd8 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -281,16 +281,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
>   
>   	/* Check for idle after hang. */
>   	if (flags & FLAG_HANG) {
> -		/* Sleep for a bit for reset unwind to settle. */
> -		usleep(500e3);
> -		/*
> -		 * Ensure batch was executing before reset, meaning it must be
> -		 * idle by now. Unless it did not even manage to start before we
> -		 * triggered the reset, in which case the idleness check below
> -		 * might fail. The latter is very unlikely since there are two
> -		 * sleeps during which it had an opportunity to start.
> -		 */
> +		gem_quiescent_gpu(gem_fd);
>   		igt_assert(!gem_bo_busy(gem_fd, spin->handle));
> +
>   		val = pmu_read_single(fd);
>   		slept = measured_usleep(batch_duration_ns / 1000);
>   		val = pmu_read_single(fd) - val;
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
Chris Wilson May 30, 2018, 10:57 a.m. UTC | #3
Quoting Tvrtko Ursulin (2018-05-30 11:48:16)
> 
> On 25/05/2018 16:14, Chris Wilson wrote:
> > We may not idle immediately after a hang, and indeed may send a pulse
> > down the pipeline periodically to become idle. Rather than make a flimsy
> > assumption about how long we need to sleep before the system idles,
> > wait for the system to declare itself idle; flushing it to idle in the
> > process!
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > ---
> >   tests/perf_pmu.c | 11 ++---------
> >   1 file changed, 2 insertions(+), 9 deletions(-)
> > 
> > diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> > index 590e6526b..9af192dd8 100644
> > --- a/tests/perf_pmu.c
> > +++ b/tests/perf_pmu.c
> > @@ -281,16 +281,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
> >   
> >       /* Check for idle after hang. */
> >       if (flags & FLAG_HANG) {
> > -             /* Sleep for a bit for reset unwind to settle. */
> > -             usleep(500e3);
> > -             /*
> > -              * Ensure batch was executing before reset, meaning it must be
> > -              * idle by now. Unless it did not even manage to start before we
> > -              * triggered the reset, in which case the idleness check below
> > -              * might fail. The latter is very unlikely since there are two
> > -              * sleeps during which it had an opportunity to start.
> > -              */
> > +             gem_quiescent_gpu(gem_fd);
> >               igt_assert(!gem_bo_busy(gem_fd, spin->handle));
> > +
> >               val = pmu_read_single(fd);
> >               slept = measured_usleep(batch_duration_ns / 1000);
> >               val = pmu_read_single(fd) - val;
> > 
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

That's a relief. I was fearing you might argue that we need some sort of
check that it does idle in a timely fashion without intervention. Off
the top of my head, we do have some gem_wait/gem_eio checks that should
cover that.
-Chris
diff mbox

Patch

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 590e6526b..9af192dd8 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -281,16 +281,9 @@  single(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
 
 	/* Check for idle after hang. */
 	if (flags & FLAG_HANG) {
-		/* Sleep for a bit for reset unwind to settle. */
-		usleep(500e3);
-		/*
-		 * Ensure batch was executing before reset, meaning it must be
-		 * idle by now. Unless it did not even manage to start before we
-		 * triggered the reset, in which case the idleness check below
-		 * might fail. The latter is very unlikely since there are two
-		 * sleeps during which it had an opportunity to start.
-		 */
+		gem_quiescent_gpu(gem_fd);
 		igt_assert(!gem_bo_busy(gem_fd, spin->handle));
+
 		val = pmu_read_single(fd);
 		slept = measured_usleep(batch_duration_ns / 1000);
 		val = pmu_read_single(fd) - val;