diff mbox

[i-g-t,2/3] tests/perf_pmu: Simplify interrupt testing

Message ID 20171219154543.10648-2-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tvrtko Ursulin Dec. 19, 2017, 3:45 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Rather than calibrate and emit nop batches, use a manually signalled chain
of spinners to generate the desired interrupts.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/perf_pmu.c | 94 ++++++++------------------------------------------------
 1 file changed, 13 insertions(+), 81 deletions(-)

Comments

Chris Wilson Dec. 19, 2017, 9:43 p.m. UTC | #1
Quoting Tvrtko Ursulin (2017-12-19 15:45:42)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Rather than calibrate and emit nop batches, use a manually signalled chain
> of spinners to generate the desired interrupts.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>         } while (idle != busy);
>  
> -       /* Install the fences and enable signaling */
> -       igt_assert_eq(poll(&pfd, 1, 10), 0);
> +       /* Process the batch queue. */
> +       pfd.events = POLLIN;
> +       for (int i = 0; i < target; i++) {
> +               const unsigned int timeout_ms = test_duration_ms / target;

I think you want
	timeout_ms = (i + 1) * test_duration_ms / target;
? Otherwise all the batches have the same relative-to-now timeout (not
relative to the start of the batch).
-Chris
Chris Wilson Dec. 19, 2017, 9:45 p.m. UTC | #2
Quoting Tvrtko Ursulin (2017-12-19 15:45:42)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Rather than calibrate and emit nop batches, use a manually signalled chain
> of spinners to generate the desired interrupts.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> -       /* Unplug the calibrated queue and wait for all the fences */
> -       igt_spin_batch_free(gem_fd, spin);
> -       igt_assert_eq(poll(&pfd, 1, 2 * test_duration_ms), 1);
> -       close(pfd.fd);
> +               pfd.fd = spin[i]->out_fence;
> +               igt_spin_batch_set_timeout(spin[i], timeout_ms * 1e6);
> +               igt_assert_eq(poll(&pfd, 1, 2 * timeout_ms), 1);

Oh, still with the synchronous behaviour, bleurgh.
-Chris
Tvrtko Ursulin Dec. 20, 2017, 9:45 a.m. UTC | #3
On 19/12/2017 21:45, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-12-19 15:45:42)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Rather than calibrate and emit nop batches, use a manually signalled chain
>> of spinners to generate the desired interrupts.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> -       /* Unplug the calibrated queue and wait for all the fences */
>> -       igt_spin_batch_free(gem_fd, spin);
>> -       igt_assert_eq(poll(&pfd, 1, 2 * test_duration_ms), 1);
>> -       close(pfd.fd);
>> +               pfd.fd = spin[i]->out_fence;
>> +               igt_spin_batch_set_timeout(spin[i], timeout_ms * 1e6);
>> +               igt_assert_eq(poll(&pfd, 1, 2 * timeout_ms), 1);
> 
> Oh, still with the synchronous behaviour, bleurgh.

I was attracted by the simplicity of this approach, but I can change to 
set incremental timeouts and keep the merged fence if you think that's 
better?

Regards,

Tvrtko
Chris Wilson Dec. 20, 2017, 10:49 a.m. UTC | #4
Quoting Tvrtko Ursulin (2017-12-20 09:45:41)
> 
> On 19/12/2017 21:45, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-12-19 15:45:42)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Rather than calibrate and emit nop batches, use a manually signalled chain
> >> of spinners to generate the desired interrupts.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> -       /* Unplug the calibrated queue and wait for all the fences */
> >> -       igt_spin_batch_free(gem_fd, spin);
> >> -       igt_assert_eq(poll(&pfd, 1, 2 * test_duration_ms), 1);
> >> -       close(pfd.fd);
> >> +               pfd.fd = spin[i]->out_fence;
> >> +               igt_spin_batch_set_timeout(spin[i], timeout_ms * 1e6);
> >> +               igt_assert_eq(poll(&pfd, 1, 2 * timeout_ms), 1);
> > 
> > Oh, still with the synchronous behaviour, bleurgh.
> 
> I was attracted by the simplicity of this approach, but I can change to 
> set incremental timeouts and keep the merged fence if you think that's 
> better?

It was mostly surprise as I just have a preference for setting up
everything and then letting it go; fire-and-forget style. So that was
what I was expecting to see. It should basically be the difference of
adding a single function to merge the fences (albeit you have to write
that function). Shall we say both patterns have merit / analogues to the
real world?
-Chris
Tvrtko Ursulin Dec. 20, 2017, 12:35 p.m. UTC | #5
On 20/12/2017 10:49, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-12-20 09:45:41)
>>
>> On 19/12/2017 21:45, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2017-12-19 15:45:42)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Rather than calibrate and emit nop batches, use a manually signalled chain
>>>> of spinners to generate the desired interrupts.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> -       /* Unplug the calibrated queue and wait for all the fences */
>>>> -       igt_spin_batch_free(gem_fd, spin);
>>>> -       igt_assert_eq(poll(&pfd, 1, 2 * test_duration_ms), 1);
>>>> -       close(pfd.fd);
>>>> +               pfd.fd = spin[i]->out_fence;
>>>> +               igt_spin_batch_set_timeout(spin[i], timeout_ms * 1e6);
>>>> +               igt_assert_eq(poll(&pfd, 1, 2 * timeout_ms), 1);
>>>
>>> Oh, still with the synchronous behaviour, bleurgh.
>>
>> I was attracted by the simplicity of this approach, but I can change to
>> set incremental timeouts and keep the merged fence if you think that's
>> better?
> 
> It was mostly surprise as I just have a preference for setting up
> everything and then letting it go; fire-and-forget style. So that was
> what I was expecting to see. It should basically be the difference of
> adding a single function to merge the fences (albeit you have to write
> that function). Shall we say both patterns have merit / analogues to the
> real world?

It is using fence merging approach so it won't be a problem to keep 
that, just with the spin batches. I mostly wanted to remove calibration 
since you were nudging me in that direction. Don't mind really of the 
specific mechanics of emitting interrupts so I am happy to go back a step.

Regards,

Tvrtko
Chris Wilson Dec. 20, 2017, 12:39 p.m. UTC | #6
Quoting Tvrtko Ursulin (2017-12-20 12:35:31)
> 
> On 20/12/2017 10:49, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-12-20 09:45:41)
> >>
> >> On 19/12/2017 21:45, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2017-12-19 15:45:42)
> >>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>
> >>>> Rather than calibrate and emit nop batches, use a manually signalled chain
> >>>> of spinners to generate the desired interrupts.
> >>>>
> >>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>> ---
> >>>> -       /* Unplug the calibrated queue and wait for all the fences */
> >>>> -       igt_spin_batch_free(gem_fd, spin);
> >>>> -       igt_assert_eq(poll(&pfd, 1, 2 * test_duration_ms), 1);
> >>>> -       close(pfd.fd);
> >>>> +               pfd.fd = spin[i]->out_fence;
> >>>> +               igt_spin_batch_set_timeout(spin[i], timeout_ms * 1e6);
> >>>> +               igt_assert_eq(poll(&pfd, 1, 2 * timeout_ms), 1);
> >>>
> >>> Oh, still with the synchronous behaviour, bleurgh.
> >>
> >> I was attracted by the simplicity of this approach, but I can change to
> >> set incremental timeouts and keep the merged fence if you think that's
> >> better?
> > 
> > It was mostly surprise as I just have a preference for setting up
> > everything and then letting it go; fire-and-forget style. So that was
> > what I was expecting to see. It should basically be the difference of
> > adding a single function to merge the fences (albeit you have to write
> > that function). Shall we say both patterns have merit / analogues to the
> > real world?
> 
> It is using fence merging approach so it won't be a problem to keep 
> that, just with the spin batches. I mostly wanted to remove calibration 
> since you were nudging me in that direction. Don't mind really of the 
> specific mechanics of emitting interrupts so I am happy to go back a step.

I was thinking how difficult would it be to run twice; once batched and
once sync? We're more likely to catch discrepancies in our interrupt
generation that way, which may or may not help improve the test in
future.
-Chris
diff mbox

Patch

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index db7696115a7b..935fee03b253 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -799,94 +799,23 @@  static void cpu_hotplug(int gem_fd)
 	assert_within_epsilon(val, ref, tolerance);
 }
 
-static unsigned long calibrate_nop(int fd, const uint64_t calibration_us)
-{
-	const uint64_t cal_min_us = calibration_us * 3;
-	const unsigned int tolerance_pct = 10;
-	const uint32_t bbe = MI_BATCH_BUFFER_END;
-	const unsigned int loops = 17;
-	struct drm_i915_gem_exec_object2 obj = {};
-	struct drm_i915_gem_execbuffer2 eb = {
-		.buffer_count = 1, .buffers_ptr = to_user_pointer(&obj),
-	};
-	struct timespec t_begin = { };
-	uint64_t size, last_size, ns;
-
-	igt_nsec_elapsed(&t_begin);
-
-	size = 256 * 1024;
-	do {
-		struct timespec t_start = { };
-
-		obj.handle = gem_create(fd, size);
-		gem_write(fd, obj.handle, size - sizeof(bbe), &bbe,
-			  sizeof(bbe));
-		gem_execbuf(fd, &eb);
-		gem_sync(fd, obj.handle);
-
-		igt_nsec_elapsed(&t_start);
-
-		for (int loop = 0; loop < loops; loop++)
-			gem_execbuf(fd, &eb);
-		gem_sync(fd, obj.handle);
-
-		ns = igt_nsec_elapsed(&t_start);
-
-		gem_close(fd, obj.handle);
-
-		last_size = size;
-		size = calibration_us * 1000 * size * loops / ns;
-		size = ALIGN(size, sizeof(uint32_t));
-	} while (igt_nsec_elapsed(&t_begin) / 1000 < cal_min_us ||
-		 abs(size - last_size) > (size * tolerance_pct / 100));
-
-	return size;
-}
-
 static void
 test_interrupts(int gem_fd)
 {
-	const uint32_t bbe = MI_BATCH_BUFFER_END;
 	const unsigned int test_duration_ms = 1000;
-	struct drm_i915_gem_exec_object2 obj = { };
-	struct drm_i915_gem_execbuffer2 eb = {
-		.buffers_ptr = to_user_pointer(&obj),
-		.buffer_count = 1,
-		.flags = I915_EXEC_FENCE_OUT,
-	};
-	unsigned long sz;
-	igt_spin_t *spin;
 	const int target = 30;
+	igt_spin_t *spin[target];
 	struct pollfd pfd;
 	uint64_t idle, busy;
 	int fd;
 
-	sz = calibrate_nop(gem_fd, test_duration_ms * 1000 / target);
 	gem_quiescent_gpu(gem_fd);
 
 	fd = open_pmu(I915_PMU_INTERRUPTS);
-	spin = igt_spin_batch_new(gem_fd, 0, 0, 0);
 
-	obj.handle = gem_create(gem_fd, sz);
-	gem_write(gem_fd, obj.handle, sz - sizeof(bbe), &bbe, sizeof(bbe));
-
-	pfd.events = POLLIN;
-	pfd.fd = -1;
-	for (int i = 0; i < target; i++) {
-		int new;
-
-		/* Merge all the fences together so we can wait on them all */
-		gem_execbuf_wr(gem_fd, &eb);
-		new = eb.rsvd2 >> 32;
-		if (pfd.fd == -1) {
-			pfd.fd = new;
-		} else {
-			int old = pfd.fd;
-			pfd.fd = sync_fence_merge(old, new);
-			close(old);
-			close(new);
-		}
-	}
+	/* Queue spinning batches. */
+	for (int i = 0; i < target; i++)
+		spin[i] = __igt_spin_batch_new_fence(gem_fd, 0, 0);
 
 	/* Wait for idle state. */
 	idle = pmu_read_single(fd);
@@ -896,13 +825,16 @@  test_interrupts(int gem_fd)
 		idle = pmu_read_single(fd);
 	} while (idle != busy);
 
-	/* Install the fences and enable signaling */
-	igt_assert_eq(poll(&pfd, 1, 10), 0);
+	/* Process the batch queue. */
+	pfd.events = POLLIN;
+	for (int i = 0; i < target; i++) {
+		const unsigned int timeout_ms = test_duration_ms / target;
 
-	/* Unplug the calibrated queue and wait for all the fences */
-	igt_spin_batch_free(gem_fd, spin);
-	igt_assert_eq(poll(&pfd, 1, 2 * test_duration_ms), 1);
-	close(pfd.fd);
+		pfd.fd = spin[i]->out_fence;
+		igt_spin_batch_set_timeout(spin[i], timeout_ms * 1e6);
+		igt_assert_eq(poll(&pfd, 1, 2 * timeout_ms), 1);
+		igt_spin_batch_free(gem_fd, spin[i]);
+	}
 
 	/* Check at least as many interrupts has been generated. */
 	busy = pmu_read_single(fd) - idle;