diff mbox series

drm/i915/selftests: Rearrange ktime_get to reduce latency against CS

Message ID 20210108105608.18424-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series drm/i915/selftests: Rearrange ktime_get to reduce latency against CS | expand

Commit Message

Chris Wilson Jan. 8, 2021, 10:56 a.m. UTC
In our tests where we measure the elapsed time on both the CPU and CS
using a udelay, our CS results match the udelay much more accurately
than the ktime (even when using ktime_get_fast_ns). With preemption
disabled, we can go one step lower than ktime and use local_clock.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2919
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Mika Kuoppala Jan. 12, 2021, 7:19 p.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> In our tests where we measure the elapsed time on both the CPU and CS
> using a udelay, our CS results match the udelay much more accurately
> than the ktime (even when using ktime_get_fast_ns). With preemption
> disabled, we can go one step lower than ktime and use local_clock.
>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2919
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> index ca080445695e..c3d965279fc3 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> @@ -112,11 +112,11 @@ static int __measure_timestamps(struct intel_context *ce,
>  
>  	/* Run the request for a 100us, sampling timestamps before/after */
>  	preempt_disable();

Do you need to promote this to local_irq_disable() ?
-Mika

> -	*dt = ktime_get_raw_fast_ns();
> +	*dt = local_clock();
>  	write_semaphore(&sema[2], 0);
>  	udelay(100);
> +	*dt = local_clock() - *dt;
>  	write_semaphore(&sema[2], 1);
> -	*dt = ktime_get_raw_fast_ns() - *dt;
>  	preempt_enable();
>  
>  	if (i915_request_wait(rq, 0, HZ / 2) < 0) {
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson Jan. 12, 2021, 8:39 p.m. UTC | #2
Quoting Mika Kuoppala (2021-01-12 19:19:34)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > In our tests where we measure the elapsed time on both the CPU and CS
> > using a udelay, our CS results match the udelay much more accurately
> > than the ktime (even when using ktime_get_fast_ns). With preemption
> > disabled, we can go one step lower than ktime and use local_clock.
> >
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2919
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > index ca080445695e..c3d965279fc3 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > @@ -112,11 +112,11 @@ static int __measure_timestamps(struct intel_context *ce,
> >  
> >       /* Run the request for a 100us, sampling timestamps before/after */
> >       preempt_disable();
> 
> Do you need to promote this to local_irq_disable() ?

Good suggestion. Will try to remember if we still see discrepancies...

Interrupt handlers are meant to <5us, right???
-Chris
Mika Kuoppala Jan. 13, 2021, 10:52 a.m. UTC | #3
Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2021-01-12 19:19:34)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > In our tests where we measure the elapsed time on both the CPU and CS
>> > using a udelay, our CS results match the udelay much more accurately
>> > than the ktime (even when using ktime_get_fast_ns). With preemption
>> > disabled, we can go one step lower than ktime and use local_clock.
>> >
>> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2919
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > ---
>> >  drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
>> > index ca080445695e..c3d965279fc3 100644
>> > --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
>> > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
>> > @@ -112,11 +112,11 @@ static int __measure_timestamps(struct intel_context *ce,
>> >  
>> >       /* Run the request for a 100us, sampling timestamps before/after */
>> >       preempt_disable();
>> 
>> Do you need to promote this to local_irq_disable() ?
>
> Good suggestion. Will try to remember if we still see discrepancies...
>
> Interrupt handlers are meant to <5us, right???

With both test types, we might sometimes find out what they are :)

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> -Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
index ca080445695e..c3d965279fc3 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
@@ -112,11 +112,11 @@  static int __measure_timestamps(struct intel_context *ce,
 
 	/* Run the request for a 100us, sampling timestamps before/after */
 	preempt_disable();
-	*dt = ktime_get_raw_fast_ns();
+	*dt = local_clock();
 	write_semaphore(&sema[2], 0);
 	udelay(100);
+	*dt = local_clock() - *dt;
 	write_semaphore(&sema[2], 1);
-	*dt = ktime_get_raw_fast_ns() - *dt;
 	preempt_enable();
 
 	if (i915_request_wait(rq, 0, HZ / 2) < 0) {