Message ID | 20210113125939.10205-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/selftests: Bump the scheduling threshold for fast heartbeats | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > Since we are system_highpri_wq, we expected the heartbeat to be > scheduled promptly. However, we see delays of over 10ms upsetting our > assertions. Accept this as inevitable and bump the error threshold to > 20ms (from 6ms). > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c > index b88aa35ad75b..e88a01390dc5 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c > @@ -197,6 +197,7 @@ static int cmp_u32(const void *_a, const void *_b) > > static int __live_heartbeat_fast(struct intel_engine_cs *engine) > { > + const int error_threshold = max(20000, jffies_to_usecs(6)); s/jffies/jiffies Also for the commit message, 6 jiffies are not 6ms so it needs some mending. -Mika > struct intel_context *ce; > struct i915_request *rq; > ktime_t t0, t1; > @@ -254,12 +255,18 @@ static int __live_heartbeat_fast(struct intel_engine_cs *engine) > times[0], > times[ARRAY_SIZE(times) - 1]); > > - /* Min work delay is 2 * 2 (worst), +1 for scheduling, +1 for slack */ > - if (times[ARRAY_SIZE(times) / 2] > jiffies_to_usecs(6)) { > + /* > + * Ideally, the upper bound on min work delay would be something like > + * 2 * 2 (worst), +1 for scheduling, +1 for slack. In practice, we > + * are, even with system_wq_highpri, at the mercy of the CPU scheduler > + * and may be stuck behind some slow work for many millisecond. Such > + * as our very own display workers. > + */ > + if (times[ARRAY_SIZE(times) / 2] > error_threshold) { > pr_err("%s: Heartbeat delay was %uus, expected less than %dus\n", > engine->name, > times[ARRAY_SIZE(times) / 2], > - jiffies_to_usecs(6)); > + error_threshold); > err = -EINVAL; > } > > -- > 2.20.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Quoting Mika Kuoppala (2021-01-13 14:13:57) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Since we are system_highpri_wq, we expected the heartbeat to be > > scheduled promptly. However, we see delays of over 10ms upsetting our > > assertions. Accept this as inevitable and bump the error threshold to > > 20ms (from 6ms). > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 13 ++++++++++--- > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c > > index b88aa35ad75b..e88a01390dc5 100644 > > --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c > > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c > > @@ -197,6 +197,7 @@ static int cmp_u32(const void *_a, const void *_b) > > > > static int __live_heartbeat_fast(struct intel_engine_cs *engine) > > { > > + const int error_threshold = max(20000, jffies_to_usecs(6)); > > s/jffies/jiffies > > Also for the commit message, 6 jiffies are not 6ms so it needs > some mending. Ok, might as well pull the failure messages from CI as well for a bit more information. -Chris
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c index b88aa35ad75b..e88a01390dc5 100644 --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c @@ -197,6 +197,7 @@ static int cmp_u32(const void *_a, const void *_b) static int __live_heartbeat_fast(struct intel_engine_cs *engine) { + const int error_threshold = max(20000, jffies_to_usecs(6)); struct intel_context *ce; struct i915_request *rq; ktime_t t0, t1; @@ -254,12 +255,18 @@ static int __live_heartbeat_fast(struct intel_engine_cs *engine) times[0], times[ARRAY_SIZE(times) - 1]); - /* Min work delay is 2 * 2 (worst), +1 for scheduling, +1 for slack */ - if (times[ARRAY_SIZE(times) / 2] > jiffies_to_usecs(6)) { + /* + * Ideally, the upper bound on min work delay would be something like + * 2 * 2 (worst), +1 for scheduling, +1 for slack. In practice, we + * are, even with system_wq_highpri, at the mercy of the CPU scheduler + * and may be stuck behind some slow work for many millisecond. Such + * as our very own display workers. + */ + if (times[ARRAY_SIZE(times) / 2] > error_threshold) { pr_err("%s: Heartbeat delay was %uus, expected less than %dus\n", engine->name, times[ARRAY_SIZE(times) / 2], - jiffies_to_usecs(6)); + error_threshold); err = -EINVAL; }
Since we are system_highpri_wq, we expected the heartbeat to be scheduled promptly. However, we see delays of over 10ms upsetting our assertions. Accept this as inevitable and bump the error threshold to 20ms (from 6ms). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)