Message ID | 20210716201724.54804-50-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | GuC submission support | expand |
On Fri, Jul 16, 2021 at 01:17:22PM -0700, Matthew Brost wrote: > From: John Harrison <John.C.Harrison@Intel.com> > > Some testing environments and some heavier tests are slower than > previous limits allowed for. For example, it can take multiple seconds > for the 'context has been reset' notification handler to reach the > 'kill the requests' code in the 'active' version of the 'reset > engines' test. During which time the selftest gets bored, gives up > waiting and fails the test. > > There is also an async thread that the selftest uses to pump work > through the hardware in parallel to the context that is marked for > reset. That also could get bored waiting for completions and kill the > test off. > > Lastly, the flush at the of various test sections can also see > timeouts due to the large amount of work backed up. This is also true > of the live_hwsp_read test. > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> > --- > drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 2 +- > drivers/gpu/drm/i915/selftests/igt_flush_test.c | 2 +- > drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c > index 971c0c249eb0..a93a9b0d258e 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c > +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c > @@ -876,7 +876,7 @@ static int active_request_put(struct i915_request *rq) > if (!rq) > return 0; > > - if (i915_request_wait(rq, 0, 5 * HZ) < 0) { > + if (i915_request_wait(rq, 0, 10 * HZ) < 0) { > GEM_TRACE("%s timed out waiting for completion of fence %llx:%lld\n", > rq->engine->name, > rq->fence.context, > diff --git a/drivers/gpu/drm/i915/selftests/igt_flush_test.c b/drivers/gpu/drm/i915/selftests/igt_flush_test.c > index 7b0939e3f007..a6c71fca61aa 100644 > --- a/drivers/gpu/drm/i915/selftests/igt_flush_test.c > +++ b/drivers/gpu/drm/i915/selftests/igt_flush_test.c > @@ -19,7 +19,7 @@ int igt_flush_test(struct drm_i915_private *i915) > > cond_resched(); > > - if (intel_gt_wait_for_idle(gt, HZ / 5) == -ETIME) { > + if (intel_gt_wait_for_idle(gt, HZ) == -ETIME) { > pr_err("%pS timed out, cancelling all further testing.\n", > __builtin_return_address(0)); > > diff --git a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c > index 69db139f9e0d..ebd6d69b3315 100644 > --- a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c > +++ b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c > @@ -13,7 +13,7 @@ > > #define REDUCED_TIMESLICE 5 > #define REDUCED_PREEMPT 10 > -#define WAIT_FOR_RESET_TIME 1000 > +#define WAIT_FOR_RESET_TIME 10000 > > int intel_selftest_modify_policy(struct intel_engine_cs *engine, > struct intel_selftest_saved_policy *saved, > -- > 2.28.0 >
On 16/07/2021 21:17, Matthew Brost wrote: > From: John Harrison <John.C.Harrison@Intel.com> > > Some testing environments and some heavier tests are slower than What testing environments are they? It's not a simulation patch which "escaped" by accident I am wondering. If not then it's just GuC which is so slow, like that other patch two steps previous in the series? Regards, Tvrtko > previous limits allowed for. For example, it can take multiple seconds > for the 'context has been reset' notification handler to reach the > 'kill the requests' code in the 'active' version of the 'reset > engines' test. During which time the selftest gets bored, gives up > waiting and fails the test. > > There is also an async thread that the selftest uses to pump work > through the hardware in parallel to the context that is marked for > reset. That also could get bored waiting for completions and kill the > test off. > > Lastly, the flush at the of various test sections can also see > timeouts due to the large amount of work backed up. This is also true > of the live_hwsp_read test. > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > --- > drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 2 +- > drivers/gpu/drm/i915/selftests/igt_flush_test.c | 2 +- > drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c > index 971c0c249eb0..a93a9b0d258e 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c > +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c > @@ -876,7 +876,7 @@ static int active_request_put(struct i915_request *rq) > if (!rq) > return 0; > > - if (i915_request_wait(rq, 0, 5 * HZ) < 0) { > + if (i915_request_wait(rq, 0, 10 * HZ) < 0) { > GEM_TRACE("%s timed out waiting for completion of fence %llx:%lld\n", > rq->engine->name, > rq->fence.context, > diff --git a/drivers/gpu/drm/i915/selftests/igt_flush_test.c b/drivers/gpu/drm/i915/selftests/igt_flush_test.c > index 7b0939e3f007..a6c71fca61aa 100644 > --- a/drivers/gpu/drm/i915/selftests/igt_flush_test.c > +++ b/drivers/gpu/drm/i915/selftests/igt_flush_test.c > @@ -19,7 +19,7 @@ int igt_flush_test(struct drm_i915_private *i915) > > cond_resched(); > > - if (intel_gt_wait_for_idle(gt, HZ / 5) == -ETIME) { > + if (intel_gt_wait_for_idle(gt, HZ) == -ETIME) { > pr_err("%pS timed out, cancelling all further testing.\n", > __builtin_return_address(0)); > > diff --git a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c > index 69db139f9e0d..ebd6d69b3315 100644 > --- a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c > +++ b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c > @@ -13,7 +13,7 @@ > > #define REDUCED_TIMESLICE 5 > #define REDUCED_PREEMPT 10 > -#define WAIT_FOR_RESET_TIME 1000 > +#define WAIT_FOR_RESET_TIME 10000 > > int intel_selftest_modify_policy(struct intel_engine_cs *engine, > struct intel_selftest_saved_policy *saved, >
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 971c0c249eb0..a93a9b0d258e 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -876,7 +876,7 @@ static int active_request_put(struct i915_request *rq) if (!rq) return 0; - if (i915_request_wait(rq, 0, 5 * HZ) < 0) { + if (i915_request_wait(rq, 0, 10 * HZ) < 0) { GEM_TRACE("%s timed out waiting for completion of fence %llx:%lld\n", rq->engine->name, rq->fence.context, diff --git a/drivers/gpu/drm/i915/selftests/igt_flush_test.c b/drivers/gpu/drm/i915/selftests/igt_flush_test.c index 7b0939e3f007..a6c71fca61aa 100644 --- a/drivers/gpu/drm/i915/selftests/igt_flush_test.c +++ b/drivers/gpu/drm/i915/selftests/igt_flush_test.c @@ -19,7 +19,7 @@ int igt_flush_test(struct drm_i915_private *i915) cond_resched(); - if (intel_gt_wait_for_idle(gt, HZ / 5) == -ETIME) { + if (intel_gt_wait_for_idle(gt, HZ) == -ETIME) { pr_err("%pS timed out, cancelling all further testing.\n", __builtin_return_address(0)); diff --git a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c index 69db139f9e0d..ebd6d69b3315 100644 --- a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c +++ b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c @@ -13,7 +13,7 @@ #define REDUCED_TIMESLICE 5 #define REDUCED_PREEMPT 10 -#define WAIT_FOR_RESET_TIME 1000 +#define WAIT_FOR_RESET_TIME 10000 int intel_selftest_modify_policy(struct intel_engine_cs *engine, struct intel_selftest_saved_policy *saved,