From patchwork Mon Jul 16 13:21:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10526781 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BA8486020A for ; Mon, 16 Jul 2018 13:22:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A93EA283C9 for ; Mon, 16 Jul 2018 13:22:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9D17F28464; Mon, 16 Jul 2018 13:22:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 264A8283C9 for ; Mon, 16 Jul 2018 13:22:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7E18F6E1B7; Mon, 16 Jul 2018 13:22:09 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6D0306E1B7 for ; Mon, 16 Jul 2018 13:22:08 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 12366656-1500050 for multiple; Mon, 16 Jul 2018 14:21:55 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 16 Jul 2018 14:21:54 +0100 Message-Id: <20180716132154.12539-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180716124829.27639-2-chris@chris-wilson.co.uk> References: <20180716124829.27639-2-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH] drm/i915/selftests: Force a preemption hang X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Inject a failure into preemption completion to pretend as if the HW didn't successfully handle preemption and we are forced to do a reset in the middle. v2: Wait for preemption, to force testing with the missed preemption. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- complete() not completion_done() to complete a completion. -Chris --- drivers/gpu/drm/i915/intel_guc_submission.c | 3 + drivers/gpu/drm/i915/intel_lrc.c | 3 + drivers/gpu/drm/i915/intel_ringbuffer.h | 27 +++++ drivers/gpu/drm/i915/selftests/intel_lrc.c | 115 ++++++++++++++++++++ 4 files changed, 148 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index cc444dc5f3ad..6bd1c3ee14a3 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -628,6 +628,9 @@ static void complete_preempt_context(struct intel_engine_cs *engine) GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT)); + if (inject_preempt_hang(execlists)) + return; + execlists_cancel_port_requests(execlists); execlists_unwind_incomplete_requests(execlists); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index f9fb32efb97f..c44072a35fb6 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -559,6 +559,9 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists) { GEM_BUG_ON(!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT)); + if (inject_preempt_hang(execlists)) + return; + execlists_cancel_port_requests(execlists); __unwind_incomplete_requests(container_of(execlists, struct intel_engine_cs, diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 665b59ba1f45..f5ffa6d31e82 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -193,6 +193,11 @@ struct i915_priolist { int priority; }; +struct st_preempt_hang { + struct completion completion; + bool inject_hang; +}; + /** * struct intel_engine_execlists - execlist submission queue and port state * @@ -333,6 +338,8 @@ struct intel_engine_execlists { * @csb_head: context status buffer head */ u8 csb_head; + + I915_SELFTEST_DECLARE(struct st_preempt_hang preempt_hang;) }; #define INTEL_ENGINE_CS_MAX_NAME 8 @@ -1155,4 +1162,24 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine); ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine); +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) + +static inline bool inject_preempt_hang(struct intel_engine_execlists *execlists) +{ + if (!execlists->preempt_hang.inject_hang) + return false; + + complete(&execlists->preempt_hang.completion); + return true; +} + +#else + +static inline bool inject_preempt_hang(struct intel_engine_execlists *execlists) +{ + return false; +} + +#endif + #endif /* _INTEL_RINGBUFFER_H_ */ diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c index 636cb68191e3..1843d331f4f1 100644 --- a/drivers/gpu/drm/i915/selftests/intel_lrc.c +++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c @@ -451,12 +451,127 @@ static int live_late_preempt(void *arg) goto err_ctx_lo; } +static int live_preempt_hang(void *arg) +{ + struct drm_i915_private *i915 = arg; + struct i915_gem_context *ctx_hi, *ctx_lo; + struct spinner spin_hi, spin_lo; + struct intel_engine_cs *engine; + enum intel_engine_id id; + int err = -ENOMEM; + + if (!HAS_LOGICAL_RING_PREEMPTION(i915)) + return 0; + + if (!intel_has_reset_engine(i915)) + return 0; + + mutex_lock(&i915->drm.struct_mutex); + + if (spinner_init(&spin_hi, i915)) + goto err_unlock; + + if (spinner_init(&spin_lo, i915)) + goto err_spin_hi; + + ctx_hi = kernel_context(i915); + if (!ctx_hi) + goto err_spin_lo; + ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY; + + ctx_lo = kernel_context(i915); + if (!ctx_lo) + goto err_ctx_hi; + ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY; + + for_each_engine(engine, i915, id) { + struct i915_request *rq; + + if (!intel_engine_has_preemption(engine)) + continue; + + rq = spinner_create_request(&spin_lo, ctx_lo, engine, + MI_ARB_CHECK); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_ctx_lo; + } + + i915_request_add(rq); + if (!wait_for_spinner(&spin_lo, rq)) { + GEM_TRACE("lo spinner failed to start\n"); + GEM_TRACE_DUMP(); + i915_gem_set_wedged(i915); + err = -EIO; + goto err_ctx_lo; + } + + rq = spinner_create_request(&spin_hi, ctx_hi, engine, + MI_ARB_CHECK); + if (IS_ERR(rq)) { + spinner_end(&spin_lo); + err = PTR_ERR(rq); + goto err_ctx_lo; + } + + init_completion(&engine->execlists.preempt_hang.completion); + engine->execlists.preempt_hang.inject_hang = true; + + i915_request_add(rq); + + if (!wait_for_completion_timeout(&engine->execlists.preempt_hang.completion, + HZ / 10)) { + pr_err("Preemption did not occur within timeout!"); + GEM_TRACE_DUMP(); + i915_gem_set_wedged(i915); + err = -EIO; + goto err_ctx_lo; + } + + set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags); + i915_reset_engine(engine, NULL); + clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags); + + engine->execlists.preempt_hang.inject_hang = false; + + if (!wait_for_spinner(&spin_hi, rq)) { + GEM_TRACE("hi spinner failed to start\n"); + GEM_TRACE_DUMP(); + i915_gem_set_wedged(i915); + err = -EIO; + goto err_ctx_lo; + } + + spinner_end(&spin_hi); + spinner_end(&spin_lo); + if (igt_flush_test(i915, I915_WAIT_LOCKED)) { + err = -EIO; + goto err_ctx_lo; + } + } + + err = 0; +err_ctx_lo: + kernel_context_close(ctx_lo); +err_ctx_hi: + kernel_context_close(ctx_hi); +err_spin_lo: + spinner_fini(&spin_lo); +err_spin_hi: + spinner_fini(&spin_hi); +err_unlock: + igt_flush_test(i915, I915_WAIT_LOCKED); + mutex_unlock(&i915->drm.struct_mutex); + return err; +} + int intel_execlists_live_selftests(struct drm_i915_private *i915) { static const struct i915_subtest tests[] = { SUBTEST(live_sanitycheck), SUBTEST(live_preempt), SUBTEST(live_late_preempt), + SUBTEST(live_preempt_hang), }; if (!HAS_EXECLISTS(i915))