From patchwork Fri Jul 6 15:52:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lis, Tomasz" X-Patchwork-Id: 10512019 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4643660325 for ; Fri, 6 Jul 2018 15:52:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 348CA286BA for ; Fri, 6 Jul 2018 15:52:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 286C8286D4; Fri, 6 Jul 2018 15:52:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 14689286BA for ; Fri, 6 Jul 2018 15:52:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 46CE76E26B; Fri, 6 Jul 2018 15:52:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id C0AF66E26B for ; Fri, 6 Jul 2018 15:52:04 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Jul 2018 08:52:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,316,1526367600"; d="scan'208";a="70189476" Received: from szara.igk.intel.com ([172.28.178.192]) by fmsmga001.fm.intel.com with ESMTP; 06 Jul 2018 08:52:01 -0700 From: Tomasz Lis To: intel-gfx@lists.freedesktop.org Date: Fri, 6 Jul 2018 17:52:00 +0200 Message-Id: <1530892320-8412-1-git-send-email-tomasz.lis@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522163879-10837-1-git-send-email-tomasz.lis@intel.com> References: <1522163879-10837-1-git-send-email-tomasz.lis@intel.com> Subject: [Intel-gfx] [PATCH v5] drm/i915/gen11: Preempt-to-idle support in execlists. X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mika Kuoppala MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP The patch adds support of preempt-to-idle requesting by setting a proper bit within Execlist Control Register, and receiving preemption result from Context Status Buffer. Preemption in previous gens required a special batch buffer to be executed, so the Command Streamer never preempted to idle directly. In Icelake it is possible, as there is a hardware mechanism to inform the kernel about status of the preemption request. This patch does not cover using the new preemption mechanism when GuC is active. v2: Added needs_preempt_context() change so that it is not created when preempt-to-idle is supported. (Chris) Updated setting HWACK flag so that it is cleared after preempt-to-dle. (Chris, Daniele) Updated to use I915_ENGINE_HAS_PREEMPTION flag. (Chris) v3: Fixed needs_preempt_context() change. (Chris) Merged preemption trigger functions to one. (Chris) Fixed conyext state tonot assume COMPLETED_MASK after preemption, since idle-to-idle case will not have it set. v4: Simplified needs_preempt_context() change. (Daniele) Removed clearing HWACK flag in idle-to-idle preempt. (Daniele) v5: Renamed inject_preempt_context(). (Daniele) Removed duplicated GEM_BUG_ON() on HWACK (Daniele) Bspec: 18922 Cc: Joonas Lahtinen Cc: Chris Wilson Cc: Daniele Ceraolo Spurio Cc: Michal Winiarski Cc: Mika Kuoppala Reviewed-by: Daniele Ceraolo Spurio Signed-off-by: Tomasz Lis --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_gem_context.c | 3 +- drivers/gpu/drm/i915/i915_pci.c | 3 +- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_lrc.c | 114 ++++++++++++++++++++----------- drivers/gpu/drm/i915/intel_lrc.h | 1 + 6 files changed, 84 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 91a7e4f..c84a66a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2533,6 +2533,8 @@ intel_info(const struct drm_i915_private *dev_priv) ((dev_priv)->info.has_logical_ring_elsq) #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \ ((dev_priv)->info.has_logical_ring_preemption) +#define HAS_HW_PREEMPT_TO_IDLE(dev_priv) \ + ((dev_priv)->info.has_hw_preempt_to_idle) #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index b10770c..bf7faa7 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -464,7 +464,8 @@ destroy_kernel_context(struct i915_gem_context **ctxp) static bool needs_preempt_context(struct drm_i915_private *i915) { - return HAS_LOGICAL_RING_PREEMPTION(i915); + return HAS_LOGICAL_RING_PREEMPTION(i915) && + !HAS_HW_PREEMPT_TO_IDLE(i915); } int i915_gem_contexts_init(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 55543f1..2da7e77 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -593,7 +593,8 @@ static const struct intel_device_info intel_cannonlake_info = { GEN(11), \ .ddb_size = 2048, \ .has_csr = 0, \ - .has_logical_ring_elsq = 1 + .has_logical_ring_elsq = 1, \ + .has_hw_preempt_to_idle = 1 static const struct intel_device_info intel_icelake_11_info = { GEN11_FEATURES, diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 633f9fb..0be7e03 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -98,6 +98,7 @@ enum intel_platform { func(has_logical_ring_contexts); \ func(has_logical_ring_elsq); \ func(has_logical_ring_preemption); \ + func(has_hw_preempt_to_idle); \ func(has_overlay); \ func(has_pooled_eu); \ func(has_psr); \ diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index ab89dab..aed4aeb 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -155,6 +155,7 @@ #define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) #define GEN8_CTX_STATUS_COMPLETE (1 << 4) #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) +#define GEN11_CTX_STATUS_PREEMPT_IDLE (1 << 29) #define GEN8_CTX_STATUS_COMPLETED_MASK \ (GEN8_CTX_STATUS_COMPLETE | GEN8_CTX_STATUS_PREEMPTED) @@ -525,34 +526,49 @@ static void port_assign(struct execlist_port *port, struct i915_request *rq) port_set(port, port_pack(i915_request_get(rq), port_count(port))); } -static void inject_preempt_context(struct intel_engine_cs *engine) +static void execlist_send_preempt_to_idle(struct intel_engine_cs *engine) { struct intel_engine_execlists *execlists = &engine->execlists; - struct intel_context *ce = - to_intel_context(engine->i915->preempt_context, engine); - unsigned int n; - - GEM_BUG_ON(execlists->preempt_complete_status != - upper_32_bits(ce->lrc_desc)); - GEM_BUG_ON((ce->lrc_reg_state[CTX_CONTEXT_CONTROL + 1] & - _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT | - CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT)) != - _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT | - CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT)); - /* - * Switch to our empty preempt context so - * the state of the GPU is known (idle). - */ GEM_TRACE("%s\n", engine->name); - for (n = execlists_num_ports(execlists); --n; ) - write_desc(execlists, 0, n); + if (HAS_HW_PREEMPT_TO_IDLE(engine->i915)) { + /* + * hardware which HAS_HW_PREEMPT_TO_IDLE(), always also + * HAS_LOGICAL_RING_ELSQ(), so we can assume ctrl_reg is set + */ + GEM_BUG_ON(execlists->ctrl_reg == NULL); + + /* + * If we have hardware preempt-to-idle, we do not need to + * inject any job to the hardware. We only set a flag. + */ + writel(EL_CTRL_PREEMPT_TO_IDLE, execlists->ctrl_reg); + } else { + struct intel_context *ce = + to_intel_context(engine->i915->preempt_context, engine); + unsigned int n; - write_desc(execlists, ce->lrc_desc, n); + GEM_BUG_ON(execlists->preempt_complete_status != + upper_32_bits(ce->lrc_desc)); + GEM_BUG_ON((ce->lrc_reg_state[CTX_CONTEXT_CONTROL + 1] & + _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT | + CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT)) != + _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT | + CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT)); - /* we need to manually load the submit queue */ - if (execlists->ctrl_reg) - writel(EL_CTRL_LOAD, execlists->ctrl_reg); + /* + * Switch to our empty preempt context so + * the state of the GPU is known (idle). + */ + for (n = execlists_num_ports(execlists); --n; ) + write_desc(execlists, 0, n); + + write_desc(execlists, ce->lrc_desc, n); + + /* we need to manually load the submit queue */ + if (execlists->ctrl_reg) + writel(EL_CTRL_LOAD, execlists->ctrl_reg); + } execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK); execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT); @@ -627,7 +643,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) return; if (need_preempt(engine, last, execlists->queue_priority)) { - inject_preempt_context(engine); + execlist_send_preempt_to_idle(engine); return; } @@ -1020,22 +1036,43 @@ static void process_csb(struct intel_engine_cs *engine) execlists->active); status = buf[2 * head]; - if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE | - GEN8_CTX_STATUS_PREEMPTED)) - execlists_set_active(execlists, - EXECLISTS_ACTIVE_HWACK); - if (status & GEN8_CTX_STATUS_ACTIVE_IDLE) - execlists_clear_active(execlists, - EXECLISTS_ACTIVE_HWACK); - - if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK)) - continue; + /* + * Check if preempted from idle to idle directly. + * The STATUS_IDLE_ACTIVE flag is used to mark + * such transition. + */ + if ((status & GEN8_CTX_STATUS_IDLE_ACTIVE) && + (status & GEN11_CTX_STATUS_PREEMPT_IDLE)) { + + /* + * We could not have COMPLETED anything + * if we were idle before preemption. + */ + GEM_BUG_ON(status & GEN8_CTX_STATUS_COMPLETED_MASK); + } else { + if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE | + GEN8_CTX_STATUS_PREEMPTED)) + execlists_set_active(execlists, + EXECLISTS_ACTIVE_HWACK); + + if (status & GEN8_CTX_STATUS_ACTIVE_IDLE) + execlists_clear_active(execlists, + EXECLISTS_ACTIVE_HWACK); + + if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK)) + continue; - /* We should never get a COMPLETED | IDLE_ACTIVE! */ - GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE); + /* We should never get a COMPLETED | IDLE_ACTIVE! */ + GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE); + } - if (status & GEN8_CTX_STATUS_COMPLETE && - buf[2*head + 1] == execlists->preempt_complete_status) { + /* + * Check if preempted to real idle, either directly or + * the preemptive context already finished executing + */ + if ((status & GEN11_CTX_STATUS_PREEMPT_IDLE) || + (status & GEN8_CTX_STATUS_COMPLETE && + buf[2*head + 1] == execlists->preempt_complete_status)) { GEM_TRACE("%s preempt-idle\n", engine->name); complete_preempt_context(execlists); continue; @@ -2377,7 +2414,8 @@ static void execlists_set_default_submission(struct intel_engine_cs *engine) engine->unpark = NULL; engine->flags |= I915_ENGINE_SUPPORTS_STATS; - if (engine->i915->preempt_context) + if (engine->i915->preempt_context || + HAS_HW_PREEMPT_TO_IDLE(engine->i915)) engine->flags |= I915_ENGINE_HAS_PREEMPTION; engine->i915->caps.scheduler = diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 1593194..3249e9b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -45,6 +45,7 @@ #define RING_EXECLIST_SQ_CONTENTS(engine) _MMIO((engine)->mmio_base + 0x510) #define RING_EXECLIST_CONTROL(engine) _MMIO((engine)->mmio_base + 0x550) #define EL_CTRL_LOAD (1 << 0) +#define EL_CTRL_PREEMPT_TO_IDLE (1 << 1) /* The docs specify that the write pointer wraps around after 5h, "After status * is written out to the last available status QW at offset 5h, this pointer