From patchwork Fri Sep 6 12:23:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 11135197 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE7C113BD for ; Fri, 6 Sep 2019 12:23:27 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B6329206CD for ; Fri, 6 Sep 2019 12:23:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B6329206CD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5A9286E284; Fri, 6 Sep 2019 12:23:27 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4F8236E288 for ; Fri, 6 Sep 2019 12:23:24 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Sep 2019 05:23:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,473,1559545200"; d="scan'208";a="184523372" Received: from rosetta.fi.intel.com ([10.237.72.194]) by fmsmga007.fm.intel.com with ESMTP; 06 Sep 2019 05:23:18 -0700 Received: by rosetta.fi.intel.com (Postfix, from userid 1000) id 8301B84077D; Fri, 6 Sep 2019 15:23:17 +0300 (EEST) From: Mika Kuoppala To: intel-gfx@lists.freedesktop.org Date: Fri, 6 Sep 2019 15:23:13 +0300 Message-Id: <20190906122314.2146-1-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.17.1 Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Use engine relative LRIs on context setup X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lucas De Marchi MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Daniele pointed out that relative mmio works differently in on context restore. Instead of adding the engine mmio base to offset, it masks out the base and adds bits [12:2] to current engine base. This should allow us to construct context register state to be applicable to all instances, including virtual. And avoid the trouble of updating the registers on virtual instances when submitting work. Bspec: 20206 Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: Michal Wajdeczko Cc: Joonas Lahtinen Cc: Lucas De Marchi Cc: John Harrison Suggested-by: Daniele Ceraolo Spurio Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 7 +++++ drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 2 ++ drivers/gpu/drm/i915/gt/intel_lrc.c | 27 ++++++++++++++------ 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 15e02cb58a67..943f0663837e 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -481,6 +481,7 @@ struct intel_engine_cs { #define I915_ENGINE_HAS_SEMAPHORES BIT(3) #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4) #define I915_ENGINE_IS_VIRTUAL BIT(5) +#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6) unsigned int flags; /* @@ -576,6 +577,12 @@ intel_engine_is_virtual(const struct intel_engine_cs *engine) return engine->flags & I915_ENGINE_IS_VIRTUAL; } +static inline bool +intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine) +{ + return engine->flags & I915_ENGINE_HAS_RELATIVE_MMIO; +} + #define instdone_has_slice(dev_priv___, sseu___, slice___) \ ((IS_GEN(dev_priv___, 7) ? 1 : ((sseu___)->slice_mask)) & BIT(slice___)) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index 86e00a2db8a4..e1b87a516ef8 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -132,6 +132,8 @@ * address/value pairs. Don't overdue it, though, x <= 2^4 must hold! */ #define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*(x)-1) +/* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12:2) : offset) */ +#define MI_LRI_CS_MMIO (1<<19) #define MI_LRI_FORCE_POSTED (1<<12) #define MI_STORE_REGISTER_MEM MI_INSTR(0x24, 1) #define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 87b7473a6dfb..6c68ed2bf3d2 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1214,7 +1214,10 @@ static void execlists_dequeue(struct intel_engine_cs *engine) unsigned int n; GEM_BUG_ON(READ_ONCE(ve->context.inflight)); - virtual_update_register_offsets(regs, engine); + + if (!intel_engine_has_relative_mmio(engine)) + virtual_update_register_offsets(regs, + engine); if (!list_empty(&ve->context.signals)) virtual_xfer_breadcrumbs(ve, engine); @@ -2939,6 +2942,10 @@ void intel_execlists_set_default_submission(struct intel_engine_cs *engine) if (HAS_LOGICAL_RING_PREEMPTION(engine->i915)) engine->flags |= I915_ENGINE_HAS_PREEMPTION; } + + engine->flags |= (engine->class != COPY_ENGINE_CLASS && + INTEL_GEN(engine->i915) >= 11) ? + I915_ENGINE_HAS_RELATIVE_MMIO : 0; } static void execlists_destroy(struct intel_engine_cs *engine) @@ -3130,8 +3137,10 @@ static void execlists_init_reg_state(u32 *regs, struct intel_ring *ring) { struct i915_ppgtt *ppgtt = vm_alias(ce->vm); - bool rcs = engine->class == RENDER_CLASS; - u32 base = engine->mmio_base; + const bool rcs = engine->class == RENDER_CLASS; + const u32 base = engine->mmio_base; + const u32 lri_base = intel_engine_has_relative_mmio(engine) ? + MI_LRI_CS_MMIO : 0; /* * A context is actually a big batch buffer with several @@ -3144,7 +3153,7 @@ static void execlists_init_reg_state(u32 *regs, * Must keep consistent with virtual_update_register_offsets(). */ regs[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(rcs ? 14 : 11) | - MI_LRI_FORCE_POSTED; + MI_LRI_FORCE_POSTED | lri_base; CTX_REG(regs, CTX_CONTEXT_CONTROL, RING_CONTEXT_CONTROL(base), _MASKED_BIT_DISABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT) | @@ -3191,7 +3200,8 @@ static void execlists_init_reg_state(u32 *regs, } } - regs[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9) | MI_LRI_FORCE_POSTED; + regs[CTX_LRI_HEADER_1] = + MI_LOAD_REGISTER_IMM(9) | MI_LRI_FORCE_POSTED | lri_base; CTX_REG(regs, CTX_CTX_TIMESTAMP, RING_CTX_TIMESTAMP(base), 0); /* PDP values well be assigned later if needed */ @@ -3218,7 +3228,7 @@ static void execlists_init_reg_state(u32 *regs, } if (rcs) { - regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1); + regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1) | lri_base; CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, 0); } @@ -3411,8 +3421,9 @@ static void virtual_engine_initial_hint(struct virtual_engine *ve) return; swap(ve->siblings[swp], ve->siblings[0]); - virtual_update_register_offsets(ve->context.lrc_reg_state, - ve->siblings[0]); + if (!intel_engine_has_relative_mmio(ve->siblings[0])) + virtual_update_register_offsets(ve->context.lrc_reg_state, + ve->siblings[0]); } static int virtual_context_pin(struct intel_context *ce) From patchwork Fri Sep 6 12:23:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 11135195 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8245315E6 for ; Fri, 6 Sep 2019 12:23:24 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 67903206CD for ; Fri, 6 Sep 2019 12:23:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 67903206CD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 927C36E283; Fri, 6 Sep 2019 12:23:23 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7CF1E6E283 for ; Fri, 6 Sep 2019 12:23:21 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Sep 2019 05:23:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,473,1559545200"; d="scan'208";a="174265161" Received: from rosetta.fi.intel.com ([10.237.72.194]) by orsmga007.jf.intel.com with ESMTP; 06 Sep 2019 05:23:18 -0700 Received: by rosetta.fi.intel.com (Postfix, from userid 1000) id 85F31840783; Fri, 6 Sep 2019 15:23:17 +0300 (EEST) From: Mika Kuoppala To: intel-gfx@lists.freedesktop.org Date: Fri, 6 Sep 2019 15:23:14 +0300 Message-Id: <20190906122314.2146-2-mika.kuoppala@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190906122314.2146-1-mika.kuoppala@linux.intel.com> References: <20190906122314.2146-1-mika.kuoppala@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/2] drm/i915/tgl: Register state context definition for Gen12 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michel Thierry , Lucas De Marchi Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Michel Thierry Gen12 has subtle changes in the reg state context offsets (some fields are gone, some are in a different location), compared to previous Gens. The simplest approach seems to be keeping Gen12 (and future platform) changes apart from the previous gens, while keeping the registers that are contiguous in functions we can reuse. v2: alias, virtual engine, rpcs, prune unused regs v3: use engine base (Daniele), take ctx_bb for all Bspec: 46255 Cc: Michal Wajdeczko Cc: Daniele Ceraolo Spurio Cc: Chris Wilson Cc: Joonas Lahtinen Cc: José Roberto de Souza Signed-off-by: Michel Thierry Signed-off-by: Lucas De Marchi Signed-off-by: Mika Kuoppala Reviewed-by: Daniele Ceraolo Spurio Tested-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/gt/intel_lrc.c | 196 +++++++++++++++++------- drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 6 +- 2 files changed, 147 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 6c68ed2bf3d2..e9c873877253 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -808,8 +808,11 @@ static void virtual_update_register_offsets(u32 *regs, { u32 base = engine->mmio_base; + GEM_WARN_ON(engine->class == COPY_ENGINE_CLASS); + /* Must match execlists_init_reg_state()! */ + /* Common part */ regs[CTX_CONTEXT_CONTROL] = i915_mmio_reg_offset(RING_CONTEXT_CONTROL(base)); regs[CTX_RING_HEAD] = i915_mmio_reg_offset(RING_HEAD(base)); @@ -820,13 +823,16 @@ static void virtual_update_register_offsets(u32 *regs, regs[CTX_BB_HEAD_U] = i915_mmio_reg_offset(RING_BBADDR_UDW(base)); regs[CTX_BB_HEAD_L] = i915_mmio_reg_offset(RING_BBADDR(base)); regs[CTX_BB_STATE] = i915_mmio_reg_offset(RING_BBSTATE(base)); + regs[CTX_SECOND_BB_HEAD_U] = i915_mmio_reg_offset(RING_SBBADDR_UDW(base)); regs[CTX_SECOND_BB_HEAD_L] = i915_mmio_reg_offset(RING_SBBADDR(base)); regs[CTX_SECOND_BB_STATE] = i915_mmio_reg_offset(RING_SBBSTATE(base)); + /* PPGTT part */ regs[CTX_CTX_TIMESTAMP] = i915_mmio_reg_offset(RING_CTX_TIMESTAMP(base)); + regs[CTX_PDP3_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 3)); regs[CTX_PDP3_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 3)); regs[CTX_PDP2_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 2)); @@ -3123,37 +3129,13 @@ static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine) return indirect_ctx_offset; } -static struct i915_ppgtt *vm_alias(struct i915_address_space *vm) -{ - if (i915_is_ggtt(vm)) - return i915_vm_to_ggtt(vm)->alias; - else - return i915_vm_to_ppgtt(vm); -} -static void execlists_init_reg_state(u32 *regs, - struct intel_context *ce, - struct intel_engine_cs *engine, - struct intel_ring *ring) +static void init_common_reg_state(u32 * const regs, + struct i915_ppgtt * const ppgtt, + struct intel_engine_cs *engine, + struct intel_ring *ring) { - struct i915_ppgtt *ppgtt = vm_alias(ce->vm); - const bool rcs = engine->class == RENDER_CLASS; const u32 base = engine->mmio_base; - const u32 lri_base = intel_engine_has_relative_mmio(engine) ? - MI_LRI_CS_MMIO : 0; - - /* - * A context is actually a big batch buffer with several - * MI_LOAD_REGISTER_IMM commands followed by (reg, value) pairs. The - * values we are setting here are only for the first context restore: - * on a subsequent save, the GPU will recreate this batchbuffer with new - * values (including all the missing MI_LOAD_REGISTER_IMM commands that - * we are not initializing here). - * - * Must keep consistent with virtual_update_register_offsets(). - */ - regs[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(rcs ? 14 : 11) | - MI_LRI_FORCE_POSTED | lri_base; CTX_REG(regs, CTX_CONTEXT_CONTROL, RING_CONTEXT_CONTROL(base), _MASKED_BIT_DISABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT) | @@ -3171,39 +3153,43 @@ static void execlists_init_reg_state(u32 *regs, CTX_REG(regs, CTX_BB_HEAD_U, RING_BBADDR_UDW(base), 0); CTX_REG(regs, CTX_BB_HEAD_L, RING_BBADDR(base), 0); CTX_REG(regs, CTX_BB_STATE, RING_BBSTATE(base), RING_BB_PPGTT); - CTX_REG(regs, CTX_SECOND_BB_HEAD_U, RING_SBBADDR_UDW(base), 0); - CTX_REG(regs, CTX_SECOND_BB_HEAD_L, RING_SBBADDR(base), 0); - CTX_REG(regs, CTX_SECOND_BB_STATE, RING_SBBSTATE(base), 0); - if (rcs) { - struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; - - CTX_REG(regs, CTX_RCS_INDIRECT_CTX, RING_INDIRECT_CTX(base), 0); - CTX_REG(regs, CTX_RCS_INDIRECT_CTX_OFFSET, - RING_INDIRECT_CTX_OFFSET(base), 0); - if (wa_ctx->indirect_ctx.size) { - u32 ggtt_offset = i915_ggtt_offset(wa_ctx->vma); +} - regs[CTX_RCS_INDIRECT_CTX + 1] = - (ggtt_offset + wa_ctx->indirect_ctx.offset) | - (wa_ctx->indirect_ctx.size / CACHELINE_BYTES); +static void init_wa_bb_reg_state(u32 * const regs, + struct intel_engine_cs *engine, + u32 pos_bb_per_ctx) +{ + struct i915_ctx_workarounds * const wa_ctx = &engine->wa_ctx; + const u32 base = engine->mmio_base; + const u32 pos_indirect_ctx = pos_bb_per_ctx + 2; + const u32 pos_indirect_ctx_offset = pos_indirect_ctx + 2; - regs[CTX_RCS_INDIRECT_CTX_OFFSET + 1] = - intel_lr_indirect_ctx_offset(engine) << 6; - } + CTX_REG(regs, pos_indirect_ctx, RING_INDIRECT_CTX(base), 0); + CTX_REG(regs, pos_indirect_ctx_offset, + RING_INDIRECT_CTX_OFFSET(base), 0); + if (wa_ctx->indirect_ctx.size) { + const u32 ggtt_offset = i915_ggtt_offset(wa_ctx->vma); - CTX_REG(regs, CTX_BB_PER_CTX_PTR, RING_BB_PER_CTX_PTR(base), 0); - if (wa_ctx->per_ctx.size) { - u32 ggtt_offset = i915_ggtt_offset(wa_ctx->vma); + regs[pos_indirect_ctx + 1] = + (ggtt_offset + wa_ctx->indirect_ctx.offset) | + (wa_ctx->indirect_ctx.size / CACHELINE_BYTES); - regs[CTX_BB_PER_CTX_PTR + 1] = - (ggtt_offset + wa_ctx->per_ctx.offset) | 0x01; - } + regs[pos_indirect_ctx_offset + 1] = + intel_lr_indirect_ctx_offset(engine) << 6; } - regs[CTX_LRI_HEADER_1] = - MI_LOAD_REGISTER_IMM(9) | MI_LRI_FORCE_POSTED | lri_base; + CTX_REG(regs, pos_bb_per_ctx, RING_BB_PER_CTX_PTR(base), 0); + if (wa_ctx->per_ctx.size) { + const u32 ggtt_offset = i915_ggtt_offset(wa_ctx->vma); - CTX_REG(regs, CTX_CTX_TIMESTAMP, RING_CTX_TIMESTAMP(base), 0); + regs[pos_bb_per_ctx + 1] = + (ggtt_offset + wa_ctx->per_ctx.offset) | 0x01; + } +} + +static void init_ppgtt_reg_state(u32 *regs, u32 base, + struct i915_ppgtt *ppgtt) +{ /* PDP values well be assigned later if needed */ CTX_REG(regs, CTX_PDP3_UDW, GEN8_RING_PDP_UDW(base, 3), 0); CTX_REG(regs, CTX_PDP3_LDW, GEN8_RING_PDP_LDW(base, 3), 0); @@ -3226,6 +3212,53 @@ static void execlists_init_reg_state(u32 *regs, ASSIGN_CTX_PDP(ppgtt, regs, 1); ASSIGN_CTX_PDP(ppgtt, regs, 0); } +} + +static struct i915_ppgtt *vm_alias(struct i915_address_space *vm) +{ + if (i915_is_ggtt(vm)) + return i915_vm_to_ggtt(vm)->alias; + else + return i915_vm_to_ppgtt(vm); +} + +static void gen8_init_reg_state(u32 * const regs, + struct intel_context *ce, + struct intel_engine_cs *engine, + struct intel_ring *ring) +{ + struct i915_ppgtt * const ppgtt = vm_alias(ce->vm); + const bool rcs = engine->class == RENDER_CLASS; + const u32 base = engine->mmio_base; + const u32 lri_base = intel_engine_has_relative_mmio(engine) ? + MI_LRI_CS_MMIO : 0; + + /* + * A context is actually a big batch buffer with several + * MI_LOAD_REGISTER_IMM commands followed by (reg, value) pairs. The + * values we are setting here are only for the first context restore: + * on a subsequent save, the GPU will recreate this batchbuffer with new + * values (including all the missing MI_LOAD_REGISTER_IMM commands that + * we are not initializing here). + * + * Must keep consistent with virtual_update_register_offsets(). + */ + regs[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(rcs ? 14 : 11) | + MI_LRI_FORCE_POSTED | lri_base; + + init_common_reg_state(regs, ppgtt, engine, ring); + CTX_REG(regs, CTX_SECOND_BB_HEAD_U, RING_SBBADDR_UDW(base), 0); + CTX_REG(regs, CTX_SECOND_BB_HEAD_L, RING_SBBADDR(base), 0); + CTX_REG(regs, CTX_SECOND_BB_STATE, RING_SBBSTATE(base), 0); + if (rcs) + init_wa_bb_reg_state(regs, engine, CTX_BB_PER_CTX_PTR); + + regs[CTX_LRI_HEADER_1] = + MI_LOAD_REGISTER_IMM(9) | MI_LRI_FORCE_POSTED | lri_base; + + CTX_REG(regs, CTX_CTX_TIMESTAMP, RING_CTX_TIMESTAMP(base), 0); + + init_ppgtt_reg_state(regs, base, ppgtt); if (rcs) { regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1) | lri_base; @@ -3237,6 +3270,61 @@ static void execlists_init_reg_state(u32 *regs, regs[CTX_END] |= BIT(0); } +static void gen12_init_reg_state(u32 * const regs, + struct intel_context *ce, + struct intel_engine_cs *engine, + struct intel_ring *ring) +{ + struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(ce->vm); + const bool rcs = engine->class == RENDER_CLASS; + const u32 base = engine->mmio_base; + const u32 lri_base = intel_engine_has_relative_mmio(engine) ? + MI_LRI_CS_MMIO : 0; + + GEM_DEBUG_EXEC(DRM_INFO_ONCE("Using GEN12 Register State Context\n")); + + regs[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(rcs ? 11 : 9) | + MI_LRI_FORCE_POSTED | lri_base; + + init_common_reg_state(regs, ppgtt, engine, ring); + + /* We want ctx_ptr for all engines to be set */ + init_wa_bb_reg_state(regs, engine, GEN12_CTX_BB_PER_CTX_PTR); + + regs[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9) | + MI_LRI_FORCE_POSTED | lri_base; + + CTX_REG(regs, CTX_CTX_TIMESTAMP, RING_CTX_TIMESTAMP(base), 0); + + init_ppgtt_reg_state(regs, base, ppgtt); + + if (rcs) { + regs[GEN12_CTX_LRI_HEADER_3] = MI_LOAD_REGISTER_IMM(1) | + lri_base; + CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, 0); + + /* TODO: oa_init_reg_state ? */ + } +} + +static void execlists_init_reg_state(u32 *regs, + struct intel_context *ce, + struct intel_engine_cs *engine, + struct intel_ring *ring) +{ + /* A context is actually a big batch buffer with several + * MI_LOAD_REGISTER_IMM commands followed by (reg, value) pairs. The + * values we are setting here are only for the first context restore: + * on a subsequent save, the GPU will recreate this batchbuffer with new + * values (including all the missing MI_LOAD_REGISTER_IMM commands that + * we are not initializing here). + */ + if (INTEL_GEN(engine->i915) >= 12) + gen12_init_reg_state(regs, ce, engine, ring); + else + gen8_init_reg_state(regs, ce, engine, ring); +} + static int populate_lr_context(struct intel_context *ce, struct drm_i915_gem_object *ctx_obj, diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h index b8f20ad71169..68caf8541866 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h @@ -9,7 +9,7 @@ #include -/* GEN8+ Reg State Context */ +/* GEN8 to GEN11 Reg State Context */ #define CTX_LRI_HEADER_0 0x01 #define CTX_CONTEXT_CONTROL 0x02 #define CTX_RING_HEAD 0x04 @@ -39,6 +39,10 @@ #define CTX_R_PWR_CLK_STATE 0x42 #define CTX_END 0x44 +/* GEN12+ Reg State Context */ +#define GEN12_CTX_BB_PER_CTX_PTR 0x12 +#define GEN12_CTX_LRI_HEADER_3 0x41 + #define CTX_REG(reg_state, pos, reg, val) do { \ u32 *reg_state__ = (reg_state); \ const u32 pos__ = (pos); \