From patchwork Mon Apr 25 15:23:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12825898 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18F84C433EF for ; Mon, 25 Apr 2022 15:22:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D27BB10EFC9; Mon, 25 Apr 2022 15:22:27 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3141310EF6A; Mon, 25 Apr 2022 15:22:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650900143; x=1682436143; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+gkSllZRfRS0AkEyN2TQaHqveCxY8bhyICvlmipDVlI=; b=nh5e5OydYTHz76hKGH87UZHomTACnKs27BgMoGe8Rlr8xZ0mTKdbgUhr UFjdvuzGDhRxkM9NrpMnQWFXiGxm98KkG4dKJOK4yUabP5VH1S8UcFiek Ta0Hdt8VmVNMgrGYP5JVfqb0aZmtsN4YfI3mkpcL4aKzWQZ5fxA7+EfuD m335vNWKNJTFFuhRm49+F34KBcOMQ/rXWcr1DeNo9EvYxUIf+4KcRxSNv zyQiul/7fIXZRUiu9jyzOvC1h3nn2QcdTTEi6/o8Zh01WeNM/tEvOSVWZ RG4DHMxE8ObpqRcyDJkdNgdMV0XVN1q1lojDTykJUd/iZeL0X3DD7QgzX w==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="262875132" X-IronPort-AV: E=Sophos;i="5.90,288,1643702400"; d="scan'208";a="262875132" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 08:22:22 -0700 X-IronPort-AV: E=Sophos;i="5.90,288,1643702400"; d="scan'208";a="512681577" Received: from ramaling-i9x.iind.intel.com ([10.203.144.108]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 08:22:20 -0700 From: Ramalingam C To: intel-gfx , dri-devel Subject: [PATCH 1/3] drm/i915/xehpsdv/dg1/tgl: Fix issue with LRI relative addressing Date: Mon, 25 Apr 2022 20:53:15 +0530 Message-Id: <20220425152317.4275-2-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20220425152317.4275-1-ramalingam.c@intel.com> References: <20220425152317.4275-1-ramalingam.c@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Akeem G Abodunrin , Hellstrom Thomas , Prathap Kumar Valsan Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Akeem G Abodunrin When bit 19 of MI_LOAD_REGISTER_IMM instruction opcode is set on tgl+ devices, HW does not care about certain register address offsets, but instead check the following for valid address ranges on specific engines: RCS && CCS: BITS(0 - 10) BCS: BITS(0 - 11) VECS && VCS: BITS(0 - 13) Also, tgl+ now support relative addressing for BCS engine - So, this patch fixes issue with live_gt_lrc selftest that is failing where there is mismatch between LRC register layout generated during init and HW default register offsets. Signed-off-by: Akeem G Abodunrin cc: Prathap Kumar Valsan Signed-off-by: Ramalingam C Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/selftest_lrc.c | 36 +++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 6ba52ef1acb8..8dc7b88cdca0 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -128,6 +128,27 @@ static int context_flush(struct intel_context *ce, long timeout) return err; } +static int get_lri_mask(struct intel_engine_cs *engine, u32 lri) +{ + if ((lri & MI_LRI_LRM_CS_MMIO) == 0) + return ~0u; + + if (GRAPHICS_VER(engine->i915) < 12) + return 0xfff; + + switch (engine->class) { + default: + case RENDER_CLASS: + case COMPUTE_CLASS: + return 0x07ff; + case COPY_ENGINE_CLASS: + return 0x0fff; + case VIDEO_DECODE_CLASS: + case VIDEO_ENHANCEMENT_CLASS: + return 0x3fff; + } +} + static int live_lrc_layout(void *arg) { struct intel_gt *gt = arg; @@ -167,6 +188,7 @@ static int live_lrc_layout(void *arg) dw = 0; do { u32 lri = READ_ONCE(hw[dw]); + u32 lri_mask; if (lri == 0) { dw++; @@ -194,6 +216,18 @@ static int live_lrc_layout(void *arg) break; } + /* + * When bit 19 of MI_LOAD_REGISTER_IMM instruction + * opcode is set on Gen12+ devices, HW does not + * care about certain register address offsets, and + * instead check the following for valid address + * ranges on specific engines: + * RCS && CCS: BITS(0 - 10) + * BCS: BITS(0 - 11) + * VECS && VCS: BITS(0 - 13) + */ + lri_mask = get_lri_mask(engine, lri); + lri &= 0x7f; lri++; dw++; @@ -201,7 +235,7 @@ static int live_lrc_layout(void *arg) while (lri) { u32 offset = READ_ONCE(hw[dw]); - if (offset != lrc[dw]) { + if ((offset ^ lrc[dw]) & lri_mask) { pr_err("%s: Different registers found at dword %d, expected %x, found %x\n", engine->name, dw, offset, lrc[dw]); err = -EINVAL; From patchwork Mon Apr 25 15:23:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12825899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3379CC433EF for ; Mon, 25 Apr 2022 15:22:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9597010EFE2; Mon, 25 Apr 2022 15:22:28 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8E9BB10EFC9; Mon, 25 Apr 2022 15:22:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650900145; x=1682436145; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s6WSnVVhmtHQd64IoTOJzWVy5YJSxb7gckdNJl86SQo=; b=aAWabwl7RgVnQ71E2tWJXOuZIRgR56Wwmo0Aogzr1YBoiALsVLyLb23G ANtVKGC32te3lUB9SQy7BDpZfT6MHjgOc+CfDk6Agk0KijHOI6k+TNdIQ XGy7tAIVTbYqu7EhYk1tGSuEfY7+uwvA0F3SVrxW61YegjfDvEB1x5zni Y1vJftvkB24+nFs5I64BofOJOrOs4g4J/H/7SKnNAHbSqekAlLelXsqOc Cn345mMBL6ykyQA/k1IGDYwlmoQ1sSEWU1cLLZYij+zsw6JDKEuPnt7A5 LEFT933VzS1yBq666yuwmoULsFtUH0TAKBFrnWn0kb7pe8Ul1mqHxYeh1 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="262875142" X-IronPort-AV: E=Sophos;i="5.90,288,1643702400"; d="scan'208";a="262875142" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 08:22:25 -0700 X-IronPort-AV: E=Sophos;i="5.90,288,1643702400"; d="scan'208";a="512681600" Received: from ramaling-i9x.iind.intel.com ([10.203.144.108]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 08:22:22 -0700 From: Ramalingam C To: intel-gfx , dri-devel Subject: [PATCH 2/3] drm/i915/selftests: Skip poisoning SET_PREDICATE_RESULT on dg2 Date: Mon, 25 Apr 2022 20:53:16 +0530 Message-Id: <20220425152317.4275-3-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20220425152317.4275-1-ramalingam.c@intel.com> References: <20220425152317.4275-1-ramalingam.c@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: CQ Tang , Hellstrom Thomas , Chris Wilson Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Chris Wilson When predication is enabled all commands baring a few (such as MI_BB_END) are nop'ed. If we accidentally enable predication while poisoning the context, not only is the rest of the poisoning skipped (thus disabling the test), but the closing instructions of the poison request are nop'ed. Not only do we then not signal the waiting context, but we even prevent re-enabling arbitration and the GPU will not perform a context switch at the end of the request. Cc: Joonas Lahtinen Suggested-by: CQ Tang Signed-off-by: Chris Wilson Signed-off-by: Ramalingam C Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/intel_engine_regs.h | 1 + drivers/gpu/drm/i915/gt/selftest_lrc.c | 17 ++++++++++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_regs.h b/drivers/gpu/drm/i915/gt/intel_engine_regs.h index 594a629cb28f..1dab554bf640 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_regs.h @@ -193,6 +193,7 @@ #define RING_TIMESTAMP_UDW(base) _MMIO((base) + 0x358 + 4) #define RING_CONTEXT_STATUS_PTR(base) _MMIO((base) + 0x3a0) #define RING_CTX_TIMESTAMP(base) _MMIO((base) + 0x3a8) /* gen8+ */ +#define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8) #define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base) + 0x4D0) + (i) * 4) #define RING_FORCE_TO_NONPRIV_ADDRESS_MASK REG_GENMASK(25, 2) #define RING_FORCE_TO_NONPRIV_ACCESS_RW (0 << 28) /* CFL+ & Gen11+ */ diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 8dc7b88cdca0..8b2c11dbe354 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -945,6 +945,19 @@ create_user_vma(struct i915_address_space *vm, unsigned long size) return vma; } +static u32 safe_poison(u32 offset, u32 poison) +{ + /* + * Do not enable predication as it will nop all subsequent commands, + * not only disabling the tests (by preventing all the other SRM) but + * also preventing the arbitration events at the end of the request. + */ + if (offset == i915_mmio_reg_offset(RING_PREDICATE_RESULT(0))) + poison &= ~REG_BIT(0); + + return poison; +} + static struct i915_vma * store_context(struct intel_context *ce, struct i915_vma *scratch) { @@ -1154,7 +1167,9 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison) *cs++ = MI_LOAD_REGISTER_IMM(len); while (len--) { *cs++ = hw[dw]; - *cs++ = poison; + *cs++ = safe_poison(hw[dw] & get_lri_mask(ce->engine, + MI_LRI_LRM_CS_MMIO), + poison); dw += 2; } } while (dw < PAGE_SIZE / sizeof(u32) && From patchwork Mon Apr 25 15:23:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12825900 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E6FDFC433FE for ; Mon, 25 Apr 2022 15:22:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DBB4E10F03D; Mon, 25 Apr 2022 15:22:32 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3C26510EF99; Mon, 25 Apr 2022 15:22:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650900148; x=1682436148; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aaf2x6K+fQUmMDf9tRLPoX2TAPBD6mly3gZDdWzDB+Q=; b=TLXkft9pfbpx6IOMZnxvpAwPyEJ5KCMEZkF1SCQgAeTGN9tdmDF7TSdI dcfFHfra7BCfa/VWgGwaUZN35Kjr1t5jWN/QwiuYiXgkp6kjWP2j3Zwq7 0/8sOTPVYLNrKq/BxLYUJIX2kTXkpEbPFlkgJrtkqAQfH0GYDC+zd8guW Hnne3qlQBkRhL/R1Ip2aFRsMe+GnkMQuBP50diHKFKosF0mxS3t2al3ok x1hj1giG2zBf8K8nVtDfApPlCqR9aNByWL0eJWn36IAo6Gcou4uJJ/jo4 6G39J/bsr0x/w/N/d2GkSwT94w+3LhxYYV46ajtiJIsxNDdEYqgPVRqAf g==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="262875160" X-IronPort-AV: E=Sophos;i="5.90,288,1643702400"; d="scan'208";a="262875160" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 08:22:27 -0700 X-IronPort-AV: E=Sophos;i="5.90,288,1643702400"; d="scan'208";a="512681616" Received: from ramaling-i9x.iind.intel.com ([10.203.144.108]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 08:22:25 -0700 From: Ramalingam C To: intel-gfx , dri-devel Subject: [PATCH 3/3] drm/i915/gt: Clear SET_PREDICATE_RESULT prior to executing the ring Date: Mon, 25 Apr 2022 20:53:17 +0530 Message-Id: <20220425152317.4275-4-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20220425152317.4275-1-ramalingam.c@intel.com> References: <20220425152317.4275-1-ramalingam.c@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zbigniew Kempczynski , Hellstrom Thomas , Chris Wilson Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Chris Wilson Userspace may leave predication enabled upon return from the batch buffer, which has the consequent of preventing all operation from the ring from being executed, including all the synchronisation, coherency control, arbitration and user signaling. This is more than just a local gpu hang in one client, as the user has the ability to prevent the kernel from applying critical workarounds and can cause a full GT reset. We could simply execute MI_SET_PREDICATE upon return from the user batch, but this has the repercussion of modifying the user's context state. Instead, we opt to execute a fixup batch which by mixing predicated operations can determine the state of the SET_PREDICATE_RESULT register and restore it prior to the next userspace batch. This allows us to protect the kernel's ring without changing the uABI. Suggested-by: Zbigniew Kempczynski Signed-off-by: Chris Wilson Cc: Zbigniew Kempczynski Cc: Thomas Hellstrom Signed-off-by: Ramalingam C Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 54 +++++++++++++ drivers/gpu/drm/i915/gt/gen8_engine_cs.h | 7 ++ drivers/gpu/drm/i915/gt/intel_engine_regs.h | 1 + .../drm/i915/gt/intel_execlists_submission.c | 15 +++- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 2 + drivers/gpu/drm/i915/gt/intel_lrc.c | 75 ++++++++++++++----- drivers/gpu/drm/i915/gt/intel_lrc.h | 5 ++ .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 + 8 files changed, 137 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 9529c5455bc3..3e13960615bd 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -5,6 +5,7 @@ #include "gen8_engine_cs.h" #include "i915_drv.h" +#include "intel_engine_regs.h" #include "intel_gpu_commands.h" #include "intel_lrc.h" #include "intel_ring.h" @@ -385,6 +386,59 @@ int gen8_emit_init_breadcrumb(struct i915_request *rq) return 0; } +static int __gen125_emit_bb_start(struct i915_request *rq, + u64 offset, u32 len, + const unsigned int flags, + u32 arb) +{ + struct intel_context *ce = rq->context; + u32 wa_offset = lrc_indirect_bb(ce); + u32 *cs; + + cs = intel_ring_begin(rq, 12); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_ARB_ON_OFF | arb; + + *cs++ = MI_LOAD_REGISTER_MEM_GEN8 | + MI_SRM_LRM_GLOBAL_GTT | + MI_LRI_LRM_CS_MMIO; + *cs++ = i915_mmio_reg_offset(RING_PREDICATE_RESULT(0)); + *cs++ = wa_offset + DG2_PREDICATE_RESULT_WA; + *cs++ = 0; + + *cs++ = MI_BATCH_BUFFER_START_GEN8 | + (flags & I915_DISPATCH_SECURE ? 0 : BIT(8)); + *cs++ = lower_32_bits(offset); + *cs++ = upper_32_bits(offset); + + /* Fixup stray MI_SET_PREDICATE as it prevents us executing the ring */ + *cs++ = MI_BATCH_BUFFER_START_GEN8; + *cs++ = wa_offset + DG2_PREDICATE_RESULT_BB; + *cs++ = 0; + + *cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE; + + intel_ring_advance(rq, cs); + + return 0; +} + +int gen125_emit_bb_start_noarb(struct i915_request *rq, + u64 offset, u32 len, + const unsigned int flags) +{ + return __gen125_emit_bb_start(rq, offset, len, flags, MI_ARB_DISABLE); +} + +int gen125_emit_bb_start(struct i915_request *rq, + u64 offset, u32 len, + const unsigned int flags) +{ + return __gen125_emit_bb_start(rq, offset, len, flags, MI_ARB_ENABLE); +} + int gen8_emit_bb_start_noarb(struct i915_request *rq, u64 offset, u32 len, const unsigned int flags) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h index 107ab42539ab..32e3d2b831bb 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h @@ -31,6 +31,13 @@ int gen8_emit_bb_start(struct i915_request *rq, u64 offset, u32 len, const unsigned int flags); +int gen125_emit_bb_start_noarb(struct i915_request *rq, + u64 offset, u32 len, + const unsigned int flags); +int gen125_emit_bb_start(struct i915_request *rq, + u64 offset, u32 len, + const unsigned int flags); + u32 *gen8_emit_fini_breadcrumb_xcs(struct i915_request *rq, u32 *cs); u32 *gen12_emit_fini_breadcrumb_xcs(struct i915_request *rq, u32 *cs); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_regs.h b/drivers/gpu/drm/i915/gt/intel_engine_regs.h index 1dab554bf640..75a0c55c5aa5 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_regs.h @@ -148,6 +148,7 @@ (REG_FIELD_PREP(CMD_CCTL_WRITE_OVERRIDE_MASK, (write) << 1) | \ REG_FIELD_PREP(CMD_CCTL_READ_OVERRIDE_MASK, (read) << 1)) +#define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8) /* gen12+ */ #define MI_PREDICATE_RESULT_2(base) _MMIO((base) + 0x3bc) #define LOWER_SLICE_ENABLED (1 << 0) #define LOWER_SLICE_DISABLED (0 << 0) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index f8749c433b7c..86f7a9ac1c39 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3433,10 +3433,17 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine) } } - if (intel_engine_has_preemption(engine)) - engine->emit_bb_start = gen8_emit_bb_start; - else - engine->emit_bb_start = gen8_emit_bb_start_noarb; + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) { + if (intel_engine_has_preemption(engine)) + engine->emit_bb_start = gen125_emit_bb_start; + else + engine->emit_bb_start = gen125_emit_bb_start_noarb; + } else { + if (intel_engine_has_preemption(engine)) + engine->emit_bb_start = gen8_emit_bb_start; + else + engine->emit_bb_start = gen8_emit_bb_start_noarb; + } engine->busyness = execlists_engine_busyness; } diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index e52718a87f14..556bca3be804 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -39,6 +39,8 @@ #define MI_GLOBAL_GTT (1<<22) #define MI_NOOP MI_INSTR(0, 0) +#define MI_SET_PREDICATE MI_INSTR(0x01, 0) +#define MI_SET_PREDICATE_DISABLE (0 << 0) #define MI_USER_INTERRUPT MI_INSTR(0x02, 0) #define MI_WAIT_FOR_EVENT MI_INSTR(0x03, 0) #define MI_WAIT_FOR_OVERLAY_FLIP (1<<16) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 3f83a9038e13..eec73c66406c 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -904,6 +904,24 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine) engine->name); } +static u32 context_wa_bb_offset(const struct intel_context *ce) +{ + return PAGE_SIZE * ce->wa_bb_page; +} + +static u32 *context_indirect_bb(const struct intel_context *ce) +{ + void *ptr; + + GEM_BUG_ON(!ce->wa_bb_page); + + ptr = ce->lrc_reg_state; + ptr -= LRC_STATE_OFFSET; /* back to start of context image */ + ptr += context_wa_bb_offset(ce); + + return ptr; +} + void lrc_init_state(struct intel_context *ce, struct intel_engine_cs *engine, void *state) @@ -922,6 +940,10 @@ void lrc_init_state(struct intel_context *ce, /* Clear the ppHWSP (inc. per-context counters) */ memset(state, 0, PAGE_SIZE); + /* Clear the indirect wa and storage */ + if (ce->wa_bb_page) + memset(state + context_wa_bb_offset(ce), 0, PAGE_SIZE); + /* * The second page of the context object contains some registers which * must be set up prior to the first execution. @@ -929,6 +951,35 @@ void lrc_init_state(struct intel_context *ce, __lrc_init_regs(state + LRC_STATE_OFFSET, ce, engine, inhibit); } +u32 lrc_indirect_bb(const struct intel_context *ce) +{ + return i915_ggtt_offset(ce->state) + context_wa_bb_offset(ce); +} + +static u32 *setup_predicate_disable_wa(const struct intel_context *ce, u32 *cs) +{ + /* If predication is active, this will be noop'ed */ + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT | (4 - 2); + *cs++ = lrc_indirect_bb(ce) + DG2_PREDICATE_RESULT_WA; + *cs++ = 0; + *cs++ = 0; /* No predication */ + + /* predicated end, only terminates if SET_PREDICATE_RESULT:0 is clear */ + *cs++ = MI_BATCH_BUFFER_END | BIT(15); + *cs++ = MI_SET_PREDICATE | MI_SET_PREDICATE_DISABLE; + + /* Instructions are no longer predicated (disabled), we can proceed */ + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT | (4 - 2); + *cs++ = lrc_indirect_bb(ce) + DG2_PREDICATE_RESULT_WA; + *cs++ = 0; + *cs++ = 1; /* enable predication before the next BB */ + + *cs++ = MI_BATCH_BUFFER_END; + GEM_BUG_ON(offset_in_page(cs) > DG2_PREDICATE_RESULT_WA); + + return cs; +} + static struct i915_vma * __lrc_alloc_state(struct intel_context *ce, struct intel_engine_cs *engine) { @@ -1240,24 +1291,6 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs) return cs; } -static u32 context_wa_bb_offset(const struct intel_context *ce) -{ - return PAGE_SIZE * ce->wa_bb_page; -} - -static u32 *context_indirect_bb(const struct intel_context *ce) -{ - void *ptr; - - GEM_BUG_ON(!ce->wa_bb_page); - - ptr = ce->lrc_reg_state; - ptr -= LRC_STATE_OFFSET; /* back to start of context image */ - ptr += context_wa_bb_offset(ce); - - return ptr; -} - static void setup_indirect_ctx_bb(const struct intel_context *ce, const struct intel_engine_cs *engine, @@ -1271,9 +1304,11 @@ setup_indirect_ctx_bb(const struct intel_context *ce, while ((unsigned long)cs % CACHELINE_BYTES) *cs++ = MI_NOOP; + GEM_BUG_ON(cs - start > DG2_PREDICATE_RESULT_BB / sizeof(*start)); + setup_predicate_disable_wa(ce, start + DG2_PREDICATE_RESULT_BB / sizeof(*start)); + lrc_setup_indirect_ctx(ce->lrc_reg_state, engine, - i915_ggtt_offset(ce->state) + - context_wa_bb_offset(ce), + lrc_indirect_bb(ce), (cs - start) * sizeof(*cs)); } diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h index 7371bb5c8129..31be734010db 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h @@ -145,4 +145,9 @@ static inline void lrc_runtime_stop(struct intel_context *ce) WRITE_ONCE(stats->active, 0); } +#define DG2_PREDICATE_RESULT_WA (PAGE_SIZE - sizeof(u64)) +#define DG2_PREDICATE_RESULT_BB (2048) + +u32 lrc_indirect_bb(const struct intel_context *ce); + #endif /* __INTEL_LRC_H__ */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 61a6f2424e24..addfdc2c2642 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -3910,6 +3910,8 @@ static void guc_default_vfuncs(struct intel_engine_cs *engine) */ engine->emit_bb_start = gen8_emit_bb_start; + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + engine->emit_bb_start = gen125_emit_bb_start; } static void rcs_submission_override(struct intel_engine_cs *engine)