From patchwork Fri Oct 14 23:02:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43410C4332F for ; Fri, 14 Oct 2022 23:03:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9965910E13A; Fri, 14 Oct 2022 23:03:12 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7C01410E122; Fri, 14 Oct 2022 23:03:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788583; x=1697324583; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3GWCVyifb/MCOhuWXHC6DxBo6Gqy6nK92+IwDd0rkO8=; b=cMf/K3TllaeWdJ/IYt/lBG/p2Cc4x1g+zxxkMy+yqfJJjXP+Lw99I2cH 8NyiXMKTWPtpaqRpyEeDsVQVecNtEJ9Ee/E49/ujTnx71JX8qmnh3lEFH wLg75rpPR+hom7aw2XeiDgChZRTjLX8GxkLNQKE+vlV8xfGL4wa+65lN9 oTBb0CrPpvwA/RRvStAiF6IeWCvM/oEZn/2JE096SQslda+JRY9hJJtie ShenqrEfjgnqnK8yKjvQcxdHD1VZbEd9CdaMKhMpryw9Vqz+ohL5Pe5Wh ju92a3opHfy1RuuYnmrH5bWwXdDGckVMOgbIlNiwQKFrQQzH12O03o2v/ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216964" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216964" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471690" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471690" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 01/14] drm/i915/gen8: Create separate reg definitions for new MCR registers Date: Fri, 14 Oct 2022 16:02:26 -0700 Message-Id: <20221014230239.1023689-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Balasubramani Vivekanandan , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Gen8 was the first time our hardware had multicast registers (or at least the first time the multicast nature was exposed and MMIO accesses could be steered). There are some registers that transitioned from singleton behavior to multicast during the gen7 -> gen8 transition; let's duplicate the register definitions for those registers in preparation for upcoming patches that will handle MCR registers in a special manner. The registers adjusted are: * MISCCPCTL * SAMPLER_INSTDONE * ROW_INSTDONE * ROW_CHICKEN2 * HALF_SLICE_CHICKEN1 * HALF_SLICE_CHICKEN3 v2: - Use the gen8 version of HALF_SLICE_CHICKEN3 in GVT's gen9 engine MMIO list. (Bala) - Update to the gen8 version of MISCCPCTL in a couple new workarounds that were recently added for DG2/PVC. (Bala) Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 4 +-- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 11 +++++++- drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 +++++++++---------- .../gpu/drm/i915/gt/uc/intel_guc_capture.c | 4 +-- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 +- drivers/gpu/drm/i915/gvt/handlers.c | 2 +- drivers/gpu/drm/i915/gvt/mmio_context.c | 2 +- drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 9 ++++--- 9 files changed, 36 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 2ddcad497fa3..c408bac3c533 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1559,11 +1559,11 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, for_each_ss_steering(iter, engine->gt, slice, subslice) { instdone->sampler[slice][subslice] = intel_gt_mcr_read(engine->gt, - GEN7_SAMPLER_INSTDONE, + GEN8_SAMPLER_INSTDONE, slice, subslice); instdone->row[slice][subslice] = intel_gt_mcr_read(engine->gt, - GEN7_ROW_INSTDONE, + GEN8_ROW_INSTDONE, slice, subslice); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 7f79bbf97828..ba4ce668042c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -651,6 +651,9 @@ #define GEN7_MISCCPCTL _MMIO(0x9424) #define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0) + +#define GEN8_MISCCPCTL _MMIO(0x9424) +#define GEN8_DOP_CLOCK_GATE_ENABLE REG_BIT(0) #define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1) #define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2) #define GEN8_DOP_CLOCK_GATE_GUC_ENABLE (1 << 4) @@ -1072,18 +1075,22 @@ #define GEN12_GAM_DONE _MMIO(0xcf68) #define GEN7_HALF_SLICE_CHICKEN1 _MMIO(0xe100) /* IVB GT1 + VLV */ +#define GEN8_HALF_SLICE_CHICKEN1 _MMIO(0xe100) #define GEN7_MAX_PS_THREAD_DEP (8 << 12) #define GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE (1 << 10) #define GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE (1 << 4) #define GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE (1 << 3) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) +#define GEN8_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define GEN8_ROW_INSTDONE _MMIO(0xe164) #define HALF_SLICE_CHICKEN2 _MMIO(0xe180) #define GEN8_ST_PO_DISABLE (1 << 13) -#define HALF_SLICE_CHICKEN3 _MMIO(0xe184) +#define HSW_HALF_SLICE_CHICKEN3 _MMIO(0xe184) +#define GEN8_HALF_SLICE_CHICKEN3 _MMIO(0xe184) #define HSW_SAMPLE_C_PERFORMANCE (1 << 9) #define GEN8_CENTROID_PIXEL_OPT_DIS (1 << 8) #define GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC (1 << 5) @@ -1136,6 +1143,8 @@ #define DISABLE_EARLY_EOT REG_BIT(1) #define GEN7_ROW_CHICKEN2 _MMIO(0xe4f4) + +#define GEN8_ROW_CHICKEN2 _MMIO(0xe4f4) #define GEN12_DISABLE_READ_SUPPRESSION REG_BIT(15) #define GEN12_DISABLE_EARLY_READ REG_BIT(14) #define GEN12_ENABLE_LARGE_GRF_MODE REG_BIT(12) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index b8eb20a155f0..47a683dcc8a5 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -295,10 +295,10 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine, * Also see the related UCGTCL1 write in bdw_init_clock_gating() * to disable EUTC clock gating. */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, + wa_masked_en(wal, GEN8_ROW_CHICKEN2, DOP_CLOCK_GATING_DISABLE); - wa_masked_en(wal, HALF_SLICE_CHICKEN3, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, GEN8_SAMPLER_POWER_BYPASS_DIS); wa_masked_en(wal, HDC_CHICKEN0, @@ -386,7 +386,7 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, IS_KABYLAKE(i915) || IS_COFFEELAKE(i915) || IS_COMETLAKE(i915)) - wa_masked_en(wal, HALF_SLICE_CHICKEN3, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, GEN8_SAMPLER_POWER_BYPASS_DIS); /* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */ @@ -490,7 +490,7 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:kbl */ - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } @@ -514,7 +514,7 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:cfl */ - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } @@ -1517,7 +1517,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS); /* Wa_14015795083 */ - wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -1526,7 +1526,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) pvc_init_mcr(gt, wal); /* Wa_14015795083 */ - wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -2117,7 +2117,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) { /* Wa_14013392000:dg2_g11 */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) || @@ -2163,7 +2163,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) DISABLE_128B_EVICTION_COMMAND_UDW); /* Wa_22012856258:dg2 */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_READ_SUPPRESSION); /* @@ -2260,7 +2260,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); /* * Wa_1407928979:tgl A* @@ -2289,7 +2289,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_PUSH_CONST_DEREF_HOLD_DIS); /* @@ -2508,7 +2508,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_HASWELL(i915)) { /* WaSampleCChickenBitEnable:hsw */ wa_masked_en(wal, - HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE); + HSW_HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE); wa_masked_dis(wal, CACHE_MODE_0_GEN7, @@ -2806,7 +2806,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); /* Wa_14010449647:xehpsdv */ - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); /* Wa_18011725039:xehpsdv */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c index 8f1165146013..9495a7928bc8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c @@ -244,8 +244,8 @@ struct __ext_steer_reg { }; static const struct __ext_steer_reg xe_extregs[] = { - {"GEN7_SAMPLER_INSTDONE", GEN7_SAMPLER_INSTDONE}, - {"GEN7_ROW_INSTDONE", GEN7_ROW_INSTDONE} + {"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE}, + {"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE} }; static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index a0372735cddb..9229243992c2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -35,7 +35,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore) if (GRAPHICS_VER(uncore->i915) == 9) { /* DOP Clock Gating Enable for GuC clocks */ - intel_uncore_rmw(uncore, GEN7_MISCCPCTL, + intel_uncore_rmw(uncore, GEN8_MISCCPCTL, 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE); /* allows for 5us (in 10ns units) before GT can go to RC6 */ diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c index daac2050d77d..700cc9688f47 100644 --- a/drivers/gpu/drm/i915/gvt/handlers.c +++ b/drivers/gpu/drm/i915/gvt/handlers.c @@ -2257,7 +2257,7 @@ static int init_generic_mmio_info(struct intel_gvt *gvt) MMIO_DFH(_MMIO(0x2438), D_ALL, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x243c), D_ALL, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x7018), D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); - MMIO_DFH(HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); + MMIO_DFH(HSW_HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); MMIO_DFH(GEN7_HALF_SLICE_CHICKEN1, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); /* display */ diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c index 1c6e941c9666..d177884d8f7d 100644 --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -111,7 +111,7 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = { {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */ {RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */ {RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */ - {RCS0, HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */ + {RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */ {RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */ {RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */ {RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */ diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c index 8279dc580a3e..638b77d64bf4 100644 --- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c +++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c @@ -102,7 +102,7 @@ static int iterate_generic_mmio(struct intel_gvt_mmio_table_iter *iter) MMIO_D(_MMIO(0x2438)); MMIO_D(_MMIO(0x243c)); MMIO_D(_MMIO(0x7018)); - MMIO_D(HALF_SLICE_CHICKEN3); + MMIO_D(HSW_HALF_SLICE_CHICKEN3); MMIO_D(GEN7_HALF_SLICE_CHICKEN1); /* display */ MMIO_F(_MMIO(0x60220), 0x20); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 9f6c58ad8bdb..390802245514 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4321,7 +4321,7 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv, u32 val; /* WaTempDisableDOPClkGating:bdw */ - misccpctl = intel_uncore_rmw(&dev_priv->uncore, GEN7_MISCCPCTL, ~GEN7_DOP_CLOCK_GATE_ENABLE, + misccpctl = intel_uncore_rmw(&dev_priv->uncore, GEN8_MISCCPCTL, ~GEN8_DOP_CLOCK_GATE_ENABLE, 0); val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1); @@ -4336,7 +4336,7 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv, */ intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1); udelay(1); - intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl); + intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl); } static void icl_init_clock_gating(struct drm_i915_private *dev_priv) @@ -4496,8 +4496,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv) gen9_init_clock_gating(dev_priv); /* WaDisableDopClockGating:skl */ - intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL) & - ~GEN7_DOP_CLOCK_GATE_ENABLE); + intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, + intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) & + ~GEN8_DOP_CLOCK_GATE_ENABLE); /* WAC6entrylatency:skl */ intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) | From patchwork Fri Oct 14 23:02:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007438 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 873C0C4332F for ; Fri, 14 Oct 2022 23:03:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 153C310E130; Fri, 14 Oct 2022 23:03:11 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9C94B10E125; Fri, 14 Oct 2022 23:03:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788583; x=1697324583; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bf2hstcvpVQ5ZJhAJxxrnuNtDeobrywAMfOf4bLcsCA=; b=h4yT+SE3GpmyE/SqNUuKM+1igc3IJOFHDLkK1GpAoEjtdyTtpFMaWojA xkH5w6oqlJXG7z7g+TgCrd7K2Zr5J8ZjCRrxWFQYwmRS1sqf/CNJUFQ11 Q2NVLnIxHC42e3A379eLNXYJLWdSC4tyaM2/Ccmap9BukjETNI7vStwgh Z/IZZPh7gsO1ryOEHycKRzeKb8KD4fVXenlomXF9YEejgexJieTku+Vtu fYu5L1wPJxqoKBzFbP6RZyfCmM8YknZu/Ld0fxz8kHP8CVlJFOMBQDzuG OCwzqT8le4a0mpJ4klsLjHk+5wlOAJ+N078ygYMvDMvELUKK5rpa2tPgl A==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216965" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216965" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471694" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471694" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 02/14] drm/i915/xehp: Create separate reg definitions for new MCR registers Date: Fri, 14 Oct 2022 16:02:27 -0700 Message-Id: <20221014230239.1023689-3-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Balasubramani Vivekanandan , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Starting in Xe_HP, several registers our driver works with have been converted from singleton registers into replicated registers with multicast behavior. Although the registers are still located at the same MMIO offsets as on previous platforms, let's duplicate the register definitions in preparation for upcoming patches that will handle multicast registers in a special manner. The registers that are now replicated on Xe_HP are: * PAT_INDEX (mslice replication) * FF_MODE2 (gslice replication) * COMMON_SLICE_CHICKEN3 (gslice replication) * SLICE_COMMON_ECO_CHICKEN1 (gslice replication) * SLICE_UNIT_LEVEL_CLKGATE (gslice replication) * LNCFCMOCS (lncf replication) Note that there are a couple places in selftest_mocs.c where the gen9 version of LNCFCMOCS is still used without regards for which platform we're on. Those cases are just doing an offset lookup and not issuing any CPU reads/writes of the register, so the potentially multicast nature of the register doesn't come into play. v2: - Add commit message note about the unconditional GEN9_LNCFCMOCS usage in selftest_mocs. (Bala) - Include some additional TLB registers. Bspec: 66534 Cc: Balasubramani Vivekanandan Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt.c | 18 ++++++++++++-- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 26 +++++++++++++++------ drivers/gpu/drm/i915/gt/intel_gtt.c | 22 ++++++++++++++--- drivers/gpu/drm/i915/gt/intel_gtt.h | 2 +- drivers/gpu/drm/i915/gt/intel_mocs.c | 5 +++- drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 +++++++++---------- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 7 ++++-- 8 files changed, 78 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 5c67e49aacf6..6b58c95ad6a0 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -986,7 +986,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt) ggtt->vm.pte_encode = gen8_ggtt_pte_encode; - setup_private_pat(ggtt->vm.gt->uncore); + setup_private_pat(ggtt->vm.gt); return ggtt_probe_common(ggtt, size); } @@ -1308,7 +1308,7 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt) wbinvd_on_all_cpus(); if (GRAPHICS_VER(ggtt->vm.i915) >= 8) - setup_private_pat(ggtt->vm.gt->uncore); + setup_private_pat(ggtt->vm.gt); intel_ggtt_restore_fences(ggtt); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index b367cfff48d5..445e171940fa 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -314,7 +314,11 @@ static void gen8_check_faults(struct intel_gt *gt) i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg; u32 fault; - if (GRAPHICS_VER(gt->i915) >= 12) { + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) { + fault_reg = XEHP_RING_FAULT_REG; + fault_data0_reg = XEHP_FAULT_TLB_DATA0; + fault_data1_reg = XEHP_FAULT_TLB_DATA1; + } else if (GRAPHICS_VER(gt->i915) >= 12) { fault_reg = GEN12_RING_FAULT_REG; fault_data0_reg = GEN12_FAULT_TLB_DATA0; fault_data1_reg = GEN12_FAULT_TLB_DATA1; @@ -990,6 +994,13 @@ static void mmio_invalidate_full(struct intel_gt *gt) [COPY_ENGINE_CLASS] = GEN12_BLT_TLB_INV_CR, [COMPUTE_CLASS] = GEN12_COMPCTX_TLB_INV_CR, }; + static const i915_reg_t xehp_regs[] = { + [RENDER_CLASS] = XEHP_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS] = XEHP_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS] = XEHP_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS] = XEHP_BLT_TLB_INV_CR, + [COMPUTE_CLASS] = XEHP_COMPCTX_TLB_INV_CR, + }; struct drm_i915_private *i915 = gt->i915; struct intel_uncore *uncore = gt->uncore; struct intel_engine_cs *engine; @@ -998,7 +1009,10 @@ static void mmio_invalidate_full(struct intel_gt *gt) const i915_reg_t *regs; unsigned int num = 0; - if (GRAPHICS_VER(i915) == 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + regs = xehp_regs; + num = ARRAY_SIZE(xehp_regs); + } else if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; num = ARRAY_SIZE(gen12_regs); } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index ba4ce668042c..0aa16caa33e4 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -333,6 +333,7 @@ #define GEN7_TLB_RD_ADDR _MMIO(0x4700) #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) +#define XEHP_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) #define XEHP_TILE0_ADDR_RANGE _MMIO(0x4900) #define XEHP_TILE_LMEM_RANGE_SHIFT 8 @@ -391,7 +392,8 @@ #define DIS_OVER_FETCH_CACHE REG_BIT(1) #define DIS_MULT_MISS_RD_SQUASH REG_BIT(0) -#define FF_MODE2 _MMIO(0x6604) +#define GEN12_FF_MODE2 _MMIO(0x6604) +#define XEHP_FF_MODE2 _MMIO(0x6604) #define FF_MODE2_GS_TIMER_MASK REG_GENMASK(31, 24) #define FF_MODE2_GS_TIMER_224 REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224) #define FF_MODE2_TDS_TIMER_MASK REG_GENMASK(23, 16) @@ -446,6 +448,7 @@ #define GEN8_HDC_CHICKEN1 _MMIO(0x7304) #define GEN11_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) +#define XEHP_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) #define DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN REG_BIT(12) #define XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE REG_BIT(12) #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) @@ -459,10 +462,9 @@ #define DISABLE_PIXEL_MASK_CAMMING (1 << 14) #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) -#define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) - -#define SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) +#define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) +#define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) #define GEN9_SLICE_PGCTL_ACK(slice) _MMIO(0x804c + (slice) * 0x4) #define GEN10_SLICE_PGCTL_ACK(slice) _MMIO(0x804c + ((slice) / 3) * 0x34 + \ @@ -707,7 +709,8 @@ #define GAMTLBVEBOX0_CLKGATE_DIS REG_BIT(16) #define LTCDD_CLKGATE_DIS REG_BIT(10) -#define SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) +#define GEN11_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) +#define XEHP_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) #define SARBUNIT_CLKGATE_DIS (1 << 5) #define RCCUNIT_CLKGATE_DIS (1 << 7) #define MSCUNIT_CLKGATE_DIS (1 << 10) @@ -722,7 +725,7 @@ #define VSUNIT_CLKGATE_DIS_TGL REG_BIT(19) #define PSDUNIT_CLKGATE_DIS REG_BIT(5) -#define SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) +#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) #define DSS_ROUTER_CLKGATE_DIS REG_BIT(28) #define GWUNIT_CLKGATE_DIS REG_BIT(16) @@ -947,7 +950,8 @@ /* MOCS (Memory Object Control State) registers */ #define GEN9_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) /* L3 Cache Control */ -#define GEN9_LNCFCMOCS_REG_COUNT 32 +#define XEHP_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) +#define LNCFCMOCS_REG_COUNT 32 #define GEN7_L3CNTLREG3 _MMIO(0xb024) @@ -1039,11 +1043,14 @@ #define GEN9_BLT_MOCS(i) _MMIO(__GEN9_BCS0_MOCS0 + (i) * 4) #define GEN12_FAULT_TLB_DATA0 _MMIO(0xceb8) +#define XEHP_FAULT_TLB_DATA0 _MMIO(0xceb8) #define GEN12_FAULT_TLB_DATA1 _MMIO(0xcebc) +#define XEHP_FAULT_TLB_DATA1 _MMIO(0xcebc) #define FAULT_VA_HIGH_BITS (0xf << 0) #define FAULT_GTT_SEL (1 << 4) #define GEN12_RING_FAULT_REG _MMIO(0xcec4) +#define XEHP_RING_FAULT_REG _MMIO(0xcec4) #define GEN8_RING_FAULT_ENGINE_ID(x) (((x) >> 12) & 0x7) #define RING_FAULT_GTTSEL_MASK (1 << 11) #define RING_FAULT_SRCID(x) (((x) >> 3) & 0xff) @@ -1051,10 +1058,15 @@ #define RING_FAULT_VALID (1 << 0) #define GEN12_GFX_TLB_INV_CR _MMIO(0xced8) +#define XEHP_GFX_TLB_INV_CR _MMIO(0xced8) #define GEN12_VD_TLB_INV_CR _MMIO(0xcedc) +#define XEHP_VD_TLB_INV_CR _MMIO(0xcedc) #define GEN12_VE_TLB_INV_CR _MMIO(0xcee0) +#define XEHP_VE_TLB_INV_CR _MMIO(0xcee0) #define GEN12_BLT_TLB_INV_CR _MMIO(0xcee4) +#define XEHP_BLT_TLB_INV_CR _MMIO(0xcee4) #define GEN12_COMPCTX_TLB_INV_CR _MMIO(0xcf04) +#define XEHP_COMPCTX_TLB_INV_CR _MMIO(0xcf04) #define GEN12_MERT_MOD_CTRL _MMIO(0xcf28) #define RENDER_MOD_CTRL _MMIO(0xcf2c) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 13e411187fd5..e82a9d763e57 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -15,6 +15,7 @@ #include "i915_trace.h" #include "i915_utils.h" #include "intel_gt.h" +#include "intel_gt_mcr.h" #include "intel_gt_regs.h" #include "intel_gtt.h" @@ -478,6 +479,18 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore) intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB); } +static void xehp_setup_private_ppat(struct intel_gt *gt) +{ + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); +} + static void icl_setup_private_ppat(struct intel_uncore *uncore) { intel_uncore_write(uncore, @@ -570,13 +583,16 @@ static void chv_setup_private_ppat(struct intel_uncore *uncore) intel_uncore_write(uncore, GEN8_PRIVATE_PAT_HI, upper_32_bits(pat)); } -void setup_private_pat(struct intel_uncore *uncore) +void setup_private_pat(struct intel_gt *gt) { - struct drm_i915_private *i915 = uncore->i915; + struct intel_uncore *uncore = gt->uncore; + struct drm_i915_private *i915 = gt->i915; GEM_BUG_ON(GRAPHICS_VER(i915) < 8); - if (GRAPHICS_VER(i915) >= 12) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_setup_private_ppat(gt); + else if (GRAPHICS_VER(i915) >= 12) tgl_setup_private_ppat(uncore); else if (GRAPHICS_VER(i915) >= 11) icl_setup_private_ppat(uncore); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 062b78333fb2..4d75ba4bb41d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -668,7 +668,7 @@ void ppgtt_unbind_vma(struct i915_address_space *vm, void gtt_write_workarounds(struct intel_gt *gt); -void setup_private_pat(struct intel_uncore *uncore); +void setup_private_pat(struct intel_gt *gt); int i915_vm_alloc_pt_stash(struct i915_address_space *vm, struct i915_vm_pt_stash *stash, diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 152244d7f62a..ecfa5baa5e3f 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -616,7 +616,10 @@ static void init_l3cc_table(struct intel_uncore *uncore, u32 l3cc; for_each_l3cc(l3cc, table, i) - intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc); + if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50)) + intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc); + else + intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc); } void intel_mocs_init_engine(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 47a683dcc8a5..3056b099dd17 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -572,7 +572,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); wa_add(wal, - FF_MODE2, + XEHP_FF_MODE2, FF_MODE2_TDS_TIMER_MASK, FF_MODE2_TDS_TIMER_128, 0, false); @@ -599,7 +599,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine, * verification is ignored. */ wa_add(wal, - FF_MODE2, + GEN12_FF_MODE2, FF_MODE2_TDS_TIMER_MASK, FF_MODE2_TDS_TIMER_128, 0, false); @@ -637,7 +637,7 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine, * to Wa_1608008084. */ wa_add(wal, - FF_MODE2, + GEN12_FF_MODE2, FF_MODE2_GS_TIMER_MASK, FF_MODE2_GS_TIMER_224, 0, false); @@ -670,7 +670,7 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine, if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) { /* Wa_14010469329:dg2_g10 */ - wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3, + wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE); /* @@ -678,12 +678,12 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine, * Wa_22010613112:dg2_g10 * Wa_14010698770:dg2_g10 */ - wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3, + wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); } /* Wa_16013271637:dg2 */ - wa_masked_en(wal, SLICE_COMMON_ECO_CHICKEN1, + wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1, MSC_MSAA_REODER_BUF_BYPASS_DISABLE); /* Wa_14014947963:dg2 */ @@ -1265,14 +1265,14 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1406680159:icl,ehl */ wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE, + GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, GWUNIT_CLKGATE_DIS); /* Wa_1607087056:icl,ehl,jsl */ if (IS_ICELAKE(i915) || IS_JSL_EHL_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, - SLICE_UNIT_LEVEL_CLKGATE, + GEN11_SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); /* @@ -1332,7 +1332,7 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1607087056:tgl also know as BUG:1409180338 */ if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, - SLICE_UNIT_LEVEL_CLKGATE, + GEN11_SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); /* Wa_1408615072:tgl[a0] */ @@ -1351,7 +1351,7 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1607087056:dg1 */ if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, - SLICE_UNIT_LEVEL_CLKGATE, + GEN11_SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); /* Wa_1409420604:dg1 */ @@ -1455,7 +1455,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) CG3DDISCFEG_CLKGATE_DIS); /* Wa_14011006942:dg2 */ - wa_write_or(wal, SUBSLICE_UNIT_LEVEL_CLKGATE, + wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, DSS_ROUTER_CLKGATE_DIS); } @@ -1467,7 +1467,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS); /* Wa_14011371254:dg2_g10 */ - wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); + wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); /* Wa_14011431319:dg2_g10 */ wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS | diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 657f0beb8e06..cc357fa0c270 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -373,8 +373,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset, false); /* add in local MOCS registers */ - for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++) - ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); + for (i = 0; i < LNCFCMOCS_REG_COUNT; i++) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false); + else + ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); return ret ? -1 : 0; } From patchwork Fri Oct 14 23:02:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007444 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9B10CC43217 for ; Fri, 14 Oct 2022 23:04:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D69D010E15B; Fri, 14 Oct 2022 23:03:25 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id BD78910E130; Fri, 14 Oct 2022 23:03:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788583; x=1697324583; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=G/o3wp8/GyOVBSbVxexs1JiQl3bxZ/vju/I+1X1zpNM=; b=GLu/roHxJ7SSrH3mGzoCTUmbO9jrijTeD2jnshazWKjUXnoBFEACPKtD 7Hj9+DWDrUMSldZBO/C9V0hgGGBHDUw8kl2M34Oh1ebDRfphqtpi72WBp Mk3YTCtEGrZOhNybfZaB5MqWmyue29aYBGBdmfTId8dB+w2sQQ17dRo+U zAkiXaPBKWUdbuUKWa+Eby5KvsNqJhN9IQ0998kDgZ/JjVcNEaZluykGx 5BFdDcq6xxqK6VlI9K8aH+KbeAbG/77huVqj4Tv9+A9cS6gbin/uR9ggP c/t3eCP+EcZ7gdi7mp3m2ImcWwwWIr0ohjYAu/az8dZTzxOxb6fqmrkr+ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216966" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216966" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471696" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471696" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 03/14] drm/i915/gt: Drop a few unused register definitions Date: Fri, 14 Oct 2022 16:02:28 -0700 Message-Id: <20221014230239.1023689-4-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Let's drop a few register definitions that are unused anywhere in the driver today. Since the referenced offsets are part of what is now considered a multicast register region, the current definitions would not be correct for use on any future platform. Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 0aa16caa33e4..71d8787230c1 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -454,13 +454,6 @@ #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) #define GEN12_DISABLE_CPS_AWARE_COLOR_PIPE REG_BIT(9) -/* GEN9 chicken */ -#define SLICE_ECO_CHICKEN0 _MMIO(0x7308) -#define PIXEL_MASK_CAMMING_DISABLE (1 << 14) - -#define GEN9_SLICE_COMMON_ECO_CHICKEN0 _MMIO(0x7308) -#define DISABLE_PIXEL_MASK_CAMMING (1 << 14) - #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) #define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) @@ -967,11 +960,6 @@ #define GEN7_L3LOG(slice, i) _MMIO(0xb070 + (slice) * 0x200 + (i) * 4) #define GEN7_L3LOG_SIZE 0x80 -#define GEN10_SCRATCH_LNCF2 _MMIO(0xb0a0) -#define PMFLUSHDONE_LNICRSDROP (1 << 20) -#define PMFLUSH_GAPL3UNBLOCK (1 << 21) -#define PMFLUSHDONE_LNEBLK (1 << 22) - #define XEHP_L3NODEARBCFG _MMIO(0xb0b4) #define XEHP_LNESPARE REG_BIT(19) @@ -986,9 +974,6 @@ #define L3_HIGH_PRIO_CREDITS(x) (((x) >> 1) << 14) #define L3_PRIO_CREDITS_MASK ((0x1f << 19) | (0x1f << 14)) -#define GEN10_L3_CHICKEN_MODE_REGISTER _MMIO(0xb114) -#define GEN11_I2M_WRITE_DISABLE (1 << 28) - #define GEN8_L3SQCREG4 _MMIO(0xb118) #define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6) #define GEN8_LQSC_RO_PERF_DIS (1 << 27) @@ -1191,8 +1176,6 @@ #define SARB_CHICKEN1 _MMIO(0xe90c) #define COMP_CKN_IN REG_GENMASK(30, 29) -#define GEN7_HALF_SLICE_CHICKEN1_GT2 _MMIO(0xf100) - #define GEN7_ROW_CHICKEN2_GT2 _MMIO(0xf4f4) #define DOP_CLOCK_GATING_DISABLE (1 << 0) #define PUSH_CONSTANT_DEREF_DISABLE (1 << 8) From patchwork Fri Oct 14 23:02:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007436 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5D21C4332F for ; Fri, 14 Oct 2022 23:03:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D241110E122; Fri, 14 Oct 2022 23:03:09 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id DE32C10E136; Fri, 14 Oct 2022 23:03:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788583; x=1697324583; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XmbNYnfA3aAzzRQ7bRwMZp7i1FbtZp2mm86a0/zPMjo=; b=F98Hu9zbHFR5yBfHRCYepIpqEt6g3jo8yD80L+mc9jIf/E/fcD2RP6M9 Df4x1p+SOt17+DPqLFcasWT3Z80vB4v6LdCm5lG/rzpPGZrgf8OsUr/yE uaN8yP0g5Rt1xxctuq2PKFsmAJxjE60BWxI0fV1hEcnGhCT/p6NHEL3Ow Btu2VZFqRQDW4XpYw4bq6AEPBhKXgMPU+fOj3nSsv0suS3IDBd7JyMphj H9eAwRsyVvHmkdnE9hy5lNSDvdeqWrWS9L7v0s8LFnaqsBT9iEy/dCD5p SJ6+zI99Uty5klzrhzfxMZeKXtdp9tvzHyS/6BS4iNeGJL4GGqjn7x2oN w==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216967" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216967" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471700" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471700" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 04/14] drm/i915/gt: Correct prefix on a few registers Date: Fri, 14 Oct 2022 16:02:29 -0700 Message-Id: <20221014230239.1023689-5-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We have a few registers that have existed for several hardware generations, but are only used by the driver on Xe_HP and beyond. In cases where the Xe_HP version of the register is now replicated and uses multicast behavior, but earlier generations were singleton, let's change the register prefix to "XEHP_" to help clarify that we're using the newer multicast form of the register. Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 8 ++++---- drivers/gpu/drm/i915/gt/intel_workarounds.c | 10 +++++----- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 71d8787230c1..890960b56b9e 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -486,7 +486,7 @@ #define GEN8_RC6_CTX_INFO _MMIO(0x8504) -#define GEN12_SQCM _MMIO(0x8724) +#define XEHP_SQCM _MMIO(0x8724) #define EN_32B_ACCESS REG_BIT(30) #define HSW_IDICR _MMIO(0x9008) @@ -989,7 +989,7 @@ #define GEN11_SCRATCH2 _MMIO(0xb140) #define GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE (1 << 19) -#define GEN11_L3SQCREG5 _MMIO(0xb158) +#define XEHP_L3SQCREG5 _MMIO(0xb158) #define L3_PWM_TIMER_INIT_VAL_MASK REG_GENMASK(9, 0) #define MLTICTXCTL _MMIO(0xb170) @@ -1053,7 +1053,7 @@ #define GEN12_COMPCTX_TLB_INV_CR _MMIO(0xcf04) #define XEHP_COMPCTX_TLB_INV_CR _MMIO(0xcf04) -#define GEN12_MERT_MOD_CTRL _MMIO(0xcf28) +#define XEHP_MERT_MOD_CTRL _MMIO(0xcf28) #define RENDER_MOD_CTRL _MMIO(0xcf2c) #define COMP_MOD_CTRL _MMIO(0xcf30) #define VDBX_MOD_CTRL _MMIO(0xcf34) @@ -1155,7 +1155,7 @@ #define EU_PERF_CNTL1 _MMIO(0xe558) #define EU_PERF_CNTL5 _MMIO(0xe55c) -#define GEN12_HDC_CHICKEN0 _MMIO(0xe5f0) +#define XEHP_HDC_CHICKEN0 _MMIO(0xe5f0) #define LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK REG_GENMASK(13, 11) #define ICL_HDC_MODE _MMIO(0xe5f4) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 3056b099dd17..96b9f02a2284 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -569,7 +569,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP); - wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, + wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); wa_add(wal, XEHP_FF_MODE2, @@ -1514,7 +1514,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) * recommended tuning settings documented in the bspec's * performance guide section. */ - wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS); + wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); /* Wa_14015795083 */ wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); @@ -2170,7 +2170,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_22010960976:dg2 * Wa_14013347512:dg2 */ - wa_masked_dis(wal, GEN12_HDC_CHICKEN0, + wa_masked_dis(wal, XEHP_HDC_CHICKEN0, LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK); } @@ -2223,7 +2223,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) || IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) { /* Wa_14012362059:dg2 */ - wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); } if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) || @@ -2816,7 +2816,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li } /* Wa_14012362059:xehpsdv */ - wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); /* Wa_14014368820:xehpsdv */ wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS | From patchwork Fri Oct 14 23:02:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6CA7DC433FE for ; Fri, 14 Oct 2022 23:04:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C955F10E152; Fri, 14 Oct 2022 23:03:20 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0D1D810E137; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BLGNCU2YhyDAiodKVPKZH0H19KcfV50OrKRtiN5pkgk=; b=VL0LrLpNOObPwBaHvtuFB4/C22ZKh3DCMni9NJOhZ9lRTN7w0u29d6ce NiZc/ICkWgikEu3zG6VNZNzf30rqUMLwH9Wju8a0g87CzE+6tJjA6OVQl WuKLw4SCx0V8kqCJarDoDyhv8iTIc0goNDuWHE4SPKRiY2CDt8fzmp7TQ G3WdPblrR1X54kzYJC7QL10A1rBLrf8bG4VN0zg0sahyui3i7ORsYYzPB 2QGfW8IM1bYAYtOTGz6lSePZAfPxBBWAGnq4kwi3y/EXGflacwUVeHdwt UsOYncquKSPrElL4yvoJX1cCeKOFSuXlKqoV9PS/Ja2Z3CHQS5NfV2Wtq A==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216968" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216968" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471703" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471703" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 05/14] drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation Date: Fri, 14 Oct 2022 16:02:30 -0700 Message-Id: <20221014230239.1023689-6-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" There are cases where we wish to read from any non-terminated MCR register instance (or the primary instance in the case of GAM ranges), clear/set some bits, and then write the value back out to the register in a multicast manner. Adding a "multicast RMW" will avoid the need to open-code this. v2: - Return a u32 to align with the recent change to intel_uncore_rmw. Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 28 ++++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_mcr.h | 3 +++ 2 files changed, 31 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index a2047a68ea7a..4dc360f4e344 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -302,6 +302,34 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va intel_uncore_write_fw(gt->uncore, reg, value); } +/** + * intel_gt_mcr_multicast_rmw - Performs a multicast RMW operations + * @gt: GT structure + * @reg: the MCR register to read and write + * @clear: bits to clear during RMW + * @set: bits to set during RMW + * + * Performs a read-modify-write on an MCR register in a multicast manner. + * This operation only makes sense on MCR registers where all instances are + * expected to have the same value. The read will target any non-terminated + * instance and the write will be applied to all instances. + * + * This function assumes the caller is already holding any necessary forcewake + * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should + * be obtained automatically. + * + * Returns the old (unmodified) value read. + */ +u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, + u32 clear, u32 set) +{ + u32 val = intel_gt_mcr_read_any(gt, reg); + + intel_gt_mcr_multicast_write(gt, reg, (val & ~clear) | set); + + return val; +} + /* * reg_needs_read_steering - determine whether a register read requires * explicit steering diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h index 77a8b11c287d..781b267478db 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h @@ -24,6 +24,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt, void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value); +u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, + u32 clear, u32 set); + void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, i915_reg_t reg, u8 *group, u8 *instance); From patchwork Fri Oct 14 23:02:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4C370C4332F for ; Fri, 14 Oct 2022 23:04:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5A58D10E163; Fri, 14 Oct 2022 23:03:27 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2CB9E10E13A; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gpqg4Emucd1s8DEcq/kPKxViC8Z0cO+7BgptjAPMr8Y=; b=n5/JMUFs/kE0JUDCRllCkTYLxUoexQH+zdC8YqBIa31vKiYXwiFYGrz5 uV1hCYFPoVizvccGhQByWVieBqwHRtMUIn3IHcCf1fb1t4xK4ZNm1eEcA jZe5VVFDFLtT1dkcdBVdWg3IgH64Zbpg8fnKTvKIdT4MHWG581SUCE2nO n7rhTidT4DOdWp5oU2zAfQNGD3276cDt/oWsKLDDkjCK0AcXbgHpgFPB/ BKgurqWAk9idEYyeWwDIeCy7ugj81YxcWdOpgZ7sxs9kRSbpueiCmgDTw xwesRFqO7QVADeOdH9YVr2xg3RJqDdXo579BqvIhAbF1bGx3suJcYFarS A==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216969" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216969" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471706" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471706" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 06/14] drm/i915/xehp: Check for faults on primary GAM Date: Fri, 14 Oct 2022 16:02:31 -0700 Message-Id: <20221014230239.1023689-7-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Xe_HP the fault registers are now in a multicast register range. However as part of the GAM these registers follow special rules and we need only read from the "primary" GAM's instance to get the information we need. So a single intel_gt_mcr_read_any() (which will automatically steer to the primary GAM) is sufficient; we don't need to loop over each instance of the MCR register. v2: - Update more instances of fault registers. (Bala) Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt.c | 52 +++++++++++++++++++++++---- drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++-- 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 445e171940fa..e14f159ad9fc 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -270,7 +270,11 @@ intel_gt_clear_error_registers(struct intel_gt *gt, I915_MASTER_ERROR_INTERRUPT); } - if (GRAPHICS_VER(i915) >= 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_gt_mcr_multicast_rmw(gt, XEHP_RING_FAULT_REG, + RING_FAULT_VALID, 0); + intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); + } else if (GRAPHICS_VER(i915) >= 12) { rmw_clear(uncore, GEN12_RING_FAULT_REG, RING_FAULT_VALID); intel_uncore_posting_read(uncore, GEN12_RING_FAULT_REG); } else if (GRAPHICS_VER(i915) >= 8) { @@ -308,17 +312,49 @@ static void gen6_check_faults(struct intel_gt *gt) } } +static void xehp_check_faults(struct intel_gt *gt) +{ + u32 fault; + + /* + * Although the fault register now lives in an MCR register range, + * the GAM registers are special and we only truly need to read + * the "primary" GAM instance rather than handling each instance + * individually. intel_gt_mcr_read_any() will automatically steer + * toward the primary instance. + */ + fault = intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); + if (fault & RING_FAULT_VALID) { + u32 fault_data0, fault_data1; + u64 fault_addr; + + fault_data0 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA0); + fault_data1 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA1); + + fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) | + ((u64)fault_data0 << 12); + + drm_dbg(>->i915->drm, "Unexpected fault\n" + "\tAddr: 0x%08x_%08x\n" + "\tAddress space: %s\n" + "\tEngine ID: %d\n" + "\tSource ID: %d\n" + "\tType: %d\n", + upper_32_bits(fault_addr), lower_32_bits(fault_addr), + fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT", + GEN8_RING_FAULT_ENGINE_ID(fault), + RING_FAULT_SRCID(fault), + RING_FAULT_FAULT_TYPE(fault)); + } +} + static void gen8_check_faults(struct intel_gt *gt) { struct intel_uncore *uncore = gt->uncore; i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg; u32 fault; - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) { - fault_reg = XEHP_RING_FAULT_REG; - fault_data0_reg = XEHP_FAULT_TLB_DATA0; - fault_data1_reg = XEHP_FAULT_TLB_DATA1; - } else if (GRAPHICS_VER(gt->i915) >= 12) { + if (GRAPHICS_VER(gt->i915) >= 12) { fault_reg = GEN12_RING_FAULT_REG; fault_data0_reg = GEN12_FAULT_TLB_DATA0; fault_data1_reg = GEN12_FAULT_TLB_DATA1; @@ -358,7 +394,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt) struct drm_i915_private *i915 = gt->i915; /* From GEN8 onwards we only have one 'All Engine Fault Register' */ - if (GRAPHICS_VER(i915) >= 8) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_check_faults(gt); + else if (GRAPHICS_VER(i915) >= 8) gen8_check_faults(gt); else if (GRAPHICS_VER(i915) >= 6) gen6_check_faults(gt); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 9ea2fe34e7d3..f2d53edcd2ee 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1221,7 +1221,10 @@ static void engine_record_registers(struct intel_engine_coredump *ee) if (GRAPHICS_VER(i915) >= 6) { ee->rc_psmi = ENGINE_READ(engine, RING_PSMI_CTL); - if (GRAPHICS_VER(i915) >= 12) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + ee->fault_reg = intel_gt_mcr_read_any(engine->gt, + XEHP_RING_FAULT_REG); + else if (GRAPHICS_VER(i915) >= 12) ee->fault_reg = intel_uncore_read(engine->uncore, GEN12_RING_FAULT_REG); else if (GRAPHICS_VER(i915) >= 8) @@ -1820,7 +1823,12 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt) if (GRAPHICS_VER(i915) == 7) gt->err_int = intel_uncore_read(uncore, GEN7_ERR_INT); - if (GRAPHICS_VER(i915) >= 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + gt->fault_data0 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt, + XEHP_FAULT_TLB_DATA0); + gt->fault_data1 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt, + XEHP_FAULT_TLB_DATA1); + } else if (GRAPHICS_VER(i915) >= 12) { gt->fault_data0 = intel_uncore_read(uncore, GEN12_FAULT_TLB_DATA0); gt->fault_data1 = intel_uncore_read(uncore, From patchwork Fri Oct 14 23:02:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A838C43217 for ; Fri, 14 Oct 2022 23:04:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4B82010E17A; Fri, 14 Oct 2022 23:03:29 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4CD7C10E13C; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ci8RhjTKG3nJRDRpoGXJalJfRr2zjpDsNci2msxWB5M=; b=AUJ6Wo7u9V6tXpnoVn7UJX5gKakMbTXVt0p3xHxH5G+FCE5UP7CCwCtt xWrXxplP6U0WZs+8PhCgr61ey+LPpTs9Ypolp5BjY3Qg6nIPgUTL5uJC1 IvsEUDb7/P+eQ2te1ymGbKLsUVVNO0dGdEHwZKXKsm27psGAUMn5zuNKD lW6liB6aT2TRRtmf0pox3fnP9YqWVJ6THMzLVt0XEUnl0+GCIYx1+kzU4 lLDkZvAMCiae7XaBpTiwNEfTMn7ZjYY02h2PSoDc4MhkFr549UM3HDSc3 DgrjlZSSECTmTuDCylH8GRNDcDVFmSKuf3ll5Sy0O04iRx0WWncxA0UPS w==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216970" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216970" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471708" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471708" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 07/14] drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw() Date: Fri, 14 Oct 2022 16:02:32 -0700 Message-Id: <20221014230239.1023689-8-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Xe_HP has some MCR registers that need to be polled for completion of operations like TLB invalidation. Those registers are in the GAM range, which rolls up the status from each unit into the 'primary' instance's value. This makes it useful to have a dedicated 'wait for register' function that handles this on MCR registers, similar to the __intel_wait_for_register_fw() function we already have for regular registers. Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 55 ++++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_mcr.h | 7 ++++ 2 files changed, 62 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index 4dc360f4e344..1ed9bc4dccfd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -568,3 +568,58 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, return; } } + +/** + * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected state + * @gt: GT structure + * @reg: the register to read + * @mask: mask to apply to register value + * @value: value to wait for + * @fast_timeout_us: fast timeout in microsecond for atomic/tight wait + * @slow_timeout_ms: slow timeout in millisecond + * + * This routine waits until the target register @reg contains the expected + * @value after applying the @mask, i.e. it waits until :: + * + * (intel_gt_mcr_read_any_fw(gt, reg) & mask) == value + * + * Otherwise, the wait will timeout after @slow_timeout_ms milliseconds. + * For atomic context @slow_timeout_ms must be zero and @fast_timeout_us + * must be not larger than 20,0000 microseconds. + * + * This function is basically an MCR-friendly version of + * __intel_wait_for_register_fw(). Generally this function will only be used + * on GAM registers which are a bit special --- although they're MCR registers, + * reads (e.g., waiting for status updates) are always directed to the primary + * instance. + * + * Note that this routine assumes the caller holds forcewake asserted, it is + * not suitable for very long waits. + * + * Return: 0 if the register matches the desired condition, or -ETIMEDOUT. + */ +int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, + i915_reg_t reg, + u32 mask, + u32 value, + unsigned int fast_timeout_us, + unsigned int slow_timeout_ms) +{ + u32 reg_value = 0; +#define done (((reg_value = intel_gt_mcr_read_any_fw(gt, reg)) & mask) == value) + int ret; + + /* Catch any overuse of this function */ + might_sleep_if(slow_timeout_ms); + GEM_BUG_ON(fast_timeout_us > 20000); + GEM_BUG_ON(!fast_timeout_us && !slow_timeout_ms); + + ret = -ETIMEDOUT; + if (fast_timeout_us && fast_timeout_us <= 20000) + ret = _wait_for_atomic(done, fast_timeout_us, 0); + if (ret && slow_timeout_ms) + ret = wait_for(done, slow_timeout_ms); + + return ret; +#undef done +} diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h index 781b267478db..548f922cd9fa 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h @@ -37,6 +37,13 @@ void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt, void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, unsigned int *group, unsigned int *instance); +int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, + i915_reg_t reg, + u32 mask, + u32 value, + unsigned int fast_timeout_us, + unsigned int slow_timeout_ms); + /* * Helper for for_each_ss_steering loop. On pre-Xe_HP platforms, subslice * presence is determined by using the group/instance as direct lookups in the From patchwork Fri Oct 14 23:02:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007442 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 56DE4C4332F for ; Fri, 14 Oct 2022 23:04:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 24FA410E142; Fri, 14 Oct 2022 23:03:18 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6C7F010E00C; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EJ7WFlKFgcJUgdrd8KWJvVXrqljEjoCpLwSNSkxv4w0=; b=WotTJ2DjjiU0IShj0ylGIssIu0anUYiUfve3XijXRRy/F3iN9okbkUTD eaO1Yyh9jnGxRMyVukUR0FEnAqNXmCVdVC94joJAYh2vllMGNsGTOUdRm 4cD2y2pJHGIiUm9NFnA6nQv0CylXS/aAW/gkEcXBFliSEzWfq9wcCP2+A mZmQkPUPAXNQDJ7GgJVmwKkDGA2R61kylq1w36wosVBJPmm1AT4VV33rc kXWSsWi1BwivCEusBgYHhzfDVJ+rwIg0tKVVp4e18fote22zUsXIJq3DK yZj2KJpak5GfT0tWLzsF673zQnV/VTlACqOeMp7hd2tShc044Gqb/NBzP A==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216971" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216971" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471711" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471711" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 08/14] drm/i915: Define MCR registers explicitly Date: Fri, 14 Oct 2022 16:02:33 -0700 Message-Id: <20221014230239.1023689-9-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than using the same _MMIO() macro to define MCR registers as singleton registers, let's use a new MCR_REG() macro to make it clear that these registers are special and should be handled accordingly. For now MCR_REG() will still generate an i915_reg_t with the given offset, but we'll change that in future patches. Bspec: 66673, 66696, 66534, 67609 Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 134 ++++++++++++------------ 1 file changed, 68 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 890960b56b9e..ad9985015b0e 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -8,6 +8,8 @@ #include "i915_reg_defs.h" +#define MCR_REG(offset) _MMIO(offset) + /* RPM unit config (Gen8+) */ #define RPM_CONFIG0 _MMIO(0xd00) #define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT 3 @@ -333,12 +335,12 @@ #define GEN7_TLB_RD_ADDR _MMIO(0x4700) #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) -#define XEHP_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) +#define XEHP_PAT_INDEX(index) MCR_REG(0x4800 + (index) * 4) -#define XEHP_TILE0_ADDR_RANGE _MMIO(0x4900) +#define XEHP_TILE0_ADDR_RANGE MCR_REG(0x4900) #define XEHP_TILE_LMEM_RANGE_SHIFT 8 -#define XEHP_FLAT_CCS_BASE_ADDR _MMIO(0x4910) +#define XEHP_FLAT_CCS_BASE_ADDR MCR_REG(0x4910) #define XEHP_CCS_BASE_SHIFT 8 #define GAMTARBMODE _MMIO(0x4a08) @@ -388,18 +390,18 @@ #define CHICKEN_RASTER_2 _MMIO(0x6208) #define TBIMR_FAST_CLIP REG_BIT(5) -#define VFLSKPD _MMIO(0x62a8) +#define VFLSKPD MCR_REG(0x62a8) #define DIS_OVER_FETCH_CACHE REG_BIT(1) #define DIS_MULT_MISS_RD_SQUASH REG_BIT(0) #define GEN12_FF_MODE2 _MMIO(0x6604) -#define XEHP_FF_MODE2 _MMIO(0x6604) +#define XEHP_FF_MODE2 MCR_REG(0x6604) #define FF_MODE2_GS_TIMER_MASK REG_GENMASK(31, 24) #define FF_MODE2_GS_TIMER_224 REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224) #define FF_MODE2_TDS_TIMER_MASK REG_GENMASK(23, 16) #define FF_MODE2_TDS_TIMER_128 REG_FIELD_PREP(FF_MODE2_TDS_TIMER_MASK, 4) -#define XEHPG_INSTDONE_GEOM_SVG _MMIO(0x666c) +#define XEHPG_INSTDONE_GEOM_SVG MCR_REG(0x666c) #define CACHE_MODE_0_GEN7 _MMIO(0x7000) /* IVB+ */ #define RC_OP_FLUSH_ENABLE (1 << 0) @@ -448,14 +450,14 @@ #define GEN8_HDC_CHICKEN1 _MMIO(0x7304) #define GEN11_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) -#define XEHP_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) +#define XEHP_COMMON_SLICE_CHICKEN3 MCR_REG(0x7304) #define DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN REG_BIT(12) #define XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE REG_BIT(12) #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) #define GEN12_DISABLE_CPS_AWARE_COLOR_PIPE REG_BIT(9) #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) -#define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) +#define XEHP_SLICE_COMMON_ECO_CHICKEN1 MCR_REG(0x731c) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) #define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) @@ -486,7 +488,7 @@ #define GEN8_RC6_CTX_INFO _MMIO(0x8504) -#define XEHP_SQCM _MMIO(0x8724) +#define XEHP_SQCM MCR_REG(0x8724) #define EN_32B_ACCESS REG_BIT(30) #define HSW_IDICR _MMIO(0x9008) @@ -647,7 +649,7 @@ #define GEN7_MISCCPCTL _MMIO(0x9424) #define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0) -#define GEN8_MISCCPCTL _MMIO(0x9424) +#define GEN8_MISCCPCTL MCR_REG(0x9424) #define GEN8_DOP_CLOCK_GATE_ENABLE REG_BIT(0) #define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1) #define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2) @@ -703,7 +705,7 @@ #define LTCDD_CLKGATE_DIS REG_BIT(10) #define GEN11_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) -#define XEHP_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) +#define XEHP_SLICE_UNIT_LEVEL_CLKGATE MCR_REG(0x94d4) #define SARBUNIT_CLKGATE_DIS (1 << 5) #define RCCUNIT_CLKGATE_DIS (1 << 7) #define MSCUNIT_CLKGATE_DIS (1 << 10) @@ -711,27 +713,27 @@ #define L3_CLKGATE_DIS REG_BIT(16) #define L3_CR2X_CLKGATE_DIS REG_BIT(17) -#define SCCGCTL94DC _MMIO(0x94dc) +#define SCCGCTL94DC MCR_REG(0x94dc) #define CG3DDISURB REG_BIT(14) #define UNSLICE_UNIT_LEVEL_CLKGATE2 _MMIO(0x94e4) #define VSUNIT_CLKGATE_DIS_TGL REG_BIT(19) #define PSDUNIT_CLKGATE_DIS REG_BIT(5) -#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) +#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE MCR_REG(0x9524) #define DSS_ROUTER_CLKGATE_DIS REG_BIT(28) #define GWUNIT_CLKGATE_DIS REG_BIT(16) -#define SUBSLICE_UNIT_LEVEL_CLKGATE2 _MMIO(0x9528) +#define SUBSLICE_UNIT_LEVEL_CLKGATE2 MCR_REG(0x9528) #define CPSSUNIT_CLKGATE_DIS REG_BIT(9) -#define SSMCGCTL9530 _MMIO(0x9530) +#define SSMCGCTL9530 MCR_REG(0x9530) #define RTFUNIT_CLKGATE_DIS REG_BIT(18) -#define GEN10_DFR_RATIO_EN_AND_CHICKEN _MMIO(0x9550) +#define GEN10_DFR_RATIO_EN_AND_CHICKEN MCR_REG(0x9550) #define DFR_DISABLE (1 << 9) -#define INF_UNIT_LEVEL_CLKGATE _MMIO(0x9560) +#define INF_UNIT_LEVEL_CLKGATE MCR_REG(0x9560) #define CGPSF_CLKGATE_DIS (1 << 3) #define MICRO_BP0_0 _MMIO(0x9800) @@ -943,7 +945,7 @@ /* MOCS (Memory Object Control State) registers */ #define GEN9_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) /* L3 Cache Control */ -#define XEHP_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) +#define XEHP_LNCFCMOCS(i) MCR_REG(0xb020 + (i) * 4) #define LNCFCMOCS_REG_COUNT 32 #define GEN7_L3CNTLREG3 _MMIO(0xb024) @@ -960,10 +962,10 @@ #define GEN7_L3LOG(slice, i) _MMIO(0xb070 + (slice) * 0x200 + (i) * 4) #define GEN7_L3LOG_SIZE 0x80 -#define XEHP_L3NODEARBCFG _MMIO(0xb0b4) +#define XEHP_L3NODEARBCFG MCR_REG(0xb0b4) #define XEHP_LNESPARE REG_BIT(19) -#define GEN8_L3SQCREG1 _MMIO(0xb100) +#define GEN8_L3SQCREG1 MCR_REG(0xb100) /* * Note that on CHV the following has an off-by-one error wrt. to BSpec. * Using the formula in BSpec leads to a hang, while the formula here works @@ -974,28 +976,28 @@ #define L3_HIGH_PRIO_CREDITS(x) (((x) >> 1) << 14) #define L3_PRIO_CREDITS_MASK ((0x1f << 19) | (0x1f << 14)) -#define GEN8_L3SQCREG4 _MMIO(0xb118) +#define GEN8_L3SQCREG4 MCR_REG(0xb118) #define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6) #define GEN8_LQSC_RO_PERF_DIS (1 << 27) #define GEN8_LQSC_FLUSH_COHERENT_LINES (1 << 21) #define GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22) -#define GEN9_SCRATCH1 _MMIO(0xb11c) +#define GEN9_SCRATCH1 MCR_REG(0xb11c) #define EVICTION_PERF_FIX_ENABLE REG_BIT(8) -#define BDW_SCRATCH1 _MMIO(0xb11c) +#define BDW_SCRATCH1 MCR_REG(0xb11c) #define GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE (1 << 2) -#define GEN11_SCRATCH2 _MMIO(0xb140) +#define GEN11_SCRATCH2 MCR_REG(0xb140) #define GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE (1 << 19) -#define XEHP_L3SQCREG5 _MMIO(0xb158) +#define XEHP_L3SQCREG5 MCR_REG(0xb158) #define L3_PWM_TIMER_INIT_VAL_MASK REG_GENMASK(9, 0) -#define MLTICTXCTL _MMIO(0xb170) +#define MLTICTXCTL MCR_REG(0xb170) #define TDONRENDER REG_BIT(2) -#define XEHP_L3SCQREG7 _MMIO(0xb188) +#define XEHP_L3SCQREG7 MCR_REG(0xb188) #define BLEND_FILL_CACHING_OPT_DIS REG_BIT(3) #define XEHPC_L3SCRUB _MMIO(0xb18c) @@ -1003,7 +1005,7 @@ #define SCRUB_RATE_PER_BANK_MASK REG_GENMASK(2, 0) #define SCRUB_RATE_4B_PER_CLK REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6) -#define L3SQCREG1_CCS0 _MMIO(0xb200) +#define L3SQCREG1_CCS0 MCR_REG(0xb200) #define FLUSHALLNONCOH REG_BIT(5) #define GEN11_GLBLINVL _MMIO(0xb404) @@ -1028,14 +1030,14 @@ #define GEN9_BLT_MOCS(i) _MMIO(__GEN9_BCS0_MOCS0 + (i) * 4) #define GEN12_FAULT_TLB_DATA0 _MMIO(0xceb8) -#define XEHP_FAULT_TLB_DATA0 _MMIO(0xceb8) +#define XEHP_FAULT_TLB_DATA0 MCR_REG(0xceb8) #define GEN12_FAULT_TLB_DATA1 _MMIO(0xcebc) -#define XEHP_FAULT_TLB_DATA1 _MMIO(0xcebc) +#define XEHP_FAULT_TLB_DATA1 MCR_REG(0xcebc) #define FAULT_VA_HIGH_BITS (0xf << 0) #define FAULT_GTT_SEL (1 << 4) #define GEN12_RING_FAULT_REG _MMIO(0xcec4) -#define XEHP_RING_FAULT_REG _MMIO(0xcec4) +#define XEHP_RING_FAULT_REG MCR_REG(0xcec4) #define GEN8_RING_FAULT_ENGINE_ID(x) (((x) >> 12) & 0x7) #define RING_FAULT_GTTSEL_MASK (1 << 11) #define RING_FAULT_SRCID(x) (((x) >> 3) & 0xff) @@ -1043,21 +1045,21 @@ #define RING_FAULT_VALID (1 << 0) #define GEN12_GFX_TLB_INV_CR _MMIO(0xced8) -#define XEHP_GFX_TLB_INV_CR _MMIO(0xced8) +#define XEHP_GFX_TLB_INV_CR MCR_REG(0xced8) #define GEN12_VD_TLB_INV_CR _MMIO(0xcedc) -#define XEHP_VD_TLB_INV_CR _MMIO(0xcedc) +#define XEHP_VD_TLB_INV_CR MCR_REG(0xcedc) #define GEN12_VE_TLB_INV_CR _MMIO(0xcee0) -#define XEHP_VE_TLB_INV_CR _MMIO(0xcee0) +#define XEHP_VE_TLB_INV_CR MCR_REG(0xcee0) #define GEN12_BLT_TLB_INV_CR _MMIO(0xcee4) -#define XEHP_BLT_TLB_INV_CR _MMIO(0xcee4) +#define XEHP_BLT_TLB_INV_CR MCR_REG(0xcee4) #define GEN12_COMPCTX_TLB_INV_CR _MMIO(0xcf04) -#define XEHP_COMPCTX_TLB_INV_CR _MMIO(0xcf04) +#define XEHP_COMPCTX_TLB_INV_CR MCR_REG(0xcf04) -#define XEHP_MERT_MOD_CTRL _MMIO(0xcf28) -#define RENDER_MOD_CTRL _MMIO(0xcf2c) -#define COMP_MOD_CTRL _MMIO(0xcf30) -#define VDBX_MOD_CTRL _MMIO(0xcf34) -#define VEBX_MOD_CTRL _MMIO(0xcf38) +#define XEHP_MERT_MOD_CTRL MCR_REG(0xcf28) +#define RENDER_MOD_CTRL MCR_REG(0xcf2c) +#define COMP_MOD_CTRL MCR_REG(0xcf30) +#define VDBX_MOD_CTRL MCR_REG(0xcf34) +#define VEBX_MOD_CTRL MCR_REG(0xcf38) #define FORCE_MISS_FTLB REG_BIT(3) #define GEN12_GAMSTLB_CTRL _MMIO(0xcf4c) @@ -1072,52 +1074,52 @@ #define GEN12_GAM_DONE _MMIO(0xcf68) #define GEN7_HALF_SLICE_CHICKEN1 _MMIO(0xe100) /* IVB GT1 + VLV */ -#define GEN8_HALF_SLICE_CHICKEN1 _MMIO(0xe100) +#define GEN8_HALF_SLICE_CHICKEN1 MCR_REG(0xe100) #define GEN7_MAX_PS_THREAD_DEP (8 << 12) #define GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE (1 << 10) #define GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE (1 << 4) #define GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE (1 << 3) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) -#define GEN8_SAMPLER_INSTDONE _MMIO(0xe160) +#define GEN8_SAMPLER_INSTDONE MCR_REG(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) -#define GEN8_ROW_INSTDONE _MMIO(0xe164) +#define GEN8_ROW_INSTDONE MCR_REG(0xe164) -#define HALF_SLICE_CHICKEN2 _MMIO(0xe180) +#define HALF_SLICE_CHICKEN2 MCR_REG(0xe180) #define GEN8_ST_PO_DISABLE (1 << 13) #define HSW_HALF_SLICE_CHICKEN3 _MMIO(0xe184) -#define GEN8_HALF_SLICE_CHICKEN3 _MMIO(0xe184) +#define GEN8_HALF_SLICE_CHICKEN3 MCR_REG(0xe184) #define HSW_SAMPLE_C_PERFORMANCE (1 << 9) #define GEN8_CENTROID_PIXEL_OPT_DIS (1 << 8) #define GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC (1 << 5) #define GEN8_SAMPLER_POWER_BYPASS_DIS (1 << 1) -#define GEN9_HALF_SLICE_CHICKEN5 _MMIO(0xe188) +#define GEN9_HALF_SLICE_CHICKEN5 MCR_REG(0xe188) #define GEN9_DG_MIRROR_FIX_ENABLE (1 << 5) #define GEN9_CCS_TLB_PREFETCH_ENABLE (1 << 3) -#define GEN10_SAMPLER_MODE _MMIO(0xe18c) +#define GEN10_SAMPLER_MODE MCR_REG(0xe18c) #define ENABLE_SMALLPL REG_BIT(15) #define SC_DISABLE_POWER_OPTIMIZATION_EBB REG_BIT(9) #define GEN11_SAMPLER_ENABLE_HEADLESS_MSG REG_BIT(5) -#define GEN9_HALF_SLICE_CHICKEN7 _MMIO(0xe194) +#define GEN9_HALF_SLICE_CHICKEN7 MCR_REG(0xe194) #define DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA REG_BIT(15) #define GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR REG_BIT(8) #define GEN9_ENABLE_YV12_BUGFIX REG_BIT(4) #define GEN9_ENABLE_GPGPU_PREEMPTION REG_BIT(2) -#define GEN10_CACHE_MODE_SS _MMIO(0xe420) +#define GEN10_CACHE_MODE_SS MCR_REG(0xe420) #define ENABLE_EU_COUNT_FOR_TDL_FLUSH REG_BIT(10) #define DISABLE_ECC REG_BIT(5) #define FLOAT_BLEND_OPTIMIZATION_ENABLE REG_BIT(4) #define ENABLE_PREFETCH_INTO_IC REG_BIT(3) -#define EU_PERF_CNTL0 _MMIO(0xe458) -#define EU_PERF_CNTL4 _MMIO(0xe45c) +#define EU_PERF_CNTL0 MCR_REG(0xe458) +#define EU_PERF_CNTL4 MCR_REG(0xe45c) -#define GEN9_ROW_CHICKEN4 _MMIO(0xe48c) +#define GEN9_ROW_CHICKEN4 MCR_REG(0xe48c) #define GEN12_DISABLE_GRF_CLEAR REG_BIT(13) #define XEHP_DIS_BBL_SYSPIPE REG_BIT(11) #define GEN12_DISABLE_TDL_PUSH REG_BIT(9) @@ -1129,7 +1131,7 @@ #define HSW_ROW_CHICKEN3 _MMIO(0xe49c) #define HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE (1 << 6) -#define GEN8_ROW_CHICKEN _MMIO(0xe4f0) +#define GEN8_ROW_CHICKEN MCR_REG(0xe4f0) #define FLOW_CONTROL_ENABLE REG_BIT(15) #define UGM_BACKUP_MODE REG_BIT(13) #define MDQ_ARBITRATION_MODE REG_BIT(12) @@ -1141,39 +1143,39 @@ #define GEN7_ROW_CHICKEN2 _MMIO(0xe4f4) -#define GEN8_ROW_CHICKEN2 _MMIO(0xe4f4) +#define GEN8_ROW_CHICKEN2 MCR_REG(0xe4f4) #define GEN12_DISABLE_READ_SUPPRESSION REG_BIT(15) #define GEN12_DISABLE_EARLY_READ REG_BIT(14) #define GEN12_ENABLE_LARGE_GRF_MODE REG_BIT(12) #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) -#define RT_CTRL _MMIO(0xe530) +#define RT_CTRL MCR_REG(0xe530) #define DIS_NULL_QUERY REG_BIT(10) #define STACKID_CTRL REG_GENMASK(6, 5) #define STACKID_CTRL_512 REG_FIELD_PREP(STACKID_CTRL, 0x2) -#define EU_PERF_CNTL1 _MMIO(0xe558) -#define EU_PERF_CNTL5 _MMIO(0xe55c) +#define EU_PERF_CNTL1 MCR_REG(0xe558) +#define EU_PERF_CNTL5 MCR_REG(0xe55c) -#define XEHP_HDC_CHICKEN0 _MMIO(0xe5f0) +#define XEHP_HDC_CHICKEN0 MCR_REG(0xe5f0) #define LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK REG_GENMASK(13, 11) -#define ICL_HDC_MODE _MMIO(0xe5f4) +#define ICL_HDC_MODE MCR_REG(0xe5f4) -#define EU_PERF_CNTL2 _MMIO(0xe658) -#define EU_PERF_CNTL6 _MMIO(0xe65c) -#define EU_PERF_CNTL3 _MMIO(0xe758) +#define EU_PERF_CNTL2 MCR_REG(0xe658) +#define EU_PERF_CNTL6 MCR_REG(0xe65c) +#define EU_PERF_CNTL3 MCR_REG(0xe758) -#define LSC_CHICKEN_BIT_0 _MMIO(0xe7c8) +#define LSC_CHICKEN_BIT_0 MCR_REG(0xe7c8) #define DISABLE_D8_D16_COASLESCE REG_BIT(30) #define FORCE_1_SUB_MESSAGE_PER_FRAGMENT REG_BIT(15) -#define LSC_CHICKEN_BIT_0_UDW _MMIO(0xe7c8 + 4) +#define LSC_CHICKEN_BIT_0_UDW MCR_REG(0xe7c8 + 4) #define DIS_CHAIN_2XSIMD8 REG_BIT(55 - 32) #define FORCE_SLM_FENCE_SCOPE_TO_TILE REG_BIT(42 - 32) #define FORCE_UGM_FENCE_SCOPE_TO_TILE REG_BIT(41 - 32) #define MAXREQS_PER_BANK REG_GENMASK(39 - 32, 37 - 32) #define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32) -#define SARB_CHICKEN1 _MMIO(0xe90c) +#define SARB_CHICKEN1 MCR_REG(0xe90c) #define COMP_CKN_IN REG_GENMASK(30, 29) #define GEN7_ROW_CHICKEN2_GT2 _MMIO(0xf4f4) From patchwork Fri Oct 14 23:02:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D03DC4332F for ; Fri, 14 Oct 2022 23:04:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C691810E13F; Fri, 14 Oct 2022 23:03:16 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8E5C110E122; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z7EleniTuFJ71WdlMSQBDeHBZE0YXti1a06FLp6ShX8=; b=aqVcLOWF8VTHvBIYppvJuJkMPxEx4/hW57IOM2P0CypeyrputGV8dfrT U+7nHCAc0Zlgac0QLCcaw9eTFFX+pKpVbOy/xUzB/Cbd2gIndpUtptFT4 ppNoMMbu/idbLAOk+t0S2MXOYudYBL+ItUS70QJHh1gDcoUOVdstv3NVK HN54JxIp5f7Fld/OzJy7/6mF86q2DYVQo11UToi/rRGSMKzduNWQKFBhN 0BnVLYC88uPVbx3/LaVF24+ZlEc8EqCyj6mj2LVd+8zCVClycJmm6oNwg fCUL82Zw0MYHqX4xO7xODHkaDr9hnRsreSouwOU9+lvIDw+DSARx9UArV g==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216972" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216972" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471714" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471714" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 09/14] drm/i915/gt: Always use MCR functions on multicast registers Date: Fri, 14 Oct 2022 16:02:34 -0700 Message-Id: <20221014230239.1023689-10-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Balasubramani Vivekanandan , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than relying on the implicit behavior of intel_uncore_*() functions, let's always use the intel_gt_mcr_*() functions to operate on multicast/replicated registers. v2: - Add TLB invalidation registers v3: - Switch more uncore operations in mmio_invalidate_full() to MCR operations for Xe_HP. (Bala) Cc: Balasubramani Vivekanandan Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt.c | 58 ++++++++++++++++------- drivers/gpu/drm/i915/gt/intel_mocs.c | 13 ++--- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 +++-- drivers/gpu/drm/i915/intel_pm.c | 19 ++++---- 4 files changed, 65 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index e14f159ad9fc..3df0d0336dbc 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -1017,6 +1017,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8, return rb; } +/* + * HW architecture suggest typical invalidation time at 40us, + * with pessimistic cases up to 100us and a recommendation to + * cap at 1ms. We go a bit higher just in case. + */ +#define TLB_INVAL_TIMEOUT_US 100 +#define TLB_INVAL_TIMEOUT_MS 4 + +/* + * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets + * but are now considered MCR registers. Since they exist within a GAM range, + * the primary instance of the register rolls up the status from each unit. + */ +static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb) +{ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0, + TLB_INVAL_TIMEOUT_US, + TLB_INVAL_TIMEOUT_MS); + else + return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0, + TLB_INVAL_TIMEOUT_US, + TLB_INVAL_TIMEOUT_MS, + NULL); +} + static void mmio_invalidate_full(struct intel_gt *gt) { static const i915_reg_t gen8_regs[] = { @@ -1048,7 +1074,7 @@ static void mmio_invalidate_full(struct intel_gt *gt) unsigned int num = 0; if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { - regs = xehp_regs; + regs = NULL; num = ARRAY_SIZE(xehp_regs); } else if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; @@ -1075,11 +1101,17 @@ static void mmio_invalidate_full(struct intel_gt *gt) if (!intel_engine_pm_is_awake(engine)) continue; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (!i915_mmio_reg_offset(rb.reg)) - continue; + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_gt_mcr_multicast_write_fw(gt, + xehp_regs[engine->class], + BIT(engine->instance)); + } else { + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + if (!i915_mmio_reg_offset(rb.reg)) + continue; - intel_uncore_write_fw(uncore, rb.reg, rb.bit); + intel_uncore_write_fw(uncore, rb.reg, rb.bit); + } awake |= engine->mask; } @@ -1099,22 +1131,12 @@ static void mmio_invalidate_full(struct intel_gt *gt) for_each_engine_masked(engine, gt, awake, tmp) { struct reg_and_bit rb; - /* - * HW architecture suggest typical invalidation time at 40us, - * with pessimistic cases up to 100us and a recommendation to - * cap at 1ms. We go a bit higher just in case. - */ - const unsigned int timeout_us = 100; - const unsigned int timeout_ms = 4; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (__intel_wait_for_register_fw(uncore, - rb.reg, rb.bit, 0, - timeout_us, timeout_ms, - NULL)) + + if (wait_for_invalidate(gt, rb)) drm_err_ratelimited(>->i915->drm, "%s TLB invalidation did not complete in %ums!\n", - engine->name, timeout_ms); + engine->name, TLB_INVAL_TIMEOUT_MS); } /* diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index ecfa5baa5e3f..49fdd509527a 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -7,6 +7,7 @@ #include "intel_engine.h" #include "intel_gt.h" +#include "intel_gt_mcr.h" #include "intel_gt_regs.h" #include "intel_mocs.h" #include "intel_ring.h" @@ -609,17 +610,17 @@ static u32 l3cc_combine(u16 low, u16 high) 0; \ i++) -static void init_l3cc_table(struct intel_uncore *uncore, +static void init_l3cc_table(struct intel_gt *gt, const struct drm_i915_mocs_table *table) { unsigned int i; u32 l3cc; for_each_l3cc(l3cc, table, i) - if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50)) - intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc); + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc); else - intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc); + intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc); } void intel_mocs_init_engine(struct intel_engine_cs *engine) @@ -639,7 +640,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine) init_mocs_table(engine, &table); if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS) - init_l3cc_table(engine->uncore, &table); + init_l3cc_table(engine->gt, &table); } static u32 global_mocs_offset(void) @@ -675,7 +676,7 @@ void intel_mocs_init(struct intel_gt *gt) * memory transactions including guc transactions */ if (flags & HAS_RENDER_L3CC) - init_l3cc_table(gt->uncore, &table); + init_l3cc_table(gt, &table); } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index 9229243992c2..5b86b2e286e0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -10,12 +10,15 @@ */ #include "gt/intel_gt.h" +#include "gt/intel_gt_mcr.h" #include "gt/intel_gt_regs.h" #include "intel_guc_fw.h" #include "i915_drv.h" -static void guc_prepare_xfer(struct intel_uncore *uncore) +static void guc_prepare_xfer(struct intel_gt *gt) { + struct intel_uncore *uncore = gt->uncore; + u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC | GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA | GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA | @@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore) if (GRAPHICS_VER(uncore->i915) == 9) { /* DOP Clock Gating Enable for GuC clocks */ - intel_uncore_rmw(uncore, GEN8_MISCCPCTL, - 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE); + intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL, + GEN8_DOP_CLOCK_GATE_GUC_ENABLE | + intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL)); /* allows for 5us (in 10ns units) before GT can go to RC6 */ intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF); @@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc) struct intel_uncore *uncore = gt->uncore; int ret; - guc_prepare_xfer(uncore); + guc_prepare_xfer(gt); /* * Note that GuC needs the CSS header plus uKernel code to be copied diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 390802245514..cb18e45f6adf 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -30,6 +30,8 @@ #include "display/skl_watermark.h" #include "gt/intel_engine_regs.h" +#include "gt/intel_gt.h" +#include "gt/intel_gt_mcr.h" #include "gt/intel_gt_regs.h" #include "i915_drv.h" @@ -4321,22 +4323,22 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv, u32 val; /* WaTempDisableDOPClkGating:bdw */ - misccpctl = intel_uncore_rmw(&dev_priv->uncore, GEN8_MISCCPCTL, ~GEN8_DOP_CLOCK_GATE_ENABLE, - 0); + misccpctl = intel_gt_mcr_multicast_rmw(to_gt(dev_priv), GEN8_MISCCPCTL, + ~GEN8_DOP_CLOCK_GATE_ENABLE, 0); - val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1); + val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1); val &= ~L3_PRIO_CREDITS_MASK; val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits); val |= L3_HIGH_PRIO_CREDITS(high_prio_credits); - intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val); + intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val); /* * Wait at least 100 clocks before re-enabling clock gating. * See the definition of L3SQCREG1 in BSpec. */ - intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1); + intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1); udelay(1); - intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl); + intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl); } static void icl_init_clock_gating(struct drm_i915_private *dev_priv) @@ -4496,9 +4498,8 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv) gen9_init_clock_gating(dev_priv); /* WaDisableDopClockGating:skl */ - intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, - intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) & - ~GEN8_DOP_CLOCK_GATE_ENABLE); + intel_gt_mcr_multicast_rmw(to_gt(dev_priv), GEN8_MISCCPCTL, + GEN8_DOP_CLOCK_GATE_ENABLE, 0); /* WAC6entrylatency:skl */ intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) | From patchwork Fri Oct 14 23:02:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007440 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8023BC4332F for ; Fri, 14 Oct 2022 23:04:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2E30C10E13C; Fri, 14 Oct 2022 23:03:15 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id ADFBB10E13D; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zW3e7StAu0gehtFTpQqfKu++AyXoea1TtiGzkFdY0VA=; b=am7mB2fJHY2sYC252XSKdvhHnfM9O3eJqG7RBkTJfLp4HYROPDyu4x26 YMcNx/okA5n+u0Fb/oQVsQn3jC2ZY4VIiII1Wv/rtfmQ7Js1QeB6OVic0 MZ5LZ4Evmdbu2XnxUZ3CDX9uHKnWCGL4IubphO3+l1nHSGdpTzn1cDwcG 5xv8rDm7AnDWzbldVtLAdzmb8b12FPXaE8tM7A38EhojBLGdhA0WZn058 3fw6nt+fhwUcuoZum9MpCSpHTPIKpwLraDdbvIQv3sE7z5FNnCwKwl2GS FeEro37NpdIKSXq7vZ8pTQl6BVOLO/Rv2lgGciT4d3rPGMPczoHO76zfO w==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216973" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216973" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471717" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471717" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 10/14] drm/i915/guc: Handle save/restore of MCR registers explicitly Date: Fri, 14 Oct 2022 16:02:35 -0700 Message-Id: <20221014230239.1023689-11-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" MCR registers can be placed on the GuC's save/restore list, but at the moment they are always handled in a multicast manner (i.e., the GuC reads one instance to save the value and then does a multicast write to restore that single value to all instances). In the future the GuC will probably give us an alternate interface to do unicast per-instance save/restore operations, so we should be very clear about which registers on the list are MCR registers (and in the future which save/restore behavior we want for them). Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 55 +++++++++++++--------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index cc357fa0c270..de923fb82301 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -278,24 +278,16 @@ __mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg) return slot; } -#define GUC_REGSET_STEERING(group, instance) ( \ - FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \ - FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \ - GUC_REGSET_NEEDS_STEERING \ -) - static long __must_check guc_mmio_reg_add(struct intel_gt *gt, struct temp_regset *regset, - i915_reg_t reg, u32 flags) + u32 offset, u32 flags) { u32 count = regset->storage_used - (regset->registers - regset->storage); - u32 offset = i915_mmio_reg_offset(reg); struct guc_mmio_reg entry = { .offset = offset, .flags = flags, }; struct guc_mmio_reg *slot; - u8 group, inst; /* * The mmio list is built using separate lists within the driver. @@ -307,17 +299,6 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt, sizeof(entry), guc_mmio_reg_cmp)) return 0; - /* - * The GuC doesn't have a default steering, so we need to explicitly - * steer all registers that need steering. However, we do not keep track - * of all the steering ranges, only of those that have a chance of using - * a non-default steering from the i915 pov. Instead of adding such - * tracking, it is easier to just program the default steering for all - * regs that don't need a non-default one. - */ - intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst); - entry.flags |= GUC_REGSET_STEERING(group, inst); - slot = __mmio_reg_add(regset, &entry); if (IS_ERR(slot)) return PTR_ERR(slot); @@ -335,6 +316,38 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt, #define GUC_MMIO_REG_ADD(gt, regset, reg, masked) \ guc_mmio_reg_add(gt, \ + regset, \ + i915_mmio_reg_offset(reg), \ + (masked) ? GUC_REGSET_MASKED : 0) + +#define GUC_REGSET_STEERING(group, instance) ( \ + FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \ + FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \ + GUC_REGSET_NEEDS_STEERING \ +) + +static long __must_check guc_mcr_reg_add(struct intel_gt *gt, + struct temp_regset *regset, + i915_reg_t reg, u32 flags) +{ + u8 group, inst; + + /* + * The GuC doesn't have a default steering, so we need to explicitly + * steer all registers that need steering. However, we do not keep track + * of all the steering ranges, only of those that have a chance of using + * a non-default steering from the i915 pov. Instead of adding such + * tracking, it is easier to just program the default steering for all + * regs that don't need a non-default one. + */ + intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst); + flags |= GUC_REGSET_STEERING(group, inst); + + return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags); +} + +#define GUC_MCR_REG_ADD(gt, regset, reg, masked) \ + guc_mcr_reg_add(gt, \ regset, \ (reg), \ (masked) ? GUC_REGSET_MASKED : 0) @@ -375,7 +388,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset, /* add in local MOCS registers */ for (i = 0; i < LNCFCMOCS_REG_COUNT; i++) if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) - ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false); + ret |= GUC_MCR_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false); else ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); From patchwork Fri Oct 14 23:02:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E992C433FE for ; Fri, 14 Oct 2022 23:04:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9AD2610E14B; Fri, 14 Oct 2022 23:03:48 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id CEA7910E13F; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FD2jnQjfI6Sl9rfEr7qs0HwCbiBLF231KFHNNSStBEk=; b=l0s5TmZzThi3WncL6NyAPlJ3UebFgsVwiRFf9mKiWQq3v1VzIpvQHvnD CWb+iDQNyBZPkx7gZ208gJnjEK62K1f5GH1WnFFW7vim57V9QcDJ9mQ+O A7J9+q2hVgKBO/mPIuBjmEOnflzcTTBMO81AhfK8mkTZo1GH4bhrnEyHG 9qUqEjdF0N1UtGIJgl6CNIytmTt1l1qNqhEVed+5DwJtTrrYesyBHaVr5 inyzVIXf9VMcc7UM+ADTQQKIWq7GMlvVRxuK0UEZ4x1u1oK8wRO0YsB/B xiOptk6bv1/l5BtEMD+KZukdLN13cJw/G171WL/KVM2/U/cwUph11PTK+ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216974" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216974" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471722" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471722" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:02 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 11/14] drm/i915/gt: Add MCR-specific workaround initializers Date: Fri, 14 Oct 2022 16:02:36 -0700 Message-Id: <20221014230239.1023689-12-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Let's be more explicit about which of our workarounds are updating MCR registers. Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 433 +++++++++++------- .../gpu/drm/i915/gt/intel_workarounds_types.h | 4 +- 2 files changed, 263 insertions(+), 174 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 96b9f02a2284..7671994d5b7a 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -166,12 +166,33 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg, _wa_add(wal, &wa); } +static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg, + u32 clear, u32 set, u32 read_mask, bool masked_reg) +{ + struct i915_wa wa = { + .reg = reg, + .clr = clear, + .set = set, + .read = read_mask, + .masked_reg = masked_reg, + .is_mcr = 1, + }; + + _wa_add(wal, &wa); +} + static void wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) { wa_add(wal, reg, clear, set, clear, false); } +static void +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) +{ + wa_mcr_add(wal, reg, clear, set, clear, false); +} + static void wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 set) { @@ -184,12 +205,24 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) wa_write_clr_set(wal, reg, set, set); } +static void +wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) +{ + wa_mcr_write_clr_set(wal, reg, set, set); +} + static void wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) { wa_write_clr_set(wal, reg, clr, 0); } +static void +wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) +{ + wa_mcr_write_clr_set(wal, reg, clr, 0); +} + /* * WA operations on "masked register". A masked register has the upper 16 bits * documented as "masked" in b-spec. Its purpose is to allow writing to just a @@ -207,12 +240,24 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true); } +static void +wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +{ + wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true); +} + static void wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) { wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true); } +static void +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +{ + wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true); +} + static void wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, u32 mask, u32 val) @@ -220,6 +265,13 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true); } +static void +wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, + u32 mask, u32 val) +{ + wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true); +} + static void gen6_ctx_workarounds_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { @@ -241,8 +293,8 @@ static void gen8_ctx_workarounds_init(struct intel_engine_cs *engine, wa_masked_en(wal, RING_MI_MODE(RENDER_RING_BASE), ASYNC_FLIP_PERF_DISABLE); /* WaDisablePartialInstShootdown:bdw,chv */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, - PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, + PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); /* Use Force Non-Coherent whenever executing a 3D context. This is a * workaround for a possible hang in the unlikely event a TLB @@ -288,18 +340,18 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine, gen8_ctx_workarounds_init(engine, wal); /* WaDisableThreadStallDopClockGating:bdw (pre-production) */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); /* WaDisableDopClockGating:bdw * * Also see the related UCGTCL1 write in bdw_init_clock_gating() * to disable EUTC clock gating. */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, - DOP_CLOCK_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, + DOP_CLOCK_GATING_DISABLE); - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, - GEN8_SAMPLER_POWER_BYPASS_DIS); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, + GEN8_SAMPLER_POWER_BYPASS_DIS); wa_masked_en(wal, HDC_CHICKEN0, /* WaForceContextSaveRestoreNonCoherent:bdw */ @@ -314,7 +366,7 @@ static void chv_ctx_workarounds_init(struct intel_engine_cs *engine, gen8_ctx_workarounds_init(engine, wal); /* WaDisableThreadStallDopClockGating:chv */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); /* Improve HiZ throughput on CHV. */ wa_masked_en(wal, HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X); @@ -333,21 +385,21 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, */ wa_masked_en(wal, COMMON_SLICE_CHICKEN2, GEN9_PBE_COMPRESSED_HASH_SELECTION); - wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, - GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR); + wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, + GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR); } /* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */ /* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, - FLOW_CONTROL_ENABLE | - PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, + FLOW_CONTROL_ENABLE | + PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); /* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */ /* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */ - wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, - GEN9_ENABLE_YV12_BUGFIX | - GEN9_ENABLE_GPGPU_PREEMPTION); + wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, + GEN9_ENABLE_YV12_BUGFIX | + GEN9_ENABLE_GPGPU_PREEMPTION); /* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */ /* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */ @@ -356,8 +408,8 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE); /* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */ - wa_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5, - GEN9_CCS_TLB_PREFETCH_ENABLE); + wa_mcr_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5, + GEN9_CCS_TLB_PREFETCH_ENABLE); /* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */ wa_masked_en(wal, HDC_CHICKEN0, @@ -386,11 +438,11 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, IS_KABYLAKE(i915) || IS_COFFEELAKE(i915) || IS_COMETLAKE(i915)) - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, - GEN8_SAMPLER_POWER_BYPASS_DIS); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, + GEN8_SAMPLER_POWER_BYPASS_DIS); /* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */ - wa_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE); + wa_mcr_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE); /* * Supporting preemption with fine-granularity requires changes in the @@ -469,8 +521,8 @@ static void bxt_ctx_workarounds_init(struct intel_engine_cs *engine, gen9_ctx_workarounds_init(engine, wal); /* WaDisableThreadStallDopClockGating:bxt */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, - STALL_DOP_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, + STALL_DOP_GATING_DISABLE); /* WaToEnableHwFixForPushConstHWBug:bxt */ wa_masked_en(wal, COMMON_SLICE_CHICKEN2, @@ -490,8 +542,8 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:kbl */ - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, - GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, + GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } static void glk_ctx_workarounds_init(struct intel_engine_cs *engine, @@ -514,8 +566,8 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:cfl */ - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, - GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, + GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, @@ -534,13 +586,13 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, * (the register is whitelisted in hardware now, so UMDs can opt in * for coherency if they have a good reason). */ - wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT); + wa_mcr_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT); /* WaEnableFloatBlendOptimization:icl */ - wa_add(wal, GEN10_CACHE_MODE_SS, 0, - _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE), - 0 /* write-only, so skip validation */, - true); + wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0, + _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE), + 0 /* write-only, so skip validation */, + true); /* WaDisableGPGPUMidThreadPreemption:icl */ wa_masked_field_set(wal, GEN8_CS_CHICKEN1, @@ -548,8 +600,8 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN9_PREEMPT_GPGPU_THREAD_GROUP_LEVEL); /* allow headerless messages for preemptible GPGPU context */ - wa_masked_en(wal, GEN10_SAMPLER_MODE, - GEN11_SAMPLER_ENABLE_HEADLESS_MSG); + wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE, + GEN11_SAMPLER_ENABLE_HEADLESS_MSG); /* Wa_1604278689:icl,ehl */ wa_write(wal, IVB_FBC_RT_BASE, 0xFFFFFFFF & ~ILK_FBC_RT_VALID); @@ -558,7 +610,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, 0xFFFFFFFF); /* Wa_1406306137:icl,ehl */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU); } /* @@ -569,13 +621,13 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP); - wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, - REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); - wa_add(wal, - XEHP_FF_MODE2, - FF_MODE2_TDS_TIMER_MASK, - FF_MODE2_TDS_TIMER_128, - 0, false); + wa_mcr_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, + REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); + wa_mcr_add(wal, + XEHP_FF_MODE2, + FF_MODE2_TDS_TIMER_MASK, + FF_MODE2_TDS_TIMER_128, + 0, false); } /* @@ -664,27 +716,27 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine, /* Wa_16011186671:dg2_g11 */ if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) { - wa_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH); - wa_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE); + wa_mcr_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH); + wa_mcr_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE); } if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) { /* Wa_14010469329:dg2_g10 */ - wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, - XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE); + wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, + XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE); /* * Wa_22010465075:dg2_g10 * Wa_22010613112:dg2_g10 * Wa_14010698770:dg2_g10 */ - wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, - GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); + wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, + GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); } /* Wa_16013271637:dg2 */ - wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1, - MSC_MSAA_REODER_BUF_BYPASS_DISABLE); + wa_mcr_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1, + MSC_MSAA_REODER_BUF_BYPASS_DISABLE); /* Wa_14014947963:dg2 */ if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_B0, STEP_FOREVER) || @@ -1264,9 +1316,9 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) PSDUNIT_CLKGATE_DIS); /* Wa_1406680159:icl,ehl */ - wa_write_or(wal, - GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, - GWUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, + GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, + GWUNIT_CLKGATE_DIS); /* Wa_1607087056:icl,ehl,jsl */ if (IS_ICELAKE(i915) || @@ -1279,7 +1331,7 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) * This is not a documented workaround, but rather an optimization * to reduce sampler power. */ - wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); + wa_mcr_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); } /* @@ -1313,7 +1365,7 @@ gen12_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_14011060649(gt, wal); /* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */ - wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); + wa_mcr_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); } static void @@ -1325,9 +1377,9 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1409420604:tgl */ if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) - wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE2, - CPSSUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, + SUBSLICE_UNIT_LEVEL_CLKGATE2, + CPSSUNIT_CLKGATE_DIS); /* Wa_1607087056:tgl also know as BUG:1409180338 */ if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) @@ -1356,9 +1408,9 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1409420604:dg1 */ if (IS_DG1(i915)) - wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE2, - CPSSUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, + SUBSLICE_UNIT_LEVEL_CLKGATE2, + CPSSUNIT_CLKGATE_DIS); /* Wa_1408615072:dg1 */ /* Empirical testing shows this register is unaffected by engine reset. */ @@ -1375,7 +1427,7 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) xehp_init_mcr(gt, wal); /* Wa_1409757795:xehpsdv */ - wa_write_or(wal, SCCGCTL94DC, CG3DDISURB); + wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB); /* Wa_16011155590:xehpsdv */ if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) @@ -1455,8 +1507,8 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) CG3DDISCFEG_CLKGATE_DIS); /* Wa_14011006942:dg2 */ - wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, - DSS_ROUTER_CLKGATE_DIS); + wa_mcr_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, + DSS_ROUTER_CLKGATE_DIS); } if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_B0)) { @@ -1467,7 +1519,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS); /* Wa_14011371254:dg2_g10 */ - wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); + wa_mcr_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); /* Wa_14011431319:dg2_g10 */ wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS | @@ -1503,21 +1555,21 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) GAMEDIA_CLKGATE_DIS); /* Wa_14011028019:dg2_g10 */ - wa_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS); } /* Wa_14014830051:dg2 */ - wa_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN); + wa_mcr_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN); /* * The following are not actually "workarounds" but rather * recommended tuning settings documented in the bspec's * performance guide section. */ - wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); + wa_mcr_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); /* Wa_14015795083 */ - wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -1526,7 +1578,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) pvc_init_mcr(gt, wal); /* Wa_14015795083 */ - wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -1638,14 +1690,25 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal) u32 val, old = 0; /* open-coded rmw due to steering */ - old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0; + if (wa->clr) + old = wa->is_mcr ? + intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_uncore_read_fw(uncore, wa->reg); val = (old & ~wa->clr) | wa->set; - if (val != old || !wa->clr) - intel_uncore_write_fw(uncore, wa->reg, val); + if (val != old || !wa->clr) { + if (wa->is_mcr) + intel_gt_mcr_multicast_write_fw(gt, wa->reg, val); + else + intel_uncore_write_fw(uncore, wa->reg, val); + } + + if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) { + u32 val = wa->is_mcr ? + intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_uncore_read_fw(uncore, wa->reg); - if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) - wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg), - wal->name, "application"); + wa_verify(wa, val, wal->name, "application"); + } } intel_uncore_forcewake_put__locked(uncore, fw); @@ -1674,8 +1737,9 @@ static bool wa_list_verify(struct intel_gt *gt, intel_uncore_forcewake_get__locked(uncore, fw); for (i = 0, wa = wal->list; i < wal->count; i++, wa++) - ok &= wa_verify(wa, - intel_gt_mcr_read_any_fw(gt, wa->reg), + ok &= wa_verify(wa, wa->is_mcr ? + intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_uncore_read_fw(uncore, wa->reg), wal->name, from); intel_uncore_forcewake_put__locked(uncore, fw); @@ -1721,12 +1785,36 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) _wa_add(wal, &wa); } +static void +whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) +{ + struct i915_wa wa = { + .reg = reg, + .is_mcr = 1, + }; + + if (GEM_DEBUG_WARN_ON(wal->count >= RING_MAX_NONPRIV_SLOTS)) + return; + + if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags))) + return; + + wa.reg.reg |= flags; + _wa_add(wal, &wa); +} + static void whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) { whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); } +static void +whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg) +{ + whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); +} + static void gen9_whitelist_build(struct i915_wa_list *w) { /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */ @@ -1752,7 +1840,7 @@ static void skl_whitelist_build(struct intel_engine_cs *engine) gen9_whitelist_build(w); /* WaDisableLSQCROPERFforOCL:skl */ - whitelist_reg(w, GEN8_L3SQCREG4); + whitelist_mcr_reg(w, GEN8_L3SQCREG4); } static void bxt_whitelist_build(struct intel_engine_cs *engine) @@ -1773,7 +1861,7 @@ static void kbl_whitelist_build(struct intel_engine_cs *engine) gen9_whitelist_build(w); /* WaDisableLSQCROPERFforOCL:kbl */ - whitelist_reg(w, GEN8_L3SQCREG4); + whitelist_mcr_reg(w, GEN8_L3SQCREG4); } static void glk_whitelist_build(struct intel_engine_cs *engine) @@ -1838,10 +1926,10 @@ static void icl_whitelist_build(struct intel_engine_cs *engine) switch (engine->class) { case RENDER_CLASS: /* WaAllowUMDToModifyHalfSliceChicken7:icl */ - whitelist_reg(w, GEN9_HALF_SLICE_CHICKEN7); + whitelist_mcr_reg(w, GEN9_HALF_SLICE_CHICKEN7); /* WaAllowUMDToModifySamplerMode:icl */ - whitelist_reg(w, GEN10_SAMPLER_MODE); + whitelist_mcr_reg(w, GEN10_SAMPLER_MODE); /* WaEnableStateCacheRedirectToCS:icl */ whitelist_reg(w, GEN9_SLICE_COMMON_ECO_CHICKEN1); @@ -2117,21 +2205,21 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) { /* Wa_14013392000:dg2_g11 */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) || IS_DG2_G11(i915) || IS_DG2_G12(i915)) { /* Wa_1509727124:dg2 */ - wa_masked_en(wal, GEN10_SAMPLER_MODE, - SC_DISABLE_POWER_OPTIMIZATION_EBB); + wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE, + SC_DISABLE_POWER_OPTIMIZATION_EBB); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0) || IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) { /* Wa_14012419201:dg2 */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, - GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, + GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) || @@ -2140,13 +2228,13 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_22012826095:dg2 * Wa_22013059131:dg2 */ - wa_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW, - MAXREQS_PER_BANK, - REG_FIELD_PREP(MAXREQS_PER_BANK, 2)); + wa_mcr_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW, + MAXREQS_PER_BANK, + REG_FIELD_PREP(MAXREQS_PER_BANK, 2)); /* Wa_22013059131:dg2 */ - wa_write_or(wal, LSC_CHICKEN_BIT_0, - FORCE_1_SUB_MESSAGE_PER_FRAGMENT); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, + FORCE_1_SUB_MESSAGE_PER_FRAGMENT); } /* Wa_1308578152:dg2_g10 when first gslice is fused off */ @@ -2159,19 +2247,19 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) || IS_DG2_G11(i915) || IS_DG2_G12(i915)) { /* Wa_22013037850:dg2 */ - wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, - DISABLE_128B_EVICTION_COMMAND_UDW); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, + DISABLE_128B_EVICTION_COMMAND_UDW); /* Wa_22012856258:dg2 */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, - GEN12_DISABLE_READ_SUPPRESSION); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, + GEN12_DISABLE_READ_SUPPRESSION); /* * Wa_22010960976:dg2 * Wa_14013347512:dg2 */ - wa_masked_dis(wal, XEHP_HDC_CHICKEN0, - LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK); + wa_mcr_masked_dis(wal, XEHP_HDC_CHICKEN0, + LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) { @@ -2179,8 +2267,8 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_1608949956:dg2_g10 * Wa_14010198302:dg2_g10 */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, - MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, + MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE); /* * Wa_14010918519:dg2_g10 @@ -2188,31 +2276,31 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * LSC_CHICKEN_BIT_0 always reads back as 0 is this stepping, * so ignoring verification. */ - wa_add(wal, LSC_CHICKEN_BIT_0_UDW, 0, - FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE, - 0, false); + wa_mcr_add(wal, LSC_CHICKEN_BIT_0_UDW, 0, + FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE, + 0, false); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) { /* Wa_22010430635:dg2 */ - wa_masked_en(wal, - GEN9_ROW_CHICKEN4, - GEN12_DISABLE_GRF_CLEAR); + wa_mcr_masked_en(wal, + GEN9_ROW_CHICKEN4, + GEN12_DISABLE_GRF_CLEAR); /* Wa_14010648519:dg2 */ - wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); + wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); } /* Wa_14013202645:dg2 */ if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) || IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) - wa_write_or(wal, RT_CTRL, DIS_NULL_QUERY); + wa_mcr_write_or(wal, RT_CTRL, DIS_NULL_QUERY); /* Wa_22012532006:dg2 */ if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_C0) || IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) - wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, - DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA); + wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, + DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA); if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) { /* Wa_14010680813:dg2_g10 */ @@ -2223,17 +2311,16 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) || IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) { /* Wa_14012362059:dg2 */ - wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); } if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) || IS_DG2_G10(i915)) { /* Wa_22014600077:dg2 */ - wa_add(wal, GEN10_CACHE_MODE_SS, 0, - _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH), - 0 /* Wa_14012342262 :write-only reg, so skip - verification */, - true); + wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0, + _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH), + 0 /* Wa_14012342262 write-only reg, so skip verification */, + true); } if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || @@ -2260,7 +2347,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); /* * Wa_1407928979:tgl A* @@ -2289,14 +2376,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, - GEN12_PUSH_CONST_DEREF_HOLD_DIS); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, + GEN12_PUSH_CONST_DEREF_HOLD_DIS); /* * Wa_1409085225:tgl * Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH); } if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || @@ -2320,9 +2407,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) || IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) { /* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */ - wa_masked_en(wal, - GEN10_SAMPLER_MODE, - ENABLE_SMALLPL); + wa_mcr_masked_en(wal, + GEN10_SAMPLER_MODE, + ENABLE_SMALLPL); } if (GRAPHICS_VER(i915) == 11) { @@ -2356,9 +2443,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_1405733216:icl * Formerly known as WaDisableCleanEvicts */ - wa_write_or(wal, - GEN8_L3SQCREG4, - GEN11_LQSC_CLEAN_EVICT_DISABLE); + wa_mcr_write_or(wal, + GEN8_L3SQCREG4, + GEN11_LQSC_CLEAN_EVICT_DISABLE); /* Wa_1606682166:icl */ wa_write_or(wal, @@ -2366,10 +2453,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) GEN7_DISABLE_SAMPLER_PREFETCH); /* Wa_1409178092:icl */ - wa_write_clr_set(wal, - GEN11_SCRATCH2, - GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE, - 0); + wa_mcr_write_clr_set(wal, + GEN11_SCRATCH2, + GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE, + 0); /* WaEnable32PlaneMode:icl */ wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS, @@ -2479,30 +2566,30 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE); /* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */ - wa_write_or(wal, - BDW_SCRATCH1, - GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE); + wa_mcr_write_or(wal, + BDW_SCRATCH1, + GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE); /* WaProgramL3SqcReg1DefaultForPerf:bxt,glk */ if (IS_GEN9_LP(i915)) - wa_write_clr_set(wal, - GEN8_L3SQCREG1, - L3_PRIO_CREDITS_MASK, - L3_GENERAL_PRIO_CREDITS(62) | - L3_HIGH_PRIO_CREDITS(2)); + wa_mcr_write_clr_set(wal, + GEN8_L3SQCREG1, + L3_PRIO_CREDITS_MASK, + L3_GENERAL_PRIO_CREDITS(62) | + L3_HIGH_PRIO_CREDITS(2)); /* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */ - wa_write_or(wal, - GEN8_L3SQCREG4, - GEN8_LQSC_FLUSH_COHERENT_LINES); + wa_mcr_write_or(wal, + GEN8_L3SQCREG4, + GEN8_LQSC_FLUSH_COHERENT_LINES); /* Disable atomics in L3 to prevent unrecoverable hangs */ wa_write_clr_set(wal, GEN9_SCRATCH_LNCF1, GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0); - wa_write_clr_set(wal, GEN8_L3SQCREG4, - GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0); - wa_write_clr_set(wal, GEN9_SCRATCH1, - EVICTION_PERF_FIX_ENABLE, 0); + wa_mcr_write_clr_set(wal, GEN8_L3SQCREG4, + GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0); + wa_mcr_write_clr_set(wal, GEN9_SCRATCH1, + EVICTION_PERF_FIX_ENABLE, 0); } if (IS_HASWELL(i915)) { @@ -2716,7 +2803,7 @@ ccs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { if (IS_PVC_CT_STEP(engine->i915, STEP_A0, STEP_C0)) { /* Wa_14014999345:pvc */ - wa_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC); + wa_mcr_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC); } } @@ -2742,8 +2829,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915, } if (IS_DG2(i915)) { - wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS); - wa_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512); + wa_mcr_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS); + wa_mcr_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512); /* * This is also listed as Wa_22012654132 for certain DG2 @@ -2754,10 +2841,10 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915, * back for verification on DG2 (due to Wa_14012342262), so * we need to explicitly skip the readback. */ - wa_add(wal, GEN10_CACHE_MODE_SS, 0, - _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC), - 0 /* write-only, so skip validation */, - true); + wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0, + _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC), + 0 /* write-only, so skip validation */, + true); } /* @@ -2766,8 +2853,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915, * platforms. */ if (INTEL_INFO(i915)->tuning_thread_rr_after_dep) - wa_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE, - THREAD_EX_ARB_MODE_RR_AFTER_DEP); + wa_mcr_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE, + THREAD_EX_ARB_MODE_RR_AFTER_DEP); } /* @@ -2793,30 +2880,30 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li if (IS_XEHPSDV(i915)) { /* Wa_1409954639 */ - wa_masked_en(wal, - GEN8_ROW_CHICKEN, - SYSTOLIC_DOP_CLOCK_GATING_DIS); + wa_mcr_masked_en(wal, + GEN8_ROW_CHICKEN, + SYSTOLIC_DOP_CLOCK_GATING_DIS); /* Wa_1607196519 */ - wa_masked_en(wal, - GEN9_ROW_CHICKEN4, - GEN12_DISABLE_GRF_CLEAR); + wa_mcr_masked_en(wal, + GEN9_ROW_CHICKEN4, + GEN12_DISABLE_GRF_CLEAR); /* Wa_14010670810:xehpsdv */ - wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); + wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); /* Wa_14010449647:xehpsdv */ - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, - GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, + GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); /* Wa_18011725039:xehpsdv */ if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A1, STEP_B0)) { - wa_masked_dis(wal, MLTICTXCTL, TDONRENDER); - wa_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH); + wa_mcr_masked_dis(wal, MLTICTXCTL, TDONRENDER); + wa_mcr_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH); } /* Wa_14012362059:xehpsdv */ - wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); /* Wa_14014368820:xehpsdv */ wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS | @@ -2825,19 +2912,19 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li if (IS_DG2(i915) || IS_PONTEVECCHIO(i915)) { /* Wa_14015227452:dg2,pvc */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE); /* Wa_22014226127:dg2,pvc */ - wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE); /* Wa_16015675438:dg2,pvc */ wa_masked_en(wal, FF_SLICE_CS_CHICKEN2, GEN12_PERF_FIX_BALANCING_CFE_DISABLE); /* Wa_18018781329:dg2,pvc */ - wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB); - wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB); - wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB); - wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB); } if (IS_DG2(i915)) { @@ -2845,7 +2932,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li * Wa_16011620976:dg2_g11 * Wa_22015475538:dg2 */ - wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8); } } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h index 8a4b6de4e754..f05b37e56fa9 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h @@ -15,7 +15,9 @@ struct i915_wa { u32 clr; u32 set; u32 read; - bool masked_reg; + + u32 masked_reg:1; + u32 is_mcr:1; }; struct i915_wa_list { From patchwork Fri Oct 14 23:02:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 25079C4332F for ; Fri, 14 Oct 2022 23:04:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C4E5110E157; Fri, 14 Oct 2022 23:03:51 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id D926210E140; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788584; x=1697324584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zOoLGbM3WKxvMW6RCMi/wh0fKZtXArLhraI63edvbNA=; b=Db1DMIS84eEKiGs7lycC7EVkvETPLmc0+iZwIu5PggNrCzThJgdMbJiP fAdH5CbfK0iEvTjp1DtL5Znfm0ilQqG+qeADubUi+1XdOCboCck65Mkud 4Dd8PkH5b+eA2HC227bRXgMzRkaYam2H2hZitJrBACPJAz5PqjcJzjAGh hulmLS5jhjec/B9rLxbo4FJkWGWXuwao512PzhnT3GXEAagibSCQXDHfX rDmHELYJJ3TOWIpMwVzp9r+ZtF75euZu784wV6Vny34b511dkULhAxI3y htbwTqj44+8K3acuiCQKM45qUGh4oGuQtR82n6zFvUFfTDFHYtASNcDaQ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216975" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216975" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471724" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471724" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 12/14] drm/i915: Define multicast registers as a new type Date: Fri, 14 Oct 2022 16:02:37 -0700 Message-Id: <20221014230239.1023689-13-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than treating multicast registers as 'i915_reg_t' let's define them as a completely new type. This will allow the compiler to help us make sure we're using multicast-aware functions to operate on multicast registers. This plan does break down a bit in places where we're just maintaining heterogeneous lists of registers (e.g., various MMIO whitelists used by perf, GVT, etc.) rather than performing reads/writes. We only really care about the offset in those cases, so for now we can "cast" the registers as non-MCR, leaving us with a list of i915_reg_t's, but we may want to look for better ways to store mixed collections of i915_reg_t and i915_mcr_reg_t in the future. v2: - Add TLB invalidation registers v3: - Make type checking of i915_mmio_reg_offset() stricter. It will accept either i915_reg_t or i915_mcr_reg_t, but will now raise a compile error if any other type is passed, even if that type contains a 'reg' field. (Jani) - Drop a ton of GVT changes; allowing i915_mmio_reg_offset() to take either an i915_reg_t or an i915_mcr_reg_t means that the huge lists of MMIO_D*() macros used in GVT will continue to work without modification. We need only make changes to structures that have an explicit i915_reg_t in them now. Cc: Jani Nikula Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt.c | 16 ++++-- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 51 ++++++++++++------- drivers/gpu/drm/i915/gt/intel_gt_mcr.h | 18 +++---- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 27 +++++++--- drivers/gpu/drm/i915/gt/intel_workarounds.c | 32 ++++++------ .../gpu/drm/i915/gt/intel_workarounds_types.h | 5 +- .../gpu/drm/i915/gt/selftest_workarounds.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +- .../gpu/drm/i915/gt/uc/intel_guc_capture.c | 4 +- drivers/gpu/drm/i915/gvt/handlers.c | 2 +- drivers/gpu/drm/i915/gvt/mmio_context.c | 14 ++--- drivers/gpu/drm/i915/i915_reg_defs.h | 27 +++++----- 12 files changed, 117 insertions(+), 83 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 3df0d0336dbc..27dbb9e4bd6c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -991,7 +991,10 @@ void intel_gt_info_print(const struct intel_gt_info *info, } struct reg_and_bit { - i915_reg_t reg; + union { + i915_reg_t reg; + i915_mcr_reg_t mcr_reg; + }; u32 bit; }; @@ -1033,7 +1036,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8, static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb) { if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0, + return intel_gt_mcr_wait_for_reg_fw(gt, rb.mcr_reg, rb.bit, 0, TLB_INVAL_TIMEOUT_US, TLB_INVAL_TIMEOUT_MS); else @@ -1058,7 +1061,7 @@ static void mmio_invalidate_full(struct intel_gt *gt) [COPY_ENGINE_CLASS] = GEN12_BLT_TLB_INV_CR, [COMPUTE_CLASS] = GEN12_COMPCTX_TLB_INV_CR, }; - static const i915_reg_t xehp_regs[] = { + static const i915_mcr_reg_t xehp_regs[] = { [RENDER_CLASS] = XEHP_GFX_TLB_INV_CR, [VIDEO_DECODE_CLASS] = XEHP_VD_TLB_INV_CR, [VIDEO_ENHANCEMENT_CLASS] = XEHP_VE_TLB_INV_CR, @@ -1131,7 +1134,12 @@ static void mmio_invalidate_full(struct intel_gt *gt) for_each_engine_masked(engine, gt, awake, tmp) { struct reg_and_bit rb; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + rb.mcr_reg = xehp_regs[engine->class]; + rb.bit = BIT(engine->instance); + } else { + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + } if (wait_for_invalidate(gt, rb)) drm_err_ratelimited(>->i915->drm, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index 1ed9bc4dccfd..349074bf365f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -150,6 +150,19 @@ void intel_gt_mcr_init(struct intel_gt *gt) } } +/* + * Although the rest of the driver should use MCR-specific functions to + * read/write MCR registers, we still use the regular intel_uncore_* functions + * internally to implement those, so we need a way for the functions in this + * file to "cast" an i915_mcr_reg_t into an i915_reg_t. + */ +static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr) +{ + i915_reg_t r = { .reg = mcr.reg }; + + return r; +} + /* * rw_with_mcr_steering_fw - Access a register with specific MCR steering * @uncore: pointer to struct intel_uncore @@ -164,7 +177,7 @@ void intel_gt_mcr_init(struct intel_gt *gt) * Caller needs to make sure the relevant forcewake wells are up. */ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, - i915_reg_t reg, u8 rw_flag, + i915_mcr_reg_t reg, u8 rw_flag, int group, int instance, u32 value) { u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0; @@ -201,9 +214,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); if (rw_flag == FW_REG_READ) - val = intel_uncore_read_fw(uncore, reg); + val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg)); else - intel_uncore_write_fw(uncore, reg, value); + intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value); mcr &= ~mcr_mask; mcr |= old_mcr & mcr_mask; @@ -214,14 +227,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, } static u32 rw_with_mcr_steering(struct intel_uncore *uncore, - i915_reg_t reg, u8 rw_flag, + i915_mcr_reg_t reg, u8 rw_flag, int group, int instance, u32 value) { enum forcewake_domains fw_domains; u32 val; - fw_domains = intel_uncore_forcewake_for_reg(uncore, reg, + fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg), rw_flag); fw_domains |= intel_uncore_forcewake_for_reg(uncore, GEN8_MCR_SELECTOR, @@ -249,7 +262,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore, * group/instance. */ u32 intel_gt_mcr_read(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, int group, int instance) { return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0); @@ -266,7 +279,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt, * Write an MCR register in unicast mode after steering toward a specific * group/instance. */ -void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value, +void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value, int group, int instance) { rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value); @@ -281,9 +294,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value, * Write an MCR register in multicast mode to update all instances. */ void intel_gt_mcr_multicast_write(struct intel_gt *gt, - i915_reg_t reg, u32 value) + i915_mcr_reg_t reg, u32 value) { - intel_uncore_write(gt->uncore, reg, value); + intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value); } /** @@ -297,9 +310,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt, * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should * be obtained automatically. */ -void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value) +void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value) { - intel_uncore_write_fw(gt->uncore, reg, value); + intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value); } /** @@ -320,7 +333,7 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va * * Returns the old (unmodified) value read. */ -u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, +u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 clear, u32 set) { u32 val = intel_gt_mcr_read_any(gt, reg); @@ -345,7 +358,7 @@ u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, * for @type steering too. */ static bool reg_needs_read_steering(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, enum intel_steering_type type) { const u32 offset = i915_mmio_reg_offset(reg); @@ -428,7 +441,7 @@ static void get_nonterminated_steering(struct intel_gt *gt, * steering. */ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u8 *group, u8 *instance) { int type; @@ -457,7 +470,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, * * Returns the value from a non-terminated instance of @reg. */ -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg) +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg) { int type; u8 group, instance; @@ -471,7 +484,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg) } } - return intel_uncore_read_fw(gt->uncore, reg); + return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg)); } /** @@ -484,7 +497,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg) * * Returns the value from a non-terminated instance of @reg. */ -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg) +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg) { int type; u8 group, instance; @@ -498,7 +511,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg) } } - return intel_uncore_read(gt->uncore, reg); + return intel_uncore_read(gt->uncore, mcr_reg_cast(reg)); } static void report_steering_type(struct drm_printer *p, @@ -599,7 +612,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, * Return: 0 if the register matches the desired condition, or -ETIMEDOUT. */ int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u32 mask, u32 value, unsigned int fast_timeout_us, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h index 548f922cd9fa..3fb0502bff22 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h @@ -11,24 +11,24 @@ void intel_gt_mcr_init(struct intel_gt *gt); u32 intel_gt_mcr_read(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, int group, int instance); -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg); -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg); +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg); +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg); void intel_gt_mcr_unicast_write(struct intel_gt *gt, - i915_reg_t reg, u32 value, + i915_mcr_reg_t reg, u32 value, int group, int instance); void intel_gt_mcr_multicast_write(struct intel_gt *gt, - i915_reg_t reg, u32 value); + i915_mcr_reg_t reg, u32 value); void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, - i915_reg_t reg, u32 value); + i915_mcr_reg_t reg, u32 value); -u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, +u32 intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 clear, u32 set); void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u8 *group, u8 *instance); void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt, @@ -38,7 +38,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, unsigned int *group, unsigned int *instance); int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u32 mask, u32 value, unsigned int fast_timeout_us, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index ad9985015b0e..754c27dd1e82 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -8,7 +8,18 @@ #include "i915_reg_defs.h" -#define MCR_REG(offset) _MMIO(offset) +#define MCR_REG(offset) ((const i915_mcr_reg_t){ .reg = (offset) }) + +/* + * The perf control registers are technically multicast registers, but the + * driver never needs to read/write them directly; we only use them to build + * lists of registers (where they're mixed in with other non-MCR registers) + * and then operate on the offset directly. For now we'll just define them + * as non-multicast so we can place them on the same list, but we may want + * to try to come up with a better way to handle heterogeneous lists of + * registers in the future. + */ +#define PERF_REG(offset) _MMIO(offset) /* RPM unit config (Gen8+) */ #define RPM_CONFIG0 _MMIO(0xd00) @@ -1116,8 +1127,8 @@ #define FLOAT_BLEND_OPTIMIZATION_ENABLE REG_BIT(4) #define ENABLE_PREFETCH_INTO_IC REG_BIT(3) -#define EU_PERF_CNTL0 MCR_REG(0xe458) -#define EU_PERF_CNTL4 MCR_REG(0xe45c) +#define EU_PERF_CNTL0 PERF_REG(0xe458) +#define EU_PERF_CNTL4 PERF_REG(0xe45c) #define GEN9_ROW_CHICKEN4 MCR_REG(0xe48c) #define GEN12_DISABLE_GRF_CLEAR REG_BIT(13) @@ -1154,16 +1165,16 @@ #define STACKID_CTRL REG_GENMASK(6, 5) #define STACKID_CTRL_512 REG_FIELD_PREP(STACKID_CTRL, 0x2) -#define EU_PERF_CNTL1 MCR_REG(0xe558) -#define EU_PERF_CNTL5 MCR_REG(0xe55c) +#define EU_PERF_CNTL1 PERF_REG(0xe558) +#define EU_PERF_CNTL5 PERF_REG(0xe55c) #define XEHP_HDC_CHICKEN0 MCR_REG(0xe5f0) #define LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK REG_GENMASK(13, 11) #define ICL_HDC_MODE MCR_REG(0xe5f4) -#define EU_PERF_CNTL2 MCR_REG(0xe658) -#define EU_PERF_CNTL6 MCR_REG(0xe65c) -#define EU_PERF_CNTL3 MCR_REG(0xe758) +#define EU_PERF_CNTL2 PERF_REG(0xe658) +#define EU_PERF_CNTL6 PERF_REG(0xe65c) +#define EU_PERF_CNTL3 PERF_REG(0xe758) #define LSC_CHICKEN_BIT_0 MCR_REG(0xe7c8) #define DISABLE_D8_D16_COASLESCE REG_BIT(30) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 7671994d5b7a..dadb60e6a58f 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg, _wa_add(wal, &wa); } -static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg, +static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set, u32 read_mask, bool masked_reg) { struct i915_wa wa = { - .reg = reg, + .mcr_reg = reg, .clr = clear, .set = set, .read = read_mask, @@ -188,7 +188,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) } static void -wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set) { wa_mcr_add(wal, reg, clear, set, clear, false); } @@ -206,7 +206,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) } static void -wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) +wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set) { wa_mcr_write_clr_set(wal, reg, set, set); } @@ -218,7 +218,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) } static void -wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) +wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr) { wa_mcr_write_clr_set(wal, reg, clr, 0); } @@ -241,7 +241,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) } static void -wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val) { wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true); } @@ -253,7 +253,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) } static void -wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val) { wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true); } @@ -266,7 +266,7 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, } static void -wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, +wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 mask, u32 val) { wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true); @@ -1692,19 +1692,19 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal) /* open-coded rmw due to steering */ if (wa->clr) old = wa->is_mcr ? - intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) : intel_uncore_read_fw(uncore, wa->reg); val = (old & ~wa->clr) | wa->set; if (val != old || !wa->clr) { if (wa->is_mcr) - intel_gt_mcr_multicast_write_fw(gt, wa->reg, val); + intel_gt_mcr_multicast_write_fw(gt, wa->mcr_reg, val); else intel_uncore_write_fw(uncore, wa->reg, val); } if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) { u32 val = wa->is_mcr ? - intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) : intel_uncore_read_fw(uncore, wa->reg); wa_verify(wa, val, wal->name, "application"); @@ -1738,7 +1738,7 @@ static bool wa_list_verify(struct intel_gt *gt, for (i = 0, wa = wal->list; i < wal->count; i++, wa++) ok &= wa_verify(wa, wa->is_mcr ? - intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) : intel_uncore_read_fw(uncore, wa->reg), wal->name, from); @@ -1786,10 +1786,10 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) } static void -whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) +whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 flags) { struct i915_wa wa = { - .reg = reg, + .mcr_reg = reg, .is_mcr = 1, }; @@ -1799,7 +1799,7 @@ whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags))) return; - wa.reg.reg |= flags; + wa.mcr_reg.reg |= flags; _wa_add(wal, &wa); } @@ -1810,7 +1810,7 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) } static void -whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg) +whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg) { whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h index f05b37e56fa9..7c8b01d00043 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h @@ -11,7 +11,10 @@ #include "i915_reg_defs.h" struct i915_wa { - i915_reg_t reg; + union { + i915_reg_t reg; + i915_mcr_reg_t mcr_reg; + }; u32 clr; u32 set; u32 read; diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 67a9aab801dd..21b1edc052f8 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg) /* Alas, we must pardon some whitelists. Mistakes already made */ static const struct regmask pardon[] = { { GEN9_CTX_PREEMPT_REG, 9 }, - { GEN8_L3SQCREG4, 9 }, + { _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */ }; return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon)); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index de923fb82301..34ef4f36e660 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -328,7 +328,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt, static long __must_check guc_mcr_reg_add(struct intel_gt *gt, struct temp_regset *regset, - i915_reg_t reg, u32 flags) + i915_mcr_reg_t reg, u32 flags) { u8 group, inst; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c index 9495a7928bc8..d5c03e7a7843 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c @@ -240,7 +240,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist struct __ext_steer_reg { const char *name; - i915_reg_t reg; + i915_mcr_reg_t reg; }; static const struct __ext_steer_reg xe_extregs[] = { @@ -252,7 +252,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, const struct __ext_steer_reg *extlist, int slice_id, int subslice_id) { - ext->reg = extlist->reg; + ext->reg = _MMIO(i915_mmio_reg_offset(extlist->reg)); ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id); ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id); ext->regname = extlist->name; diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c index 700cc9688f47..1cb388484bf0 100644 --- a/drivers/gpu/drm/i915/gvt/handlers.c +++ b/drivers/gpu/drm/i915/gvt/handlers.c @@ -734,7 +734,7 @@ static i915_reg_t force_nonpriv_white_list[] = { _MMIO(0x770c), _MMIO(0x83a8), _MMIO(0xb110), - GEN8_L3SQCREG4,//_MMIO(0xb118) + _MMIO(0xb118), _MMIO(0xe100), _MMIO(0xe18c), _MMIO(0xe48c), diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c index d177884d8f7d..200c1162daa3 100644 --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -106,15 +106,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = { {RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */ {RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */ {RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */ - {RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */ - {RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */ + {RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */ + {RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */ {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */ {RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */ - {RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */ - {RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */ - {RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */ - {RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */ - {RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */ + {RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */ + {RCS0, _MMIO(0xe184), 0xffff, true}, /* GEN8_HALF_SLICE_CHICKEN3 */ + {RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */ + {RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */ + {RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */ {RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */ {RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */ {RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */ diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h index 8f486f77609f..f1859046a9c4 100644 --- a/drivers/gpu/drm/i915/i915_reg_defs.h +++ b/drivers/gpu/drm/i915/i915_reg_defs.h @@ -104,22 +104,21 @@ typedef struct { #define _MMIO(r) ((const i915_reg_t){ .reg = (r) }) -#define INVALID_MMIO_REG _MMIO(0) - -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg) -{ - return reg.reg; -} +typedef struct { + u32 reg; +} i915_mcr_reg_t; -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b) -{ - return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b); -} +#define INVALID_MMIO_REG _MMIO(0) -static inline bool i915_mmio_reg_valid(i915_reg_t reg) -{ - return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); -} +/* + * These macros can be used on either i915_reg_t or i915_mcr_reg_t since they're + * simply operations on the register's offset and don't care about the MCR vs + * non-MCR nature of the register. + */ +#define i915_mmio_reg_offset(r) \ + _Generic((r), i915_reg_t: (r).reg, i915_mcr_reg_t: (r).reg) +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b)) +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG)) #define VLV_DISPLAY_BASE 0x180000 From patchwork Fri Oct 14 23:02:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2DE4EC433FE for ; Fri, 14 Oct 2022 23:05:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5497310E18E; Fri, 14 Oct 2022 23:03:54 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id EE6E110E142; Fri, 14 Oct 2022 23:03:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788585; x=1697324585; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TOKctJ5xKQXmSFbPctbOCzxXyQO98zLZebD6Nw0YJ2Q=; b=R1JjnQwuKR08PITCmTglMv9vhs0UOoO0uT29KGVttbI4dCuWrNa+iMzj wx1lSUD340xsYyPXAYblxoFtvk+86YMzTa6ZSQX7W02cJbtJFzy/5qXII IfbgDG6jJSGx7k37w37xHGlqDSJeOqiHjzqTcc7SeHEfXiDxeDPeXQfno MHsoI98pFLscAD6G+o330xV6juwnuy+YLMlFOytaQrejkUwcu+DfHM2TY eY7IU45vWe/4jSfpWulekbLVpNowdSYTEDV2IjMP85CDIwQt6OlFMNsN9 0P6Ljt3G4vzmXYl/BlB6mLEV1mgTdAvWLzjfiY/aQpPS5C8Lj4+Asc+88 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216976" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216976" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471728" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471728" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 13/14] drm/i915/xelpg: Add multicast steering Date: Fri, 14 Oct 2022 16:02:38 -0700 Message-Id: <20221014230239.1023689-14-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Balasubramani Vivekanandan , dri-devel@lists.freedesktop.org, Radhakrishna Sripada Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" MTL's graphics IP (Xe_LPG) once again changes the multicast register types and steering details. Key changes from past platforms: * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary according to the MTL subplatform and cannot be read from fuse registers. However steering to instance #0 will always provided a non-terminated value, so we can lump these all into a single "instance0" table. * The MCR steering register (and its bitfields) has changed. Unlike past platforms, we will be explicitly steering all types of MCR accesses, including those for "SLICE" and "DSS" ranges; we no longer rely on implicit steering. On previous platforms, various hardware/firmware agents that needed to access registers typically had their own steering control registers, allowing them to perform multicast steering without clobbering the CPU/kernel steering. Starting with MTL, more of these agents now share a single steering register (0xFD4) and it is no longer safe for us to assume that the value will remain unchanged from how we initialized it during startup. There is also a slight chance of race conditions between the driver and a hardware/firmware agent, so the hardware provides a semaphore register that can be used to coordinate access to the steering register. Support for the semaphore register will be introduced in a future patch. v2: - Use Xe_LPG terminology instead of "MTL 3D" since it's the IP version we're matching on now rather than the platform. - Don't combine l3bank and mslice masks into a union. It's not related to the other changes here and we might still need both of them on some future platform. - Separate debug dumping of steering settings to a separate helper function. (Tvrtko) - Update debug dumping to include DSS ranges (and future-proof it so that any new ranges added on future platforms will also be dumped). - Restore MULTICAST bit at the end of rw_with_mcr_steering_fw() if we cleared it. Also force the MULTICAST bit to true at the beginning of multicast writes just to be safe. (Bala) Bspec: 67788, 67112 Cc: Radhakrishna Sripada Cc: Balasubramani Vivekanandan Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 135 +++++++++++++++++--- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 5 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 1 + drivers/gpu/drm/i915/gt/intel_workarounds.c | 33 ++++- drivers/gpu/drm/i915/i915_pci.c | 1 + 5 files changed, 154 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index 349074bf365f..23a1ef9659bf 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = { "MSLICE", "LNCF", "GAM", + "DSS", "INSTANCE 0", }; @@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = { {}, }; +static const struct intel_mmio_range xelpg_instance0_steering_table[] = { + { 0x000B00, 0x000BFF }, /* SQIDI */ + { 0x001000, 0x001FFF }, /* SQIDI */ + { 0x004000, 0x0048FF }, /* GAM */ + { 0x008700, 0x0087FF }, /* SQIDI */ + { 0x00B000, 0x00B0FF }, /* NODE */ + { 0x00C800, 0x00CFFF }, /* GAM */ + { 0x00D880, 0x00D8FF }, /* NODE */ + { 0x00DD00, 0x00DDFF }, /* OAAL2 */ + {}, +}; + +static const struct intel_mmio_range xelpg_l3bank_steering_table[] = { + { 0x00B100, 0x00B3FF }, + {}, +}; + +/* DSS steering is used for SLICE ranges as well */ +static const struct intel_mmio_range xelpg_dss_steering_table[] = { + { 0x005200, 0x0052FF }, /* SLICE */ + { 0x005500, 0x007FFF }, /* SLICE */ + { 0x008140, 0x00815F }, /* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */ + { 0x0094D0, 0x00955F }, /* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */ + { 0x009680, 0x0096FF }, /* DSS */ + { 0x00D800, 0x00D87F }, /* SLICE */ + { 0x00DC00, 0x00DCFF }, /* SLICE */ + { 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */ +}; + void intel_gt_mcr_init(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; + unsigned long fuse; + int i; /* * An mslice is unavailable only if both the meml3 for the slice is @@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt) drm_warn(&i915->drm, "mslice mask all zero!\n"); } - if (IS_PONTEVECCHIO(i915)) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) && + gt->type == GT_PRIMARY) { + fuse = REG_FIELD_GET(GT_L3_EXC_MASK, + intel_uncore_read(gt->uncore, XEHP_FUSE4)); + + /* + * Despite the register field being named "exclude mask" the + * bits actually represent enabled banks (two banks per bit). + */ + for_each_set_bit(i, &fuse, 3) + gt->info.l3bank_mask |= 0x3 << 2 * i; + + gt->steering_table[INSTANCE0] = xelpg_instance0_steering_table; + gt->steering_table[L3BANK] = xelpg_l3bank_steering_table; + gt->steering_table[DSS] = xelpg_dss_steering_table; + } else if (IS_PONTEVECCHIO(i915)) { gt->steering_table[INSTANCE0] = pvc_instance0_steering_table; } else if (IS_DG2(i915)) { gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; @@ -184,7 +231,19 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, lockdep_assert_held(&uncore->lock); - if (GRAPHICS_VER(uncore->i915) >= 11) { + if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) { + /* + * Always leave the hardware in multicast mode when doing reads + * (see comment about Wa_22013088509 below) and only change it + * to unicast mode when doing writes of a specific instance. + * + * No need to save old steering reg value. + */ + intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR, + REG_FIELD_PREP(MTL_MCR_GROUPID, group) | + REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance) | + (rw_flag == FW_REG_READ) ? GEN11_MCR_MULTICAST : 0); + } else if (GRAPHICS_VER(uncore->i915) >= 11) { mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK; mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance); @@ -202,26 +261,40 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, */ if (rw_flag == FW_REG_WRITE) mcr_mask |= GEN11_MCR_MULTICAST; + + mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR); + old_mcr = mcr; + + mcr &= ~mcr_mask; + mcr |= mcr_ss; + intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); } else { mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance); - } - old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR); + mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR); + old_mcr = mcr; - mcr &= ~mcr_mask; - mcr |= mcr_ss; - intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); + mcr &= ~mcr_mask; + mcr |= mcr_ss; + intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); + } if (rw_flag == FW_REG_READ) val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg)); else intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value); - mcr &= ~mcr_mask; - mcr |= old_mcr & mcr_mask; - - intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); + /* + * For pre-MTL platforms, we need to restore the old value of the + * steering control register to ensure that implicit steering continues + * to behave as expected. For MTL and beyond, we need only reinstate + * the 'multicast' bit (and only if we did a write that cleared it). + */ + if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70) && rw_flag == FW_REG_WRITE) + intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR, GEN11_MCR_MULTICAST); + else if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70)) + intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, old_mcr); return val; } @@ -296,6 +369,13 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 val void intel_gt_mcr_multicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value) { + /* + * Ensure we have multicast behavior, just in case some non-i915 agent + * left the hardware in unicast mode. + */ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70)) + intel_uncore_write_fw(gt->uncore, MTL_MCR_SELECTOR, GEN11_MCR_MULTICAST); + intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value); } @@ -312,6 +392,13 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt, */ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value) { + /* + * Ensure we have multicast behavior, just in case some non-i915 agent + * left the hardware in unicast mode. + */ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70)) + intel_uncore_write_fw(gt->uncore, MTL_MCR_SELECTOR, GEN11_MCR_MULTICAST); + intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value); } @@ -389,6 +476,8 @@ static void get_nonterminated_steering(struct intel_gt *gt, enum intel_steering_type type, u8 *group, u8 *instance) { + u32 dss; + switch (type) { case L3BANK: *group = 0; /* unused */ @@ -412,6 +501,11 @@ static void get_nonterminated_steering(struct intel_gt *gt, *group = IS_DG2(gt->i915) ? 1 : 0; *instance = 0; break; + case DSS: + dss = intel_sseu_find_first_xehp_dss(>->info.sseu, 0, 0); + *group = dss / GEN_DSS_PER_GSLICE; + *instance = dss % GEN_DSS_PER_GSLICE; + break; case INSTANCE0: /* * There are a lot of MCR types for which instance (0, 0) @@ -544,11 +638,20 @@ static void report_steering_type(struct drm_printer *p, void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt, bool dump_table) { - drm_printf(p, "Default steering: group=0x%x, instance=0x%x\n", - gt->default_steering.groupid, - gt->default_steering.instanceid); - - if (IS_PONTEVECCHIO(gt->i915)) { + /* + * Starting with MTL we no longer have default steering; + * all ranges are explicitly steered. + */ + if (GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 70)) + drm_printf(p, "Default steering: group=0x%x, instance=0x%x\n", + gt->default_steering.groupid, + gt->default_steering.instanceid); + + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70)) { + for (int i = 0; i < NUM_STEERING_TYPES; i++) + if (gt->steering_table[i]) + report_steering_type(p, gt, i, dump_table); + } else if (IS_PONTEVECCHIO(gt->i915)) { report_steering_type(p, gt, INSTANCE0, dump_table); } else if (HAS_MSLICE_STEERING(gt->i915)) { report_steering_type(p, gt, MSLICE, dump_table); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 754c27dd1e82..810283131f0a 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -59,6 +59,7 @@ #define GMD_ID_MEDIA _MMIO(MTL_MEDIA_GSI_BASE + 0xd8c) #define MCFG_MCR_SELECTOR _MMIO(0xfd0) +#define MTL_MCR_SELECTOR _MMIO(0xfd4) #define SF_MCR_SELECTOR _MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc) #define GAM_MCR_SELECTOR _MMIO(0xfe0) @@ -71,6 +72,8 @@ #define GEN11_MCR_SLICE_MASK GEN11_MCR_SLICE(0xf) #define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24) #define GEN11_MCR_SUBSLICE_MASK GEN11_MCR_SUBSLICE(0x7) +#define MTL_MCR_GROUPID REG_GENMASK(11, 8) +#define MTL_MCR_INSTANCEID REG_GENMASK(3, 0) #define IPEIR_I965 _MMIO(0x2064) #define IPEHR_I965 _MMIO(0x2068) @@ -531,6 +534,8 @@ #define GEN6_MBCTL_BOOT_FETCH_MECH (1 << 0) /* Fuse readout registers for GT */ +#define XEHP_FUSE4 _MMIO(0x9114) +#define GT_L3_EXC_MASK REG_GENMASK(6, 4) #define GEN10_MIRROR_FUSE3 _MMIO(0x9118) #define GEN10_L3BANK_PAIR_COUNT 4 #define GEN10_L3BANK_MASK 0x0F diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 30003d68fd51..0bb73d110a84 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -60,6 +60,7 @@ enum intel_steering_type { MSLICE, LNCF, GAM, + DSS, /* * On some platforms there are multiple types of MCR registers that diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index dadb60e6a58f..711a31935857 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1128,18 +1128,23 @@ static void __set_mcr_steering(struct i915_wa_list *wal, wa_write_clr_set(wal, steering_reg, mcr_mask, mcr); } -static void __add_mcr_wa(struct intel_gt *gt, struct i915_wa_list *wal, - unsigned int slice, unsigned int subslice) +static void debug_dump_steering(struct intel_gt *gt) { struct drm_printer p = drm_debug_printer("MCR Steering:"); + if (drm_debug_enabled(DRM_UT_DRIVER)) + intel_gt_mcr_report_steering(&p, gt, false); +} + +static void __add_mcr_wa(struct intel_gt *gt, struct i915_wa_list *wal, + unsigned int slice, unsigned int subslice) +{ __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice); gt->default_steering.groupid = slice; gt->default_steering.instanceid = subslice; - if (drm_debug_enabled(DRM_UT_DRIVER)) - intel_gt_mcr_report_steering(&p, gt, false); + debug_dump_steering(gt); } static void @@ -1581,12 +1586,30 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } +static void +xelpg_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) +{ + /* FIXME: Actual workarounds will be added in future patch(es) */ + + /* + * Unlike older platforms, we no longer setup implicit steering here; + * all MCR accesses are explicitly steered. + */ + debug_dump_steering(gt); +} + static void gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal) { struct drm_i915_private *i915 = gt->i915; - if (IS_PONTEVECCHIO(i915)) + /* FIXME: Media GT handling will be added in an upcoming patch */ + if (gt->type == GT_MEDIA) + return; + + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) + xelpg_gt_workarounds_init(gt, wal); + else if (IS_PONTEVECCHIO(i915)) pvc_gt_workarounds_init(gt, wal); else if (IS_DG2(i915)) dg2_gt_workarounds_init(gt, wal); diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 40bb06c5cdc0..496df0f547f4 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1143,6 +1143,7 @@ static const struct intel_device_info mtl_info = { .extra_gt_list = xelpmp_extra_gt, .has_flat_ccs = 0, .has_gmd_id = 1, + .has_mslice_steering = 0, .has_snoop = 1, .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM, .__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0), From patchwork Fri Oct 14 23:02:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 13007448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 37D5CC433FE for ; Fri, 14 Oct 2022 23:05:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ED3B510E173; Fri, 14 Oct 2022 23:03:51 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 07FB610E14B; Fri, 14 Oct 2022 23:03:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665788585; x=1697324585; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ojhGefCBWKgraiPsza53LL8lmh8xKnD2xnX5HsGjb/I=; b=U4/CDuGfGo0Y0dnTbZ6VNrYyTdcKWt/jRJ++mQs/brLAOk5AHsr/qQCl rBHmasKdVvidackA7OaMDmXHjQ+BX2ykZpBIgTr9ZMECX0YJeTFgpgPb9 Axf1X3opS0GgieVzsMZ5dNnEwj1wXoJYQjMQCfsIOWpEks5ge6J6MFhzL KeFJvvhHMm7WvnQ8rnmWDHUl9fj2o9XtiZ5hGruHMnxLszcC/5XLDI5qM WPoPTFh7PnuboPjUl+lEqVFL/d93PrLxx6yaAXtoATrISGCgqg9K9Z7r7 lBnrB6d9hpdnWNnc1Lr/T3eR4CLwna552ILPxd0/CARMxaSIk55f+yAAp w==; X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="285216977" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="285216977" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10500"; a="696471730" X-IronPort-AV: E=Sophos;i="5.95,185,1661842800"; d="scan'208";a="696471730" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 16:03:03 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v3 14/14] drm/i915/xelpmp: Add multicast steering for media GT Date: Fri, 14 Oct 2022 16:02:39 -0700 Message-Id: <20221014230239.1023689-15-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221014230239.1023689-1-matthew.d.roper@intel.com> References: <20221014230239.1023689-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" MTL's media IP (Xe_LPM+) only has a single type of steering ("OAADDRM") which selects between media slice 0 and media slice 1. We'll always steer to media slice 0 unless it is fused off (which is the case when VD0, VE0, and SFC0 are all reported as unavailable). Bspec: 67789 Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 18 ++++++++++++++++-- drivers/gpu/drm/i915/gt/intel_gt_types.h | 1 + drivers/gpu/drm/i915/gt/intel_workarounds.c | 17 +++++++++++++++-- 3 files changed, 32 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index 23a1ef9659bf..0d2811724b00 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -42,6 +42,7 @@ static const char * const intel_steering_types[] = { "LNCF", "GAM", "DSS", + "OADDRM", "INSTANCE 0", }; @@ -129,6 +130,11 @@ static const struct intel_mmio_range xelpg_dss_steering_table[] = { { 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */ }; +static const struct intel_mmio_range xelpmp_oaddrm_steering_table[] = { + { 0x393200, 0x39323F }, + { 0x393400, 0x3934FF }, +}; + void intel_gt_mcr_init(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; @@ -151,8 +157,9 @@ void intel_gt_mcr_init(struct intel_gt *gt) drm_warn(&i915->drm, "mslice mask all zero!\n"); } - if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) && - gt->type == GT_PRIMARY) { + if (MEDIA_VER(i915) >= 13 && gt->type == GT_MEDIA) { + gt->steering_table[OADDRM] = xelpmp_oaddrm_steering_table; + } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) { fuse = REG_FIELD_GET(GT_L3_EXC_MASK, intel_uncore_read(gt->uncore, XEHP_FUSE4)); @@ -514,6 +521,13 @@ static void get_nonterminated_steering(struct intel_gt *gt, *group = 0; *instance = 0; break; + case OADDRM: + if ((VDBOX_MASK(gt) | VEBOX_MASK(gt) | gt->info.sfc_mask) & BIT(0)) + *group = 0; + else + *group = 1; + *instance = 0; + break; default: MISSING_CASE(type); *group = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 0bb73d110a84..64aa2ba624fc 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -61,6 +61,7 @@ enum intel_steering_type { LNCF, GAM, DSS, + OADDRM, /* * On some platforms there are multiple types of MCR registers that diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 711a31935857..bae960486872 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1598,14 +1598,27 @@ xelpg_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) debug_dump_steering(gt); } +static void +xelpmp_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) +{ + /* FIXME: Actual workarounds will be added in future patch(es) */ + + debug_dump_steering(gt); +} + static void gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal) { struct drm_i915_private *i915 = gt->i915; - /* FIXME: Media GT handling will be added in an upcoming patch */ - if (gt->type == GT_MEDIA) + if (gt->type == GT_MEDIA) { + if (MEDIA_VER(i915) >= 13) + xelpmp_gt_workarounds_init(gt, wal); + else + MISSING_CASE(MEDIA_VER(i915)); + return; + } if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) xelpg_gt_workarounds_init(gt, wal);