From patchwork Sat Oct 1 00:45:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996156 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1B148C433F5 for ; Sat, 1 Oct 2022 00:46:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 33D2910E497; Sat, 1 Oct 2022 00:46:28 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id C20BF10E497; Sat, 1 Oct 2022 00:46:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cebqAxZzMpx1eKWdiFhn4UJE9Feg9uytXFlfnkwNmV4=; b=XcZ0k3V9dVHXnZrOIVfAooa/drvlGOF3jhLWvDnU7inIzIiTcxLpNugE WkR+sW3c8n3g2XAz7uPTVe9kWCMSqCPQaf2rASfEZcmeMMU+/yiIp3JM0 YlQAPtA8Wd8+/qaijlLpThH2vAr2mxQEU5R/+LZjWxBWG/pogOKM3Ab1B NpCG/hBvvxgY9H5G4jSxlwBMeJ63qCb1z0jPpt2HFoRT7hQ02sNGg9m1g mjqUpkQr9Q2qL7b0Fa+EiMAMnSEMUEOeBIJfgcjp50+oAy02tVMp/iTD1 TQ/G2b6rceLqLtOAfSWukA1iQs67KUOruA3bkUR1QMcWO8h9fflAGnzdz A==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607926" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607926" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477611" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477611" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 01/14] drm/i915/gen8: Create separate reg definitions for new MCR registers Date: Fri, 30 Sep 2022 17:45:37 -0700 Message-Id: <20221001004550.3031431-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Gen8 was the first time our hardware had multicast registers (or at least the first time the multicast nature was exposed and MMIO accesses could be steered). There are some registers that transitioned from singleton behavior to multicast during the gen7 -> gen8 transition; let's duplicate the register definitions for those registers in preparation for upcoming patches that will handle MCR registers in a special manner. The registers adjusted are: * MISCCPCTL * SAMPLER_INSTDONE * ROW_INSTDONE * ROW_CHICKEN2 * HALF_SLICE_CHICKEN1 * HALF_SLICE_CHICKEN3 v2: - Use the gen8 version of HALF_SLICE_CHICKEN3 in GVT's gen9 engine MMIO list. (Bala) - Update to the gen8 version of MISCCPCTL in a couple new workarounds that were recently added for DG2/PVC. (Bala) Cc: Balasubramani Vivekanandan Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 4 +-- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 11 +++++++- drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 +++++++++---------- .../gpu/drm/i915/gt/uc/intel_guc_capture.c | 4 +-- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 +- drivers/gpu/drm/i915/gvt/handlers.c | 2 +- drivers/gpu/drm/i915/gvt/mmio_context.c | 2 +- drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 10 +++---- 9 files changed, 36 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 2ddcad497fa3..c408bac3c533 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1559,11 +1559,11 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, for_each_ss_steering(iter, engine->gt, slice, subslice) { instdone->sampler[slice][subslice] = intel_gt_mcr_read(engine->gt, - GEN7_SAMPLER_INSTDONE, + GEN8_SAMPLER_INSTDONE, slice, subslice); instdone->row[slice][subslice] = intel_gt_mcr_read(engine->gt, - GEN7_ROW_INSTDONE, + GEN8_ROW_INSTDONE, slice, subslice); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 7f79bbf97828..ba4ce668042c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -651,6 +651,9 @@ #define GEN7_MISCCPCTL _MMIO(0x9424) #define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0) + +#define GEN8_MISCCPCTL _MMIO(0x9424) +#define GEN8_DOP_CLOCK_GATE_ENABLE REG_BIT(0) #define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1) #define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2) #define GEN8_DOP_CLOCK_GATE_GUC_ENABLE (1 << 4) @@ -1072,18 +1075,22 @@ #define GEN12_GAM_DONE _MMIO(0xcf68) #define GEN7_HALF_SLICE_CHICKEN1 _MMIO(0xe100) /* IVB GT1 + VLV */ +#define GEN8_HALF_SLICE_CHICKEN1 _MMIO(0xe100) #define GEN7_MAX_PS_THREAD_DEP (8 << 12) #define GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE (1 << 10) #define GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE (1 << 4) #define GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE (1 << 3) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) +#define GEN8_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define GEN8_ROW_INSTDONE _MMIO(0xe164) #define HALF_SLICE_CHICKEN2 _MMIO(0xe180) #define GEN8_ST_PO_DISABLE (1 << 13) -#define HALF_SLICE_CHICKEN3 _MMIO(0xe184) +#define HSW_HALF_SLICE_CHICKEN3 _MMIO(0xe184) +#define GEN8_HALF_SLICE_CHICKEN3 _MMIO(0xe184) #define HSW_SAMPLE_C_PERFORMANCE (1 << 9) #define GEN8_CENTROID_PIXEL_OPT_DIS (1 << 8) #define GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC (1 << 5) @@ -1136,6 +1143,8 @@ #define DISABLE_EARLY_EOT REG_BIT(1) #define GEN7_ROW_CHICKEN2 _MMIO(0xe4f4) + +#define GEN8_ROW_CHICKEN2 _MMIO(0xe4f4) #define GEN12_DISABLE_READ_SUPPRESSION REG_BIT(15) #define GEN12_DISABLE_EARLY_READ REG_BIT(14) #define GEN12_ENABLE_LARGE_GRF_MODE REG_BIT(12) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index ff5220694f46..7a0f0146ee01 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -295,10 +295,10 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine, * Also see the related UCGTCL1 write in bdw_init_clock_gating() * to disable EUTC clock gating. */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, + wa_masked_en(wal, GEN8_ROW_CHICKEN2, DOP_CLOCK_GATING_DISABLE); - wa_masked_en(wal, HALF_SLICE_CHICKEN3, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, GEN8_SAMPLER_POWER_BYPASS_DIS); wa_masked_en(wal, HDC_CHICKEN0, @@ -386,7 +386,7 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, IS_KABYLAKE(i915) || IS_COFFEELAKE(i915) || IS_COMETLAKE(i915)) - wa_masked_en(wal, HALF_SLICE_CHICKEN3, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, GEN8_SAMPLER_POWER_BYPASS_DIS); /* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */ @@ -490,7 +490,7 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:kbl */ - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } @@ -514,7 +514,7 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:cfl */ - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } @@ -1517,7 +1517,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS); /* Wa_14015795083 */ - wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -1526,7 +1526,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) pvc_init_mcr(gt, wal); /* Wa_14015795083 */ - wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -2117,7 +2117,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) { /* Wa_14013392000:dg2_g11 */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) || @@ -2163,7 +2163,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) DISABLE_128B_EVICTION_COMMAND_UDW); /* Wa_22012856258:dg2 */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_READ_SUPPRESSION); /* @@ -2260,7 +2260,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); /* * Wa_1407928979:tgl A* @@ -2289,7 +2289,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, + wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_PUSH_CONST_DEREF_HOLD_DIS); /* @@ -2456,7 +2456,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_HASWELL(i915)) { /* WaSampleCChickenBitEnable:hsw */ wa_masked_en(wal, - HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE); + HSW_HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE); wa_masked_dis(wal, CACHE_MODE_0_GEN7, @@ -2754,7 +2754,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); /* Wa_14010449647:xehpsdv */ - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); /* Wa_18011725039:xehpsdv */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c index 8f1165146013..9495a7928bc8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c @@ -244,8 +244,8 @@ struct __ext_steer_reg { }; static const struct __ext_steer_reg xe_extregs[] = { - {"GEN7_SAMPLER_INSTDONE", GEN7_SAMPLER_INSTDONE}, - {"GEN7_ROW_INSTDONE", GEN7_ROW_INSTDONE} + {"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE}, + {"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE} }; static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index a0372735cddb..9229243992c2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -35,7 +35,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore) if (GRAPHICS_VER(uncore->i915) == 9) { /* DOP Clock Gating Enable for GuC clocks */ - intel_uncore_rmw(uncore, GEN7_MISCCPCTL, + intel_uncore_rmw(uncore, GEN8_MISCCPCTL, 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE); /* allows for 5us (in 10ns units) before GT can go to RC6 */ diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c index daac2050d77d..700cc9688f47 100644 --- a/drivers/gpu/drm/i915/gvt/handlers.c +++ b/drivers/gpu/drm/i915/gvt/handlers.c @@ -2257,7 +2257,7 @@ static int init_generic_mmio_info(struct intel_gvt *gvt) MMIO_DFH(_MMIO(0x2438), D_ALL, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x243c), D_ALL, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x7018), D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); - MMIO_DFH(HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); + MMIO_DFH(HSW_HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); MMIO_DFH(GEN7_HALF_SLICE_CHICKEN1, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); /* display */ diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c index 1c6e941c9666..d177884d8f7d 100644 --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -111,7 +111,7 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = { {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */ {RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */ {RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */ - {RCS0, HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */ + {RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */ {RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */ {RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */ {RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */ diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c index 8279dc580a3e..638b77d64bf4 100644 --- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c +++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c @@ -102,7 +102,7 @@ static int iterate_generic_mmio(struct intel_gvt_mmio_table_iter *iter) MMIO_D(_MMIO(0x2438)); MMIO_D(_MMIO(0x243c)); MMIO_D(_MMIO(0x7018)); - MMIO_D(HALF_SLICE_CHICKEN3); + MMIO_D(HSW_HALF_SLICE_CHICKEN3); MMIO_D(GEN7_HALF_SLICE_CHICKEN1); /* display */ MMIO_F(_MMIO(0x60220), 0x20); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 2595ec5aeb77..c694433a7da5 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4339,8 +4339,8 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv, u32 val; /* WaTempDisableDOPClkGating:bdw */ - misccpctl = intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL); - intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE); + misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL); + intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE); val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1); val &= ~L3_PRIO_CREDITS_MASK; @@ -4354,7 +4354,7 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv, */ intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1); udelay(1); - intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl); + intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl); } static void icl_init_clock_gating(struct drm_i915_private *dev_priv) @@ -4514,8 +4514,8 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv) gen9_init_clock_gating(dev_priv); /* WaDisableDopClockGating:skl */ - intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL) & - ~GEN7_DOP_CLOCK_GATE_ENABLE); + intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) & + ~GEN8_DOP_CLOCK_GATE_ENABLE); /* WAC6entrylatency:skl */ intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) | From patchwork Sat Oct 1 00:45:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D3D8C433FE for ; Sat, 1 Oct 2022 00:46:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2BBFE10E495; Sat, 1 Oct 2022 00:46:27 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 72B4E10E495; Sat, 1 Oct 2022 00:46:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585182; x=1696121182; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E88gVA9hqYe5JDzwGF6tdG2Vo+AMQtpf/zeW1Gk2FoM=; b=JJZAm9mOeqli6O3+J7DSwXjEGs3yicLjGsBDOD7acFeJkVPAzmz4THU7 MlaSsFv2DVdHLWSmYLll5rYEuDLxlpdktzaJwhRSg3mXIEx8dohbQqrl0 qGAN9BALjsbK1OtCTIV3OP964fAyxzbESvwDPkwPnQjZtLvGPuLONuHKg eE7FTioq+SRr39v2Lc39BzqkJSAVTg/Od2TLygm/LbtQeAD9Jgm8P/kXz rcKsWg3eRXJfcy9+csKdTXPNxSb4llXME141zJNpiN9LLHRy3rqSVZyiz JiurYoQ7GFr810T8w9PsF9/wzYWuiXppv8TVv6cu0DhfguNmo57HNvXdg A==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607925" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607925" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477616" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477616" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 02/14] drm/i915/xehp: Create separate reg definitions for new MCR registers Date: Fri, 30 Sep 2022 17:45:38 -0700 Message-Id: <20221001004550.3031431-3-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Starting in Xe_HP, several registers our driver works with have been converted from singleton registers into replicated registers with multicast behavior. Although the registers are still located at the same MMIO offsets as on previous platforms, let's duplicate the register definitions in preparation for upcoming patches that will handle multicast registers in a special manner. The registers that are now replicated on Xe_HP are: * PAT_INDEX (mslice replication) * FF_MODE2 (gslice replication) * COMMON_SLICE_CHICKEN3 (gslice replication) * SLICE_COMMON_ECO_CHICKEN1 (gslice replication) * SLICE_UNIT_LEVEL_CLKGATE (gslice replication) * LNCFCMOCS (lncf replication) Note that there are a couple places in selftest_mocs.c where the gen9 version of LNCFCMOCS is still used without regards for which platform we're on. Those cases are just doing an offset lookup and not issuing any CPU reads/writes of the register, so the potentially multicast nature of the register doesn't come into play. v2: - Add commit message note about the unconditional GEN9_LNCFCMOCS usage in selftest_mocs. (Bala) - Include some additional TLB registers. Bspec: 66534 Cc: Balasubramani Vivekanandan Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt.c | 18 ++++++++++++-- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 26 +++++++++++++++------ drivers/gpu/drm/i915/gt/intel_gtt.c | 22 ++++++++++++++--- drivers/gpu/drm/i915/gt/intel_gtt.h | 2 +- drivers/gpu/drm/i915/gt/intel_mocs.c | 5 +++- drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 +++++++++---------- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 7 ++++-- 8 files changed, 78 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index b31fe0fb013f..af5b98b56a49 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -986,7 +986,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt) ggtt->vm.pte_encode = gen8_ggtt_pte_encode; - setup_private_pat(ggtt->vm.gt->uncore); + setup_private_pat(ggtt->vm.gt); return ggtt_probe_common(ggtt, size); } @@ -1302,7 +1302,7 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt) wbinvd_on_all_cpus(); if (GRAPHICS_VER(ggtt->vm.i915) >= 8) - setup_private_pat(ggtt->vm.gt->uncore); + setup_private_pat(ggtt->vm.gt); intel_ggtt_restore_fences(ggtt); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index b367cfff48d5..445e171940fa 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -314,7 +314,11 @@ static void gen8_check_faults(struct intel_gt *gt) i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg; u32 fault; - if (GRAPHICS_VER(gt->i915) >= 12) { + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) { + fault_reg = XEHP_RING_FAULT_REG; + fault_data0_reg = XEHP_FAULT_TLB_DATA0; + fault_data1_reg = XEHP_FAULT_TLB_DATA1; + } else if (GRAPHICS_VER(gt->i915) >= 12) { fault_reg = GEN12_RING_FAULT_REG; fault_data0_reg = GEN12_FAULT_TLB_DATA0; fault_data1_reg = GEN12_FAULT_TLB_DATA1; @@ -990,6 +994,13 @@ static void mmio_invalidate_full(struct intel_gt *gt) [COPY_ENGINE_CLASS] = GEN12_BLT_TLB_INV_CR, [COMPUTE_CLASS] = GEN12_COMPCTX_TLB_INV_CR, }; + static const i915_reg_t xehp_regs[] = { + [RENDER_CLASS] = XEHP_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS] = XEHP_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS] = XEHP_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS] = XEHP_BLT_TLB_INV_CR, + [COMPUTE_CLASS] = XEHP_COMPCTX_TLB_INV_CR, + }; struct drm_i915_private *i915 = gt->i915; struct intel_uncore *uncore = gt->uncore; struct intel_engine_cs *engine; @@ -998,7 +1009,10 @@ static void mmio_invalidate_full(struct intel_gt *gt) const i915_reg_t *regs; unsigned int num = 0; - if (GRAPHICS_VER(i915) == 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + regs = xehp_regs; + num = ARRAY_SIZE(xehp_regs); + } else if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; num = ARRAY_SIZE(gen12_regs); } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index ba4ce668042c..68f45a748712 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -333,6 +333,7 @@ #define GEN7_TLB_RD_ADDR _MMIO(0x4700) #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) +#define XEHP_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) #define XEHP_TILE0_ADDR_RANGE _MMIO(0x4900) #define XEHP_TILE_LMEM_RANGE_SHIFT 8 @@ -391,7 +392,8 @@ #define DIS_OVER_FETCH_CACHE REG_BIT(1) #define DIS_MULT_MISS_RD_SQUASH REG_BIT(0) -#define FF_MODE2 _MMIO(0x6604) +#define GEN12_FF_MODE2 _MMIO(0x6604) +#define XEHP_FF_MODE2 _MMIO(0x6604) #define FF_MODE2_GS_TIMER_MASK REG_GENMASK(31, 24) #define FF_MODE2_GS_TIMER_224 REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224) #define FF_MODE2_TDS_TIMER_MASK REG_GENMASK(23, 16) @@ -446,6 +448,7 @@ #define GEN8_HDC_CHICKEN1 _MMIO(0x7304) #define GEN11_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) +#define XEHP_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) #define DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN REG_BIT(12) #define XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE REG_BIT(12) #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) @@ -459,10 +462,9 @@ #define DISABLE_PIXEL_MASK_CAMMING (1 << 14) #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) -#define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) - -#define SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) +#define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) +#define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) #define GEN9_SLICE_PGCTL_ACK(slice) _MMIO(0x804c + (slice) * 0x4) #define GEN10_SLICE_PGCTL_ACK(slice) _MMIO(0x804c + ((slice) / 3) * 0x34 + \ @@ -707,7 +709,8 @@ #define GAMTLBVEBOX0_CLKGATE_DIS REG_BIT(16) #define LTCDD_CLKGATE_DIS REG_BIT(10) -#define SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) +#define GEN11_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) +#define XEHP_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) #define SARBUNIT_CLKGATE_DIS (1 << 5) #define RCCUNIT_CLKGATE_DIS (1 << 7) #define MSCUNIT_CLKGATE_DIS (1 << 10) @@ -722,7 +725,7 @@ #define VSUNIT_CLKGATE_DIS_TGL REG_BIT(19) #define PSDUNIT_CLKGATE_DIS REG_BIT(5) -#define SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) +#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) #define DSS_ROUTER_CLKGATE_DIS REG_BIT(28) #define GWUNIT_CLKGATE_DIS REG_BIT(16) @@ -947,7 +950,8 @@ /* MOCS (Memory Object Control State) registers */ #define GEN9_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) /* L3 Cache Control */ -#define GEN9_LNCFCMOCS_REG_COUNT 32 +#define XEHP_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) /* L3 Cache Control */ +#define LNCFCMOCS_REG_COUNT 32 #define GEN7_L3CNTLREG3 _MMIO(0xb024) @@ -1039,11 +1043,14 @@ #define GEN9_BLT_MOCS(i) _MMIO(__GEN9_BCS0_MOCS0 + (i) * 4) #define GEN12_FAULT_TLB_DATA0 _MMIO(0xceb8) +#define XEHP_FAULT_TLB_DATA0 _MMIO(0xceb8) #define GEN12_FAULT_TLB_DATA1 _MMIO(0xcebc) +#define XEHP_FAULT_TLB_DATA1 _MMIO(0xcebc) #define FAULT_VA_HIGH_BITS (0xf << 0) #define FAULT_GTT_SEL (1 << 4) #define GEN12_RING_FAULT_REG _MMIO(0xcec4) +#define XEHP_RING_FAULT_REG _MMIO(0xcec4) #define GEN8_RING_FAULT_ENGINE_ID(x) (((x) >> 12) & 0x7) #define RING_FAULT_GTTSEL_MASK (1 << 11) #define RING_FAULT_SRCID(x) (((x) >> 3) & 0xff) @@ -1051,10 +1058,15 @@ #define RING_FAULT_VALID (1 << 0) #define GEN12_GFX_TLB_INV_CR _MMIO(0xced8) +#define XEHP_GFX_TLB_INV_CR _MMIO(0xced8) #define GEN12_VD_TLB_INV_CR _MMIO(0xcedc) +#define XEHP_VD_TLB_INV_CR _MMIO(0xcedc) #define GEN12_VE_TLB_INV_CR _MMIO(0xcee0) +#define XEHP_VE_TLB_INV_CR _MMIO(0xcee0) #define GEN12_BLT_TLB_INV_CR _MMIO(0xcee4) +#define XEHP_BLT_TLB_INV_CR _MMIO(0xcee4) #define GEN12_COMPCTX_TLB_INV_CR _MMIO(0xcf04) +#define XEHP_COMPCTX_TLB_INV_CR _MMIO(0xcf04) #define GEN12_MERT_MOD_CTRL _MMIO(0xcf28) #define RENDER_MOD_CTRL _MMIO(0xcf2c) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 2eaeba14319e..5ce98c30728a 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -15,6 +15,7 @@ #include "i915_trace.h" #include "i915_utils.h" #include "intel_gt.h" +#include "intel_gt_mcr.h" #include "intel_gt_regs.h" #include "intel_gtt.h" @@ -493,6 +494,18 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore) intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB); } +static void xehp_setup_private_ppat(struct intel_gt *gt) +{ + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); +} + static void icl_setup_private_ppat(struct intel_uncore *uncore) { intel_uncore_write(uncore, @@ -585,13 +598,16 @@ static void chv_setup_private_ppat(struct intel_uncore *uncore) intel_uncore_write(uncore, GEN8_PRIVATE_PAT_HI, upper_32_bits(pat)); } -void setup_private_pat(struct intel_uncore *uncore) +void setup_private_pat(struct intel_gt *gt) { - struct drm_i915_private *i915 = uncore->i915; + struct intel_uncore *uncore = gt->uncore; + struct drm_i915_private *i915 = gt->i915; GEM_BUG_ON(GRAPHICS_VER(i915) < 8); - if (GRAPHICS_VER(i915) >= 12) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_setup_private_ppat(gt); + else if (GRAPHICS_VER(i915) >= 12) tgl_setup_private_ppat(uncore); else if (GRAPHICS_VER(i915) >= 11) icl_setup_private_ppat(uncore); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index c0ca53cba9f0..5236c60f2a68 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -667,7 +667,7 @@ void ppgtt_unbind_vma(struct i915_address_space *vm, void gtt_write_workarounds(struct intel_gt *gt); -void setup_private_pat(struct intel_uncore *uncore); +void setup_private_pat(struct intel_gt *gt); int i915_vm_alloc_pt_stash(struct i915_address_space *vm, struct i915_vm_pt_stash *stash, diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index c6ebe2781076..06643701bf24 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -608,7 +608,10 @@ static void init_l3cc_table(struct intel_uncore *uncore, u32 l3cc; for_each_l3cc(l3cc, table, i) - intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc); + if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50)) + intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc); + else + intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc); } void intel_mocs_init_engine(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 7a0f0146ee01..7c237eeeac85 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -572,7 +572,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); wa_add(wal, - FF_MODE2, + XEHP_FF_MODE2, FF_MODE2_TDS_TIMER_MASK, FF_MODE2_TDS_TIMER_128, 0, false); @@ -599,7 +599,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine, * verification is ignored. */ wa_add(wal, - FF_MODE2, + GEN12_FF_MODE2, FF_MODE2_TDS_TIMER_MASK, FF_MODE2_TDS_TIMER_128, 0, false); @@ -637,7 +637,7 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine, * to Wa_1608008084. */ wa_add(wal, - FF_MODE2, + GEN12_FF_MODE2, FF_MODE2_GS_TIMER_MASK, FF_MODE2_GS_TIMER_224, 0, false); @@ -670,7 +670,7 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine, if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) { /* Wa_14010469329:dg2_g10 */ - wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3, + wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE); /* @@ -678,12 +678,12 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine, * Wa_22010613112:dg2_g10 * Wa_14010698770:dg2_g10 */ - wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3, + wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); } /* Wa_16013271637:dg2 */ - wa_masked_en(wal, SLICE_COMMON_ECO_CHICKEN1, + wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1, MSC_MSAA_REODER_BUF_BYPASS_DISABLE); /* Wa_14014947963:dg2 */ @@ -1265,14 +1265,14 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1406680159:icl,ehl */ wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE, + GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, GWUNIT_CLKGATE_DIS); /* Wa_1607087056:icl,ehl,jsl */ if (IS_ICELAKE(i915) || IS_JSL_EHL_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, - SLICE_UNIT_LEVEL_CLKGATE, + GEN11_SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); /* @@ -1332,7 +1332,7 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1607087056:tgl also know as BUG:1409180338 */ if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, - SLICE_UNIT_LEVEL_CLKGATE, + GEN11_SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); /* Wa_1408615072:tgl[a0] */ @@ -1351,7 +1351,7 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1607087056:dg1 */ if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, - SLICE_UNIT_LEVEL_CLKGATE, + GEN11_SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); /* Wa_1409420604:dg1 */ @@ -1455,7 +1455,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) CG3DDISCFEG_CLKGATE_DIS); /* Wa_14011006942:dg2 */ - wa_write_or(wal, SUBSLICE_UNIT_LEVEL_CLKGATE, + wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, DSS_ROUTER_CLKGATE_DIS); } @@ -1467,7 +1467,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS); /* Wa_14011371254:dg2_g10 */ - wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); + wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); /* Wa_14011431319:dg2_g10 */ wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS | diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 657f0beb8e06..cc357fa0c270 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -373,8 +373,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset, false); /* add in local MOCS registers */ - for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++) - ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); + for (i = 0; i < LNCFCMOCS_REG_COUNT; i++) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false); + else + ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); return ret ? -1 : 0; } From patchwork Sat Oct 1 00:45:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 14272C433FE for ; Sat, 1 Oct 2022 00:46:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6B96610E49A; Sat, 1 Oct 2022 00:46:30 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 666BE10E49A; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ETFBzxia7PYQiQmClNGw3g/LnwleYkSloKnVrQYIFGo=; b=RGLiTsP7XHyHy51yHHBBGTuNp9v/09Bfo2NFFHSnu39jCVsYv7u4iWGc 0qquuKkSIu3FT1ipeMv9yqKa8DhMV3HfCW8jrQ4QmzNGryxIZEqMT2XUD cc47cTFpltHmQJhENraywx+1iqbQsT3i95G5Vf6Yy1H+zhK7GkusssaNP BxIgxtzpkUfFGdviRmhjbzDYPDUezVDp07OSI9v1wvZpTNmkY8qATPLx4 i3bUxMv1oHnTN7DRLAgin7LMyjWTYxMWCAU2cNQDynBpEawMjmLbWbVhl w0nJEF3B4nQK9cBouweTp86V1XK2DvfpdYMPLl1/jDaMSsBKSM0cM2W7c Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607929" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607929" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477619" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477619" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 03/14] drm/i915/gt: Drop a few unused register definitions Date: Fri, 30 Sep 2022 17:45:39 -0700 Message-Id: <20221001004550.3031431-4-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Let's drop a few register definitions that are unused anywhere in the driver today. Since the referenced offsets are part of what is now considered a multicast register region, the current definitions would not be correct for use on any future platform. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 68f45a748712..3a50e8e966aa 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -454,13 +454,6 @@ #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) #define GEN12_DISABLE_CPS_AWARE_COLOR_PIPE REG_BIT(9) -/* GEN9 chicken */ -#define SLICE_ECO_CHICKEN0 _MMIO(0x7308) -#define PIXEL_MASK_CAMMING_DISABLE (1 << 14) - -#define GEN9_SLICE_COMMON_ECO_CHICKEN0 _MMIO(0x7308) -#define DISABLE_PIXEL_MASK_CAMMING (1 << 14) - #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) #define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) @@ -967,11 +960,6 @@ #define GEN7_L3LOG(slice, i) _MMIO(0xb070 + (slice) * 0x200 + (i) * 4) #define GEN7_L3LOG_SIZE 0x80 -#define GEN10_SCRATCH_LNCF2 _MMIO(0xb0a0) -#define PMFLUSHDONE_LNICRSDROP (1 << 20) -#define PMFLUSH_GAPL3UNBLOCK (1 << 21) -#define PMFLUSHDONE_LNEBLK (1 << 22) - #define XEHP_L3NODEARBCFG _MMIO(0xb0b4) #define XEHP_LNESPARE REG_BIT(19) @@ -986,9 +974,6 @@ #define L3_HIGH_PRIO_CREDITS(x) (((x) >> 1) << 14) #define L3_PRIO_CREDITS_MASK ((0x1f << 19) | (0x1f << 14)) -#define GEN10_L3_CHICKEN_MODE_REGISTER _MMIO(0xb114) -#define GEN11_I2M_WRITE_DISABLE (1 << 28) - #define GEN8_L3SQCREG4 _MMIO(0xb118) #define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6) #define GEN8_LQSC_RO_PERF_DIS (1 << 27) @@ -1191,8 +1176,6 @@ #define SARB_CHICKEN1 _MMIO(0xe90c) #define COMP_CKN_IN REG_GENMASK(30, 29) -#define GEN7_HALF_SLICE_CHICKEN1_GT2 _MMIO(0xf100) - #define GEN7_ROW_CHICKEN2_GT2 _MMIO(0xf4f4) #define DOP_CLOCK_GATING_DISABLE (1 << 0) #define PUSH_CONSTANT_DEREF_DISABLE (1 << 8) From patchwork Sat Oct 1 00:45:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE058C433F5 for ; Sat, 1 Oct 2022 00:47:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8ED1210EDFE; Sat, 1 Oct 2022 00:46:35 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id ABB5510EDEF; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b+aD2yjtf2KCOHCVbis/LUePOyrBmUrSGGajN7t0jOc=; b=aZSKxyI8Adm/ay0Q8ED9qRVXHmrQAVpbZ302UywzzmY1anGVDTR4fDdU Rbh5no4nraeN+nhr10DS0nbAgTdWjZLsKXZybWzM5azYJdi25zGIPaPj/ DRu8lgqo+lEULA4HXSF1Fn9VnvJMIcHCLbGGVDpY5hlUhwmaYIrfIiXHc JzYynPunG6PmMt3U71A/FWBASeQa9boKa8FfbkyHUxlMFEKYcaNzZ2CVD pbm6cOf5cpwVoLDbqZ99RymyD4l9OUukXodwS4dJvKzW8YzqEOzZmZoRx 72sc/zc3eo3Em826y39KK3bSVu9MZYruybYqasRzb2QRlF7fptMpqbWuV w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607935" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607935" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477622" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477622" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 04/14] drm/i915/gt: Correct prefix on a few registers Date: Fri, 30 Sep 2022 17:45:40 -0700 Message-Id: <20221001004550.3031431-5-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We have a few registers that have existed for several hardware generations, but are only used by the driver on Xe_HP and beyond. In cases where the Xe_HP version of the register is now replicated and uses multicast behavior, but earlier generations were singleton, let's change the register prefix to "XEHP_" to help clarify that we're using the newer multicast form of the register. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 8 ++++---- drivers/gpu/drm/i915/gt/intel_workarounds.c | 10 +++++----- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 3a50e8e966aa..86757dd22c13 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -486,7 +486,7 @@ #define GEN8_RC6_CTX_INFO _MMIO(0x8504) -#define GEN12_SQCM _MMIO(0x8724) +#define XEHP_SQCM _MMIO(0x8724) #define EN_32B_ACCESS REG_BIT(30) #define HSW_IDICR _MMIO(0x9008) @@ -989,7 +989,7 @@ #define GEN11_SCRATCH2 _MMIO(0xb140) #define GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE (1 << 19) -#define GEN11_L3SQCREG5 _MMIO(0xb158) +#define XEHP_L3SQCREG5 _MMIO(0xb158) #define L3_PWM_TIMER_INIT_VAL_MASK REG_GENMASK(9, 0) #define MLTICTXCTL _MMIO(0xb170) @@ -1053,7 +1053,7 @@ #define GEN12_COMPCTX_TLB_INV_CR _MMIO(0xcf04) #define XEHP_COMPCTX_TLB_INV_CR _MMIO(0xcf04) -#define GEN12_MERT_MOD_CTRL _MMIO(0xcf28) +#define XEHP_MERT_MOD_CTRL _MMIO(0xcf28) #define RENDER_MOD_CTRL _MMIO(0xcf2c) #define COMP_MOD_CTRL _MMIO(0xcf30) #define VDBX_MOD_CTRL _MMIO(0xcf34) @@ -1155,7 +1155,7 @@ #define EU_PERF_CNTL1 _MMIO(0xe558) #define EU_PERF_CNTL5 _MMIO(0xe55c) -#define GEN12_HDC_CHICKEN0 _MMIO(0xe5f0) +#define XEHP_HDC_CHICKEN0 _MMIO(0xe5f0) #define LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK REG_GENMASK(13, 11) #define ICL_HDC_MODE _MMIO(0xe5f4) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 7c237eeeac85..7bb878581d00 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -569,7 +569,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP); - wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, + wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); wa_add(wal, XEHP_FF_MODE2, @@ -1514,7 +1514,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) * recommended tuning settings documented in the bspec's * performance guide section. */ - wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS); + wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); /* Wa_14015795083 */ wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); @@ -2170,7 +2170,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_22010960976:dg2 * Wa_14013347512:dg2 */ - wa_masked_dis(wal, GEN12_HDC_CHICKEN0, + wa_masked_dis(wal, XEHP_HDC_CHICKEN0, LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK); } @@ -2223,7 +2223,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) || IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) { /* Wa_14012362059:dg2 */ - wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); } if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) || @@ -2764,7 +2764,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li } /* Wa_14012362059:xehpsdv */ - wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); /* Wa_14014368820:xehpsdv */ wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS | From patchwork Sat Oct 1 00:45:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1F37EC433FE for ; Sat, 1 Oct 2022 00:47:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0116110EDF5; Sat, 1 Oct 2022 00:46:37 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9B32E10E4A7; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zxCyASFbj71ZpPe16+aNvvynmFM0KTbbX96HLla1SlI=; b=c6bFHeLYmR0cYDoFqFLJLvSAChxPDIrcdO/8MJ/0ZkacHltP5XvvCZnz ujXbsnWSkCUiHRRwRFtMiETMWbrBMtG1GRg2s3YuGeuKLnST85ZnvDOkK lr1GIinU857bTHb6qR7YWLiy7nGDvpE9gC0brKvK7I02ff6jGIJngtt8s brDSsKpsXw1N0ZOnpOq1GMXr/laG/WvvaQFBun8pV/1C/FBla55xUwqRx 63IpUycTGUx6UWph+xjfV38ehedGqaYSGM7IU5l+oFYCeKdQ+aFvfr++a j5E/il6y6OqcXOxbHRflzNm9//m+6t7iO3fB3Y45rk5msn3Fsi1ZjZOfS w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607927" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607927" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477625" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477625" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 05/14] drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation Date: Fri, 30 Sep 2022 17:45:41 -0700 Message-Id: <20221001004550.3031431-6-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" There are cases where we wish to read from any non-terminated MCR register instance (or the primary instance in the case of GAM ranges), clear/set some bits, and then write the value back out to the register in a multicast manner. Adding a "multicast RMW" will avoid the need to open-code this. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 24 ++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_mcr.h | 3 +++ 2 files changed, 27 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index a2047a68ea7a..962d3e974384 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -302,6 +302,30 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va intel_uncore_write_fw(gt->uncore, reg, value); } +/** + * intel_gt_mcr_multicast_rmw - Performs a multicast RMW operations + * @gt: GT structure + * @reg: the MCR register to read and write + * @clear: bits to clear during RMW + * @set: bits to set during RMW + * + * Performs a read-modify-write on an MCR register in a multicast manner. + * This operation only makes sense on MCR registers where all instances are + * expected to have the same value. The read will target any non-terminated + * instance and the write will be applied to all instances. + * + * This function assumes the caller is already holding any necessary forcewake + * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should + * be obtained automatically. + */ +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, + u32 clear, u32 set) +{ + u32 val = intel_gt_mcr_read_any(gt, reg); + + intel_gt_mcr_multicast_write(gt, reg, (val & ~clear) | set); +} + /* * reg_needs_read_steering - determine whether a register read requires * explicit steering diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h index 77a8b11c287d..e47e75e66835 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h @@ -24,6 +24,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt, void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value); +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, + u32 clear, u32 set); + void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, i915_reg_t reg, u8 *group, u8 *instance); From patchwork Sat Oct 1 00:45:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 98427C433FE for ; Sat, 1 Oct 2022 00:47:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 97BD010EE01; Sat, 1 Oct 2022 00:46:43 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id D161310EDF3; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MWmYQejfGrh/T2TU2UgZu/SEYh2twhb2uhi56qcGliQ=; b=FRAEus8YPgwcdGbXo9/SU5k6BeEudxZHsZNCJAFpZ7YppUdhTHR9B5Ak g2V9qTtNmSFc7qf9Exw+25xKDdUWLfD5799yKhkbCphLOz5CKfexnCqQ0 IOjyfZZq5ZrDYYXDgfUtJmGiOSMJ6wLhIWmv3d0ZFXITdPFuY9MgUTl9I G38OcQ1QJkH9Ib5shH0d3NEF9h2FfqnWlglsZDjt7z044MEckuUpXLCBi wCOqtTJWhb2odd7fgt7n3yJ/txyVPZT78QV4AyPdHHswxI5ehOMIolDOi 4TBMs6mVA3dfdNiwLIjCCxhy/x8CabMA6+O46usWshpsWIzMpSCHfcZgn A==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607936" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607936" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477629" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477629" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 06/14] drm/i915/xehp: Check for faults on primary GAM Date: Fri, 30 Sep 2022 17:45:42 -0700 Message-Id: <20221001004550.3031431-7-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Xe_HP the fault registers are now in a multicast register range. However as part of the GAM these registers follow special rules and we need only read from the "primary" GAM's instance to get the information we need. So a single intel_gt_mcr_read_any() (which will automatically steer to the primary GAM) is sufficient; we don't need to loop over each instance of the MCR register. v2: - Update more instances of fault registers. (Bala) Cc: Balasubramani Vivekanandan Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 52 +++++++++++++++++++++++---- drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++-- 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 445e171940fa..e14f159ad9fc 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -270,7 +270,11 @@ intel_gt_clear_error_registers(struct intel_gt *gt, I915_MASTER_ERROR_INTERRUPT); } - if (GRAPHICS_VER(i915) >= 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_gt_mcr_multicast_rmw(gt, XEHP_RING_FAULT_REG, + RING_FAULT_VALID, 0); + intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); + } else if (GRAPHICS_VER(i915) >= 12) { rmw_clear(uncore, GEN12_RING_FAULT_REG, RING_FAULT_VALID); intel_uncore_posting_read(uncore, GEN12_RING_FAULT_REG); } else if (GRAPHICS_VER(i915) >= 8) { @@ -308,17 +312,49 @@ static void gen6_check_faults(struct intel_gt *gt) } } +static void xehp_check_faults(struct intel_gt *gt) +{ + u32 fault; + + /* + * Although the fault register now lives in an MCR register range, + * the GAM registers are special and we only truly need to read + * the "primary" GAM instance rather than handling each instance + * individually. intel_gt_mcr_read_any() will automatically steer + * toward the primary instance. + */ + fault = intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); + if (fault & RING_FAULT_VALID) { + u32 fault_data0, fault_data1; + u64 fault_addr; + + fault_data0 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA0); + fault_data1 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA1); + + fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) | + ((u64)fault_data0 << 12); + + drm_dbg(>->i915->drm, "Unexpected fault\n" + "\tAddr: 0x%08x_%08x\n" + "\tAddress space: %s\n" + "\tEngine ID: %d\n" + "\tSource ID: %d\n" + "\tType: %d\n", + upper_32_bits(fault_addr), lower_32_bits(fault_addr), + fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT", + GEN8_RING_FAULT_ENGINE_ID(fault), + RING_FAULT_SRCID(fault), + RING_FAULT_FAULT_TYPE(fault)); + } +} + static void gen8_check_faults(struct intel_gt *gt) { struct intel_uncore *uncore = gt->uncore; i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg; u32 fault; - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) { - fault_reg = XEHP_RING_FAULT_REG; - fault_data0_reg = XEHP_FAULT_TLB_DATA0; - fault_data1_reg = XEHP_FAULT_TLB_DATA1; - } else if (GRAPHICS_VER(gt->i915) >= 12) { + if (GRAPHICS_VER(gt->i915) >= 12) { fault_reg = GEN12_RING_FAULT_REG; fault_data0_reg = GEN12_FAULT_TLB_DATA0; fault_data1_reg = GEN12_FAULT_TLB_DATA1; @@ -358,7 +394,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt) struct drm_i915_private *i915 = gt->i915; /* From GEN8 onwards we only have one 'All Engine Fault Register' */ - if (GRAPHICS_VER(i915) >= 8) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_check_faults(gt); + else if (GRAPHICS_VER(i915) >= 8) gen8_check_faults(gt); else if (GRAPHICS_VER(i915) >= 6) gen6_check_faults(gt); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 9ea2fe34e7d3..f2d53edcd2ee 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1221,7 +1221,10 @@ static void engine_record_registers(struct intel_engine_coredump *ee) if (GRAPHICS_VER(i915) >= 6) { ee->rc_psmi = ENGINE_READ(engine, RING_PSMI_CTL); - if (GRAPHICS_VER(i915) >= 12) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + ee->fault_reg = intel_gt_mcr_read_any(engine->gt, + XEHP_RING_FAULT_REG); + else if (GRAPHICS_VER(i915) >= 12) ee->fault_reg = intel_uncore_read(engine->uncore, GEN12_RING_FAULT_REG); else if (GRAPHICS_VER(i915) >= 8) @@ -1820,7 +1823,12 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt) if (GRAPHICS_VER(i915) == 7) gt->err_int = intel_uncore_read(uncore, GEN7_ERR_INT); - if (GRAPHICS_VER(i915) >= 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + gt->fault_data0 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt, + XEHP_FAULT_TLB_DATA0); + gt->fault_data1 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt, + XEHP_FAULT_TLB_DATA1); + } else if (GRAPHICS_VER(i915) >= 12) { gt->fault_data0 = intel_uncore_read(uncore, GEN12_FAULT_TLB_DATA0); gt->fault_data1 = intel_uncore_read(uncore, From patchwork Sat Oct 1 00:45:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996172 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 13A2FC4332F for ; Sat, 1 Oct 2022 00:47:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D203F10EDF6; Sat, 1 Oct 2022 00:46:33 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 88FCB10E4A4; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZJij893k4g5E12XSKRc6qNWUUh0OXDrRnK9GCA9ePwc=; b=VLxCrpab5pZB2waJlr3vMyoUAXZjjiTTYk6Y53/n8QH+RA8Ppw+1QDPl W5oTBnRf+pQfr5izr/kXp1aIK4LXvIDqzXtHaNPbABJtkP7WRFM6lvk/i uTx9XUh56M6TYStJlaqjnI0gZUXL8mio0XwKxTUgciRqmk8pv60moBuM3 iQw410MjJMni9ep2QlAgiuoDNJ/FJgRrwE2wFXKs5zZ4E97/XKgMHGsoa PS4Ir5R1hVwY3TXfpa4963xmR3/tZYIxUvMdPKwpqAQiI9kmJDucR1VAZ LwQdgtcEXGwkRaPt6s6ns6eFfB5u54dJ7yfK3YhZq6KxGMgChZRPdZgv3 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607934" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607934" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477631" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477631" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 07/14] drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw() Date: Fri, 30 Sep 2022 17:45:43 -0700 Message-Id: <20221001004550.3031431-8-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Xe_HP has some MCR registers that need to be polled for completion of operations like TLB invalidation. Those registers are in the GAM range, which rolls up the status from each unit into the 'primary' instance's value. This makes it useful to have a dedicated 'wait for register' function that handles this on MCR registers, similar to the __intel_wait_for_register_fw() function we already have for regular registers. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 55 ++++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_mcr.h | 7 ++++ 2 files changed, 62 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index 962d3e974384..a2bc77c33835 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -564,3 +564,58 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, return; } } + +/** + * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected state + * @gt: GT structure + * @reg: the register to read + * @mask: mask to apply to register value + * @value: value to wait for + * @fast_timeout_us: fast timeout in microsecond for atomic/tight wait + * @slow_timeout_ms: slow timeout in millisecond + * + * This routine waits until the target register @reg contains the expected + * @value after applying the @mask, i.e. it waits until :: + * + * (intel_gt_mcr_read_any_fw(gt, reg) & mask) == value + * + * Otherwise, the wait will timeout after @slow_timeout_ms milliseconds. + * For atomic context @slow_timeout_ms must be zero and @fast_timeout_us + * must be not larger than 20,0000 microseconds. + * + * This function is basically an MCR-friendly version of + * __intel_wait_for_register_fw(). Generally this function will only be used + * on GAM registers which are a bit special --- although they're MCR registers, + * reads (e.g., waiting for status updates) are always directed to the primary + * instance. + * + * Note that this routine assumes the caller holds forcewake asserted, it is + * not suitable for very long waits. + * + * Return: 0 if the register matches the desired condition, or -ETIMEDOUT. + */ +int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, + i915_reg_t reg, + u32 mask, + u32 value, + unsigned int fast_timeout_us, + unsigned int slow_timeout_ms) +{ + u32 reg_value = 0; +#define done (((reg_value = intel_gt_mcr_read_any_fw(gt, reg)) & mask) == value) + int ret; + + /* Catch any overuse of this function */ + might_sleep_if(slow_timeout_ms); + GEM_BUG_ON(fast_timeout_us > 20000); + GEM_BUG_ON(!fast_timeout_us && !slow_timeout_ms); + + ret = -ETIMEDOUT; + if (fast_timeout_us && fast_timeout_us <= 20000) + ret = _wait_for_atomic(done, fast_timeout_us, 0); + if (ret && slow_timeout_ms) + ret = wait_for(done, slow_timeout_ms); + + return ret; +#undef done +} diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h index e47e75e66835..96979af01f2b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h @@ -37,6 +37,13 @@ void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt, void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, unsigned int *group, unsigned int *instance); +int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, + i915_reg_t reg, + u32 mask, + u32 value, + unsigned int fast_timeout_us, + unsigned int slow_timeout_ms); + /* * Helper for for_each_ss_steering loop. On pre-Xe_HP platforms, subslice * presence is determined by using the group/instance as direct lookups in the From patchwork Sat Oct 1 00:45:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996180 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39B41C43219 for ; Sat, 1 Oct 2022 00:47:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 64D1810EDFA; Sat, 1 Oct 2022 00:46:43 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 16B3F10EDF7; Sat, 1 Oct 2022 00:46:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585184; x=1696121184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dPe3eYLKlv2aTe/am/IVGyVhyybNB4NZdmKtw380WgU=; b=hkiAucOKA8g6D2Xiq/0bieN6jixxAs3m1AkVfUyPZ53xCjBCJi0SnNAR QS8xKK6hCzW17G38uNtn8HZdKA5iDO7HkEzTgnemT4ZpOq/7ZXSsZwYLt ZA884vgFouhJRsnzAV5iJRV//eCb/LzkepKMK0lT3tlDsG38NIBMaFqHw SaEE+EOQm3kEZADXAfhgd8vfdKhBRmchpbAS4B6i2TeiEYuj7fAbyvfi/ sckUVxpm7VB4Hw0NdbL4Uulv7eDAzPGPxp0u5ByKBtybr2+owY2Vc6hwX Kylg1jZ95XlTuvLHa45mdjm7HNLPCgbt4xgGR+1jKPz7qZiqwwY+ThtTs w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607932" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607932" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477634" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477634" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 08/14] drm/i915: Define MCR registers explicitly Date: Fri, 30 Sep 2022 17:45:44 -0700 Message-Id: <20221001004550.3031431-9-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than using the same _MMIO() macro to define MCR registers as singleton registers, let's use a new MCR_REG() macro to make it clear that these registers are special and should be handled accordingly. For now MCR_REG() will still generate an i915_reg_t with the given offset, but we'll change that in future patches. Bspec: 66673, 66696, 66534, 67609 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 134 ++++++++++++------------ 1 file changed, 68 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 86757dd22c13..901ea03b9fd8 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -8,6 +8,8 @@ #include "i915_reg_defs.h" +#define MCR_REG(offset) _MMIO(offset) + /* RPM unit config (Gen8+) */ #define RPM_CONFIG0 _MMIO(0xd00) #define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT 3 @@ -333,12 +335,12 @@ #define GEN7_TLB_RD_ADDR _MMIO(0x4700) #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) -#define XEHP_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) +#define XEHP_PAT_INDEX(index) MCR_REG(0x4800 + (index) * 4) -#define XEHP_TILE0_ADDR_RANGE _MMIO(0x4900) +#define XEHP_TILE0_ADDR_RANGE MCR_REG(0x4900) #define XEHP_TILE_LMEM_RANGE_SHIFT 8 -#define XEHP_FLAT_CCS_BASE_ADDR _MMIO(0x4910) +#define XEHP_FLAT_CCS_BASE_ADDR MCR_REG(0x4910) #define XEHP_CCS_BASE_SHIFT 8 #define GAMTARBMODE _MMIO(0x4a08) @@ -388,18 +390,18 @@ #define CHICKEN_RASTER_2 _MMIO(0x6208) #define TBIMR_FAST_CLIP REG_BIT(5) -#define VFLSKPD _MMIO(0x62a8) +#define VFLSKPD MCR_REG(0x62a8) #define DIS_OVER_FETCH_CACHE REG_BIT(1) #define DIS_MULT_MISS_RD_SQUASH REG_BIT(0) #define GEN12_FF_MODE2 _MMIO(0x6604) -#define XEHP_FF_MODE2 _MMIO(0x6604) +#define XEHP_FF_MODE2 MCR_REG(0x6604) #define FF_MODE2_GS_TIMER_MASK REG_GENMASK(31, 24) #define FF_MODE2_GS_TIMER_224 REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224) #define FF_MODE2_TDS_TIMER_MASK REG_GENMASK(23, 16) #define FF_MODE2_TDS_TIMER_128 REG_FIELD_PREP(FF_MODE2_TDS_TIMER_MASK, 4) -#define XEHPG_INSTDONE_GEOM_SVG _MMIO(0x666c) +#define XEHPG_INSTDONE_GEOM_SVG MCR_REG(0x666c) #define CACHE_MODE_0_GEN7 _MMIO(0x7000) /* IVB+ */ #define RC_OP_FLUSH_ENABLE (1 << 0) @@ -448,14 +450,14 @@ #define GEN8_HDC_CHICKEN1 _MMIO(0x7304) #define GEN11_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) -#define XEHP_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) +#define XEHP_COMMON_SLICE_CHICKEN3 MCR_REG(0x7304) #define DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN REG_BIT(12) #define XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE REG_BIT(12) #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) #define GEN12_DISABLE_CPS_AWARE_COLOR_PIPE REG_BIT(9) #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) -#define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) +#define XEHP_SLICE_COMMON_ECO_CHICKEN1 MCR_REG(0x731c) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) #define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) @@ -486,7 +488,7 @@ #define GEN8_RC6_CTX_INFO _MMIO(0x8504) -#define XEHP_SQCM _MMIO(0x8724) +#define XEHP_SQCM MCR_REG(0x8724) #define EN_32B_ACCESS REG_BIT(30) #define HSW_IDICR _MMIO(0x9008) @@ -647,7 +649,7 @@ #define GEN7_MISCCPCTL _MMIO(0x9424) #define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0) -#define GEN8_MISCCPCTL _MMIO(0x9424) +#define GEN8_MISCCPCTL MCR_REG(0x9424) #define GEN8_DOP_CLOCK_GATE_ENABLE REG_BIT(0) #define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1) #define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2) @@ -703,7 +705,7 @@ #define LTCDD_CLKGATE_DIS REG_BIT(10) #define GEN11_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) -#define XEHP_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) +#define XEHP_SLICE_UNIT_LEVEL_CLKGATE MCR_REG(0x94d4) #define SARBUNIT_CLKGATE_DIS (1 << 5) #define RCCUNIT_CLKGATE_DIS (1 << 7) #define MSCUNIT_CLKGATE_DIS (1 << 10) @@ -711,27 +713,27 @@ #define L3_CLKGATE_DIS REG_BIT(16) #define L3_CR2X_CLKGATE_DIS REG_BIT(17) -#define SCCGCTL94DC _MMIO(0x94dc) +#define SCCGCTL94DC MCR_REG(0x94dc) #define CG3DDISURB REG_BIT(14) #define UNSLICE_UNIT_LEVEL_CLKGATE2 _MMIO(0x94e4) #define VSUNIT_CLKGATE_DIS_TGL REG_BIT(19) #define PSDUNIT_CLKGATE_DIS REG_BIT(5) -#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) +#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE MCR_REG(0x9524) #define DSS_ROUTER_CLKGATE_DIS REG_BIT(28) #define GWUNIT_CLKGATE_DIS REG_BIT(16) -#define SUBSLICE_UNIT_LEVEL_CLKGATE2 _MMIO(0x9528) +#define SUBSLICE_UNIT_LEVEL_CLKGATE2 MCR_REG(0x9528) #define CPSSUNIT_CLKGATE_DIS REG_BIT(9) -#define SSMCGCTL9530 _MMIO(0x9530) +#define SSMCGCTL9530 MCR_REG(0x9530) #define RTFUNIT_CLKGATE_DIS REG_BIT(18) -#define GEN10_DFR_RATIO_EN_AND_CHICKEN _MMIO(0x9550) +#define GEN10_DFR_RATIO_EN_AND_CHICKEN MCR_REG(0x9550) #define DFR_DISABLE (1 << 9) -#define INF_UNIT_LEVEL_CLKGATE _MMIO(0x9560) +#define INF_UNIT_LEVEL_CLKGATE MCR_REG(0x9560) #define CGPSF_CLKGATE_DIS (1 << 3) #define MICRO_BP0_0 _MMIO(0x9800) @@ -943,7 +945,7 @@ /* MOCS (Memory Object Control State) registers */ #define GEN9_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) /* L3 Cache Control */ -#define XEHP_LNCFCMOCS(i) _MMIO(0xb020 + (i) * 4) /* L3 Cache Control */ +#define XEHP_LNCFCMOCS(i) MCR_REG(0xb020 + (i) * 4) /* L3 Cache Control */ #define LNCFCMOCS_REG_COUNT 32 #define GEN7_L3CNTLREG3 _MMIO(0xb024) @@ -960,10 +962,10 @@ #define GEN7_L3LOG(slice, i) _MMIO(0xb070 + (slice) * 0x200 + (i) * 4) #define GEN7_L3LOG_SIZE 0x80 -#define XEHP_L3NODEARBCFG _MMIO(0xb0b4) +#define XEHP_L3NODEARBCFG MCR_REG(0xb0b4) #define XEHP_LNESPARE REG_BIT(19) -#define GEN8_L3SQCREG1 _MMIO(0xb100) +#define GEN8_L3SQCREG1 MCR_REG(0xb100) /* * Note that on CHV the following has an off-by-one error wrt. to BSpec. * Using the formula in BSpec leads to a hang, while the formula here works @@ -974,28 +976,28 @@ #define L3_HIGH_PRIO_CREDITS(x) (((x) >> 1) << 14) #define L3_PRIO_CREDITS_MASK ((0x1f << 19) | (0x1f << 14)) -#define GEN8_L3SQCREG4 _MMIO(0xb118) +#define GEN8_L3SQCREG4 MCR_REG(0xb118) #define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6) #define GEN8_LQSC_RO_PERF_DIS (1 << 27) #define GEN8_LQSC_FLUSH_COHERENT_LINES (1 << 21) #define GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22) -#define GEN9_SCRATCH1 _MMIO(0xb11c) +#define GEN9_SCRATCH1 MCR_REG(0xb11c) #define EVICTION_PERF_FIX_ENABLE REG_BIT(8) -#define BDW_SCRATCH1 _MMIO(0xb11c) +#define BDW_SCRATCH1 MCR_REG(0xb11c) #define GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE (1 << 2) -#define GEN11_SCRATCH2 _MMIO(0xb140) +#define GEN11_SCRATCH2 MCR_REG(0xb140) #define GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE (1 << 19) -#define XEHP_L3SQCREG5 _MMIO(0xb158) +#define XEHP_L3SQCREG5 MCR_REG(0xb158) #define L3_PWM_TIMER_INIT_VAL_MASK REG_GENMASK(9, 0) -#define MLTICTXCTL _MMIO(0xb170) +#define MLTICTXCTL MCR_REG(0xb170) #define TDONRENDER REG_BIT(2) -#define XEHP_L3SCQREG7 _MMIO(0xb188) +#define XEHP_L3SCQREG7 MCR_REG(0xb188) #define BLEND_FILL_CACHING_OPT_DIS REG_BIT(3) #define XEHPC_L3SCRUB _MMIO(0xb18c) @@ -1003,7 +1005,7 @@ #define SCRUB_RATE_PER_BANK_MASK REG_GENMASK(2, 0) #define SCRUB_RATE_4B_PER_CLK REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6) -#define L3SQCREG1_CCS0 _MMIO(0xb200) +#define L3SQCREG1_CCS0 MCR_REG(0xb200) #define FLUSHALLNONCOH REG_BIT(5) #define GEN11_GLBLINVL _MMIO(0xb404) @@ -1028,14 +1030,14 @@ #define GEN9_BLT_MOCS(i) _MMIO(__GEN9_BCS0_MOCS0 + (i) * 4) #define GEN12_FAULT_TLB_DATA0 _MMIO(0xceb8) -#define XEHP_FAULT_TLB_DATA0 _MMIO(0xceb8) +#define XEHP_FAULT_TLB_DATA0 MCR_REG(0xceb8) #define GEN12_FAULT_TLB_DATA1 _MMIO(0xcebc) -#define XEHP_FAULT_TLB_DATA1 _MMIO(0xcebc) +#define XEHP_FAULT_TLB_DATA1 MCR_REG(0xcebc) #define FAULT_VA_HIGH_BITS (0xf << 0) #define FAULT_GTT_SEL (1 << 4) #define GEN12_RING_FAULT_REG _MMIO(0xcec4) -#define XEHP_RING_FAULT_REG _MMIO(0xcec4) +#define XEHP_RING_FAULT_REG MCR_REG(0xcec4) #define GEN8_RING_FAULT_ENGINE_ID(x) (((x) >> 12) & 0x7) #define RING_FAULT_GTTSEL_MASK (1 << 11) #define RING_FAULT_SRCID(x) (((x) >> 3) & 0xff) @@ -1043,21 +1045,21 @@ #define RING_FAULT_VALID (1 << 0) #define GEN12_GFX_TLB_INV_CR _MMIO(0xced8) -#define XEHP_GFX_TLB_INV_CR _MMIO(0xced8) +#define XEHP_GFX_TLB_INV_CR MCR_REG(0xced8) #define GEN12_VD_TLB_INV_CR _MMIO(0xcedc) -#define XEHP_VD_TLB_INV_CR _MMIO(0xcedc) +#define XEHP_VD_TLB_INV_CR MCR_REG(0xcedc) #define GEN12_VE_TLB_INV_CR _MMIO(0xcee0) -#define XEHP_VE_TLB_INV_CR _MMIO(0xcee0) +#define XEHP_VE_TLB_INV_CR MCR_REG(0xcee0) #define GEN12_BLT_TLB_INV_CR _MMIO(0xcee4) -#define XEHP_BLT_TLB_INV_CR _MMIO(0xcee4) +#define XEHP_BLT_TLB_INV_CR MCR_REG(0xcee4) #define GEN12_COMPCTX_TLB_INV_CR _MMIO(0xcf04) -#define XEHP_COMPCTX_TLB_INV_CR _MMIO(0xcf04) +#define XEHP_COMPCTX_TLB_INV_CR MCR_REG(0xcf04) -#define XEHP_MERT_MOD_CTRL _MMIO(0xcf28) -#define RENDER_MOD_CTRL _MMIO(0xcf2c) -#define COMP_MOD_CTRL _MMIO(0xcf30) -#define VDBX_MOD_CTRL _MMIO(0xcf34) -#define VEBX_MOD_CTRL _MMIO(0xcf38) +#define XEHP_MERT_MOD_CTRL MCR_REG(0xcf28) +#define RENDER_MOD_CTRL MCR_REG(0xcf2c) +#define COMP_MOD_CTRL MCR_REG(0xcf30) +#define VDBX_MOD_CTRL MCR_REG(0xcf34) +#define VEBX_MOD_CTRL MCR_REG(0xcf38) #define FORCE_MISS_FTLB REG_BIT(3) #define GEN12_GAMSTLB_CTRL _MMIO(0xcf4c) @@ -1072,52 +1074,52 @@ #define GEN12_GAM_DONE _MMIO(0xcf68) #define GEN7_HALF_SLICE_CHICKEN1 _MMIO(0xe100) /* IVB GT1 + VLV */ -#define GEN8_HALF_SLICE_CHICKEN1 _MMIO(0xe100) +#define GEN8_HALF_SLICE_CHICKEN1 MCR_REG(0xe100) #define GEN7_MAX_PS_THREAD_DEP (8 << 12) #define GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE (1 << 10) #define GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE (1 << 4) #define GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE (1 << 3) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) -#define GEN8_SAMPLER_INSTDONE _MMIO(0xe160) +#define GEN8_SAMPLER_INSTDONE MCR_REG(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) -#define GEN8_ROW_INSTDONE _MMIO(0xe164) +#define GEN8_ROW_INSTDONE MCR_REG(0xe164) -#define HALF_SLICE_CHICKEN2 _MMIO(0xe180) +#define HALF_SLICE_CHICKEN2 MCR_REG(0xe180) #define GEN8_ST_PO_DISABLE (1 << 13) #define HSW_HALF_SLICE_CHICKEN3 _MMIO(0xe184) -#define GEN8_HALF_SLICE_CHICKEN3 _MMIO(0xe184) +#define GEN8_HALF_SLICE_CHICKEN3 MCR_REG(0xe184) #define HSW_SAMPLE_C_PERFORMANCE (1 << 9) #define GEN8_CENTROID_PIXEL_OPT_DIS (1 << 8) #define GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC (1 << 5) #define GEN8_SAMPLER_POWER_BYPASS_DIS (1 << 1) -#define GEN9_HALF_SLICE_CHICKEN5 _MMIO(0xe188) +#define GEN9_HALF_SLICE_CHICKEN5 MCR_REG(0xe188) #define GEN9_DG_MIRROR_FIX_ENABLE (1 << 5) #define GEN9_CCS_TLB_PREFETCH_ENABLE (1 << 3) -#define GEN10_SAMPLER_MODE _MMIO(0xe18c) +#define GEN10_SAMPLER_MODE MCR_REG(0xe18c) #define ENABLE_SMALLPL REG_BIT(15) #define SC_DISABLE_POWER_OPTIMIZATION_EBB REG_BIT(9) #define GEN11_SAMPLER_ENABLE_HEADLESS_MSG REG_BIT(5) -#define GEN9_HALF_SLICE_CHICKEN7 _MMIO(0xe194) +#define GEN9_HALF_SLICE_CHICKEN7 MCR_REG(0xe194) #define DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA REG_BIT(15) #define GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR REG_BIT(8) #define GEN9_ENABLE_YV12_BUGFIX REG_BIT(4) #define GEN9_ENABLE_GPGPU_PREEMPTION REG_BIT(2) -#define GEN10_CACHE_MODE_SS _MMIO(0xe420) +#define GEN10_CACHE_MODE_SS MCR_REG(0xe420) #define ENABLE_EU_COUNT_FOR_TDL_FLUSH REG_BIT(10) #define DISABLE_ECC REG_BIT(5) #define FLOAT_BLEND_OPTIMIZATION_ENABLE REG_BIT(4) #define ENABLE_PREFETCH_INTO_IC REG_BIT(3) -#define EU_PERF_CNTL0 _MMIO(0xe458) -#define EU_PERF_CNTL4 _MMIO(0xe45c) +#define EU_PERF_CNTL0 MCR_REG(0xe458) +#define EU_PERF_CNTL4 MCR_REG(0xe45c) -#define GEN9_ROW_CHICKEN4 _MMIO(0xe48c) +#define GEN9_ROW_CHICKEN4 MCR_REG(0xe48c) #define GEN12_DISABLE_GRF_CLEAR REG_BIT(13) #define XEHP_DIS_BBL_SYSPIPE REG_BIT(11) #define GEN12_DISABLE_TDL_PUSH REG_BIT(9) @@ -1129,7 +1131,7 @@ #define HSW_ROW_CHICKEN3 _MMIO(0xe49c) #define HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE (1 << 6) -#define GEN8_ROW_CHICKEN _MMIO(0xe4f0) +#define GEN8_ROW_CHICKEN MCR_REG(0xe4f0) #define FLOW_CONTROL_ENABLE REG_BIT(15) #define UGM_BACKUP_MODE REG_BIT(13) #define MDQ_ARBITRATION_MODE REG_BIT(12) @@ -1141,39 +1143,39 @@ #define GEN7_ROW_CHICKEN2 _MMIO(0xe4f4) -#define GEN8_ROW_CHICKEN2 _MMIO(0xe4f4) +#define GEN8_ROW_CHICKEN2 MCR_REG(0xe4f4) #define GEN12_DISABLE_READ_SUPPRESSION REG_BIT(15) #define GEN12_DISABLE_EARLY_READ REG_BIT(14) #define GEN12_ENABLE_LARGE_GRF_MODE REG_BIT(12) #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) -#define RT_CTRL _MMIO(0xe530) +#define RT_CTRL MCR_REG(0xe530) #define DIS_NULL_QUERY REG_BIT(10) #define STACKID_CTRL REG_GENMASK(6, 5) #define STACKID_CTRL_512 REG_FIELD_PREP(STACKID_CTRL, 0x2) -#define EU_PERF_CNTL1 _MMIO(0xe558) -#define EU_PERF_CNTL5 _MMIO(0xe55c) +#define EU_PERF_CNTL1 MCR_REG(0xe558) +#define EU_PERF_CNTL5 MCR_REG(0xe55c) -#define XEHP_HDC_CHICKEN0 _MMIO(0xe5f0) +#define XEHP_HDC_CHICKEN0 MCR_REG(0xe5f0) #define LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK REG_GENMASK(13, 11) -#define ICL_HDC_MODE _MMIO(0xe5f4) +#define ICL_HDC_MODE MCR_REG(0xe5f4) -#define EU_PERF_CNTL2 _MMIO(0xe658) -#define EU_PERF_CNTL6 _MMIO(0xe65c) -#define EU_PERF_CNTL3 _MMIO(0xe758) +#define EU_PERF_CNTL2 MCR_REG(0xe658) +#define EU_PERF_CNTL6 MCR_REG(0xe65c) +#define EU_PERF_CNTL3 MCR_REG(0xe758) -#define LSC_CHICKEN_BIT_0 _MMIO(0xe7c8) +#define LSC_CHICKEN_BIT_0 MCR_REG(0xe7c8) #define DISABLE_D8_D16_COASLESCE REG_BIT(30) #define FORCE_1_SUB_MESSAGE_PER_FRAGMENT REG_BIT(15) -#define LSC_CHICKEN_BIT_0_UDW _MMIO(0xe7c8 + 4) +#define LSC_CHICKEN_BIT_0_UDW MCR_REG(0xe7c8 + 4) #define DIS_CHAIN_2XSIMD8 REG_BIT(55 - 32) #define FORCE_SLM_FENCE_SCOPE_TO_TILE REG_BIT(42 - 32) #define FORCE_UGM_FENCE_SCOPE_TO_TILE REG_BIT(41 - 32) #define MAXREQS_PER_BANK REG_GENMASK(39 - 32, 37 - 32) #define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32) -#define SARB_CHICKEN1 _MMIO(0xe90c) +#define SARB_CHICKEN1 MCR_REG(0xe90c) #define COMP_CKN_IN REG_GENMASK(30, 29) #define GEN7_ROW_CHICKEN2_GT2 _MMIO(0xf4f4) From patchwork Sat Oct 1 00:45:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6111BC433FE for ; Sat, 1 Oct 2022 00:47:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9E90A10EE0F; Sat, 1 Oct 2022 00:46:41 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id DFE4810EDF5; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vguiR+jg2pUs0BfJJSFKOHSBSzlLpraa1sqFvKJtA/8=; b=BekcBIOAS/faeMJyVC6adu6aL7XpX8qOLEcEzyAIDLvEupg2aottkycS 9HM6Mxdx5RQuTaTighK/XYuS3qFu3zLX9PFC42Cp0sid0RRM6FFusLg2K zhYtBYmEF4fVD2wiYcAabII8UKSbxLLXu83udBcLyuiUOX7q8notuKXEd OSPNbN9YbWkHOoWh/lLON1KxgSFzHPB/MV7EVVlmQC+RmYfhBK7ykK300 jrudsRkkR8WhaMFZK4KwMxHp619X8NIw23nnLLbaZk2VbGo3v0yAb6Svg RkcAUaa0H4Z/c4BOPsiOpbv7gkUBCK1ZgwuDJKjVZ/PWX/hDmi4rIkfME w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607928" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607928" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477637" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477637" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 09/14] drm/i915/gt: Always use MCR functions on multicast registers Date: Fri, 30 Sep 2022 17:45:45 -0700 Message-Id: <20221001004550.3031431-10-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than relying on the implicit behavior of intel_uncore_*() functions, let's always use the intel_gt_mcr_*() functions to operate on multicast/replicated registers. v2: - Add TLB invalidation registers Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 42 ++++++++++++++++------- drivers/gpu/drm/i915/gt/intel_mocs.c | 13 +++---- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 ++++--- drivers/gpu/drm/i915/intel_pm.c | 20 ++++++----- 4 files changed, 56 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index e14f159ad9fc..e763dc719d3a 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -1017,6 +1017,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8, return rb; } +/* + * HW architecture suggest typical invalidation time at 40us, + * with pessimistic cases up to 100us and a recommendation to + * cap at 1ms. We go a bit higher just in case. + */ +#define TLB_INVAL_TIMEOUT_US 100 +#define TLB_INVAL_TIMEOUT_MS 4 + +/* + * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets + * but are now considered MCR registers. Since they exist within a GAM range, + * the primary instance of the register rolls up the status from each unit. + */ +static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb) +{ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0, + TLB_INVAL_TIMEOUT_US, + TLB_INVAL_TIMEOUT_MS); + else + return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0, + TLB_INVAL_TIMEOUT_US, + TLB_INVAL_TIMEOUT_MS, + NULL); +} + static void mmio_invalidate_full(struct intel_gt *gt) { static const i915_reg_t gen8_regs[] = { @@ -1099,22 +1125,12 @@ static void mmio_invalidate_full(struct intel_gt *gt) for_each_engine_masked(engine, gt, awake, tmp) { struct reg_and_bit rb; - /* - * HW architecture suggest typical invalidation time at 40us, - * with pessimistic cases up to 100us and a recommendation to - * cap at 1ms. We go a bit higher just in case. - */ - const unsigned int timeout_us = 100; - const unsigned int timeout_ms = 4; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (__intel_wait_for_register_fw(uncore, - rb.reg, rb.bit, 0, - timeout_us, timeout_ms, - NULL)) + + if (wait_for_invalidate(gt, rb)) drm_err_ratelimited(>->i915->drm, "%s TLB invalidation did not complete in %ums!\n", - engine->name, timeout_ms); + engine->name, TLB_INVAL_TIMEOUT_MS); } /* diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 06643701bf24..89ef1e06bf1d 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -7,6 +7,7 @@ #include "intel_engine.h" #include "intel_gt.h" +#include "intel_gt_mcr.h" #include "intel_gt_regs.h" #include "intel_mocs.h" #include "intel_ring.h" @@ -601,17 +602,17 @@ static u32 l3cc_combine(u16 low, u16 high) 0; \ i++) -static void init_l3cc_table(struct intel_uncore *uncore, +static void init_l3cc_table(struct intel_gt *gt, const struct drm_i915_mocs_table *table) { unsigned int i; u32 l3cc; for_each_l3cc(l3cc, table, i) - if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50)) - intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc); + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc); else - intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc); + intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc); } void intel_mocs_init_engine(struct intel_engine_cs *engine) @@ -631,7 +632,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine) init_mocs_table(engine, &table); if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS) - init_l3cc_table(engine->uncore, &table); + init_l3cc_table(engine->gt, &table); } static u32 global_mocs_offset(void) @@ -667,7 +668,7 @@ void intel_mocs_init(struct intel_gt *gt) * memory transactions including guc transactions */ if (flags & HAS_RENDER_L3CC) - init_l3cc_table(gt->uncore, &table); + init_l3cc_table(gt, &table); } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index 9229243992c2..5b86b2e286e0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -10,12 +10,15 @@ */ #include "gt/intel_gt.h" +#include "gt/intel_gt_mcr.h" #include "gt/intel_gt_regs.h" #include "intel_guc_fw.h" #include "i915_drv.h" -static void guc_prepare_xfer(struct intel_uncore *uncore) +static void guc_prepare_xfer(struct intel_gt *gt) { + struct intel_uncore *uncore = gt->uncore; + u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC | GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA | GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA | @@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore) if (GRAPHICS_VER(uncore->i915) == 9) { /* DOP Clock Gating Enable for GuC clocks */ - intel_uncore_rmw(uncore, GEN8_MISCCPCTL, - 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE); + intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL, + GEN8_DOP_CLOCK_GATE_GUC_ENABLE | + intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL)); /* allows for 5us (in 10ns units) before GT can go to RC6 */ intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF); @@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc) struct intel_uncore *uncore = gt->uncore; int ret; - guc_prepare_xfer(uncore); + guc_prepare_xfer(gt); /* * Note that GuC needs the CSS header plus uKernel code to be copied diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index c694433a7da5..381f3d7ef8a7 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -30,6 +30,8 @@ #include "display/skl_watermark.h" #include "gt/intel_engine_regs.h" +#include "gt/intel_gt.h" +#include "gt/intel_gt_mcr.h" #include "gt/intel_gt_regs.h" #include "i915_drv.h" @@ -4339,22 +4341,23 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv, u32 val; /* WaTempDisableDOPClkGating:bdw */ - misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL); - intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE); + misccpctl = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL); + intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, + misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE); - val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1); + val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1); val &= ~L3_PRIO_CREDITS_MASK; val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits); val |= L3_HIGH_PRIO_CREDITS(high_prio_credits); - intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val); + intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val); /* * Wait at least 100 clocks before re-enabling clock gating. * See the definition of L3SQCREG1 in BSpec. */ - intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1); + intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1); udelay(1); - intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl); + intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl); } static void icl_init_clock_gating(struct drm_i915_private *dev_priv) @@ -4514,8 +4517,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv) gen9_init_clock_gating(dev_priv); /* WaDisableDopClockGating:skl */ - intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) & - ~GEN8_DOP_CLOCK_GATE_ENABLE); + intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, + intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL) & + ~GEN8_DOP_CLOCK_GATE_ENABLE); /* WAC6entrylatency:skl */ intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) | From patchwork Sat Oct 1 00:45:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6450C433FE for ; Sat, 1 Oct 2022 00:47:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6B02310EE0C; Sat, 1 Oct 2022 00:46:41 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id F1CEF10EDF6; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585184; x=1696121184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zW3e7StAu0gehtFTpQqfKu++AyXoea1TtiGzkFdY0VA=; b=Q7Xs6OEocH1dMrekX9dsYABh7nLVF53KCRYHp+Z6PbxqDhoXIMJPzomu /kfeUzT8hOAZw4k6FBJztxFD8YWYhJKyRJoBNLsqaebtS4HsjuR34fzAa YH/76TvUfUXXGoREGA8esndmRRHY2sB2lPiSME7TMk2x++6k69gkOWKyw vbFprhatc55bD3q+Uzq+hKzaAYmXcTS8mvlFsvZ21FIBlsU2xD2L6NlhQ qNC/GJ3SzdJCiQVFXxs1G12QLe/9/azbLcILRawOz98vBpb9A9079Tat/ J5pcW8YpisIFMv9FRAmZH8kKfgCLceRF3I+1MOttlFmhGg2c7RCDX/ybp w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607938" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607938" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477640" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477640" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 10/14] drm/i915/guc: Handle save/restore of MCR registers explicitly Date: Fri, 30 Sep 2022 17:45:46 -0700 Message-Id: <20221001004550.3031431-11-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" MCR registers can be placed on the GuC's save/restore list, but at the moment they are always handled in a multicast manner (i.e., the GuC reads one instance to save the value and then does a multicast write to restore that single value to all instances). In the future the GuC will probably give us an alternate interface to do unicast per-instance save/restore operations, so we should be very clear about which registers on the list are MCR registers (and in the future which save/restore behavior we want for them). Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 55 +++++++++++++--------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index cc357fa0c270..de923fb82301 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -278,24 +278,16 @@ __mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg) return slot; } -#define GUC_REGSET_STEERING(group, instance) ( \ - FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \ - FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \ - GUC_REGSET_NEEDS_STEERING \ -) - static long __must_check guc_mmio_reg_add(struct intel_gt *gt, struct temp_regset *regset, - i915_reg_t reg, u32 flags) + u32 offset, u32 flags) { u32 count = regset->storage_used - (regset->registers - regset->storage); - u32 offset = i915_mmio_reg_offset(reg); struct guc_mmio_reg entry = { .offset = offset, .flags = flags, }; struct guc_mmio_reg *slot; - u8 group, inst; /* * The mmio list is built using separate lists within the driver. @@ -307,17 +299,6 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt, sizeof(entry), guc_mmio_reg_cmp)) return 0; - /* - * The GuC doesn't have a default steering, so we need to explicitly - * steer all registers that need steering. However, we do not keep track - * of all the steering ranges, only of those that have a chance of using - * a non-default steering from the i915 pov. Instead of adding such - * tracking, it is easier to just program the default steering for all - * regs that don't need a non-default one. - */ - intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst); - entry.flags |= GUC_REGSET_STEERING(group, inst); - slot = __mmio_reg_add(regset, &entry); if (IS_ERR(slot)) return PTR_ERR(slot); @@ -335,6 +316,38 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt, #define GUC_MMIO_REG_ADD(gt, regset, reg, masked) \ guc_mmio_reg_add(gt, \ + regset, \ + i915_mmio_reg_offset(reg), \ + (masked) ? GUC_REGSET_MASKED : 0) + +#define GUC_REGSET_STEERING(group, instance) ( \ + FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \ + FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \ + GUC_REGSET_NEEDS_STEERING \ +) + +static long __must_check guc_mcr_reg_add(struct intel_gt *gt, + struct temp_regset *regset, + i915_reg_t reg, u32 flags) +{ + u8 group, inst; + + /* + * The GuC doesn't have a default steering, so we need to explicitly + * steer all registers that need steering. However, we do not keep track + * of all the steering ranges, only of those that have a chance of using + * a non-default steering from the i915 pov. Instead of adding such + * tracking, it is easier to just program the default steering for all + * regs that don't need a non-default one. + */ + intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst); + flags |= GUC_REGSET_STEERING(group, inst); + + return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags); +} + +#define GUC_MCR_REG_ADD(gt, regset, reg, masked) \ + guc_mcr_reg_add(gt, \ regset, \ (reg), \ (masked) ? GUC_REGSET_MASKED : 0) @@ -375,7 +388,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset, /* add in local MOCS registers */ for (i = 0; i < LNCFCMOCS_REG_COUNT; i++) if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) - ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false); + ret |= GUC_MCR_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false); else ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false); From patchwork Sat Oct 1 00:45:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0DD36C433FE for ; Sat, 1 Oct 2022 00:47:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 189E110EDF0; Sat, 1 Oct 2022 00:46:33 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6F2EB10E49D; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FoXvi3befdkRPuXaOi7JB/TbCdKx6ficAzVM5L25DJI=; b=AOu9GbnM03bwQ3FUFBmfBXkSXMPRty2VHC+cwRcluUvoMLHe6SDmJCr9 jD0gbwcfGX7R2QVLGZ+hnuQkIRRAFWoxP/rteo0ljz9VD1ytg8LxzlzR5 cH50+Xm+gEYuiYqpaKsQba4KvSMBpqPOB6j1PkgVjJO+tb/nQxuiHd3xn vF6mJkwqCptC+WUNkYr1Q4kbH4nHi52duhUve6Wh58gZBqKgwCLN86Mz3 VjabGQVx98Gh6pMz9rTGRz1rimRG4PBByfJY61xen0yeNw+5X6rMJVAVA 9I98a+B6M04ZXgNKR7HnMX+QR3piW+/w2HjQhQO2zpAiuxk9QWgKvqR0X Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607931" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607931" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477644" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477644" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:09 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 11/14] drm/i915/gt: Add MCR-specific workaround initializers Date: Fri, 30 Sep 2022 17:45:47 -0700 Message-Id: <20221001004550.3031431-12-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Let's be more explicit about which of our workarounds are updating MCR registers. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 428 +++++++++++------- .../gpu/drm/i915/gt/intel_workarounds_types.h | 4 +- 2 files changed, 261 insertions(+), 171 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 7bb878581d00..c5cd89aa13c9 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -166,12 +166,33 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg, _wa_add(wal, &wa); } +static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg, + u32 clear, u32 set, u32 read_mask, bool masked_reg) +{ + struct i915_wa wa = { + .reg = reg, + .clr = clear, + .set = set, + .read = read_mask, + .masked_reg = masked_reg, + .is_mcr = 1, + }; + + _wa_add(wal, &wa); +} + static void wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) { wa_add(wal, reg, clear, set, clear, false); } +static void +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) +{ + wa_mcr_add(wal, reg, clear, set, clear, false); +} + static void wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 set) { @@ -184,12 +205,24 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) wa_write_clr_set(wal, reg, set, set); } +static void +wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) +{ + wa_mcr_write_clr_set(wal, reg, set, set); +} + static void wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) { wa_write_clr_set(wal, reg, clr, 0); } +static void +wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) +{ + wa_mcr_write_clr_set(wal, reg, clr, 0); +} + /* * WA operations on "masked register". A masked register has the upper 16 bits * documented as "masked" in b-spec. Its purpose is to allow writing to just a @@ -207,12 +240,24 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true); } +static void +wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +{ + wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true); +} + static void wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) { wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true); } +static void +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +{ + wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true); +} + static void wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, u32 mask, u32 val) @@ -220,6 +265,13 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true); } +static void +wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, + u32 mask, u32 val) +{ + wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true); +} + static void gen6_ctx_workarounds_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { @@ -241,7 +293,7 @@ static void gen8_ctx_workarounds_init(struct intel_engine_cs *engine, wa_masked_en(wal, RING_MI_MODE(RENDER_RING_BASE), ASYNC_FLIP_PERF_DISABLE); /* WaDisablePartialInstShootdown:bdw,chv */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); /* Use Force Non-Coherent whenever executing a 3D context. This is a @@ -288,18 +340,18 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine, gen8_ctx_workarounds_init(engine, wal); /* WaDisableThreadStallDopClockGating:bdw (pre-production) */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); /* WaDisableDopClockGating:bdw * * Also see the related UCGTCL1 write in bdw_init_clock_gating() * to disable EUTC clock gating. */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, - DOP_CLOCK_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, + DOP_CLOCK_GATING_DISABLE); - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, - GEN8_SAMPLER_POWER_BYPASS_DIS); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, + GEN8_SAMPLER_POWER_BYPASS_DIS); wa_masked_en(wal, HDC_CHICKEN0, /* WaForceContextSaveRestoreNonCoherent:bdw */ @@ -314,7 +366,7 @@ static void chv_ctx_workarounds_init(struct intel_engine_cs *engine, gen8_ctx_workarounds_init(engine, wal); /* WaDisableThreadStallDopClockGating:chv */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE); /* Improve HiZ throughput on CHV. */ wa_masked_en(wal, HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X); @@ -333,21 +385,21 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, */ wa_masked_en(wal, COMMON_SLICE_CHICKEN2, GEN9_PBE_COMPRESSED_HASH_SELECTION); - wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, - GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR); + wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, + GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR); } /* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */ /* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, - FLOW_CONTROL_ENABLE | - PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, + FLOW_CONTROL_ENABLE | + PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE); /* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */ /* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */ - wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, - GEN9_ENABLE_YV12_BUGFIX | - GEN9_ENABLE_GPGPU_PREEMPTION); + wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, + GEN9_ENABLE_YV12_BUGFIX | + GEN9_ENABLE_GPGPU_PREEMPTION); /* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */ /* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */ @@ -356,8 +408,8 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE); /* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */ - wa_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5, - GEN9_CCS_TLB_PREFETCH_ENABLE); + wa_mcr_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5, + GEN9_CCS_TLB_PREFETCH_ENABLE); /* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */ wa_masked_en(wal, HDC_CHICKEN0, @@ -386,11 +438,11 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine, IS_KABYLAKE(i915) || IS_COFFEELAKE(i915) || IS_COMETLAKE(i915)) - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, - GEN8_SAMPLER_POWER_BYPASS_DIS); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3, + GEN8_SAMPLER_POWER_BYPASS_DIS); /* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */ - wa_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE); + wa_mcr_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE); /* * Supporting preemption with fine-granularity requires changes in the @@ -469,8 +521,8 @@ static void bxt_ctx_workarounds_init(struct intel_engine_cs *engine, gen9_ctx_workarounds_init(engine, wal); /* WaDisableThreadStallDopClockGating:bxt */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, - STALL_DOP_GATING_DISABLE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, + STALL_DOP_GATING_DISABLE); /* WaToEnableHwFixForPushConstHWBug:bxt */ wa_masked_en(wal, COMMON_SLICE_CHICKEN2, @@ -490,8 +542,8 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:kbl */ - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, - GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, + GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } static void glk_ctx_workarounds_init(struct intel_engine_cs *engine, @@ -514,8 +566,8 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); /* WaDisableSbeCacheDispatchPortSharing:cfl */ - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, - GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, + GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, @@ -534,13 +586,13 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, * (the register is whitelisted in hardware now, so UMDs can opt in * for coherency if they have a good reason). */ - wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT); + wa_mcr_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT); /* WaEnableFloatBlendOptimization:icl */ - wa_add(wal, GEN10_CACHE_MODE_SS, 0, - _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE), - 0 /* write-only, so skip validation */, - true); + wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0, + _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE), + 0 /* write-only, so skip validation */, + true); /* WaDisableGPGPUMidThreadPreemption:icl */ wa_masked_field_set(wal, GEN8_CS_CHICKEN1, @@ -548,8 +600,8 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN9_PREEMPT_GPGPU_THREAD_GROUP_LEVEL); /* allow headerless messages for preemptible GPGPU context */ - wa_masked_en(wal, GEN10_SAMPLER_MODE, - GEN11_SAMPLER_ENABLE_HEADLESS_MSG); + wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE, + GEN11_SAMPLER_ENABLE_HEADLESS_MSG); /* Wa_1604278689:icl,ehl */ wa_write(wal, IVB_FBC_RT_BASE, 0xFFFFFFFF & ~ILK_FBC_RT_VALID); @@ -558,7 +610,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, 0xFFFFFFFF); /* Wa_1406306137:icl,ehl */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU); } /* @@ -569,13 +621,13 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP); - wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, - REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); - wa_add(wal, - XEHP_FF_MODE2, - FF_MODE2_TDS_TIMER_MASK, - FF_MODE2_TDS_TIMER_128, - 0, false); + wa_mcr_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, + REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f)); + wa_mcr_add(wal, + XEHP_FF_MODE2, + FF_MODE2_TDS_TIMER_MASK, + FF_MODE2_TDS_TIMER_128, + 0, false); } /* @@ -664,27 +716,27 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine, /* Wa_16011186671:dg2_g11 */ if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) { - wa_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH); - wa_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE); + wa_mcr_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH); + wa_mcr_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE); } if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) { /* Wa_14010469329:dg2_g10 */ - wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, - XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE); + wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, + XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE); /* * Wa_22010465075:dg2_g10 * Wa_22010613112:dg2_g10 * Wa_14010698770:dg2_g10 */ - wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, - GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); + wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3, + GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); } /* Wa_16013271637:dg2 */ - wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1, - MSC_MSAA_REODER_BUF_BYPASS_DISABLE); + wa_mcr_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1, + MSC_MSAA_REODER_BUF_BYPASS_DISABLE); /* Wa_14014947963:dg2 */ if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_B0, STEP_FOREVER) || @@ -1264,9 +1316,9 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) PSDUNIT_CLKGATE_DIS); /* Wa_1406680159:icl,ehl */ - wa_write_or(wal, - GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, - GWUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, + GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, + GWUNIT_CLKGATE_DIS); /* Wa_1607087056:icl,ehl,jsl */ if (IS_ICELAKE(i915) || @@ -1279,7 +1331,7 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) * This is not a documented workaround, but rather an optimization * to reduce sampler power. */ - wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); + wa_mcr_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); } /* @@ -1313,7 +1365,7 @@ gen12_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_14011060649(gt, wal); /* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */ - wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); + wa_mcr_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE); } static void @@ -1325,9 +1377,9 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1409420604:tgl */ if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) - wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE2, - CPSSUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, + SUBSLICE_UNIT_LEVEL_CLKGATE2, + CPSSUNIT_CLKGATE_DIS); /* Wa_1607087056:tgl also know as BUG:1409180338 */ if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) @@ -1356,9 +1408,9 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) /* Wa_1409420604:dg1 */ if (IS_DG1(i915)) - wa_write_or(wal, - SUBSLICE_UNIT_LEVEL_CLKGATE2, - CPSSUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, + SUBSLICE_UNIT_LEVEL_CLKGATE2, + CPSSUNIT_CLKGATE_DIS); /* Wa_1408615072:dg1 */ /* Empirical testing shows this register is unaffected by engine reset. */ @@ -1375,7 +1427,7 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) xehp_init_mcr(gt, wal); /* Wa_1409757795:xehpsdv */ - wa_write_or(wal, SCCGCTL94DC, CG3DDISURB); + wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB); /* Wa_16011155590:xehpsdv */ if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) @@ -1455,8 +1507,8 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) CG3DDISCFEG_CLKGATE_DIS); /* Wa_14011006942:dg2 */ - wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, - DSS_ROUTER_CLKGATE_DIS); + wa_mcr_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE, + DSS_ROUTER_CLKGATE_DIS); } if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_B0)) { @@ -1467,7 +1519,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS); /* Wa_14011371254:dg2_g10 */ - wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); + wa_mcr_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS); /* Wa_14011431319:dg2_g10 */ wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS | @@ -1503,21 +1555,21 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) GAMEDIA_CLKGATE_DIS); /* Wa_14011028019:dg2_g10 */ - wa_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS); + wa_mcr_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS); } /* Wa_14014830051:dg2 */ - wa_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN); + wa_mcr_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN); /* * The following are not actually "workarounds" but rather * recommended tuning settings documented in the bspec's * performance guide section. */ - wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); + wa_mcr_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); /* Wa_14015795083 */ - wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -1526,7 +1578,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) pvc_init_mcr(gt, wal); /* Wa_14015795083 */ - wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); + wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } static void @@ -1638,14 +1690,25 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal) u32 val, old = 0; /* open-coded rmw due to steering */ - old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0; + if (wa->clr) + old = wa->is_mcr ? + intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_uncore_read_fw(uncore, wa->reg); val = (old & ~wa->clr) | wa->set; - if (val != old || !wa->clr) - intel_uncore_write_fw(uncore, wa->reg, val); + if (val != old || !wa->clr) { + if (wa->is_mcr) + intel_gt_mcr_multicast_write_fw(gt, wa->reg, val); + else + intel_uncore_write_fw(uncore, wa->reg, val); + } + + if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) { + u32 val = wa->is_mcr ? + intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_uncore_read_fw(uncore, wa->reg); - if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) - wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg), - wal->name, "application"); + wa_verify(wa, val, wal->name, "application"); + } } intel_uncore_forcewake_put__locked(uncore, fw); @@ -1674,8 +1737,9 @@ static bool wa_list_verify(struct intel_gt *gt, intel_uncore_forcewake_get__locked(uncore, fw); for (i = 0, wa = wal->list; i < wal->count; i++, wa++) - ok &= wa_verify(wa, - intel_gt_mcr_read_any_fw(gt, wa->reg), + ok &= wa_verify(wa, wa->is_mcr ? + intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_uncore_read_fw(uncore, wa->reg), wal->name, from); intel_uncore_forcewake_put__locked(uncore, fw); @@ -1721,12 +1785,36 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) _wa_add(wal, &wa); } +static void +whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) +{ + struct i915_wa wa = { + .reg = reg, + .is_mcr = 1, + }; + + if (GEM_DEBUG_WARN_ON(wal->count >= RING_MAX_NONPRIV_SLOTS)) + return; + + if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags))) + return; + + wa.reg.reg |= flags; + _wa_add(wal, &wa); +} + static void whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) { whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); } +static void +whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg) +{ + whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); +} + static void gen9_whitelist_build(struct i915_wa_list *w) { /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */ @@ -1752,7 +1840,7 @@ static void skl_whitelist_build(struct intel_engine_cs *engine) gen9_whitelist_build(w); /* WaDisableLSQCROPERFforOCL:skl */ - whitelist_reg(w, GEN8_L3SQCREG4); + whitelist_mcr_reg(w, GEN8_L3SQCREG4); } static void bxt_whitelist_build(struct intel_engine_cs *engine) @@ -1773,7 +1861,7 @@ static void kbl_whitelist_build(struct intel_engine_cs *engine) gen9_whitelist_build(w); /* WaDisableLSQCROPERFforOCL:kbl */ - whitelist_reg(w, GEN8_L3SQCREG4); + whitelist_mcr_reg(w, GEN8_L3SQCREG4); } static void glk_whitelist_build(struct intel_engine_cs *engine) @@ -1838,10 +1926,10 @@ static void icl_whitelist_build(struct intel_engine_cs *engine) switch (engine->class) { case RENDER_CLASS: /* WaAllowUMDToModifyHalfSliceChicken7:icl */ - whitelist_reg(w, GEN9_HALF_SLICE_CHICKEN7); + whitelist_mcr_reg(w, GEN9_HALF_SLICE_CHICKEN7); /* WaAllowUMDToModifySamplerMode:icl */ - whitelist_reg(w, GEN10_SAMPLER_MODE); + whitelist_mcr_reg(w, GEN10_SAMPLER_MODE); /* WaEnableStateCacheRedirectToCS:icl */ whitelist_reg(w, GEN9_SLICE_COMMON_ECO_CHICKEN1); @@ -2117,21 +2205,21 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) { /* Wa_14013392000:dg2_g11 */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) || IS_DG2_G11(i915) || IS_DG2_G12(i915)) { /* Wa_1509727124:dg2 */ - wa_masked_en(wal, GEN10_SAMPLER_MODE, - SC_DISABLE_POWER_OPTIMIZATION_EBB); + wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE, + SC_DISABLE_POWER_OPTIMIZATION_EBB); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0) || IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) { /* Wa_14012419201:dg2 */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, - GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, + GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) || @@ -2140,13 +2228,13 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_22012826095:dg2 * Wa_22013059131:dg2 */ - wa_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW, - MAXREQS_PER_BANK, - REG_FIELD_PREP(MAXREQS_PER_BANK, 2)); + wa_mcr_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW, + MAXREQS_PER_BANK, + REG_FIELD_PREP(MAXREQS_PER_BANK, 2)); /* Wa_22013059131:dg2 */ - wa_write_or(wal, LSC_CHICKEN_BIT_0, - FORCE_1_SUB_MESSAGE_PER_FRAGMENT); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, + FORCE_1_SUB_MESSAGE_PER_FRAGMENT); } /* Wa_1308578152:dg2_g10 when first gslice is fused off */ @@ -2159,19 +2247,19 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) || IS_DG2_G11(i915) || IS_DG2_G12(i915)) { /* Wa_22013037850:dg2 */ - wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, - DISABLE_128B_EVICTION_COMMAND_UDW); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, + DISABLE_128B_EVICTION_COMMAND_UDW); /* Wa_22012856258:dg2 */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_READ_SUPPRESSION); /* * Wa_22010960976:dg2 * Wa_14013347512:dg2 */ - wa_masked_dis(wal, XEHP_HDC_CHICKEN0, - LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK); + wa_mcr_masked_dis(wal, XEHP_HDC_CHICKEN0, + LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) { @@ -2179,7 +2267,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_1608949956:dg2_g10 * Wa_14010198302:dg2_g10 */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE); /* @@ -2188,31 +2276,31 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * LSC_CHICKEN_BIT_0 always reads back as 0 is this stepping, * so ignoring verification. */ - wa_add(wal, LSC_CHICKEN_BIT_0_UDW, 0, - FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE, - 0, false); + wa_mcr_add(wal, LSC_CHICKEN_BIT_0_UDW, 0, + FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE, + 0, false); } if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) { /* Wa_22010430635:dg2 */ - wa_masked_en(wal, - GEN9_ROW_CHICKEN4, - GEN12_DISABLE_GRF_CLEAR); + wa_mcr_masked_en(wal, + GEN9_ROW_CHICKEN4, + GEN12_DISABLE_GRF_CLEAR); /* Wa_14010648519:dg2 */ - wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); + wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); } /* Wa_14013202645:dg2 */ if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) || IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) - wa_write_or(wal, RT_CTRL, DIS_NULL_QUERY); + wa_mcr_write_or(wal, RT_CTRL, DIS_NULL_QUERY); /* Wa_22012532006:dg2 */ if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_C0) || IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) - wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, - DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA); + wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7, + DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA); if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) { /* Wa_14010680813:dg2_g10 */ @@ -2223,17 +2311,17 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) || IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) { /* Wa_14012362059:dg2 */ - wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); } if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) || IS_DG2_G10(i915)) { /* Wa_22014600077:dg2 */ - wa_add(wal, GEN10_CACHE_MODE_SS, 0, - _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH), - 0 /* Wa_14012342262 :write-only reg, so skip - verification */, - true); + wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0, + _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH), + 0 /* Wa_14012342262 :write-only reg, so skip + verification */, + true); } if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || @@ -2260,7 +2348,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ); /* * Wa_1407928979:tgl A* @@ -2289,14 +2377,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */ - wa_masked_en(wal, GEN8_ROW_CHICKEN2, - GEN12_PUSH_CONST_DEREF_HOLD_DIS); + wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, + GEN12_PUSH_CONST_DEREF_HOLD_DIS); /* * Wa_1409085225:tgl * Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH); } if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) || @@ -2320,9 +2408,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) || IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) { /* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */ - wa_masked_en(wal, - GEN10_SAMPLER_MODE, - ENABLE_SMALLPL); + wa_mcr_masked_en(wal, + GEN10_SAMPLER_MODE, + ENABLE_SMALLPL); } if (GRAPHICS_VER(i915) == 11) { @@ -2356,9 +2444,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) * Wa_1405733216:icl * Formerly known as WaDisableCleanEvicts */ - wa_write_or(wal, - GEN8_L3SQCREG4, - GEN11_LQSC_CLEAN_EVICT_DISABLE); + wa_mcr_write_or(wal, + GEN8_L3SQCREG4, + GEN11_LQSC_CLEAN_EVICT_DISABLE); /* Wa_1606682166:icl */ wa_write_or(wal, @@ -2366,10 +2454,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) GEN7_DISABLE_SAMPLER_PREFETCH); /* Wa_1409178092:icl */ - wa_write_clr_set(wal, - GEN11_SCRATCH2, - GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE, - 0); + wa_mcr_write_clr_set(wal, + GEN11_SCRATCH2, + GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE, + 0); /* WaEnable32PlaneMode:icl */ wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS, @@ -2427,30 +2515,30 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE); /* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */ - wa_write_or(wal, - BDW_SCRATCH1, - GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE); + wa_mcr_write_or(wal, + BDW_SCRATCH1, + GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE); /* WaProgramL3SqcReg1DefaultForPerf:bxt,glk */ if (IS_GEN9_LP(i915)) - wa_write_clr_set(wal, - GEN8_L3SQCREG1, - L3_PRIO_CREDITS_MASK, - L3_GENERAL_PRIO_CREDITS(62) | - L3_HIGH_PRIO_CREDITS(2)); + wa_mcr_write_clr_set(wal, + GEN8_L3SQCREG1, + L3_PRIO_CREDITS_MASK, + L3_GENERAL_PRIO_CREDITS(62) | + L3_HIGH_PRIO_CREDITS(2)); /* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */ - wa_write_or(wal, - GEN8_L3SQCREG4, - GEN8_LQSC_FLUSH_COHERENT_LINES); + wa_mcr_write_or(wal, + GEN8_L3SQCREG4, + GEN8_LQSC_FLUSH_COHERENT_LINES); /* Disable atomics in L3 to prevent unrecoverable hangs */ wa_write_clr_set(wal, GEN9_SCRATCH_LNCF1, GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0); - wa_write_clr_set(wal, GEN8_L3SQCREG4, - GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0); - wa_write_clr_set(wal, GEN9_SCRATCH1, - EVICTION_PERF_FIX_ENABLE, 0); + wa_mcr_write_clr_set(wal, GEN8_L3SQCREG4, + GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0); + wa_mcr_write_clr_set(wal, GEN9_SCRATCH1, + EVICTION_PERF_FIX_ENABLE, 0); } if (IS_HASWELL(i915)) { @@ -2664,7 +2752,7 @@ ccs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { if (IS_PVC_CT_STEP(engine->i915, STEP_A0, STEP_C0)) { /* Wa_14014999345:pvc */ - wa_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC); + wa_mcr_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC); } } @@ -2690,8 +2778,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915, } if (IS_DG2(i915)) { - wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS); - wa_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512); + wa_mcr_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS); + wa_mcr_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512); /* * This is also listed as Wa_22012654132 for certain DG2 @@ -2702,10 +2790,10 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915, * back for verification on DG2 (due to Wa_14012342262), so * we need to explicitly skip the readback. */ - wa_add(wal, GEN10_CACHE_MODE_SS, 0, - _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC), - 0 /* write-only, so skip validation */, - true); + wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0, + _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC), + 0 /* write-only, so skip validation */, + true); } /* @@ -2714,8 +2802,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915, * platforms. */ if (INTEL_INFO(i915)->tuning_thread_rr_after_dep) - wa_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE, - THREAD_EX_ARB_MODE_RR_AFTER_DEP); + wa_mcr_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE, + THREAD_EX_ARB_MODE_RR_AFTER_DEP); } /* @@ -2741,30 +2829,30 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li if (IS_XEHPSDV(i915)) { /* Wa_1409954639 */ - wa_masked_en(wal, - GEN8_ROW_CHICKEN, - SYSTOLIC_DOP_CLOCK_GATING_DIS); + wa_mcr_masked_en(wal, + GEN8_ROW_CHICKEN, + SYSTOLIC_DOP_CLOCK_GATING_DIS); /* Wa_1607196519 */ - wa_masked_en(wal, - GEN9_ROW_CHICKEN4, - GEN12_DISABLE_GRF_CLEAR); + wa_mcr_masked_en(wal, + GEN9_ROW_CHICKEN4, + GEN12_DISABLE_GRF_CLEAR); /* Wa_14010670810:xehpsdv */ - wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); + wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); /* Wa_14010449647:xehpsdv */ - wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, - GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); + wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, + GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); /* Wa_18011725039:xehpsdv */ if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A1, STEP_B0)) { - wa_masked_dis(wal, MLTICTXCTL, TDONRENDER); - wa_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH); + wa_mcr_masked_dis(wal, MLTICTXCTL, TDONRENDER); + wa_mcr_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH); } /* Wa_14012362059:xehpsdv */ - wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB); /* Wa_14014368820:xehpsdv */ wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS | @@ -2773,19 +2861,19 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li if (IS_DG2(i915) || IS_PONTEVECCHIO(i915)) { /* Wa_14015227452:dg2,pvc */ - wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE); + wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE); /* Wa_22014226127:dg2,pvc */ - wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE); /* Wa_16015675438:dg2,pvc */ wa_masked_en(wal, FF_SLICE_CS_CHICKEN2, GEN12_PERF_FIX_BALANCING_CFE_DISABLE); /* Wa_18018781329:dg2,pvc */ - wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB); - wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB); - wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB); - wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB); + wa_mcr_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB); } if (IS_DG2(i915)) { @@ -2793,7 +2881,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li * Wa_16011620976:dg2_g11 * Wa_22015475538:dg2 */ - wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8); + wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8); } } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h index 8a4b6de4e754..f05b37e56fa9 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h @@ -15,7 +15,9 @@ struct i915_wa { u32 clr; u32 set; u32 read; - bool masked_reg; + + u32 masked_reg:1; + u32 is_mcr:1; }; struct i915_wa_list { From patchwork Sat Oct 1 00:45:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7C1D5C433FE for ; Sat, 1 Oct 2022 00:47:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D996110EE02; Sat, 1 Oct 2022 00:46:36 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0BE3510E497; Sat, 1 Oct 2022 00:46:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585184; x=1696121184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dCis5aGLI0tzRDzKGFc9Xn/MeP1n3SGQuk8GMS7cMzU=; b=Z0AHS9esefw+VY+33MX5WQcam6xnLrRNHeSPRguViymI8VsSeCsxP9YV PnGBtjlqDrPrW3ZQzpbtBaCuT+cCMVNCEta/mhJAMDMlCq2NTK+8CJMaG 52aeDqTzlUV8ZCghYyruGBE1L6xvHT6664RGTP7XuRHNE1nfE8Uz4gw2E CmHhbCN2mZcxmgC2PptcDfexyzYyisfK2jAOCsg9fhNXUYbrl9zi4CBF2 nu3YxE30ZQ58Db1Y1+hwbMMosJReM7OwEdl4Ku3yyJk1lk9AvYi9mKe1d wjugD3lta7cbuthvBv1tvyyIwpcTpVlPa2bBCXzGCM9qjQFcKmZRxeTux Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607939" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607939" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477647" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477647" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:10 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 12/14] drm/i915: Define multicast registers as a new type Date: Fri, 30 Sep 2022 17:45:48 -0700 Message-Id: <20221001004550.3031431-13-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than treating multicast registers as 'i915_reg_t' let's define them as a completely new type. This will allow the compiler to help us make sure we're using multicast-aware functions to operate on multicast registers. This plan does break down a bit in places where we're just maintaining heterogeneous lists of registers (e.g., various MMIO whitelists used by perf, GVT, etc.) rather than performing reads/writes. We only really care about the offset in those cases, so for now we can "cast" the registers as non-MCR, leaving us with a list of i915_reg_t's, but we may want to look for better ways to store mixed collections of i915_reg_t and i915_mcr_reg_t in the future. v2: - Add TLB invalidation registers Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 32 +++++++---- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 53 ++++++++++++------- drivers/gpu/drm/i915/gt/intel_gt_mcr.h | 18 +++---- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 27 +++++++--- drivers/gpu/drm/i915/gt/intel_workarounds.c | 32 +++++------ .../gpu/drm/i915/gt/intel_workarounds_types.h | 5 +- .../gpu/drm/i915/gt/selftest_workarounds.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +- .../gpu/drm/i915/gt/uc/intel_guc_capture.c | 4 +- drivers/gpu/drm/i915/gvt/handlers.c | 17 +++--- drivers/gpu/drm/i915/gvt/mmio_context.c | 16 +++--- drivers/gpu/drm/i915/i915_reg_defs.h | 22 +++----- drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 ++-- 13 files changed, 140 insertions(+), 100 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index e763dc719d3a..27dbb9e4bd6c 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -991,7 +991,10 @@ void intel_gt_info_print(const struct intel_gt_info *info, } struct reg_and_bit { - i915_reg_t reg; + union { + i915_reg_t reg; + i915_mcr_reg_t mcr_reg; + }; u32 bit; }; @@ -1033,7 +1036,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8, static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb) { if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0, + return intel_gt_mcr_wait_for_reg_fw(gt, rb.mcr_reg, rb.bit, 0, TLB_INVAL_TIMEOUT_US, TLB_INVAL_TIMEOUT_MS); else @@ -1058,7 +1061,7 @@ static void mmio_invalidate_full(struct intel_gt *gt) [COPY_ENGINE_CLASS] = GEN12_BLT_TLB_INV_CR, [COMPUTE_CLASS] = GEN12_COMPCTX_TLB_INV_CR, }; - static const i915_reg_t xehp_regs[] = { + static const i915_mcr_reg_t xehp_regs[] = { [RENDER_CLASS] = XEHP_GFX_TLB_INV_CR, [VIDEO_DECODE_CLASS] = XEHP_VD_TLB_INV_CR, [VIDEO_ENHANCEMENT_CLASS] = XEHP_VE_TLB_INV_CR, @@ -1074,7 +1077,7 @@ static void mmio_invalidate_full(struct intel_gt *gt) unsigned int num = 0; if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { - regs = xehp_regs; + regs = NULL; num = ARRAY_SIZE(xehp_regs); } else if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; @@ -1101,11 +1104,17 @@ static void mmio_invalidate_full(struct intel_gt *gt) if (!intel_engine_pm_is_awake(engine)) continue; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (!i915_mmio_reg_offset(rb.reg)) - continue; + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_gt_mcr_multicast_write_fw(gt, + xehp_regs[engine->class], + BIT(engine->instance)); + } else { + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + if (!i915_mmio_reg_offset(rb.reg)) + continue; - intel_uncore_write_fw(uncore, rb.reg, rb.bit); + intel_uncore_write_fw(uncore, rb.reg, rb.bit); + } awake |= engine->mask; } @@ -1125,7 +1134,12 @@ static void mmio_invalidate_full(struct intel_gt *gt) for_each_engine_masked(engine, gt, awake, tmp) { struct reg_and_bit rb; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + rb.mcr_reg = xehp_regs[engine->class]; + rb.bit = BIT(engine->instance); + } else { + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + } if (wait_for_invalidate(gt, rb)) drm_err_ratelimited(>->i915->drm, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index a2bc77c33835..9e2caba64f19 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -150,6 +150,19 @@ void intel_gt_mcr_init(struct intel_gt *gt) } } +/* + * Although the rest of the driver should use MCR-specific functions to + * read/write MCR registers, we still use the regular intel_uncore_* functions + * internally to implement those, so we need a way for the functions in this + * file to "cast" an i915_mcr_reg_t into an i915_reg_t. + */ +static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr) +{ + i915_reg_t r = { .reg = mcr.reg }; + + return r; +} + /* * rw_with_mcr_steering_fw - Access a register with specific MCR steering * @uncore: pointer to struct intel_uncore @@ -164,7 +177,7 @@ void intel_gt_mcr_init(struct intel_gt *gt) * Caller needs to make sure the relevant forcewake wells are up. */ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, - i915_reg_t reg, u8 rw_flag, + i915_mcr_reg_t reg, u8 rw_flag, int group, int instance, u32 value) { u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0; @@ -201,9 +214,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); if (rw_flag == FW_REG_READ) - val = intel_uncore_read_fw(uncore, reg); + val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg)); else - intel_uncore_write_fw(uncore, reg, value); + intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value); mcr &= ~mcr_mask; mcr |= old_mcr & mcr_mask; @@ -214,14 +227,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, } static u32 rw_with_mcr_steering(struct intel_uncore *uncore, - i915_reg_t reg, u8 rw_flag, + i915_mcr_reg_t reg, u8 rw_flag, int group, int instance, u32 value) { enum forcewake_domains fw_domains; u32 val; - fw_domains = intel_uncore_forcewake_for_reg(uncore, reg, + fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg), rw_flag); fw_domains |= intel_uncore_forcewake_for_reg(uncore, GEN8_MCR_SELECTOR, @@ -249,7 +262,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore, * group/instance. */ u32 intel_gt_mcr_read(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, int group, int instance) { return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0); @@ -266,7 +279,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt, * Write an MCR register in unicast mode after steering toward a specific * group/instance. */ -void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value, +void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value, int group, int instance) { rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value); @@ -281,9 +294,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value, * Write an MCR register in multicast mode to update all instances. */ void intel_gt_mcr_multicast_write(struct intel_gt *gt, - i915_reg_t reg, u32 value) + i915_mcr_reg_t reg, u32 value) { - intel_uncore_write(gt->uncore, reg, value); + intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value); } /** @@ -297,9 +310,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt, * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should * be obtained automatically. */ -void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value) +void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value) { - intel_uncore_write_fw(gt->uncore, reg, value); + intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value); } /** @@ -318,7 +331,7 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should * be obtained automatically. */ -void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 clear, u32 set) { u32 val = intel_gt_mcr_read_any(gt, reg); @@ -341,10 +354,10 @@ void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, * for @type steering too. */ static bool reg_needs_read_steering(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, enum intel_steering_type type) { - const u32 offset = i915_mmio_reg_offset(reg); + const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg)); const struct intel_mmio_range *entry; if (likely(!gt->steering_table[type])) @@ -424,7 +437,7 @@ static void get_nonterminated_steering(struct intel_gt *gt, * steering. */ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u8 *group, u8 *instance) { int type; @@ -453,7 +466,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, * * Returns the value from a non-terminated instance of @reg. */ -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg) +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg) { int type; u8 group, instance; @@ -467,7 +480,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg) } } - return intel_uncore_read_fw(gt->uncore, reg); + return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg)); } /** @@ -480,7 +493,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg) * * Returns the value from a non-terminated instance of @reg. */ -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg) +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg) { int type; u8 group, instance; @@ -494,7 +507,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg) } } - return intel_uncore_read(gt->uncore, reg); + return intel_uncore_read(gt->uncore, mcr_reg_cast(reg)); } static void report_steering_type(struct drm_printer *p, @@ -595,7 +608,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, * Return: 0 if the register matches the desired condition, or -ETIMEDOUT. */ int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u32 mask, u32 value, unsigned int fast_timeout_us, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h index 96979af01f2b..7fb574d31d73 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h @@ -11,24 +11,24 @@ void intel_gt_mcr_init(struct intel_gt *gt); u32 intel_gt_mcr_read(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, int group, int instance); -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg); -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg); +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg); +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg); void intel_gt_mcr_unicast_write(struct intel_gt *gt, - i915_reg_t reg, u32 value, + i915_mcr_reg_t reg, u32 value, int group, int instance); void intel_gt_mcr_multicast_write(struct intel_gt *gt, - i915_reg_t reg, u32 value); + i915_mcr_reg_t reg, u32 value); void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, - i915_reg_t reg, u32 value); + i915_mcr_reg_t reg, u32 value); -void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg, +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 clear, u32 set); void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u8 *group, u8 *instance); void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt, @@ -38,7 +38,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss, unsigned int *group, unsigned int *instance); int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt, - i915_reg_t reg, + i915_mcr_reg_t reg, u32 mask, u32 value, unsigned int fast_timeout_us, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 901ea03b9fd8..b7341f01ec9f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -8,7 +8,18 @@ #include "i915_reg_defs.h" -#define MCR_REG(offset) _MMIO(offset) +#define MCR_REG(offset) ((const i915_mcr_reg_t){ .reg = (offset) }) + +/* + * The perf control registers are technically multicast registers, but the + * driver never needs to read/write them directly; we only use them to build + * lists of registers (where they're mixed in with other non-MCR registers) + * and then operate on the offset directly. For now we'll just define them + * as non-multicast so we can place them on the same list, but we may want + * to try to come up with a better way to handle heterogeneous lists of + * registers in the future. + */ +#define PERF_REG(offset) _MMIO(offset) /* RPM unit config (Gen8+) */ #define RPM_CONFIG0 _MMIO(0xd00) @@ -1116,8 +1127,8 @@ #define FLOAT_BLEND_OPTIMIZATION_ENABLE REG_BIT(4) #define ENABLE_PREFETCH_INTO_IC REG_BIT(3) -#define EU_PERF_CNTL0 MCR_REG(0xe458) -#define EU_PERF_CNTL4 MCR_REG(0xe45c) +#define EU_PERF_CNTL0 PERF_REG(0xe458) +#define EU_PERF_CNTL4 PERF_REG(0xe45c) #define GEN9_ROW_CHICKEN4 MCR_REG(0xe48c) #define GEN12_DISABLE_GRF_CLEAR REG_BIT(13) @@ -1154,16 +1165,16 @@ #define STACKID_CTRL REG_GENMASK(6, 5) #define STACKID_CTRL_512 REG_FIELD_PREP(STACKID_CTRL, 0x2) -#define EU_PERF_CNTL1 MCR_REG(0xe558) -#define EU_PERF_CNTL5 MCR_REG(0xe55c) +#define EU_PERF_CNTL1 PERF_REG(0xe558) +#define EU_PERF_CNTL5 PERF_REG(0xe55c) #define XEHP_HDC_CHICKEN0 MCR_REG(0xe5f0) #define LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK REG_GENMASK(13, 11) #define ICL_HDC_MODE MCR_REG(0xe5f4) -#define EU_PERF_CNTL2 MCR_REG(0xe658) -#define EU_PERF_CNTL6 MCR_REG(0xe65c) -#define EU_PERF_CNTL3 MCR_REG(0xe758) +#define EU_PERF_CNTL2 PERF_REG(0xe658) +#define EU_PERF_CNTL6 PERF_REG(0xe65c) +#define EU_PERF_CNTL3 PERF_REG(0xe758) #define LSC_CHICKEN_BIT_0 MCR_REG(0xe7c8) #define DISABLE_D8_D16_COASLESCE REG_BIT(30) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index c5cd89aa13c9..174c74dcda99 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg, _wa_add(wal, &wa); } -static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg, +static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set, u32 read_mask, bool masked_reg) { struct i915_wa wa = { - .reg = reg, + .mcr_reg = reg, .clr = clear, .set = set, .read = read_mask, @@ -188,7 +188,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) } static void -wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set) +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set) { wa_mcr_add(wal, reg, clear, set, clear, false); } @@ -206,7 +206,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) } static void -wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set) +wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set) { wa_mcr_write_clr_set(wal, reg, set, set); } @@ -218,7 +218,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) } static void -wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr) +wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr) { wa_mcr_write_clr_set(wal, reg, clr, 0); } @@ -241,7 +241,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) } static void -wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val) { wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true); } @@ -253,7 +253,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) } static void -wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val) +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val) { wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true); } @@ -266,7 +266,7 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, } static void -wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg, +wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 mask, u32 val) { wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true); @@ -1692,19 +1692,19 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal) /* open-coded rmw due to steering */ if (wa->clr) old = wa->is_mcr ? - intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) : intel_uncore_read_fw(uncore, wa->reg); val = (old & ~wa->clr) | wa->set; if (val != old || !wa->clr) { if (wa->is_mcr) - intel_gt_mcr_multicast_write_fw(gt, wa->reg, val); + intel_gt_mcr_multicast_write_fw(gt, wa->mcr_reg, val); else intel_uncore_write_fw(uncore, wa->reg, val); } if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) { u32 val = wa->is_mcr ? - intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) : intel_uncore_read_fw(uncore, wa->reg); wa_verify(wa, val, wal->name, "application"); @@ -1738,7 +1738,7 @@ static bool wa_list_verify(struct intel_gt *gt, for (i = 0, wa = wal->list; i < wal->count; i++, wa++) ok &= wa_verify(wa, wa->is_mcr ? - intel_gt_mcr_read_any_fw(gt, wa->reg) : + intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) : intel_uncore_read_fw(uncore, wa->reg), wal->name, from); @@ -1786,10 +1786,10 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) } static void -whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) +whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 flags) { struct i915_wa wa = { - .reg = reg, + .mcr_reg = reg, .is_mcr = 1, }; @@ -1799,7 +1799,7 @@ whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags))) return; - wa.reg.reg |= flags; + wa.mcr_reg.reg |= flags; _wa_add(wal, &wa); } @@ -1810,7 +1810,7 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) } static void -whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg) +whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg) { whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h index f05b37e56fa9..7c8b01d00043 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h @@ -11,7 +11,10 @@ #include "i915_reg_defs.h" struct i915_wa { - i915_reg_t reg; + union { + i915_reg_t reg; + i915_mcr_reg_t mcr_reg; + }; u32 clr; u32 set; u32 read; diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 67a9aab801dd..21b1edc052f8 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg) /* Alas, we must pardon some whitelists. Mistakes already made */ static const struct regmask pardon[] = { { GEN9_CTX_PREEMPT_REG, 9 }, - { GEN8_L3SQCREG4, 9 }, + { _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */ }; return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon)); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index de923fb82301..34ef4f36e660 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -328,7 +328,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt, static long __must_check guc_mcr_reg_add(struct intel_gt *gt, struct temp_regset *regset, - i915_reg_t reg, u32 flags) + i915_mcr_reg_t reg, u32 flags) { u8 group, inst; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c index 9495a7928bc8..d5c03e7a7843 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c @@ -240,7 +240,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist struct __ext_steer_reg { const char *name; - i915_reg_t reg; + i915_mcr_reg_t reg; }; static const struct __ext_steer_reg xe_extregs[] = { @@ -252,7 +252,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, const struct __ext_steer_reg *extlist, int slice_id, int subslice_id) { - ext->reg = extlist->reg; + ext->reg = _MMIO(i915_mmio_reg_offset(extlist->reg)); ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id); ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id); ext->regname = extlist->name; diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c index 700cc9688f47..a52c4645cdbe 100644 --- a/drivers/gpu/drm/i915/gvt/handlers.c +++ b/drivers/gpu/drm/i915/gvt/handlers.c @@ -734,7 +734,7 @@ static i915_reg_t force_nonpriv_white_list[] = { _MMIO(0x770c), _MMIO(0x83a8), _MMIO(0xb110), - GEN8_L3SQCREG4,//_MMIO(0xb118) + _MMIO(0xb118), _MMIO(0xe100), _MMIO(0xe18c), _MMIO(0xe48c), @@ -2140,6 +2140,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu, #define MMIO_DFH(reg, d, f, r, w) \ MMIO_F(reg, 4, f, 0, 0, d, r, w) +#define MMIO_DFH_MCR(reg, d, f, r, w) \ + MMIO_F(_MMIO(i915_mmio_reg_offset(reg)), 4, f, 0, 0, d, r, w) + #define MMIO_GM(reg, d, r, w) \ MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w) @@ -2529,15 +2532,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt) MMIO_DFH(HDC_CHICKEN0, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); - MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, - NULL, NULL); + MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, + NULL, NULL); MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL); - MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL); + MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL); @@ -2550,7 +2553,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt) MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); - MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); + MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL); @@ -2699,7 +2702,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt) MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 3)), D_SKL_PLUS, NULL, NULL); MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 4)), D_SKL_PLUS, NULL, NULL); - MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL); + MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL); MMIO_F(GEN9_GFX_MOCS(0), 0x7f8, F_CMD_ACCESS, 0, 0, D_SKL_PLUS, NULL, NULL); @@ -2769,7 +2772,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt) MMIO_DH(BXT_PORT_TX_DW3_LN0(DPIO_PHY1, DPIO_CH0), D_BXT, bxt_port_tx_dw3_read, NULL); MMIO_DH(BXT_DE_PLL_ENABLE, D_BXT, NULL, bxt_de_pll_enable_write); - MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL); + MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL); MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL); MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS, diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c index d177884d8f7d..7f40055b0474 100644 --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -44,6 +44,8 @@ #define GEN9_MOCS_SIZE 64 +#define MCR_CAST(mcr) _MMIO(i915_mcr_reg_offset(mcr)) + /* Raw offset is appened to each line for convenience. */ static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = { {RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */ @@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = { {RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */ {RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */ {RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */ - {RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */ - {RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */ + {RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */ + {RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */ {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */ {RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */ - {RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */ - {RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */ - {RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */ - {RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */ - {RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */ + {RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */ + {RCS0, _MMIO(0xe184), 0xffff, true}, /* 0xe184 */ + {RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */ + {RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */ + {RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */ {RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */ {RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */ {RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */ diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h index 8f486f77609f..e823869b9afd 100644 --- a/drivers/gpu/drm/i915/i915_reg_defs.h +++ b/drivers/gpu/drm/i915/i915_reg_defs.h @@ -104,22 +104,16 @@ typedef struct { #define _MMIO(r) ((const i915_reg_t){ .reg = (r) }) -#define INVALID_MMIO_REG _MMIO(0) - -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg) -{ - return reg.reg; -} +typedef struct { + u32 reg; +} i915_mcr_reg_t; -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b) -{ - return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b); -} +#define INVALID_MMIO_REG _MMIO(0) -static inline bool i915_mmio_reg_valid(i915_reg_t reg) -{ - return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG); -} +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */ +#define i915_mmio_reg_offset(r) (r.reg) +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b)) +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG)) #define VLV_DISPLAY_BASE 0x180000 diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c index 638b77d64bf4..8c1515f5c04c 100644 --- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c +++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c @@ -816,10 +816,10 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter) MMIO_D(GEN8_EU_DISABLE1); MMIO_D(GEN8_EU_DISABLE2); MMIO_D(_MMIO(0xfdc)); - MMIO_D(GEN8_ROW_CHICKEN); + MMIO_D(_MMIO(0xe4f0)); /* GEN8_ROW_CHICKEN */ MMIO_D(GEN7_ROW_CHICKEN2); MMIO_D(GEN8_UCGCTL6); - MMIO_D(GEN8_L3SQCREG4); + MMIO_D(_MMIO(0xb118)); /* GEN8_L3SQCREG4 */ MMIO_D(GEN9_SCRATCH_LNCF1); MMIO_F(_MMIO(0x24d0), 48); MMIO_D(_MMIO(0x44484)); @@ -831,7 +831,7 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter) MMIO_D(_MMIO(0x65f10)); MMIO_D(_MMIO(0xe194)); MMIO_D(_MMIO(0xe188)); - MMIO_D(HALF_SLICE_CHICKEN2); + MMIO_D(_MMIO(0xe180)); /* HALF_SLICE_CHICKEN2 */ MMIO_D(_MMIO(0x2580)); MMIO_D(_MMIO(0xe220)); MMIO_D(_MMIO(0xe230)); @@ -1025,7 +1025,7 @@ static int iterate_skl_plus_mmio(struct intel_gvt_mmio_table_iter *iter) MMIO_D(DMC_SSP_BASE); MMIO_D(DMC_HTP_SKL); MMIO_D(DMC_LAST_WRITE); - MMIO_D(BDW_SCRATCH1); + MMIO_D(_MMIO(0xb11c)); /* BDW_SCRATCH1 */ MMIO_D(SKL_DFSM); MMIO_D(DISPIO_CR_TX_BMU_CR0); MMIO_F(GEN9_GFX_MOCS(0), 0x7f8); @@ -1228,7 +1228,7 @@ static int iterate_bxt_mmio(struct intel_gvt_mmio_table_iter *iter) MMIO_D(GEN8_PUSHBUS_ENABLE); MMIO_D(GEN8_PUSHBUS_SHIFT); MMIO_D(GEN6_GFXPAUSE); - MMIO_D(GEN8_L3SQCREG1); + MMIO_D(_MMIO(0xb100)); /* GEN8_L3SQCREG1 */ MMIO_D(GEN8_L3CNTLREG); MMIO_D(_MMIO(0x20D8)); MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40); From patchwork Sat Oct 1 00:45:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996174 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CCD96C433F5 for ; Sat, 1 Oct 2022 00:47:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1CF1A10EDFB; Sat, 1 Oct 2022 00:46:35 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id BFC8410EDF0; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GPdL2exlDBty4p4AMkfnE0rWzlUtuRhJ4DZcKwA7nBM=; b=Ruht+TXPuQWHniAoS3X0jg/9obk5/65+8a4y9PW8zNTnpBgMf+8VLKel bMqXwim/R6NEM3Fj+DPhJM9SyRnnxYXsqB1Jf7QgMFSzfo+abXOhe9Tqp c0rqNl4etZaQX4b6MS0DLcR3bvRDWU2pVnRCWB467b+CTpfTpflpHEB/B 8iDz1rt563ErmnhhVt5avq3mwaTP0SynMRi45578ngmyqXriZUIQa3uWQ 7fIdTsGbjAYoOh1QISZ8mPm3UNFbzPMS3Wr1Co8yc+cFVy1+LK3UHKqns juotYUZdUb4elkb5QGoVSjwUpuyn8JsYj21/7GKrs4EKqeM5kM+p4rBiw w==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607937" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607937" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477649" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477649" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:10 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT Date: Fri, 30 Sep 2022 17:45:49 -0700 Message-Id: <20221001004550.3031431-14-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org, Radhakrishna Sripada Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" MTL once again changes the multicast register types and steering details. Key changes from past platforms: * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary according to the MTL subplatform and cannot be read from fuse registers. * The MCR steering register (and its bitfields) has changed. Unlike past platforms, we will be explicitly steering all types of MCR accesses, including those for "SLICE" and "DSS" ranges; we no longer rely on implicit steering. On previous platforms, various hardware/firmware agents that needed to access registers typically had their own steering control registers, allowing them to perform multicast steering without clobbering the CPU/kernel steering. Starting with MTL, more of these agents now share a single steering register (0xFD4) and it is no longer safe for us to assume that the value will remain unchanged from how we initialized it during startup. There is also a slight chance of race conditions between the driver and a hardware/firmware agent, so the hardware provides a semaphore register that can be used to coordinate access to the steering register. Support for the semaphore register will be introduced in a future patch. Bspec: 67788, 67112 Cc: Radhakrishna Sripada Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 85 ++++++++++++++++++--- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 5 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 8 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++- drivers/gpu/drm/i915/i915_pci.c | 1 + 5 files changed, 102 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index 9e2caba64f19..a1fa71b47831 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = { "MSLICE", "LNCF", "GAM", + "DSS", "INSTANCE 0", }; @@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = { {}, }; +static const struct intel_mmio_range mtl3d_instance0_steering_table[] = { + { 0x000B00, 0x000BFF }, /* SQIDI */ + { 0x001000, 0x001FFF }, /* SQIDI */ + { 0x004000, 0x0048FF }, /* GAM */ + { 0x008700, 0x0087FF }, /* SQIDI */ + { 0x00B000, 0x00B0FF }, /* NODE */ + { 0x00C800, 0x00CFFF }, /* GAM */ + { 0x00D880, 0x00D8FF }, /* NODE */ + { 0x00DD00, 0x00DDFF }, /* OAAL2 */ + {}, +}; + +static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = { + { 0x00B100, 0x00B3FF }, + {}, +}; + +/* DSS steering is used for SLICE ranges as well */ +static const struct intel_mmio_range mtl3d_dss_steering_table[] = { + { 0x005200, 0x0052FF }, /* SLICE */ + { 0x005500, 0x007FFF }, /* SLICE */ + { 0x008140, 0x00815F }, /* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */ + { 0x0094D0, 0x00955F }, /* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */ + { 0x009680, 0x0096FF }, /* DSS */ + { 0x00D800, 0x00D87F }, /* SLICE */ + { 0x00DC00, 0x00DCFF }, /* SLICE */ + { 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */ +}; + void intel_gt_mcr_init(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; + unsigned long fuse; + int i; /* * An mslice is unavailable only if both the meml3 for the slice is @@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt) drm_warn(&i915->drm, "mslice mask all zero!\n"); } - if (IS_PONTEVECCHIO(i915)) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) && + gt->type == GT_PRIMARY) { + fuse = REG_FIELD_GET(GT_L3_EXC_MASK, + intel_uncore_read(gt->uncore, XEHP_FUSE4)); + + /* + * Despite the register field being named "exclude mask" the + * bits actually represent enabled banks (two banks per bit). + */ + for_each_set_bit(i, &fuse, 3) + gt->info.l3bank_mask |= (0x3 << 2*i); + + gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table; + gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table; + gt->steering_table[DSS] = mtl3d_dss_steering_table; + } else if (IS_PONTEVECCHIO(i915)) { gt->steering_table[INSTANCE0] = pvc_instance0_steering_table; } else if (IS_DG2(i915)) { gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; @@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, lockdep_assert_held(&uncore->lock); - if (GRAPHICS_VER(uncore->i915) >= 11) { + if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) { + /* No need to save old steering reg value */ + intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR, + REG_FIELD_PREP(MTL_MCR_GROUPID, group) | + REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance)); + } else if (GRAPHICS_VER(uncore->i915) >= 11) { mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK; mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance); @@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore, */ if (rw_flag == FW_REG_WRITE) mcr_mask |= GEN11_MCR_MULTICAST; + + old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR); + + mcr &= ~mcr_mask; + mcr |= mcr_ss; + intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); } else { mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance); - } - old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR); + old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR); - mcr &= ~mcr_mask; - mcr |= mcr_ss; - intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); + mcr &= ~mcr_mask; + mcr |= mcr_ss; + intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); + } if (rw_flag == FW_REG_READ) val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg)); else intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value); - mcr &= ~mcr_mask; - mcr |= old_mcr & mcr_mask; - - intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); + if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70)) + intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr); return val; } @@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt, enum intel_steering_type type, u8 *group, u8 *instance) { + u32 dss; + switch (type) { case L3BANK: *group = 0; /* unused */ @@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt, *group = IS_DG2(gt->i915) ? 1 : 0; *instance = 0; break; + case DSS: + dss = intel_sseu_find_first_xehp_dss(>->info.sseu, 0, 0); + *group = dss / GEN_DSS_PER_GSLICE; + *instance = dss % GEN_DSS_PER_GSLICE; + break; case INSTANCE0: /* * There are a lot of MCR types for which instance (0, 0) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index b7341f01ec9f..c5b9671097e3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -59,6 +59,7 @@ #define GMD_ID_MEDIA _MMIO(MTL_MEDIA_GSI_BASE + 0xd8c) #define MCFG_MCR_SELECTOR _MMIO(0xfd0) +#define MTL_MCR_SELECTOR _MMIO(0xfd4) #define SF_MCR_SELECTOR _MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc) #define GAM_MCR_SELECTOR _MMIO(0xfe0) @@ -71,6 +72,8 @@ #define GEN11_MCR_SLICE_MASK GEN11_MCR_SLICE(0xf) #define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24) #define GEN11_MCR_SUBSLICE_MASK GEN11_MCR_SUBSLICE(0x7) +#define MTL_MCR_GROUPID REG_GENMASK(11, 8) +#define MTL_MCR_INSTANCEID REG_GENMASK(3, 0) #define IPEIR_I965 _MMIO(0x2064) #define IPEHR_I965 _MMIO(0x2068) @@ -531,6 +534,8 @@ #define GEN6_MBCTL_BOOT_FETCH_MECH (1 << 0) /* Fuse readout registers for GT */ +#define XEHP_FUSE4 _MMIO(0x9114) +#define GT_L3_EXC_MASK REG_GENMASK(6, 4) #define GEN10_MIRROR_FUSE3 _MMIO(0x9118) #define GEN10_L3BANK_PAIR_COUNT 4 #define GEN10_L3BANK_MASK 0x0F diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 30003d68fd51..77ecbd6be331 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -60,6 +60,7 @@ enum intel_steering_type { MSLICE, LNCF, GAM, + DSS, /* * On some platforms there are multiple types of MCR registers that @@ -255,8 +256,6 @@ struct intel_gt { intel_engine_mask_t engine_mask; - u32 l3bank_mask; - u8 num_engines; /* General presence of SFC units */ @@ -268,7 +267,10 @@ struct intel_gt { /* Slice/subslice/EU info */ struct sseu_dev_info sseu; - unsigned long mslice_mask; + union { + unsigned long mslice_mask; + unsigned long l3bank_mask; + }; /** @hwconfig: hardware configuration data */ struct intel_hwconfig hwconfig; diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 174c74dcda99..9be048da7fb3 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); } +static void +mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) +{ + /* + * Unlike older platforms, we no longer setup implicit steering here; + * all MCR accesses are explicitly steered. + */ + if (drm_debug_enabled(DRM_UT_DRIVER)) { + struct drm_printer p = drm_debug_printer("MCR Steering:"); + + intel_gt_mcr_report_steering(&p, gt, false); + } +} + static void gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal) { struct drm_i915_private *i915 = gt->i915; - if (IS_PONTEVECCHIO(i915)) + if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY) + mtl_3d_gt_workarounds_init(gt, wal); + else if (IS_PONTEVECCHIO(i915)) pvc_gt_workarounds_init(gt, wal); else if (IS_DG2(i915)) dg2_gt_workarounds_init(gt, wal); diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 38460a0bd7cb..bd1d8e0205a6 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = { .extra_gt_list = xelpmp_extra_gt, .has_flat_ccs = 0, .has_gmd_id = 1, + .has_mslice_steering = 0, .has_snoop = 1, .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM, .__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0), From patchwork Sat Oct 1 00:45:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12996171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 208BEC4332F for ; Sat, 1 Oct 2022 00:47:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3F9DE10E49D; Sat, 1 Oct 2022 00:46:31 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 392C210E498; Sat, 1 Oct 2022 00:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664585183; x=1696121183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wG9M37JMzdOyGt6f7SIFaWSQwjRiBvFNFXo8NFbhogw=; b=V+C1joKzuIyWidBcRFAgoYxhmnx6qkEIUwM/7ti2dKcUGT+GYrMIiini IUmSNX9dAzygsJT/3TY+S4UKIgI0gJg5sYrkyfnKKQQcl57CCd0xG0NNL 9BUyJnU7VWlWrLkGSEFMheFojBBse5wSJE2kofNnO1MwctDKR8ASS/LJz ANO1tO2V8VCbQDPM9dAJqnq/m/XCHJJxU2IEsid87htT3lnbi2qEohwxU hrdfYbn96GpcDwJjA35uU9yX7To1ThS/geOdEWMN11DwNqGgA0HERAAFy BHOw6G7r8+SubOcyoDf+z0+erbIh1Gkgcir0cLK2ntK6Tk6sIpvm6e4U3 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="388607930" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="388607930" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10486"; a="685477662" X-IronPort-AV: E=Sophos;i="5.93,359,1654585200"; d="scan'208";a="685477662" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 17:46:10 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Subject: [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT Date: Fri, 30 Sep 2022 17:45:50 -0700 Message-Id: <20221001004550.3031431-15-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221001004550.3031431-1-matthew.d.roper@intel.com> References: <20221001004550.3031431-1-matthew.d.roper@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ravi.kumar.vodapalli@intel.com, balasubramani.vivekanandan@intel.com, dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" MTL's media GT only has a single type of steering ("OAADDRM") which selects between media slice 0 and media slice 1. We'll always steer to media slice 0 unless it is fused off (which is the case when VD0, VE0, and SFC0 are all reported as unavailable). Bspec: 67789 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 19 +++++++++++++++++-- drivers/gpu/drm/i915/gt/intel_gt_types.h | 1 + drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++++++++++- 3 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c index a1fa71b47831..a580723bdd04 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c @@ -42,6 +42,7 @@ static const char * const intel_steering_types[] = { "LNCF", "GAM", "DSS", + "OADDRM", "INSTANCE 0", }; @@ -129,6 +130,12 @@ static const struct intel_mmio_range mtl3d_dss_steering_table[] = { { 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */ }; +static const struct intel_mmio_range xelpmp_oaddrm_steering_table[] = { + { 0x393200, 0x39323F }, + { 0x393400, 0x3934FF }, +}; + + void intel_gt_mcr_init(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; @@ -151,8 +158,9 @@ void intel_gt_mcr_init(struct intel_gt *gt) drm_warn(&i915->drm, "mslice mask all zero!\n"); } - if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) && - gt->type == GT_PRIMARY) { + if (MEDIA_VER(i915) >= 13 && gt->type == GT_MEDIA) { + gt->steering_table[OADDRM] = xelpmp_oaddrm_steering_table; + } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) { fuse = REG_FIELD_GET(GT_L3_EXC_MASK, intel_uncore_read(gt->uncore, XEHP_FUSE4)); @@ -479,6 +487,13 @@ static void get_nonterminated_steering(struct intel_gt *gt, *group = 0; *instance = 0; break; + case OADDRM: + if ((VDBOX_MASK(gt) | VEBOX_MASK(gt) | gt->info.sfc_mask) & BIT(0)) + *group = 0; + else + *group = 1; + *instance = 0; + break; default: MISSING_CASE(type); *group = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 77ecbd6be331..5d049e0c66f5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -61,6 +61,7 @@ enum intel_steering_type { LNCF, GAM, DSS, + OADDRM, /* * On some platforms there are multiple types of MCR registers that diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 9be048da7fb3..744d83e5bb1f 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1595,12 +1595,28 @@ mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) } } +static void +mtl_media_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) +{ + /* + * Unlike older platforms, we no longer setup implicit steering here; + * all MCR accesses are explicitly steered. + */ + if (drm_debug_enabled(DRM_UT_DRIVER)) { + struct drm_printer p = drm_debug_printer("MCR Steering:"); + + intel_gt_mcr_report_steering(&p, gt, false); + } +} + static void gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal) { struct drm_i915_private *i915 = gt->i915; - if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY) + if (IS_METEORLAKE(i915) && gt->type == GT_MEDIA) + mtl_media_gt_workarounds_init(gt, wal); + else if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY) mtl_3d_gt_workarounds_init(gt, wal); else if (IS_PONTEVECCHIO(i915)) pvc_gt_workarounds_init(gt, wal);