Message ID | 1526683197-24656-1-git-send-email-yunwei.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 5/18/2018 3:39 PM, Yunwei Zhang wrote: > WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO > read into Slice/Subslice specific registers, MCR packet control > register(0xFDC) needs to be programmed to point to any enabled > slice/subslice pair. Otherwise, incorrect value will be returned. > > However, that means each subsequent MMIO read will be forwarded to a > specific slice/subslice combination as read is unicast. This is OK since > slice/subslice specific register values are consistent in almost all cases > across slice/subslice. There are rare occasions such as INSTDONE that this > value will be dependent on slice/subslice combo, in such cases, we need to > program 0xFDC and recover this after. This is already covered by > read_subslice_reg. > > Also, 0xFDC will lose its information after TDR/engine reset/power state > change. > > References: HSD#1405586840, BSID#0575 > > v2: > - use fls() instead of find_last_bit() (Chris) > - added INTEL_SSEU to extract sseu from device info. (Chris) > v3: > - rebase on latest tip > v5: > - Added references (Mika) > - Change the ordered of passing arguments and etc. (Ursulin) > v7: > - Moved WA explanation Comments(Oscar) > - Rebased. > v8: > - Renamed sanitize_mcr to calculate_s_ss_select. (Oscar) > - calculate s/ss selector instead of whole mcr. (Oscar) > v9: > - Updated function name (Oscar) > - Remove redundant variables (Oscar) > v10: > - Separate pre-GEN10 and GEN11 mask. (Oscar) > > Cc: Oscar Mateo <oscar.mateo@intel.com> > Cc: Michel Thierry <michel.thierry@intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Signed-off-by: Yunwei Zhang <yunwei.zhang@intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 2 ++ > drivers/gpu/drm/i915/intel_engine_cs.c | 30 +++++++++++++++++++++++++----- > drivers/gpu/drm/i915/intel_workarounds.c | 27 +++++++++++++++++++++++++++ > 3 files changed, 54 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index e33c380..3b8a047 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2744,6 +2744,8 @@ int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); > int intel_engines_init_mmio(struct drm_i915_private *dev_priv); > int intel_engines_init(struct drm_i915_private *dev_priv); > > +u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv); > + > /* intel_hotplug.c */ > void intel_hpd_irq_handler(struct drm_i915_private *dev_priv, > u32 pin_mask, u32 long_mask); > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c > index 26f9f8a..832419e 100644 > --- a/drivers/gpu/drm/i915/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c > @@ -819,12 +819,29 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type) > } > } > > +u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv) > +{ > + const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu); > + u32 mcr_s_ss_select; > + u32 slice = fls(sseu->slice_mask); > + u32 subslice = fls(sseu->subslice_mask[slice]); > + > + if (INTEL_GEN(dev_priv) == 10) > + mcr_s_ss_select = GEN8_MCR_SLICE(slice) | > + GEN8_MCR_SUBSLICE(subslice); > + else > + mcr_s_ss_select = 0; > + > + return mcr_s_ss_select; > +} > + > static inline uint32_t > read_subslice_reg(struct drm_i915_private *dev_priv, int slice, > int subslice, i915_reg_t reg) > { > uint32_t mcr_slice_subslice_mask; > uint32_t mcr_slice_subslice_select; > + uint32_t default_mcr_s_ss_select; > uint32_t mcr; > uint32_t ret; > enum forcewake_domains fw_domains; > @@ -841,6 +858,8 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int slice, > GEN8_MCR_SUBSLICE(subslice); > } > > + default_mcr_s_ss_select = intel_calculate_mcr_s_ss_select(dev_priv); > + > fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg, > FW_REG_READ); > fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, > @@ -851,11 +870,10 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int slice, > intel_uncore_forcewake_get__locked(dev_priv, fw_domains); > > mcr = I915_READ_FW(GEN8_MCR_SELECTOR); > - /* > - * The HW expects the slice and sublice selectors to be reset to 0 > - * after reading out the registers. > - */ > - WARN_ON_ONCE(mcr & mcr_slice_subslice_mask); > + > + WARN_ON_ONCE((mcr & mcr_slice_subslice_mask) != > + default_mcr_s_ss_select); > + > mcr &= ~mcr_slice_subslice_mask; > mcr |= mcr_slice_subslice_select; > I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr); > @@ -863,6 +881,8 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int slice, > ret = I915_READ_FW(reg); > > mcr &= ~mcr_slice_subslice_mask; > + mcr |= default_mcr_s_ss_select; > + > I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr); > > intel_uncore_forcewake_put__locked(dev_priv, fw_domains); > diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c > index 2df3538..720d863 100644 > --- a/drivers/gpu/drm/i915/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/intel_workarounds.c > @@ -672,8 +672,35 @@ static void cfl_gt_workarounds_apply(struct drm_i915_private *dev_priv) > GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); > } > > +static void wa_init_mcr(struct drm_i915_private *dev_priv) > +{ > + u32 mcr; > + u32 mcr_slice_subslice_mask; > + > + mcr = I915_READ(GEN8_MCR_SELECTOR); > + > + mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK | > + GEN8_MCR_SUBSLICE_MASK; > + /* > + * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl > + * Before any MMIO read into slice/subslice specific registers, MCR > + * packet control register needs to be programmed to point to any > + * enabled s/ss pair. Otherwise, incorrect values will be returned. > + * This means each subsequent MMIO read will be forwarded to an > + * specific s/ss combination, but this is OK since these registers > + * are consistent across s/ss in almost all cases. In the rare > + * occasions, such as INSTDONE, where this value is dependent > + * on s/ss combo, the read should be done with read_subslice_reg. > + */ > + mcr &= ~mcr_slice_subslice_mask; > + mcr |= intel_calculate_mcr_s_ss_select(dev_priv); > + I915_WRITE(GEN8_MCR_SELECTOR, mcr); > +} > + > static void cnl_gt_workarounds_apply(struct drm_i915_private *dev_priv) > { > + wa_init_mcr(dev_priv); > + > /* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */ > if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0)) > I915_WRITE(GAMT_CHKN_BIT_REG,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e33c380..3b8a047 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2744,6 +2744,8 @@ int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); int intel_engines_init_mmio(struct drm_i915_private *dev_priv); int intel_engines_init(struct drm_i915_private *dev_priv); +u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv); + /* intel_hotplug.c */ void intel_hpd_irq_handler(struct drm_i915_private *dev_priv, u32 pin_mask, u32 long_mask); diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 26f9f8a..832419e 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -819,12 +819,29 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type) } } +u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv) +{ + const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu); + u32 mcr_s_ss_select; + u32 slice = fls(sseu->slice_mask); + u32 subslice = fls(sseu->subslice_mask[slice]); + + if (INTEL_GEN(dev_priv) == 10) + mcr_s_ss_select = GEN8_MCR_SLICE(slice) | + GEN8_MCR_SUBSLICE(subslice); + else + mcr_s_ss_select = 0; + + return mcr_s_ss_select; +} + static inline uint32_t read_subslice_reg(struct drm_i915_private *dev_priv, int slice, int subslice, i915_reg_t reg) { uint32_t mcr_slice_subslice_mask; uint32_t mcr_slice_subslice_select; + uint32_t default_mcr_s_ss_select; uint32_t mcr; uint32_t ret; enum forcewake_domains fw_domains; @@ -841,6 +858,8 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int slice, GEN8_MCR_SUBSLICE(subslice); } + default_mcr_s_ss_select = intel_calculate_mcr_s_ss_select(dev_priv); + fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg, FW_REG_READ); fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, @@ -851,11 +870,10 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int slice, intel_uncore_forcewake_get__locked(dev_priv, fw_domains); mcr = I915_READ_FW(GEN8_MCR_SELECTOR); - /* - * The HW expects the slice and sublice selectors to be reset to 0 - * after reading out the registers. - */ - WARN_ON_ONCE(mcr & mcr_slice_subslice_mask); + + WARN_ON_ONCE((mcr & mcr_slice_subslice_mask) != + default_mcr_s_ss_select); + mcr &= ~mcr_slice_subslice_mask; mcr |= mcr_slice_subslice_select; I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr); @@ -863,6 +881,8 @@ read_subslice_reg(struct drm_i915_private *dev_priv, int slice, ret = I915_READ_FW(reg); mcr &= ~mcr_slice_subslice_mask; + mcr |= default_mcr_s_ss_select; + I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr); intel_uncore_forcewake_put__locked(dev_priv, fw_domains); diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c index 2df3538..720d863 100644 --- a/drivers/gpu/drm/i915/intel_workarounds.c +++ b/drivers/gpu/drm/i915/intel_workarounds.c @@ -672,8 +672,35 @@ static void cfl_gt_workarounds_apply(struct drm_i915_private *dev_priv) GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); } +static void wa_init_mcr(struct drm_i915_private *dev_priv) +{ + u32 mcr; + u32 mcr_slice_subslice_mask; + + mcr = I915_READ(GEN8_MCR_SELECTOR); + + mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK | + GEN8_MCR_SUBSLICE_MASK; + /* + * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl + * Before any MMIO read into slice/subslice specific registers, MCR + * packet control register needs to be programmed to point to any + * enabled s/ss pair. Otherwise, incorrect values will be returned. + * This means each subsequent MMIO read will be forwarded to an + * specific s/ss combination, but this is OK since these registers + * are consistent across s/ss in almost all cases. In the rare + * occasions, such as INSTDONE, where this value is dependent + * on s/ss combo, the read should be done with read_subslice_reg. + */ + mcr &= ~mcr_slice_subslice_mask; + mcr |= intel_calculate_mcr_s_ss_select(dev_priv); + I915_WRITE(GEN8_MCR_SELECTOR, mcr); +} + static void cnl_gt_workarounds_apply(struct drm_i915_private *dev_priv) { + wa_init_mcr(dev_priv); + /* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */ if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0)) I915_WRITE(GAMT_CHKN_BIT_REG,
WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO read into Slice/Subslice specific registers, MCR packet control register(0xFDC) needs to be programmed to point to any enabled slice/subslice pair. Otherwise, incorrect value will be returned. However, that means each subsequent MMIO read will be forwarded to a specific slice/subslice combination as read is unicast. This is OK since slice/subslice specific register values are consistent in almost all cases across slice/subslice. There are rare occasions such as INSTDONE that this value will be dependent on slice/subslice combo, in such cases, we need to program 0xFDC and recover this after. This is already covered by read_subslice_reg. Also, 0xFDC will lose its information after TDR/engine reset/power state change. References: HSD#1405586840, BSID#0575 v2: - use fls() instead of find_last_bit() (Chris) - added INTEL_SSEU to extract sseu from device info. (Chris) v3: - rebase on latest tip v5: - Added references (Mika) - Change the ordered of passing arguments and etc. (Ursulin) v7: - Moved WA explanation Comments(Oscar) - Rebased. v8: - Renamed sanitize_mcr to calculate_s_ss_select. (Oscar) - calculate s/ss selector instead of whole mcr. (Oscar) v9: - Updated function name (Oscar) - Remove redundant variables (Oscar) v10: - Separate pre-GEN10 and GEN11 mask. (Oscar) Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Yunwei Zhang <yunwei.zhang@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/intel_engine_cs.c | 30 +++++++++++++++++++++++++----- drivers/gpu/drm/i915/intel_workarounds.c | 27 +++++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 5 deletions(-)