Message ID | 20210302062700.6025-1-cooper.chiou@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 | expand |
Hi Tvrtko, This WaProgramMgsrForCorrectSliceSpecificMmioReads info can be found on bspec WA#0575 and it's necessary for GT subslice fuse sku on PC7 exit case while running VP8 hw encode, it impacted CrOS projects since google disabled VP8 HW encode feature on Gen9 sku, so that's why we need this patch. Best Regards, Cooper > > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to > resolve VP8 hardware encoding system hang up on GT1 sku for ChromiumOS > projects > > Slice specific MMIO read inaccurate so MGSR needs to be programmed > appropriately to get correct reads from these slicet-related MMIOs. > > It dictates that before any MMIO read into Slice/Subslice specific registers, > MCR packet control register(0xFDC) needs to be programmed to point to > any enabled slice/subslice pair, especially GT1 fused sku since this issue can > be reproduced on VP8 hardware encoding via ffmpeg on ChromiumOS > devices. > When exit PC7, MGSR will reset so that we have to skip fused subslice ID. > > Reference: HSD#1508045018,1405586840, BSID#0575 > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Jani Nikula <jani.nikula@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: William Tseng <william.tseng@intel.com> > Cc: Lee Shawn C <shawn.c.lee@intel.com> > > Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 > +++++++++++++++++++++ > 1 file changed, 38 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c > b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index 3b4a7da60f0b..4ad598a727a6 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private > *i915, struct i915_wa_list *wal) > wa_write_clr(wal, GEN7_FF_THREAD_MODE, > GEN7_FF_VS_REF_CNT_FFME); } > > +static void > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list > +*wal) { > + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; > + unsigned int slice, subslice; > + u32 mcr, mcr_mask; > + > + GEM_BUG_ON(INTEL_GEN(i915) < 9); > + > + /* > + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml > + * Before any MMIO read into slice/subslice specific registers, MCR > + * packet control register needs to be programmed to point to any > + * enabled s/ss pair. Otherwise, incorrect values will be returned. > + * This means each subsequent MMIO read will be forwarded to an > + * specific s/ss combination, but this is OK since these registers > + * are consistent across s/ss in almost all cases. In the rare > + * occasions, such as INSTDONE, where this value is dependent > + * on s/ss combo, the read should be done with read_subslice_reg. > + */ > + slice = fls(sseu->slice_mask) - 1; > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); > + subslice = fls(intel_sseu_get_subslices(sseu, slice)); > + GEM_BUG_ON(!subslice); > + subslice--; > + > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; > + > + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, > +subslice, mcr); > + > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); } > + > static void > gen9_gt_workarounds_init(struct drm_i915_private *i915, struct > i915_wa_list *wal) { > + /* WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml */ > + if (!IS_COFFEELAKE(i915)) > + gen9_wa_init_mcr(i915, wal); > + > /* WaDisableKillLogic:bxt,skl,kbl */ > if (!IS_COFFEELAKE(i915) && !IS_COMETLAKE(i915)) > wa_write_or(wal, > -- > 2.17.1
On 02/03/2021 06:27, Cooper Chiou wrote: > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to > resolve VP8 hardware encoding system hang up on GT1 sku for > ChromiumOS projects > > Slice specific MMIO read inaccurate so MGSR needs to be programmed > appropriately to get correct reads from these slicet-related MMIOs. > > It dictates that before any MMIO read into Slice/Subslice specific > registers, MCR packet control register(0xFDC) needs to be programmed > to point to any enabled slice/subslice pair, especially GT1 fused sku > since this issue can be reproduced on VP8 hardware encoding via ffmpeg > on ChromiumOS devices. > When exit PC7, MGSR will reset so that we have to skip fused subslice ID. > > Reference: HSD#1508045018,1405586840, BSID#0575 > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Jani Nikula <jani.nikula@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: William Tseng <william.tseng@intel.com> > Cc: Lee Shawn C <shawn.c.lee@intel.com> > > Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +++++++++++++++++++++ > 1 file changed, 38 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index 3b4a7da60f0b..4ad598a727a6 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) > wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME); > } > > +static void > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) > +{ > + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; > + unsigned int slice, subslice; > + u32 mcr, mcr_mask; > + > + GEM_BUG_ON(INTEL_GEN(i915) < 9); > + > + /* > + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml > + * Before any MMIO read into slice/subslice specific registers, MCR > + * packet control register needs to be programmed to point to any > + * enabled s/ss pair. Otherwise, incorrect values will be returned. > + * This means each subsequent MMIO read will be forwarded to an > + * specific s/ss combination, but this is OK since these registers > + * are consistent across s/ss in almost all cases. In the rare > + * occasions, such as INSTDONE, where this value is dependent > + * on s/ss combo, the read should be done with read_subslice_reg. > + */ > + slice = fls(sseu->slice_mask) - 1; > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); > + subslice = fls(intel_sseu_get_subslices(sseu, slice)); > + GEM_BUG_ON(!subslice); > + subslice--; > + > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; > + > + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr); > + > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); > +} Have you considered reusing existing wa_init_mcr? Just needs the top-level assert changed and otherwise it looks it would do the right thing for Gen9. Advantage being a smaller patch and less code to carry. Regards, Tvrtko > + > static void > gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) > { > + /* WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml */ > + if (!IS_COFFEELAKE(i915)) > + gen9_wa_init_mcr(i915, wal); > + > /* WaDisableKillLogic:bxt,skl,kbl */ > if (!IS_COFFEELAKE(i915) && !IS_COMETLAKE(i915)) > wa_write_or(wal, >
Quoting Tvrtko Ursulin (2021-03-04 09:12:26) > > On 02/03/2021 06:27, Cooper Chiou wrote: > > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to > > resolve VP8 hardware encoding system hang up on GT1 sku for > > ChromiumOS projects > > > > Slice specific MMIO read inaccurate so MGSR needs to be programmed > > appropriately to get correct reads from these slicet-related MMIOs. > > > > It dictates that before any MMIO read into Slice/Subslice specific > > registers, MCR packet control register(0xFDC) needs to be programmed > > to point to any enabled slice/subslice pair, especially GT1 fused sku > > since this issue can be reproduced on VP8 hardware encoding via ffmpeg > > on ChromiumOS devices. > > When exit PC7, MGSR will reset so that we have to skip fused subslice ID. > > > > Reference: HSD#1508045018,1405586840, BSID#0575 > > > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Cc: Jani Nikula <jani.nikula@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > > Cc: William Tseng <william.tseng@intel.com> > > Cc: Lee Shawn C <shawn.c.lee@intel.com> > > > > Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +++++++++++++++++++++ > > 1 file changed, 38 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c > > index 3b4a7da60f0b..4ad598a727a6 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) > > wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME); > > } > > > > +static void > > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) > > +{ > > + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; > > + unsigned int slice, subslice; > > + u32 mcr, mcr_mask; > > + > > + GEM_BUG_ON(INTEL_GEN(i915) < 9); > > + > > + /* > > + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml > > + * Before any MMIO read into slice/subslice specific registers, MCR > > + * packet control register needs to be programmed to point to any > > + * enabled s/ss pair. Otherwise, incorrect values will be returned. > > + * This means each subsequent MMIO read will be forwarded to an > > + * specific s/ss combination, but this is OK since these registers > > + * are consistent across s/ss in almost all cases. In the rare > > + * occasions, such as INSTDONE, where this value is dependent > > + * on s/ss combo, the read should be done with read_subslice_reg. > > + */ > > + slice = fls(sseu->slice_mask) - 1; > > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); > > + subslice = fls(intel_sseu_get_subslices(sseu, slice)); > > + GEM_BUG_ON(!subslice); > > + subslice--; > > + > > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); > > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; > > + > > + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr); > > + > > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); > > +} > > Have you considered reusing existing wa_init_mcr? Just needs the > top-level assert changed and otherwise it looks it would do the right > thing for Gen9. Advantage being a smaller patch and less code to carry. That was the first patch, and fails for the same reason. The problem would appear to be in the mcr_mask for gt3. -Chris
Quoting Chris Wilson (2021-03-04 09:19:24) > Quoting Tvrtko Ursulin (2021-03-04 09:12:26) > > > > On 02/03/2021 06:27, Cooper Chiou wrote: > > > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to > > > resolve VP8 hardware encoding system hang up on GT1 sku for > > > ChromiumOS projects > > > > > > Slice specific MMIO read inaccurate so MGSR needs to be programmed > > > appropriately to get correct reads from these slicet-related MMIOs. > > > > > > It dictates that before any MMIO read into Slice/Subslice specific > > > registers, MCR packet control register(0xFDC) needs to be programmed > > > to point to any enabled slice/subslice pair, especially GT1 fused sku > > > since this issue can be reproduced on VP8 hardware encoding via ffmpeg > > > on ChromiumOS devices. > > > When exit PC7, MGSR will reset so that we have to skip fused subslice ID. > > > > > > Reference: HSD#1508045018,1405586840, BSID#0575 > > > > > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > > Cc: Jani Nikula <jani.nikula@intel.com> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > > > Cc: William Tseng <william.tseng@intel.com> > > > Cc: Lee Shawn C <shawn.c.lee@intel.com> > > > > > > Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> > > > --- > > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +++++++++++++++++++++ > > > 1 file changed, 38 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c > > > index 3b4a7da60f0b..4ad598a727a6 100644 > > > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > > > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > > > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) > > > wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME); > > > } > > > > > > +static void > > > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) > > > +{ > > > + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; > > > + unsigned int slice, subslice; > > > + u32 mcr, mcr_mask; > > > + > > > + GEM_BUG_ON(INTEL_GEN(i915) < 9); > > > + > > > + /* > > > + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml > > > + * Before any MMIO read into slice/subslice specific registers, MCR > > > + * packet control register needs to be programmed to point to any > > > + * enabled s/ss pair. Otherwise, incorrect values will be returned. > > > + * This means each subsequent MMIO read will be forwarded to an > > > + * specific s/ss combination, but this is OK since these registers > > > + * are consistent across s/ss in almost all cases. In the rare > > > + * occasions, such as INSTDONE, where this value is dependent > > > + * on s/ss combo, the read should be done with read_subslice_reg. > > > + */ > > > + slice = fls(sseu->slice_mask) - 1; > > > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); > > > + subslice = fls(intel_sseu_get_subslices(sseu, slice)); > > > + GEM_BUG_ON(!subslice); > > > + subslice--; > > > + > > > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); > > > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; > > > + > > > + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr); > > > + > > > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); > > > +} > > > > Have you considered reusing existing wa_init_mcr? Just needs the > > top-level assert changed and otherwise it looks it would do the right > > thing for Gen9. Advantage being a smaller patch and less code to carry. > > That was the first patch, and fails for the same reason. The problem > would appear to be in the mcr_mask for gt3. For the record, it appears to be an issue with fls vs ffs. Switching to ffs also removes the warnings for workaround failures on ehl/jsl. -Chris
Quoting Chris Wilson (2021-03-04 11:56:16) > Quoting Chris Wilson (2021-03-04 09:19:24) > > Quoting Tvrtko Ursulin (2021-03-04 09:12:26) > > > > > > On 02/03/2021 06:27, Cooper Chiou wrote: > > > > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to > > > > resolve VP8 hardware encoding system hang up on GT1 sku for > > > > ChromiumOS projects > > > > > > > > Slice specific MMIO read inaccurate so MGSR needs to be programmed > > > > appropriately to get correct reads from these slicet-related MMIOs. > > > > > > > > It dictates that before any MMIO read into Slice/Subslice specific > > > > registers, MCR packet control register(0xFDC) needs to be programmed > > > > to point to any enabled slice/subslice pair, especially GT1 fused sku > > > > since this issue can be reproduced on VP8 hardware encoding via ffmpeg > > > > on ChromiumOS devices. > > > > When exit PC7, MGSR will reset so that we have to skip fused subslice ID. > > > > > > > > Reference: HSD#1508045018,1405586840, BSID#0575 > > > > > > > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > > > Cc: Jani Nikula <jani.nikula@intel.com> > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > > > > Cc: William Tseng <william.tseng@intel.com> > > > > Cc: Lee Shawn C <shawn.c.lee@intel.com> > > > > > > > > Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> > > > > --- > > > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +++++++++++++++++++++ > > > > 1 file changed, 38 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c > > > > index 3b4a7da60f0b..4ad598a727a6 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > > > > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > > > > @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) > > > > wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME); > > > > } > > > > > > > > +static void > > > > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) > > > > +{ > > > > + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; > > > > + unsigned int slice, subslice; > > > > + u32 mcr, mcr_mask; > > > > + > > > > + GEM_BUG_ON(INTEL_GEN(i915) < 9); > > > > + > > > > + /* > > > > + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml > > > > + * Before any MMIO read into slice/subslice specific registers, MCR > > > > + * packet control register needs to be programmed to point to any > > > > + * enabled s/ss pair. Otherwise, incorrect values will be returned. > > > > + * This means each subsequent MMIO read will be forwarded to an > > > > + * specific s/ss combination, but this is OK since these registers > > > > + * are consistent across s/ss in almost all cases. In the rare > > > > + * occasions, such as INSTDONE, where this value is dependent > > > > + * on s/ss combo, the read should be done with read_subslice_reg. > > > > + */ > > > > + slice = fls(sseu->slice_mask) - 1; > > > > + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); > > > > + subslice = fls(intel_sseu_get_subslices(sseu, slice)); > > > > + GEM_BUG_ON(!subslice); > > > > + subslice--; > > > > + > > > > + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); > > > > + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; > > > > + > > > > + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr); > > > > + > > > > + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); > > > > +} > > > > > > Have you considered reusing existing wa_init_mcr? Just needs the > > > top-level assert changed and otherwise it looks it would do the right > > > thing for Gen9. Advantage being a smaller patch and less code to carry. > > > > That was the first patch, and fails for the same reason. The problem > > would appear to be in the mcr_mask for gt3. > > For the record, it appears to be an issue with fls vs ffs. Switching to > ffs also removes the warnings for workaround failures on ehl/jsl. Of course the icl in farm2 started spewing warnigns, but the other icl in farm1 were happy. -Chris
On 05/03/2021 00:53, Chris Wilson wrote: > Quoting Chris Wilson (2021-03-04 11:56:16) >> Quoting Chris Wilson (2021-03-04 09:19:24) >>> Quoting Tvrtko Ursulin (2021-03-04 09:12:26) >>>> >>>> On 02/03/2021 06:27, Cooper Chiou wrote: >>>>> WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to >>>>> resolve VP8 hardware encoding system hang up on GT1 sku for >>>>> ChromiumOS projects >>>>> >>>>> Slice specific MMIO read inaccurate so MGSR needs to be programmed >>>>> appropriately to get correct reads from these slicet-related MMIOs. >>>>> >>>>> It dictates that before any MMIO read into Slice/Subslice specific >>>>> registers, MCR packet control register(0xFDC) needs to be programmed >>>>> to point to any enabled slice/subslice pair, especially GT1 fused sku >>>>> since this issue can be reproduced on VP8 hardware encoding via ffmpeg >>>>> on ChromiumOS devices. >>>>> When exit PC7, MGSR will reset so that we have to skip fused subslice ID. >>>>> >>>>> Reference: HSD#1508045018,1405586840, BSID#0575 >>>>> >>>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> >>>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >>>>> Cc: Jani Nikula <jani.nikula@intel.com> >>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> >>>>> Cc: William Tseng <william.tseng@intel.com> >>>>> Cc: Lee Shawn C <shawn.c.lee@intel.com> >>>>> >>>>> Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> >>>>> --- >>>>> drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +++++++++++++++++++++ >>>>> 1 file changed, 38 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c >>>>> index 3b4a7da60f0b..4ad598a727a6 100644 >>>>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c >>>>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c >>>>> @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) >>>>> wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME); >>>>> } >>>>> >>>>> +static void >>>>> +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) >>>>> +{ >>>>> + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; >>>>> + unsigned int slice, subslice; >>>>> + u32 mcr, mcr_mask; >>>>> + >>>>> + GEM_BUG_ON(INTEL_GEN(i915) < 9); >>>>> + >>>>> + /* >>>>> + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml >>>>> + * Before any MMIO read into slice/subslice specific registers, MCR >>>>> + * packet control register needs to be programmed to point to any >>>>> + * enabled s/ss pair. Otherwise, incorrect values will be returned. >>>>> + * This means each subsequent MMIO read will be forwarded to an >>>>> + * specific s/ss combination, but this is OK since these registers >>>>> + * are consistent across s/ss in almost all cases. In the rare >>>>> + * occasions, such as INSTDONE, where this value is dependent >>>>> + * on s/ss combo, the read should be done with read_subslice_reg. >>>>> + */ >>>>> + slice = fls(sseu->slice_mask) - 1; >>>>> + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); >>>>> + subslice = fls(intel_sseu_get_subslices(sseu, slice)); >>>>> + GEM_BUG_ON(!subslice); >>>>> + subslice--; >>>>> + >>>>> + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); >>>>> + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; >>>>> + >>>>> + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr); >>>>> + >>>>> + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); >>>>> +} >>>> >>>> Have you considered reusing existing wa_init_mcr? Just needs the >>>> top-level assert changed and otherwise it looks it would do the right >>>> thing for Gen9. Advantage being a smaller patch and less code to carry. >>> >>> That was the first patch, and fails for the same reason. The problem >>> would appear to be in the mcr_mask for gt3. >> >> For the record, it appears to be an issue with fls vs ffs. Switching to >> ffs also removes the warnings for workaround failures on ehl/jsl. > > Of course the icl in farm2 started spewing warnigns, but the other icl > in farm1 were happy. It figures yes, now I remember it was the shards which had a mix of ICL GT1/2 with different fusing which were exhibiting odd behaviour. I have old patches around which a) try to program each WA in unicast mode (for all present ss), b) do verification for each present ss. Idea being to see if there are any patterns as to what writes land and which get lost. I don't think the results were conclusive back then but maybe I can try again. I am not sure if PC8 and DMC could also be involved from what Cooper was saying in a different thread. Maybe another CI run without the DMC, both ffs and fls. Another for limiting cstates. Regards, Tvrtko
Quoting Tvrtko Ursulin (2021-03-05 09:23:02) > I am not sure if PC8 and DMC could also be involved from what Cooper was > saying in a different thread. Maybe another CI run without the DMC, both > ffs and fls. Another for limiting cstates. Disabling the dmc leaves the display code in an inconsistent state so we don't complete a BAT run; but since the warnings are thrown during boot we can say that disabling the dmc does clear up the workaround issues on ehl/jsl: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7612/fi-ehl-2/boot0.txt https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7612/fi-jsl-1/boot0.txt -Chris
Quoting Chris Wilson (2021-03-05 12:20:45) > Quoting Tvrtko Ursulin (2021-03-05 09:23:02) > > I am not sure if PC8 and DMC could also be involved from what Cooper was > > saying in a different thread. Maybe another CI run without the DMC, both > > ffs and fls. Another for limiting cstates. > > Disabling the dmc leaves the display code in an inconsistent state so we > don't complete a BAT run; but since the warnings are thrown during boot > we can say that disabling the dmc does clear up the workaround issues on > ehl/jsl: > > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7612/fi-ehl-2/boot0.txt > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7612/fi-jsl-1/boot0.txt Fwiw, disabling the dmc and using fls for gen9 is not enough to avoid the warnings though: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7614/fi-cfl-8109u/igt@i915_selftest@live@memory_region.html So far only ffs works for gen9 mcr. -Chris
After switched to ffs from fls in "patch v5"(https://patchwork.freedesktop.org/series/81764/#rev7), now CI result is PASS no regression in wa_verify warning. @Chen, Rong Could you please run “phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second” with this latest patch v5 on test box to see if any performance impact. Thanks, Best Regards, Cooper > Quoting Tvrtko Ursulin (2021-03-05 09:23:02) > > I am not sure if PC8 and DMC could also be involved from what Cooper > > was saying in a different thread. Maybe another CI run without the > > DMC, both ffs and fls. Another for limiting cstates. > > Disabling the dmc leaves the display code in an inconsistent state so we don't > complete a BAT run; but since the warnings are thrown during boot we can > say that disabling the dmc does clear up the workaround issues on > ehl/jsl: > > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7612/fi-ehl-2/boot0.txt > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7612/fi-jsl-1/boot0.txt > -Chris
On 05/03/2021 15:24, Chiou, Cooper wrote: > After switched to ffs from fls in "patch v5"(https://patchwork.freedesktop.org/series/81764/#rev7), now CI result is PASS no regression in wa_verify warning. > > @Chen, Rong > Could you please run “phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second” with this latest patch v5 on test box to see if any performance impact. We need testing on more that one box I'm afraid. Need to cover different fusing configs of Gen9 with and without the patch. I don't have any useful ideas on how to do it though. :( Regards, Tvrtko
Hi Rong, Please help to trigger 3D performance test on several Gen9 CI test boxes which different fusing sku with/without “patch v5”. Thanks, Best Regards, Cooper > On Mar 6, 2021, at 12:01 AM, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > >> On 05/03/2021 15:24, Chiou, Cooper wrote: >> After switched to ffs from fls in "patch v5"(https://patchwork.freedesktop.org/series/81764/#rev7), now CI result is PASS no regression in wa_verify warning. >> @Chen, Rong >> Could you please run “phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second” with this latest patch v5 on test box to see if any performance impact. > > We need testing on more that one box I'm afraid. Need to cover different fusing configs of Gen9 with and without the patch. I don't have any useful ideas on how to do it though. :( > > Regards, > > Tvrtko
I've tested on GLK, KBL, CFL Intel NUC devices and got the following performance results, there is no performance regression per my testing. Patch: [v5] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 Test suite: phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second Kernel version: 5.12.0-rc1 (drm-tip) a. Device: Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 Cores) Without patch, fps=57.45 With patch, fps=57.49 b. Device: Intel NUC kit NUC8BEH Coffee Lake Core i3-8109U @3.6GHz(4 Cores) Without patch, fps=117.23 With patch, fps=117.27 c. Device: Intel NUC kit NUC7i3BNH Kaby Lake Core i3-7100U @2.4GHz(4 Cores) Without patch, fps=114.05 With patch, fps=114.34 Meanwhile, Intel lkp team has validated performance on lkp-kbl-nuc1 and no regression. f69d02e37a85645a d912096c40cdc3bc9364966971 testcase/testparams/testbox ---------------- -------------------------- --------------------------- %stddev change %stddev \ | \ 29.79 29.67 phoronix-test-suite/performance-true-Fullscreen-Ultimate-1-Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl-nuc1 29.79 29.67 GEO-MEAN phoronix-test-suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second Best Regards, Cooper > We need testing on more that one box I'm afraid. Need to cover different > fusing configs of Gen9 with and without the patch. I don't have any useful > ideas on how to do it though. :( > > Regards, > > Tvrtko
Hi, On 08/03/2021 17:32, Chiou, Cooper wrote: > I've tested on GLK, KBL, CFL Intel NUC devices and got the following performance results, there is no performance regression per my testing. > > Patch: [v5] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 > Test suite: phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second > Kernel version: 5.12.0-rc1 (drm-tip) > > a. Device: Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 Cores) > Without patch, fps=57.45 > With patch, fps=57.49 > b. Device: Intel NUC kit NUC8BEH Coffee Lake Core i3-8109U @3.6GHz(4 Cores) > Without patch, fps=117.23 > With patch, fps=117.27 > c. Device: Intel NUC kit NUC7i3BNH Kaby Lake Core i3-7100U @2.4GHz(4 Cores) > Without patch, fps=114.05 > With patch, fps=114.34 > > Meanwhile, Intel lkp team has validated performance on lkp-kbl-nuc1 and no regression. > f69d02e37a85645a d912096c40cdc3bc9364966971 testcase/testparams/testbox > ---------------- -------------------------- --------------------------- > %stddev change %stddev > \ | \ > 29.79 29.67 > phoronix-test-suite/performance-true-Fullscreen-Ultimate-1-Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl-nuc1 > 29.79 29.67 GEO-MEAN phoronix-test-suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second > CI results are green so that is good. Do the machines used for performance testing include unusual fusing? Worrying thing is that we were never able to reproduce the reported regression in house due lack of identical machine, right? Although I guess avoiding hangs trumps performance. Regards, Tvrtko
Quoting Tvrtko Ursulin (2021-03-10 10:19:12) > > Hi, > > On 08/03/2021 17:32, Chiou, Cooper wrote: > > I've tested on GLK, KBL, CFL Intel NUC devices and got the following performance results, there is no performance regression per my testing. > > > > Patch: [v5] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 > > Test suite: phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second > > Kernel version: 5.12.0-rc1 (drm-tip) > > > > a. Device: Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 Cores) > > Without patch, fps=57.45 > > With patch, fps=57.49 > > b. Device: Intel NUC kit NUC8BEH Coffee Lake Core i3-8109U @3.6GHz(4 Cores) > > Without patch, fps=117.23 > > With patch, fps=117.27 > > c. Device: Intel NUC kit NUC7i3BNH Kaby Lake Core i3-7100U @2.4GHz(4 Cores) > > Without patch, fps=114.05 > > With patch, fps=114.34 > > > > Meanwhile, Intel lkp team has validated performance on lkp-kbl-nuc1 and no regression. > > f69d02e37a85645a d912096c40cdc3bc9364966971 testcase/testparams/testbox > > ---------------- -------------------------- --------------------------- > > %stddev change %stddev > > \ | \ > > 29.79 29.67 > > phoronix-test-suite/performance-true-Fullscreen-Ultimate-1-Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl-nuc1 > > 29.79 29.67 GEO-MEAN phoronix-test-suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second > > > > CI results are green so that is good. > > Do the machines used for performance testing include unusual fusing? > Worrying thing is that we were never able to reproduce the reported > regression in house due lack of identical machine, right? Although I > guess avoiding hangs trumps performance. The issue is that if the regression is reproducible it means that the broadcast mask is no longer correct (or never was, one or the other ;) And another w/a is going astray because it depends on the previous undefined value of the mcr. Which raises the question as to whether the hang prevention seen here is also because some other w/a (or other mmio) is not being applied to the relevant units. Or vice versa. Either way there remains an underlying issue in that some register writes for gen9 require mcr being set that were are not handling correctly. Changing the mask here changing results elsewhere indicate that the issues are fully addressed, and the fear that undoing some other mmio is going to introduce other subtle hangs. And we are all blindly poking at the issue as no one has access to the affected skus. What would be useful is if we print the value before changing it so that we can see if we have any machines in CI where we are making significant changes to the broadcast mask. -Chris
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Hi, > > On 08/03/2021 17:32, Chiou, Cooper wrote: > > I've tested on GLK, KBL, CFL Intel NUC devices and got the following > performance results, there is no performance regression per my testing. > > > > Patch: [v5] drm/i915: Enable > > WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 Test suite: > > phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranPa > > radisoIsland.frames_per_second > > Kernel version: 5.12.0-rc1 (drm-tip) > > > > a. Device: Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 > Cores) > > Without patch, fps=57.45 > > With patch, fps=57.49 > > b. Device: Intel NUC kit NUC8BEH Coffee Lake Core i3-8109U @3.6GHz(4 > Cores) > > Without patch, fps=117.23 > > With patch, fps=117.27 > > c. Device: Intel NUC kit NUC7i3BNH Kaby Lake Core i3-7100U @2.4GHz(4 > Cores) > > Without patch, fps=114.05 > > With patch, fps=114.34 > > > > Meanwhile, Intel lkp team has validated performance on lkp-kbl-nuc1 and > no regression. > > f69d02e37a85645a d912096c40cdc3bc9364966971 > > testcase/testparams/testbox > > ---------------- -------------------------- --------------------------- > > %stddev change %stddev > > \ | \ > > 29.79 29.67 > > phoronix-test-suite/performance-true-Fullscreen-Ultimate-1- > Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl- > nuc1 > > 29.79 29.67 GEO-MEAN phoronix-test- > suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.fram > es_per_second > > > > CI results are green so that is good. > > Do the machines used for performance testing include unusual fusing? [Chiou, Cooper] Yes, this performance test included fusing sku as following NUC GLK Celeron Linux device, Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 Cores) Without patch, fps=57.45 With patch, fps=57.49 > Worrying thing is that we were never able to reproduce the reported regression in house due lack of identical machine, right? [Chiou, Cooper] Yes, if device is core-i3/5/7 GT2, then hang issue isn’t able to reproduce due to there is no fused/disabled subslice0. But VP8 hw encoding GPU hang issue can be 100% reproduced on CML/KBL Pentium/Celeron GT1 sku, explain root cause as the following, on Chrome CML Pentium 6450u GT1 sku, we observed when system exit PC7 power state, MGSR(0xFDC) is reset to 0x10000000, read its reg(fuse2:0x9120)=0x02988480 bit 20-23=1001 means subslice0/3 are fused disabled, it's defined in bspec as well. on CML core-i5 sku, reg(fuse2)=0x22889140 bit 20-23=1000 only subslice3 is fused. so we have to skip this "fused/disabled" subslice 0/3 on GT1 sku in i915, then use sublice 1 or 2 and re-program 0xFDC=0x11000000 or 0x12000000 since reg[0xfdc]=0x10000000 to use subslice0, but sublice0 is fused/disabled, so once go this fused disabled subslice0 then GPU hang happened. In some OEM Linux projects, they only have core-i sku no Pentium/Celeron sku as I knew and vp8 hw encoding is rare to use on Ubuntu, as this two reasons then Linux didn't reproduce this fused subslice0 gpu hang issue before, but it does happen on Chrome projects. Cooper > Worrying thing is that we were never able to reproduce the reported > regression in house due lack of identical machine, right? Although I guess > avoiding hangs trumps performance. > > Regards, > > Tvrtko
> The issue is that if the regression is reproducible it means that the broadcast > mask is no longer correct (or never was, one or the other ;) And another w/a > is going astray because it depends on the previous undefined value of the > mcr. [Chiou, Cooper] I think we might focus on this w/a WaProgramMgsrForCorrectSliceSpecificMmioReads, this w/a is not only for CNL/ICL/TGL, but need to be applied in Gen9 as well. I'm ok to add wa_init_mcr() in gen9_gt_workarounds_init() as patch v1 to make it simple, and this patch can resolve Chrome VP8 hw encoding issue and no performance regression as well. > > Which raises the question as to whether the hang prevention seen here is > also because some other w/a (or other mmio) is not being applied to the > relevant units. Or vice versa. > > Either way there remains an underlying issue in that some register writes for > gen9 require mcr being set that were are not handling correctly. Changing the > mask here changing results elsewhere indicate that the issues are fully > addressed, and the fear that undoing some other mmio is going to introduce > other subtle hangs. And we are all blindly poking at the issue as no one has > access to the affected skus. > > What would be useful is if we print the value before changing it so that we > can see if we have any machines in CI where we are making significant > changes to the broadcast mask. > -Chris
On 11/03/2021 01:27, Chiou, Cooper wrote: >> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> >> Hi, >> >> On 08/03/2021 17:32, Chiou, Cooper wrote: >>> I've tested on GLK, KBL, CFL Intel NUC devices and got the following >> performance results, there is no performance regression per my testing. >>> >>> Patch: [v5] drm/i915: Enable >>> WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 Test suite: >>> phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranPa >>> radisoIsland.frames_per_second >>> Kernel version: 5.12.0-rc1 (drm-tip) >>> >>> a. Device: Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 >> Cores) >>> Without patch, fps=57.45 >>> With patch, fps=57.49 >>> b. Device: Intel NUC kit NUC8BEH Coffee Lake Core i3-8109U @3.6GHz(4 >> Cores) >>> Without patch, fps=117.23 >>> With patch, fps=117.27 >>> c. Device: Intel NUC kit NUC7i3BNH Kaby Lake Core i3-7100U @2.4GHz(4 >> Cores) >>> Without patch, fps=114.05 >>> With patch, fps=114.34 >>> >>> Meanwhile, Intel lkp team has validated performance on lkp-kbl-nuc1 and >> no regression. >>> f69d02e37a85645a d912096c40cdc3bc9364966971 >>> testcase/testparams/testbox >>> ---------------- -------------------------- --------------------------- >>> %stddev change %stddev >>> \ | \ >>> 29.79 29.67 >>> phoronix-test-suite/performance-true-Fullscreen-Ultimate-1- >> Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl- >> nuc1 >>> 29.79 29.67 GEO-MEAN phoronix-test- >> suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.fram >> es_per_second >>> >> >> CI results are green so that is good. >> >> Do the machines used for performance testing include unusual fusing? > [Chiou, Cooper] Yes, this performance test included fusing sku as following NUC GLK Celeron Linux device, > Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 Cores) > Without patch, fps=57.45 > With patch, fps=57.49 I was referring to the original performance regression report which came from LKP which was on "i7-7567U" so Kabylake. Can yo find such machine to test on? >> Worrying thing is that we were never able to reproduce the reported regression in house due lack of identical machine, right? > [Chiou, Cooper] Yes, if device is core-i3/5/7 GT2, then hang issue isn’t able to reproduce due to there is no fused/disabled subslice0. > > But VP8 hw encoding GPU hang issue can be 100% reproduced on CML/KBL Pentium/Celeron GT1 sku, > explain root cause as the following, > on Chrome CML Pentium 6450u GT1 sku, we observed when system exit PC7 power state, MGSR(0xFDC) is reset to 0x10000000, > read its reg(fuse2:0x9120)=0x02988480 bit 20-23=1001 means subslice0/3 are fused disabled, it's defined in bspec as well. > on CML core-i5 sku, reg(fuse2)=0x22889140 bit 20-23=1000 only subslice3 is fused. > > so we have to skip this "fused/disabled" subslice 0/3 on GT1 sku in i915, then use sublice 1 or 2 and re-program 0xFDC=0x11000000 or 0x12000000 > since reg[0xfdc]=0x10000000 to use subslice0, but sublice0 is fused/disabled, so once go this fused disabled subslice0 then GPU hang happened. > > In some OEM Linux projects, they only have core-i sku no Pentium/Celeron sku as I knew and vp8 hw encoding is rare to use on Ubuntu, > as this two reasons then Linux didn't reproduce this fused subslice0 gpu hang issue before, but it does happen on Chrome projects. Understood on the hang part. Regards, Tvrtko
> I was referring to the original performance regression report which came > from LKP which was on "i7-7567U" so Kabylake. Can yo find such machine to > test on? [Chiou, Cooper] Yes, lkp-tests team has tested this patch on lkp-kbl-nuc1 (KBL i7-7567U) as the following mail thread and there is no performance regression per their confirmed. From: Chen, Rong A <rong.a.chen@intel.com> Sent: Thursday, March 4, 2021 9:56 PM Hi Cooper, There's no regression with this patch: d912096c40cdc drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9 f69d02e37a856 Merge tag 'misc-5.12-2021-03-02' of git://git.kernel.dk/linux-block f69d02e37a85645a d912096c40cdc3bc9364966971 testcase/testparams/testbox ---------------- -------------------------- --------------------------- %stddev change %stddev \ | \ 29.79 29.67 phoronix-test-suite/performance-true-Fullscreen-Ultimate-1-Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl-nuc1 29.79 29.67 GEO-MEAN phoronix-test-suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 3b4a7da60f0b..4ad598a727a6 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -878,9 +878,47 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME); } +static void +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) +{ + const struct sseu_dev_info *sseu = &i915->gt.info.sseu; + unsigned int slice, subslice; + u32 mcr, mcr_mask; + + GEM_BUG_ON(INTEL_GEN(i915) < 9); + + /* + * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml + * Before any MMIO read into slice/subslice specific registers, MCR + * packet control register needs to be programmed to point to any + * enabled s/ss pair. Otherwise, incorrect values will be returned. + * This means each subsequent MMIO read will be forwarded to an + * specific s/ss combination, but this is OK since these registers + * are consistent across s/ss in almost all cases. In the rare + * occasions, such as INSTDONE, where this value is dependent + * on s/ss combo, the read should be done with read_subslice_reg. + */ + slice = fls(sseu->slice_mask) - 1; + GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask)); + subslice = fls(intel_sseu_get_subslices(sseu, slice)); + GEM_BUG_ON(!subslice); + subslice--; + + mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice); + mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK; + + drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr); + + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); +} + static void gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) { + /* WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml */ + if (!IS_COFFEELAKE(i915)) + gen9_wa_init_mcr(i915, wal); + /* WaDisableKillLogic:bxt,skl,kbl */ if (!IS_COFFEELAKE(i915) && !IS_COMETLAKE(i915)) wa_write_or(wal,
WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to resolve VP8 hardware encoding system hang up on GT1 sku for ChromiumOS projects Slice specific MMIO read inaccurate so MGSR needs to be programmed appropriately to get correct reads from these slicet-related MMIOs. It dictates that before any MMIO read into Slice/Subslice specific registers, MCR packet control register(0xFDC) needs to be programmed to point to any enabled slice/subslice pair, especially GT1 fused sku since this issue can be reproduced on VP8 hardware encoding via ffmpeg on ChromiumOS devices. When exit PC7, MGSR will reset so that we have to skip fused subslice ID. Reference: HSD#1508045018,1405586840, BSID#0575 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: William Tseng <william.tseng@intel.com> Cc: Lee Shawn C <shawn.c.lee@intel.com> Signed-off-by: Cooper Chiou <cooper.chiou@intel.com> --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 38 +++++++++++++++++++++ 1 file changed, 38 insertions(+)