Message ID | 20180615190605.16238-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Friday, June 15, 2018 12:06:05 PM PDT Chris Wilson wrote: > From: Kenneth Graunke <kenneth@whitecape.org> > > The SF and clipper units mishandle the provoking vertex in some cases, > which can cause misrendering with shaders that use flat shaded inputs. > > There are chicken bits in 3D_CHICKEN3 (for SF) and FF_SLICE_CHICKEN > (for the clipper) that work around the issue. These registers are > unfortunately not part of the logical context (even the power context), > and so we must reload them every time we start executing in a context. > > Bugzilla: https://bugs.freedesktop.org/103047 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > > This is only being set for gen9 right now, do we need it for gen10+? The 3D_CHICKEN3 bit is labeled as SKL and KBL. The FF_SLICE_CHICKEN bit is labeled as SKL, with no mention of KBL...but the Windows driver appears to set both on all Gen 9. It doesn't look like it sets them on Gen 10, nor is documented to be necessary. (I haven't brought up a Cannonlake to test it, though...) It looks like some of these registers are gone or reworked on Icelake, so Gen9-only sounds good to me. > Ken, I also need your s-o-b. Oops, sorry...got out of the habit of adding that... Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Feel free to take authorship if you prefer - you've done much more work on it than I have at this point. :) Your call. Thanks for fixing it up and getting it landed, by the way! It's a bummer that we have to do it here...but it seems like the only place that it can realistically happen, with the power context issue. It would be nice to Cc this to stable, but it seems like a lot of the workaround code has changed a lot in the meantime, so not sure how well it'd apply... > An open question is how to record such w/a for easy checking from > userspace. Though something like this might be better as part of an > independent verification tool. > -Chris > --- > drivers/gpu/drm/i915/i915_reg.h | 5 +++++ > drivers/gpu/drm/i915/intel_lrc.c | 12 +++++++++++- > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index b8c0ebd50889..54ec7ab57ce8 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -2432,12 +2432,17 @@ enum i915_power_well_id { > #define _3D_CHICKEN _MMIO(0x2084) > #define _3D_CHICKEN_HIZ_PLANE_DISABLE_MSAA_4X_SNB (1 << 10) > #define _3D_CHICKEN2 _MMIO(0x208c) > + > +#define FF_SLICE_CHICKEN _MMIO(0x2088) > +#define FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX (1 << 1) > + > /* Disables pipelining of read flushes past the SF-WIZ interface. > * Required on all Ironlake steppings according to the B-Spec, but the > * particular danger of not doing so is not specified. > */ > # define _3D_CHICKEN2_WM_READ_PIPELINED (1 << 14) > #define _3D_CHICKEN3 _MMIO(0x2090) > +#define _3D_CHICKEN_SF_PROVOKING_VERTEX_FIX (1 << 12) > #define _3D_CHICKEN_SF_DISABLE_OBJEND_CULL (1 << 10) > #define _3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE (1 << 5) > #define _3D_CHICKEN3_SF_DISABLE_FASTCLIP_CULL (1 << 5) > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 839cb1fc6a01..2b0ae552cc4e 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1575,11 +1575,21 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch) > /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */ > batch = gen8_emit_flush_coherentl3_wa(engine, batch); > > + *batch++ = MI_LOAD_REGISTER_IMM(3); > + > /* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */ > - *batch++ = MI_LOAD_REGISTER_IMM(1); > *batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2); > *batch++ = _MASKED_BIT_DISABLE( > GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE); > + > + /* BSpec: 11391 */ > + *batch++ = i915_mmio_reg_offset(FF_SLICE_CHICKEN); > + *batch++ = _MASKED_BIT_ENABLE(FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX); > + > + /* BSpec: 11299 */ > + *batch++ = i915_mmio_reg_offset(_3D_CHICKEN3); > + *batch++ = _MASKED_BIT_ENABLE(_3D_CHICKEN_SF_PROVOKING_VERTEX_FIX); > + > *batch++ = MI_NOOP; > > /* WaClearSlmSpaceAtContextSwitch:kbl */ >
Quoting Chris Wilson (2018-06-15 22:06:05) > From: Kenneth Graunke <kenneth@whitecape.org> > > The SF and clipper units mishandle the provoking vertex in some cases, > which can cause misrendering with shaders that use flat shaded inputs. > > There are chicken bits in 3D_CHICKEN3 (for SF) and FF_SLICE_CHICKEN > (for the clipper) that work around the issue. These registers are > unfortunately not part of the logical context (even the power context), > and so we must reload them every time we start executing in a context. > > Bugzilla: https://bugs.freedesktop.org/103047 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> One note below. > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1575,11 +1575,21 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch) > /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */ > batch = gen8_emit_flush_coherentl3_wa(engine, batch); > > + *batch++ = MI_LOAD_REGISTER_IMM(3); > + > /* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */ > - *batch++ = MI_LOAD_REGISTER_IMM(1); > *batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2); > *batch++ = _MASKED_BIT_DISABLE( > GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE); > + > + /* BSpec: 11391 */ > + *batch++ = i915_mmio_reg_offset(FF_SLICE_CHICKEN); > + *batch++ = _MASKED_BIT_ENABLE(FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX); > + > + /* BSpec: 11299 */ > + *batch++ = i915_mmio_reg_offset(_3D_CHICKEN3); > + *batch++ = _MASKED_BIT_ENABLE(_3D_CHICKEN_SF_PROVOKING_VERTEX_FIX); I'm almost betting that somebody will take one of these 3 off without noticing the distant LOAD_REGISTER_IMM(). To perfect this, const table of pairs and use ARRAY_SIZE()? Then we're also one step closer to the const W/A tables... Regards, Joonas
Quoting Joonas Lahtinen (2018-06-18 10:03:24) > Quoting Chris Wilson (2018-06-15 22:06:05) > > From: Kenneth Graunke <kenneth@whitecape.org> > > > > The SF and clipper units mishandle the provoking vertex in some cases, > > which can cause misrendering with shaders that use flat shaded inputs. > > > > There are chicken bits in 3D_CHICKEN3 (for SF) and FF_SLICE_CHICKEN > > (for the clipper) that work around the issue. These registers are > > unfortunately not part of the logical context (even the power context), > > and so we must reload them every time we start executing in a context. > > > > Bugzilla: https://bugs.freedesktop.org/103047 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Thanks, fingers crossed we don't discover a reason why we shouldn't do this for all contexts... Pushed. > One note below. > > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -1575,11 +1575,21 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch) > > /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */ > > batch = gen8_emit_flush_coherentl3_wa(engine, batch); > > > > + *batch++ = MI_LOAD_REGISTER_IMM(3); > > + > > /* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */ > > - *batch++ = MI_LOAD_REGISTER_IMM(1); > > *batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2); > > *batch++ = _MASKED_BIT_DISABLE( > > GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE); > > + > > + /* BSpec: 11391 */ > > + *batch++ = i915_mmio_reg_offset(FF_SLICE_CHICKEN); > > + *batch++ = _MASKED_BIT_ENABLE(FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX); > > + > > + /* BSpec: 11299 */ > > + *batch++ = i915_mmio_reg_offset(_3D_CHICKEN3); > > + *batch++ = _MASKED_BIT_ENABLE(_3D_CHICKEN_SF_PROVOKING_VERTEX_FIX); > > I'm almost betting that somebody will take one of these 3 off without > noticing the distant LOAD_REGISTER_IMM(). To perfect this, const table > of pairs and use ARRAY_SIZE()? Then we're also one step closer to the > const W/A tables... I'll be back later to tidy this up. -Chris
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index b8c0ebd50889..54ec7ab57ce8 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2432,12 +2432,17 @@ enum i915_power_well_id { #define _3D_CHICKEN _MMIO(0x2084) #define _3D_CHICKEN_HIZ_PLANE_DISABLE_MSAA_4X_SNB (1 << 10) #define _3D_CHICKEN2 _MMIO(0x208c) + +#define FF_SLICE_CHICKEN _MMIO(0x2088) +#define FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX (1 << 1) + /* Disables pipelining of read flushes past the SF-WIZ interface. * Required on all Ironlake steppings according to the B-Spec, but the * particular danger of not doing so is not specified. */ # define _3D_CHICKEN2_WM_READ_PIPELINED (1 << 14) #define _3D_CHICKEN3 _MMIO(0x2090) +#define _3D_CHICKEN_SF_PROVOKING_VERTEX_FIX (1 << 12) #define _3D_CHICKEN_SF_DISABLE_OBJEND_CULL (1 << 10) #define _3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE (1 << 5) #define _3D_CHICKEN3_SF_DISABLE_FASTCLIP_CULL (1 << 5) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 839cb1fc6a01..2b0ae552cc4e 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1575,11 +1575,21 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch) /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */ batch = gen8_emit_flush_coherentl3_wa(engine, batch); + *batch++ = MI_LOAD_REGISTER_IMM(3); + /* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */ - *batch++ = MI_LOAD_REGISTER_IMM(1); *batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2); *batch++ = _MASKED_BIT_DISABLE( GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE); + + /* BSpec: 11391 */ + *batch++ = i915_mmio_reg_offset(FF_SLICE_CHICKEN); + *batch++ = _MASKED_BIT_ENABLE(FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX); + + /* BSpec: 11299 */ + *batch++ = i915_mmio_reg_offset(_3D_CHICKEN3); + *batch++ = _MASKED_BIT_ENABLE(_3D_CHICKEN_SF_PROVOKING_VERTEX_FIX); + *batch++ = MI_NOOP; /* WaClearSlmSpaceAtContextSwitch:kbl */