Message ID | 20191113151956.32242-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gt: Invalidate as we write the gen7 breadcrumb | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > Still the saga of the hsw live_blt incoherency continues. While it did > seem that the invalidate before the breadcrumb had improved the mtbf, > nevertheless live_blt still failed. Mika's next idea was to pull the > invalidate-stall into the breadcrumb write itself. > > References: 860afa086841 ("drm/i915/gt: Flush gen7 even harder") > References: https://bugs.freedesktop.org/show_bug.cgi?id=112147 > Testcase: igt/i915_selftest/live_blt > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/gt/intel_ring_submission.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c > index e8bee44add34..f25ceccb335e 100644 > --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c > +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c > @@ -454,12 +454,8 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs) > GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma); > GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR); > > - *cs++ = (MI_FLUSH_DW | MI_INVALIDATE_TLB | > - MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW); > - *cs++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT; > - *cs++ = 0; > - > - *cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; > + *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB | > + MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; > *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT; > *cs++ = rq->fence.seqno; In both would have been the shotgun approach. You favour sniper. Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > @@ -474,6 +470,7 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs) > *cs++ = 0; > > *cs++ = MI_USER_INTERRUPT; > + *cs++ = MI_NOOP; > > rq->tail = intel_ring_offset(rq, cs); > assert_ring_tail_valid(rq->ring, rq->tail); > -- > 2.24.0
Quoting Mika Kuoppala (2019-11-13 15:59:53) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Still the saga of the hsw live_blt incoherency continues. While it did > > seem that the invalidate before the breadcrumb had improved the mtbf, > > nevertheless live_blt still failed. Mika's next idea was to pull the > > invalidate-stall into the breadcrumb write itself. > > > > References: 860afa086841 ("drm/i915/gt: Flush gen7 even harder") > > References: https://bugs.freedesktop.org/show_bug.cgi?id=112147 > > Testcase: igt/i915_selftest/live_blt > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_ring_submission.c | 9 +++------ > > 1 file changed, 3 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > index e8bee44add34..f25ceccb335e 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > @@ -454,12 +454,8 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs) > > GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma); > > GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR); > > > > - *cs++ = (MI_FLUSH_DW | MI_INVALIDATE_TLB | > > - MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW); > > - *cs++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT; > > - *cs++ = 0; > > - > > - *cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; > > + *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB | > > + MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; > > *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT; > > *cs++ = rq->fence.seqno; > > In both would have been the shotgun approach. You favour sniper. At the end of the day, we stop when live_blt stops failing -- and we hope that it's in the simplest form when we forget about it :) -Chris
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index e8bee44add34..f25ceccb335e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -454,12 +454,8 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs) GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma); GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR); - *cs++ = (MI_FLUSH_DW | MI_INVALIDATE_TLB | - MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW); - *cs++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT; - *cs++ = 0; - - *cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; + *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB | + MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT; *cs++ = rq->fence.seqno; @@ -474,6 +470,7 @@ static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs) *cs++ = 0; *cs++ = MI_USER_INTERRUPT; + *cs++ = MI_NOOP; rq->tail = intel_ring_offset(rq, cs); assert_ring_tail_valid(rq->ring, rq->tail);
Still the saga of the hsw live_blt incoherency continues. While it did seem that the invalidate before the breadcrumb had improved the mtbf, nevertheless live_blt still failed. Mika's next idea was to pull the invalidate-stall into the breadcrumb write itself. References: 860afa086841 ("drm/i915/gt: Flush gen7 even harder") References: https://bugs.freedesktop.org/show_bug.cgi?id=112147 Testcase: igt/i915_selftest/live_blt Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/gt/intel_ring_submission.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)