Message ID | 20181031090243.31993-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | RFT drm/i915/execlists: Flush memory before signaling ELSQ | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > We observe that the ordering of writes for a CS event is not as strong > from the GPU as we would like, and that on occasions we see the > ringbuffer tail updated before the event is written into the ringbuffer, > leading us to reuse the stale data. > > Through around a big hammer to try and batter ELSQ into submission with > the presumption that perhaps the UC mmio write is not flushing our > writes into the context images. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=108315 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/intel_lrc.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 22b57b8926fc..ba61849fbb9b 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -454,8 +454,10 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > } > > /* we need to manually load the submit queue */ > - if (execlists->ctrl_reg) > + if (execlists->ctrl_reg) { > + wmb(); /* XXX Big hammer or paper? XXX */ > writel(EL_CTRL_LOAD, execlists->ctrl_reg); Will be difficult to know if it is the additional delay or not. The question is then why would it only be with ctrl_reg and not legacy submission mode. So lets try with legacy mode? -Mika > + } > > execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK); > } > -- > 2.19.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Quoting Mika Kuoppala (2018-10-31 14:49:38) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > We observe that the ordering of writes for a CS event is not as strong > > from the GPU as we would like, and that on occasions we see the > > ringbuffer tail updated before the event is written into the ringbuffer, > > leading us to reuse the stale data. > > > > Through around a big hammer to try and batter ELSQ into submission with > > the presumption that perhaps the UC mmio write is not flushing our > > writes into the context images. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=108315 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/i915/intel_lrc.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > > index 22b57b8926fc..ba61849fbb9b 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -454,8 +454,10 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > > } > > > > /* we need to manually load the submit queue */ > > - if (execlists->ctrl_reg) > > + if (execlists->ctrl_reg) { > > + wmb(); /* XXX Big hammer or paper? XXX */ > > writel(EL_CTRL_LOAD, execlists->ctrl_reg); > > Will be difficult to know if it is the additional delay or > not. Either way; its notourbug. > The question is then why would it only be with ctrl_reg and > not legacy submission mode. So lets try with legacy mode? Because it is definitely platform specific, and the only platform that shows sign of this is uses ELSQ. -Chris
Quoting Patchwork (2018-10-31 16:37:57) > == Series Details == > > Series: RFT drm/i915/execlists: Flush memory before signaling ELSQ > URL : https://patchwork.freedesktop.org/series/51796/ > State : success > > == Summary == > > = CI Bug Log - changes from CI_DRM_5062 -> Patchwork_10671 = > > == Summary - WARNING == > > Minor unknown changes coming with Patchwork_10671 need to be verified > manually. > > If you think the reported changes have nothing to do with the changes > introduced in Patchwork_10671, please notify your bug team to allow them > to document this new failure mode, which will reduce false positives in CI. > > External URL: https://patchwork.freedesktop.org/api/1.0/series/51796/revisions/1/mbox/ > > == Possible new issues == > > Here are the unknown changes that may have been introduced in Patchwork_10671: > > === IGT changes === > > ==== Warnings ==== > > igt@prime_vgem@basic-fence-flip: > fi-ivb-3520m: SKIP -> PASS > > > == Known issues == > > Here are the changes found in Patchwork_10671 that come from known issues: > > === IGT changes === > > ==== Issues hit ==== > > igt@drv_selftest@live_hangcheck: > fi-icl-u: PASS -> INCOMPLETE (fdo#108315) Hmm, that could be the failure we are looking for. Hard to tell as it might just have been the preceding error during reset. -Chris
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 22b57b8926fc..ba61849fbb9b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -454,8 +454,10 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) } /* we need to manually load the submit queue */ - if (execlists->ctrl_reg) + if (execlists->ctrl_reg) { + wmb(); /* XXX Big hammer or paper? XXX */ writel(EL_CTRL_LOAD, execlists->ctrl_reg); + } execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK); }
We observe that the ordering of writes for a CS event is not as strong from the GPU as we would like, and that on occasions we see the ringbuffer tail updated before the event is written into the ringbuffer, leading us to reuse the stale data. Through around a big hammer to try and batter ELSQ into submission with the presumption that perhaps the UC mmio write is not flushing our writes into the context images. References: https://bugs.freedesktop.org/show_bug.cgi?id=108315 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/intel_lrc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)