Message ID | 20180508163042.21825-2-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 05/08/2018 09:30 AM, Chris Wilson wrote: > We assume that the CSB is written using the normal ringbuffer > coherency protocols, as outlined in kernel/events/ring_buffer.c: > > * (HW) (DRIVER) > * > * if (LOAD ->data_tail) { LOAD ->data_head > * (A) smp_rmb() (C) > * STORE $data LOAD $data > * smp_wmb() (B) smp_mb() (D) > * STORE ->data_head STORE ->data_tail > * } > > So we assume that the HW fulfils its ordering requirements (B), and so > we should use a complimentary rmb (C) to ensure that our read of its > WRITE pointer is completed before we start accessing the data. > > The final mb (D) is implied by the uncached mmio we perform to inform > the HW of our READ pointer. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=105064 > References: https://bugs.freedesktop.org/show_bug.cgi?id=105888 > References: https://bugs.freedesktop.org/show_bug.cgi?id=106185 > Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP") > References: 61bf9719fa17 ("drm/i915/cnl: Use mmio access to context status buffer") > Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Michał Winiarski <michal.winiarski@intel.com> > Cc: Rafael Antognolli <rafael.antognolli@intel.com> > Cc: Michel Thierry <michel.thierry@intel.com> > Cc: Timo Aaltonen <tjaalton@ubuntu.com> > Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Acked-by: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/intel_lrc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 911f288f78aa..8977600f0d81 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -992,6 +992,7 @@ static void execlists_submission_tasklet(unsigned long data) > > head = execlists->csb_head; > tail = READ_ONCE(buf[write_idx]); > + rmb(); /* Hopefully paired with a wmb() in HW */ > } > GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n", > engine->name, >
Quoting Chris Wilson (2018-05-08 17:30:41) > We assume that the CSB is written using the normal ringbuffer > coherency protocols, as outlined in kernel/events/ring_buffer.c: > > * (HW) (DRIVER) > * > * if (LOAD ->data_tail) { LOAD ->data_head > * (A) smp_rmb() (C) > * STORE $data LOAD $data > * smp_wmb() (B) smp_mb() (D) > * STORE ->data_head STORE ->data_tail > * } > > So we assume that the HW fulfils its ordering requirements (B), and so > we should use a complimentary rmb (C) to ensure that our read of its > WRITE pointer is completed before we start accessing the data. > > The final mb (D) is implied by the uncached mmio we perform to inform > the HW of our READ pointer. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=105064 -References: https://bugs.freedesktop.org/show_bug.cgi?id=105064 Resolved as ba74cb10c775 ("drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write") So the other possibility is that this fixes up the issue with VT-d latency. -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > We assume that the CSB is written using the normal ringbuffer > coherency protocols, as outlined in kernel/events/ring_buffer.c: > > * (HW) (DRIVER) > * > * if (LOAD ->data_tail) { LOAD ->data_head > * (A) smp_rmb() (C) > * STORE $data LOAD $data > * smp_wmb() (B) smp_mb() (D) > * STORE ->data_head STORE ->data_tail > * } > > So we assume that the HW fulfils its ordering requirements (B), and so > we should use a complimentary rmb (C) to ensure that our read of its > WRITE pointer is completed before we start accessing the data. > > The final mb (D) is implied by the uncached mmio we perform to inform > the HW of our READ pointer. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=105064 > References: https://bugs.freedesktop.org/show_bug.cgi?id=105888 > References: https://bugs.freedesktop.org/show_bug.cgi?id=106185 > Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP") > References: 61bf9719fa17 ("drm/i915/cnl: Use mmio access to context status buffer") > Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Michał Winiarski <michal.winiarski@intel.com> > Cc: Rafael Antognolli <rafael.antognolli@intel.com> > Cc: Michel Thierry <michel.thierry@intel.com> > Cc: Timo Aaltonen <tjaalton@ubuntu.com> > Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Acked-by: Mika Kuoppala <mika.kuoppala@intel.com> This also makes the hwsp access work fine with iommu on (kbl). -Mika > --- > drivers/gpu/drm/i915/intel_lrc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 911f288f78aa..8977600f0d81 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -992,6 +992,7 @@ static void execlists_submission_tasklet(unsigned long data) > > head = execlists->csb_head; > tail = READ_ONCE(buf[write_idx]); > + rmb(); /* Hopefully paired with a wmb() in HW */ > } > GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n", > engine->name, > -- > 2.17.0
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 911f288f78aa..8977600f0d81 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -992,6 +992,7 @@ static void execlists_submission_tasklet(unsigned long data) head = execlists->csb_head; tail = READ_ONCE(buf[write_idx]); + rmb(); /* Hopefully paired with a wmb() in HW */ } GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n", engine->name,