Message ID | 20180412135237.9393-1-mika.kuoppala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Mika Kuoppala (2018-04-12 14:52:37)
> Read ordering on cpu can be out of order and speculative.
If it didn't include a barrier to prevent compiler mispeculation. rmb()
only applies to other CPU cores and not the GPU so this is entirely
hocus.
-Chris
Quoting Chris Wilson (2018-04-12 14:57:42) > Quoting Mika Kuoppala (2018-04-12 14:52:37) > > Read ordering on cpu can be out of order and speculative. > > If it didn't include a barrier to prevent compiler mispeculation. rmb() > only applies to other CPU cores and not the GPU so this is entirely > hocus. To put it another way this is just imposing the cost of an uncached mmio which the entire point of using HWSP in the first place is to avoid. -Chris
Quoting Chris Wilson (2018-04-12 14:57:42) > Quoting Mika Kuoppala (2018-04-12 14:52:37) > > Read ordering on cpu can be out of order and speculative. > > If it didn't include a barrier to prevent compiler mispeculation. rmb() > only applies to other CPU cores and not the GPU so this is entirely > hocus. Or think of it in terms checkpatch will ask you, where is the corresponding wmb()? -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > Quoting Mika Kuoppala (2018-04-12 14:52:37) >> Read ordering on cpu can be out of order and speculative. > > If it didn't include a barrier to prevent compiler mispeculation. rmb() > only applies to other CPU cores and not the GPU so this is entirely > hocus. How I understand the Documentation/memory-barrier.txt/THE THINGS CPUS GET UP TO, the compiler memory barrier is not enough. > -Chris
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index c7c85134a84a..5378391e1e71 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -969,6 +969,8 @@ static void execlists_submission_tasklet(unsigned long data) head = execlists->csb_head; tail = READ_ONCE(buf[write_idx]); + /* Enforce ordering of reads into the HWSP */ + rmb(); } GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n", engine->name,
Read ordering on cpu can be out of order and speculative. In addition, the access to the hardware status page is snooped. This raises a concern that we might use the HWSP in an unexpected way as we may load the head of a cbs entry before we access the tail. Concerns like that the coherency protocol is tied somehow to the tail read snoop. To enforce that we really do read the tail before we fetch the csb entry pointed by head, insert a read memory barrier after we have read the tail. This fixes, or masks due to added latency, context status buffer incoherence on cnl, where we see an old context status entry still on that a csb slot. References: https://bugs.freedesktop.org/show_bug.cgi?id=105888 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/intel_lrc.c | 2 ++ 1 file changed, 2 insertions(+)