diff mbox

drm/i915: Ensure consistent control flow __i915_gem_active_get_rcu

Message ID 1471856122-466-1-git-send-email-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Vetter Aug. 22, 2016, 8:55 a.m. UTC
This issue here is (I think) purely theoretical, since a compiler
would need to be especially foolish to recompute the value of
i915_gem_request_completed right after it was already used. Hence the
additional barrier() is also not really a restriction.

But I believe this to be at least permissible, and since our rcu
trickery is a beast it's worth to annotate all the corner cases.
Chris proposed to instead just wrap a READ_ONCE around
request->fence.seqno in i915_gem_request_completed. But that has a
measurable impact on code size, and everywhere we hold a full
reference to the underlying request it's also not needed. And
personally I'd like to have just enough barriers and locking needed
for correctness, but not more - it makes it much easier in the future
to understand what's going on.

Since the busy ioctl has now fully embraced it's races there's no
point annotating it there too. We really only need it in
active_get_rcu, since that function _must_ deliver a correct snapshot
of the active fences (and not chase something else).

v2: Polish the comment a bit more (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_request.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Chris Wilson Aug. 22, 2016, 10:15 a.m. UTC | #1
On Mon, Aug 22, 2016 at 10:55:22AM +0200, Daniel Vetter wrote:
> This issue here is (I think) purely theoretical, since a compiler
> would need to be especially foolish to recompute the value of
> i915_gem_request_completed right after it was already used. Hence the
> additional barrier() is also not really a restriction.
> 
> But I believe this to be at least permissible, and since our rcu
> trickery is a beast it's worth to annotate all the corner cases.
> Chris proposed to instead just wrap a READ_ONCE around
> request->fence.seqno in i915_gem_request_completed. But that has a
> measurable impact on code size, and everywhere we hold a full
> reference to the underlying request it's also not needed. And
> personally I'd like to have just enough barriers and locking needed
> for correctness, but not more - it makes it much easier in the future
> to understand what's going on.
> 
> Since the busy ioctl has now fully embraced it's races there's no
> point annotating it there too. We really only need it in
> active_get_rcu, since that function _must_ deliver a correct snapshot
> of the active fences (and not chase something else).
> 
> v2: Polish the comment a bit more (Chris).
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Just the fun of the role reversal, I've pushed it.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 6c72bd8d9423..91014de8bfbc 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -472,6 +472,19 @@  __i915_gem_active_get_rcu(const struct i915_gem_active *active)
 		if (!request || i915_gem_request_completed(request))
 			return NULL;
 
+		/* An especially silly compiler could decide to recompute the
+		 * result of i915_gem_request_completed, more specifically
+		 * re-emit the load for request->fence.seqno. A race would catch
+		 * a later seqno value, which could flip the result from true to
+		 * false. Which means part of the instructions below might not
+		 * be executed, while later on instructions are executed. Due to
+		 * barriers within the refcounting the inconsistency can't reach
+		 * past the call to i915_gem_request_get_rcu, but not executing
+		 * that while still executing i915_gem_request_put() creates
+		 * havoc enough.  Prevent this with a compiler barrier.
+		 */
+		barrier();
+
 		request = i915_gem_request_get_rcu(request);
 
 		/* What stops the following rcu_access_pointer() from occurring