Message ID | 1471856122-466-1-git-send-email-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 22, 2016 at 10:55:22AM +0200, Daniel Vetter wrote: > This issue here is (I think) purely theoretical, since a compiler > would need to be especially foolish to recompute the value of > i915_gem_request_completed right after it was already used. Hence the > additional barrier() is also not really a restriction. > > But I believe this to be at least permissible, and since our rcu > trickery is a beast it's worth to annotate all the corner cases. > Chris proposed to instead just wrap a READ_ONCE around > request->fence.seqno in i915_gem_request_completed. But that has a > measurable impact on code size, and everywhere we hold a full > reference to the underlying request it's also not needed. And > personally I'd like to have just enough barriers and locking needed > for correctness, but not more - it makes it much easier in the future > to understand what's going on. > > Since the busy ioctl has now fully embraced it's races there's no > point annotating it there too. We really only need it in > active_get_rcu, since that function _must_ deliver a correct snapshot > of the active fences (and not chase something else). > > v2: Polish the comment a bit more (Chris). > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Just the fun of the role reversal, I've pushed it. -Chris
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h index 6c72bd8d9423..91014de8bfbc 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.h +++ b/drivers/gpu/drm/i915/i915_gem_request.h @@ -472,6 +472,19 @@ __i915_gem_active_get_rcu(const struct i915_gem_active *active) if (!request || i915_gem_request_completed(request)) return NULL; + /* An especially silly compiler could decide to recompute the + * result of i915_gem_request_completed, more specifically + * re-emit the load for request->fence.seqno. A race would catch + * a later seqno value, which could flip the result from true to + * false. Which means part of the instructions below might not + * be executed, while later on instructions are executed. Due to + * barriers within the refcounting the inconsistency can't reach + * past the call to i915_gem_request_get_rcu, but not executing + * that while still executing i915_gem_request_put() creates + * havoc enough. Prevent this with a compiler barrier. + */ + barrier(); + request = i915_gem_request_get_rcu(request); /* What stops the following rcu_access_pointer() from occurring