Message ID | 1461172195-3959-4-git-send-email-John.C.Harrison@Intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Apr 20, 2016 at 06:09:50PM +0100, John.C.Harrison@Intel.com wrote: > From: John Harrison <John.C.Harrison@Intel.com> > > The fence object used inside the request structure requires a sequence > number. Although this is not used by the i915 driver itself, it could > potentially be used by non-i915 code if the fence is passed outside of > the driver. This is the intention as it allows external kernel drivers > and user applications to wait on batch buffer completion > asynchronously via the dma-buff fence API. > > To ensure that such external users are not confused by strange things > happening with the seqno, this patch adds in a per context timeline > that can provide a guaranteed in-order seqno value for the fence. This > is safe because the scheduler will not re-order batch buffers within a > context - they are considered to be mutually dependent. > > v2: New patch in series. > > v3: Renamed/retyped timeline structure fields after review comments by > Tvrtko Ursulin. > > Added context information to the timeline's name string for better > identification in debugfs output. > > v5: Line wrapping and other white space fixes to keep style checker > happy. > > v7: Updated to newer nightly (lots of ring -> engine renaming). > > For: VIZ-5190 > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 25 +++++++--- > drivers/gpu/drm/i915/i915_gem.c | 83 +++++++++++++++++++++++++++++---- > drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++ > drivers/gpu/drm/i915/intel_lrc.c | 8 ++++ > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 - > 5 files changed, 114 insertions(+), 17 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index bb18b89..1c3a1ca 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -813,6 +813,15 @@ struct i915_ctx_hang_stats { > bool banned; > }; > > +struct i915_fence_timeline { > + char name[32]; > + unsigned fence_context; > + unsigned next; > + > + struct intel_context *ctx; > + struct intel_engine_cs *engine; > +}; > + > /* This must match up with the value previously used for execbuf2.rsvd1. */ > #define DEFAULT_CONTEXT_HANDLE 0 > > @@ -860,6 +869,7 @@ struct intel_context { > struct i915_vma *lrc_vma; > u64 lrc_desc; > uint32_t *lrc_reg_state; > + struct i915_fence_timeline fence_timeline; This is wrong. The timeline is actually a property of the vm, with contexts for each engine. -Chris
On 20/04/2016 18:44, Chris Wilson wrote: > On Wed, Apr 20, 2016 at 06:09:50PM +0100, John.C.Harrison@Intel.com wrote: >> From: John Harrison <John.C.Harrison@Intel.com> >> >> The fence object used inside the request structure requires a sequence >> number. Although this is not used by the i915 driver itself, it could >> potentially be used by non-i915 code if the fence is passed outside of >> the driver. This is the intention as it allows external kernel drivers >> and user applications to wait on batch buffer completion >> asynchronously via the dma-buff fence API. >> >> To ensure that such external users are not confused by strange things >> happening with the seqno, this patch adds in a per context timeline >> that can provide a guaranteed in-order seqno value for the fence. This >> is safe because the scheduler will not re-order batch buffers within a >> context - they are considered to be mutually dependent. >> >> v2: New patch in series. >> >> v3: Renamed/retyped timeline structure fields after review comments by >> Tvrtko Ursulin. >> >> Added context information to the timeline's name string for better >> identification in debugfs output. >> >> v5: Line wrapping and other white space fixes to keep style checker >> happy. >> >> v7: Updated to newer nightly (lots of ring -> engine renaming). >> >> For: VIZ-5190 >> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> --- >> drivers/gpu/drm/i915/i915_drv.h | 25 +++++++--- >> drivers/gpu/drm/i915/i915_gem.c | 83 +++++++++++++++++++++++++++++---- >> drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++ >> drivers/gpu/drm/i915/intel_lrc.c | 8 ++++ >> drivers/gpu/drm/i915/intel_ringbuffer.h | 1 - >> 5 files changed, 114 insertions(+), 17 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h >> index bb18b89..1c3a1ca 100644 >> --- a/drivers/gpu/drm/i915/i915_drv.h >> +++ b/drivers/gpu/drm/i915/i915_drv.h >> @@ -813,6 +813,15 @@ struct i915_ctx_hang_stats { >> bool banned; >> }; >> >> +struct i915_fence_timeline { >> + char name[32]; >> + unsigned fence_context; >> + unsigned next; >> + >> + struct intel_context *ctx; >> + struct intel_engine_cs *engine; >> +}; >> + >> /* This must match up with the value previously used for execbuf2.rsvd1. */ >> #define DEFAULT_CONTEXT_HANDLE 0 >> >> @@ -860,6 +869,7 @@ struct intel_context { >> struct i915_vma *lrc_vma; >> u64 lrc_desc; >> uint32_t *lrc_reg_state; >> + struct i915_fence_timeline fence_timeline; > This is wrong. The timeline is actually a property of the vm, with > contexts for each engine. > -Chris > I don't get what you mean. The timeline does not have contexts. It is just an incrementing number. It could possible be shared between all engines of a single software context rather than be specific to the hardware context. Although I think it is safer and more future proof to be the latter, i.e. per engine per s/w context. I don't see where the vm comes into it.
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bb18b89..1c3a1ca 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -813,6 +813,15 @@ struct i915_ctx_hang_stats { bool banned; }; +struct i915_fence_timeline { + char name[32]; + unsigned fence_context; + unsigned next; + + struct intel_context *ctx; + struct intel_engine_cs *engine; +}; + /* This must match up with the value previously used for execbuf2.rsvd1. */ #define DEFAULT_CONTEXT_HANDLE 0 @@ -860,6 +869,7 @@ struct intel_context { struct i915_vma *lrc_vma; u64 lrc_desc; uint32_t *lrc_reg_state; + struct i915_fence_timeline fence_timeline; } engine[I915_NUM_ENGINES]; struct list_head link; @@ -2245,13 +2255,10 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old, struct drm_i915_gem_request { /** * Underlying object for implementing the signal/wait stuff. - * NB: Never call fence_later() or return this fence object to user - * land! Due to lazy allocation, scheduler re-ordering, pre-emption, - * etc., there is no guarantee at all about the validity or - * sequentiality of the fence's seqno! It is also unsafe to let - * anything outside of the i915 driver get hold of the fence object - * as the clean up when decrementing the reference count requires - * holding the driver mutex lock. + * NB: Never return this fence object to user land! It is unsafe to + * let anything outside of the i915 driver get hold of the fence + * object as the clean up when decrementing the reference count + * requires holding the driver mutex lock. */ struct fence fence; @@ -2340,6 +2347,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, struct intel_context *ctx); void i915_gem_request_cancel(struct drm_i915_gem_request *req); +int i915_create_fence_timeline(struct drm_device *dev, + struct intel_context *ctx, + struct intel_engine_cs *ring); + static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req) { return fence_is_signaled(&req->fence); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 4209e3b..99e9ddb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2776,9 +2776,35 @@ static const char *i915_gem_request_get_driver_name(struct fence *req_fence) static const char *i915_gem_request_get_timeline_name(struct fence *req_fence) { - struct drm_i915_gem_request *req = container_of(req_fence, - typeof(*req), fence); - return req->engine->name; + struct drm_i915_gem_request *req; + struct i915_fence_timeline *timeline; + + req = container_of(req_fence, typeof(*req), fence); + timeline = &req->ctx->engine[req->engine->id].fence_timeline; + + return timeline->name; +} + +static void i915_gem_request_timeline_value_str(struct fence *req_fence, + char *str, int size) +{ + struct drm_i915_gem_request *req; + + req = container_of(req_fence, typeof(*req), fence); + + /* Last signalled timeline value ??? */ + snprintf(str, size, "? [%d]"/*, timeline->value*/, + req->engine->get_seqno(req->engine)); +} + +static void i915_gem_request_fence_value_str(struct fence *req_fence, + char *str, int size) +{ + struct drm_i915_gem_request *req; + + req = container_of(req_fence, typeof(*req), fence); + + snprintf(str, size, "%d [%d]", req->fence.seqno, req->seqno); } static const struct fence_ops i915_gem_request_fops = { @@ -2788,8 +2814,50 @@ static const struct fence_ops i915_gem_request_fops = { .release = i915_gem_request_free, .get_driver_name = i915_gem_request_get_driver_name, .get_timeline_name = i915_gem_request_get_timeline_name, + .fence_value_str = i915_gem_request_fence_value_str, + .timeline_value_str = i915_gem_request_timeline_value_str, }; +int i915_create_fence_timeline(struct drm_device *dev, + struct intel_context *ctx, + struct intel_engine_cs *engine) +{ + struct i915_fence_timeline *timeline; + + timeline = &ctx->engine[engine->id].fence_timeline; + + if (timeline->engine) + return 0; + + timeline->fence_context = fence_context_alloc(1); + + /* + * Start the timeline from seqno 0 as this is a special value + * that is reserved for invalid sync points. + */ + timeline->next = 1; + timeline->ctx = ctx; + timeline->engine = engine; + + snprintf(timeline->name, sizeof(timeline->name), "%d>%s:%d", + timeline->fence_context, engine->name, ctx->user_handle); + + return 0; +} + +static unsigned i915_fence_timeline_get_next_seqno(struct i915_fence_timeline *timeline) +{ + unsigned seqno; + + seqno = timeline->next; + + /* Reserve zero for invalid */ + if (++timeline->next == 0) + timeline->next = 1; + + return seqno; +} + static inline int __i915_gem_request_alloc(struct intel_engine_cs *engine, struct intel_context *ctx, @@ -2827,7 +2895,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine, } fence_init(&req->fence, &i915_gem_request_fops, &engine->fence_lock, - engine->fence_context, req->seqno); + ctx->engine[engine->id].fence_timeline.fence_context, + i915_fence_timeline_get_next_seqno(&ctx->engine[engine->id].fence_timeline)); /* * Reserve space in the ring buffer for all the commands required to @@ -4939,7 +5008,7 @@ i915_gem_init_hw(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *engine; - int ret, j, fence_base; + int ret, j; if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt()) return -EIO; @@ -5009,14 +5078,10 @@ i915_gem_init_hw(struct drm_device *dev) if (ret) goto out; - fence_base = fence_context_alloc(I915_NUM_ENGINES); - /* Now it is safe to go back round and do everything else: */ for_each_engine(engine, dev_priv) { struct drm_i915_gem_request *req; - engine->fence_context = fence_base + engine->id; - req = i915_gem_request_alloc(engine, NULL); if (IS_ERR(req)) { ret = PTR_ERR(req); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 282b17c..6b30de9 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -279,6 +279,20 @@ i915_gem_create_context(struct drm_device *dev, if (IS_ERR(ctx)) return ctx; + if (!i915.enable_execlists) { + struct intel_engine_cs *engine; + + /* Create a per context timeline for fences */ + for_each_engine(engine, to_i915(dev)) { + ret = i915_create_fence_timeline(dev, ctx, engine); + if (ret) { + DRM_ERROR("Fence timeline creation failed for legacy %s: %p\n", + engine->name, ctx); + goto err_destroy; + } + } + } + if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state) { /* We may need to do things with the shrinker which * require us to immediately switch back to the default diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 268e024..db74cc14 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2694,6 +2694,14 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx, goto error_ringbuf; } + /* Create a per context timeline for fences */ + ret = i915_create_fence_timeline(dev, ctx, engine); + if (ret) { + DRM_ERROR("Fence timeline creation failed for engine %s, ctx %p\n", + engine->name, ctx); + goto error_ringbuf; + } + ctx->engine[engine->id].ringbuf = ringbuf; ctx->engine[engine->id].state = ctx_obj; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 2cd9d2c..23e1d94 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -349,7 +349,6 @@ struct intel_engine_cs { */ u32 (*get_cmd_length_mask)(u32 cmd_header); - unsigned fence_context; spinlock_t fence_lock; };