Message ID | 1452701985-19706-1-git-send-email-nicholas.hoath@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
This version resolved the issue (kernel bug check in intel_lr_context_clean_ring) I reported on previous versions. Verified by igt drv_module_reload_basic, gem_close_race and -t basic tests. Reviewed-by: Alex Dai <yu.dai@intel.com> On 01/13/2016 08:19 AM, Nick Hoath wrote: > Use the first retired request on a new context to unpin > the old context. This ensures that the hw context remains > bound until it has been written back to by the GPU. > Now that the context is pinned until later in the request/context > lifecycle, it no longer needs to be pinned from context_queue to > retire_requests. > This fixes an issue with GuC submission where the GPU might not > have finished writing back the context before it is unpinned. This > results in a GPU hang. > > v2: Moved the new pin to cover GuC submission (Alex Dai) > Moved the new unpin to request_retire to fix coverage leak > v3: Added switch to default context if freeing a still pinned > context just in case the hw was actually still using it > v4: Unwrapped context unpin to allow calling without a request > v5: Only create a switch to idle context if the ring doesn't > already have a request pending on it (Alex Dai) > Rename unsaved to dirty to avoid double negatives (Dave Gordon) > Changed _no_req postfix to __ prefix for consistency (Dave Gordon) > Split out per engine cleanup from context_free as it > was getting unwieldy > Corrected locking (Dave Gordon) > v6: Removed some bikeshedding (Mika Kuoppala) > Added explanation of the GuC hang that this fixes (Daniel Vetter) > v7: Removed extra per request pinning from ring reset code (Alex Dai) > Added forced ring unpin/clean in error case in context free (Alex Dai) > v8: Renamed lrc specific last_context to lrc_last_context as there > were some reset cases where the codepaths leaked (Mika Kuoppala) > NULL'd last_context in reset case - there was a pointer leak > if someone did reset->close context. > v9: Rebase over "Fix context/engine cleanup order" > v10: Rebase over nightly, remove WARN_ON which caused the > dependency on dev. > > Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> > Issue: VIZ-4277 > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: David Gordon <david.s.gordon@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Alex Dai <yu.dai@intel.com> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/i915_gem.c | 3 + > drivers/gpu/drm/i915/intel_lrc.c | 138 ++++++++++++++++++++++++++------ > drivers/gpu/drm/i915/intel_lrc.h | 1 + > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > 5 files changed, 121 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 104bd18..d28e10a 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -882,6 +882,7 @@ struct intel_context { > struct { > struct drm_i915_gem_object *state; > struct intel_ringbuffer *ringbuf; > + bool dirty; > int pin_count; > } engine[I915_NUM_RINGS]; > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ddc21d4..7b79405 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1413,6 +1413,9 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request) > { > trace_i915_gem_request_retire(request); > > + if (i915.enable_execlists) > + intel_lr_context_complete_check(request); > + > /* We know the GPU must have read the request to have > * sent us the seqno + interrupt, so use the position > * of tail of the request to update the last known position > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 5027699..b661058 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -585,9 +585,6 @@ static int execlists_context_queue(struct drm_i915_gem_request *request) > struct drm_i915_gem_request *cursor; > int num_elements = 0; > > - if (request->ctx != ring->default_context) > - intel_lr_context_pin(request); > - > i915_gem_request_reference(request); > > spin_lock_irq(&ring->execlist_lock); > @@ -763,6 +760,13 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request) > if (intel_ring_stopped(ring)) > return; > > + if (request->ctx != ring->default_context) { > + if (!request->ctx->engine[ring->id].dirty) { > + intel_lr_context_pin(request); > + request->ctx->engine[ring->id].dirty = true; > + } > + } > + > if (dev_priv->guc.execbuf_client) > i915_guc_submit(dev_priv->guc.execbuf_client, request); > else > @@ -989,12 +993,6 @@ void intel_execlists_retire_requests(struct intel_engine_cs *ring) > spin_unlock_irq(&ring->execlist_lock); > > list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) { > - struct intel_context *ctx = req->ctx; > - struct drm_i915_gem_object *ctx_obj = > - ctx->engine[ring->id].state; > - > - if (ctx_obj && (ctx != ring->default_context)) > - intel_lr_context_unpin(req); > list_del(&req->execlist_link); > i915_gem_request_unreference(req); > } > @@ -1089,21 +1087,39 @@ reset_pin_count: > return ret; > } > > -void intel_lr_context_unpin(struct drm_i915_gem_request *rq) > +static void __intel_lr_context_unpin(struct intel_engine_cs *ring, > + struct intel_context *ctx) > { > - struct intel_engine_cs *ring = rq->ring; > - struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state; > - struct intel_ringbuffer *ringbuf = rq->ringbuf; > - > + struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state; > + struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; > if (ctx_obj) { > WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex)); > - if (--rq->ctx->engine[ring->id].pin_count == 0) { > + if (--ctx->engine[ring->id].pin_count == 0) { > intel_unpin_ringbuffer_obj(ringbuf); > i915_gem_object_ggtt_unpin(ctx_obj); > } > } > } > > +void intel_lr_context_unpin(struct drm_i915_gem_request *rq) > +{ > + __intel_lr_context_unpin(rq->ring, rq->ctx); > +} > + > +void intel_lr_context_complete_check(struct drm_i915_gem_request *req) > +{ > + struct intel_engine_cs *ring = req->ring; > + > + if (ring->lrc_last_context && ring->lrc_last_context != req->ctx && > + ring->lrc_last_context->engine[ring->id].dirty) { > + __intel_lr_context_unpin( > + ring, > + ring->lrc_last_context); > + ring->lrc_last_context->engine[ring->id].dirty = false; > + } > + ring->lrc_last_context = req->ctx; > +} > + > static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) > { > int ret, i; > @@ -2347,6 +2363,74 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o > } > > /** > + * intel_lr_context_clean_ring() - clean the ring specific parts of an LRC > + * @ctx: the LR context being freed. > + * @ring: the engine being cleaned > + * @ctx_obj: the hw context being unreferenced > + * @ringbuf: the ringbuf being freed > + * > + * Take care of cleaning up the per-engine backing > + * objects and the logical ringbuffer. > + */ > +static void > +intel_lr_context_clean_ring(struct intel_context *ctx, > + struct intel_engine_cs *ring, > + struct drm_i915_gem_object *ctx_obj, > + struct intel_ringbuffer *ringbuf) > +{ > + int ret; > + > + if (ctx == ring->default_context) { > + intel_unpin_ringbuffer_obj(ringbuf); > + i915_gem_object_ggtt_unpin(ctx_obj); > + } > + > + if (ctx->engine[ring->id].dirty) { > + struct drm_i915_gem_request *req = NULL; > + > + /** > + * If there is already a request pending on > + * this ring, wait for that to complete, > + * otherwise create a switch to idle request > + */ > + if (list_empty(&ring->request_list)) { > + int ret; > + > + ret = i915_gem_request_alloc( > + ring, > + ring->default_context, > + &req); > + if (!ret) > + i915_add_request(req); > + else > + DRM_DEBUG("Failed to ensure context saved"); > + } else { > + req = list_first_entry( > + &ring->request_list, > + typeof(*req), list); > + } > + if (req) { > + ret = i915_wait_request(req); > + if (ret != 0) { > + /** > + * If we get here, there's probably been a ring > + * reset, so we just clean up the dirty flag.& > + * pin count. > + */ > + ctx->engine[ring->id].dirty = false; > + __intel_lr_context_unpin( > + ring, > + ctx); > + } > + } > + } > + > + WARN_ON(ctx->engine[ring->id].pin_count); > + intel_ringbuffer_free(ringbuf); > + drm_gem_object_unreference(&ctx_obj->base); > +} > + > +/** > * intel_lr_context_free() - free the LRC specific bits of a context > * @ctx: the LR context to free. > * > @@ -2358,7 +2444,7 @@ void intel_lr_context_free(struct intel_context *ctx) > { > int i; > > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_NUM_RINGS; ++i) { > struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state; > > if (ctx_obj) { > @@ -2366,13 +2452,10 @@ void intel_lr_context_free(struct intel_context *ctx) > ctx->engine[i].ringbuf; > struct intel_engine_cs *ring = ringbuf->ring; > > - if (ctx == ring->default_context) { > - intel_unpin_ringbuffer_obj(ringbuf); > - i915_gem_object_ggtt_unpin(ctx_obj); > - } > - WARN_ON(ctx->engine[ring->id].pin_count); > - intel_ringbuffer_free(ringbuf); > - drm_gem_object_unreference(&ctx_obj->base); > + intel_lr_context_clean_ring(ctx, > + ring, > + ctx_obj, > + ringbuf); > } > } > } > @@ -2548,5 +2631,14 @@ void intel_lr_context_reset(struct drm_device *dev, > > ringbuf->head = 0; > ringbuf->tail = 0; > + > + if (ctx->engine[ring->id].dirty) { > + __intel_lr_context_unpin( > + ring, > + ctx); > + ctx->engine[ring->id].dirty = false; > + if (ring->lrc_last_context == ctx) > + ring->lrc_last_context = NULL; > + } > } > } > diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h > index de41ad6..48690f79 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.h > +++ b/drivers/gpu/drm/i915/intel_lrc.h > @@ -106,6 +106,7 @@ void intel_lr_context_reset(struct drm_device *dev, > struct intel_context *ctx); > uint64_t intel_lr_context_descriptor(struct intel_context *ctx, > struct intel_engine_cs *ring); > +void intel_lr_context_complete_check(struct drm_i915_gem_request *req); > > /* Execlists */ > int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists); > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index 7349d92..fb1df83 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -276,6 +276,7 @@ struct intel_engine_cs { > u32 flush_domains); > int (*emit_bb_start)(struct drm_i915_gem_request *req, > u64 offset, unsigned dispatch_flags); > + struct intel_context *lrc_last_context; > > /** > * List of objects currently involved in rendering from the
On 14/01/2016 07:20, Patchwork wrote: > == Summary == > > Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest > > Test gem_ctx_basic: > pass -> FAIL (bdw-ultra) Test failed to load - not patch related > Test gem_ctx_param_basic: > Subgroup non-root-set: > pass -> DMESG-WARN (bsw-nuc-2) gem driver allocated a poisoned slab - not patch related > Test kms_flip: > Subgroup basic-flip-vs-dpms: > pass -> SKIP (bsw-nuc-2) test reqs not met - not patch related > dmesg-warn -> PASS (ilk-hp8440p) warn to PASS > > bdw-nuci7 total:138 pass:128 dwarn:1 dfail:0 fail:0 skip:9 > bdw-ultra total:138 pass:131 dwarn:0 dfail:0 fail:1 skip:6 > bsw-nuc-2 total:141 pass:113 dwarn:3 dfail:0 fail:0 skip:25 > hsw-brixbox total:141 pass:134 dwarn:0 dfail:0 fail:0 skip:7 > hsw-gt2 total:141 pass:137 dwarn:0 dfail:0 fail:0 skip:4 > ilk-hp8440p total:141 pass:101 dwarn:3 dfail:0 fail:0 skip:37 > ivb-t430s total:135 pass:122 dwarn:3 dfail:4 fail:0 skip:6 > skl-i5k-2 total:141 pass:131 dwarn:2 dfail:0 fail:0 skip:8 > skl-i7k-2 total:141 pass:131 dwarn:2 dfail:0 fail:0 skip:8 > snb-dellxps total:141 pass:122 dwarn:5 dfail:0 fail:0 skip:14 > snb-x220t total:141 pass:122 dwarn:5 dfail:0 fail:1 skip:13 > > Results at /archive/results/CI_IGT_test/Patchwork_1174/ >
On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote: > + if (ctx->engine[ring->id].dirty) { > + struct drm_i915_gem_request *req = NULL; > + > + /** > + * If there is already a request pending on > + * this ring, wait for that to complete, > + * otherwise create a switch to idle request > + */ > + if (list_empty(&ring->request_list)) { > + int ret; > + > + ret = i915_gem_request_alloc( > + ring, > + ring->default_context, > + &req); > + if (!ret) > + i915_add_request(req); > + else > + DRM_DEBUG("Failed to ensure context saved"); > + } else { > + req = list_first_entry( > + &ring->request_list, > + typeof(*req), list); > + } > + if (req) { > + ret = i915_wait_request(req); > + if (ret != 0) { > + /** > + * If we get here, there's probably been a ring > + * reset, so we just clean up the dirty flag.& > + * pin count. > + */ > + ctx->engine[ring->id].dirty = false; > + __intel_lr_context_unpin( > + ring, > + ctx); > + } > + } If you were to take a lr_context_pin on the last_context, and only release that pin when you change to a new context, you do not need to introduce a blocking context-close, nor do you need to introduce the usage of default_context. (lr_context_pin should take a reference on the ctx to prevent early freeeing ofc). The code at that point starts to look v.v.similar to legacy, right down to the need to use a GPU reset during shutdown to prevent writing back the context image. (Which you still currently need to get rid of the default context now.) -Chris
On 14/01/2016 11:36, Chris Wilson wrote: > On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote: >> + if (ctx->engine[ring->id].dirty) { >> + struct drm_i915_gem_request *req = NULL; >> + >> + /** >> + * If there is already a request pending on >> + * this ring, wait for that to complete, >> + * otherwise create a switch to idle request >> + */ >> + if (list_empty(&ring->request_list)) { >> + int ret; >> + >> + ret = i915_gem_request_alloc( >> + ring, >> + ring->default_context, >> + &req); >> + if (!ret) >> + i915_add_request(req); >> + else >> + DRM_DEBUG("Failed to ensure context saved"); >> + } else { >> + req = list_first_entry( >> + &ring->request_list, >> + typeof(*req), list); >> + } >> + if (req) { >> + ret = i915_wait_request(req); >> + if (ret != 0) { >> + /** >> + * If we get here, there's probably been a ring >> + * reset, so we just clean up the dirty flag.& >> + * pin count. >> + */ >> + ctx->engine[ring->id].dirty = false; >> + __intel_lr_context_unpin( >> + ring, >> + ctx); >> + } >> + } > > If you were to take a lr_context_pin on the last_context, and only > release that pin when you change to a new context, you do not need to That what this patch does. > introduce a blocking context-close, nor do you need to introduce the > usage of default_context. The use of default_context here is to stop a context hanging around after it is no longer needed. > > (lr_context_pin should take a reference on the ctx to prevent early > freeeing ofc). You can't clear the reference on the ctx in an interrupt context. > > The code at that point starts to look v.v.similar to legacy, right down > to the need to use a GPU reset during shutdown to prevent writing back > the context image. (Which you still currently need to get rid of the > default context now.) > -Chris >
On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote: > On 14/01/2016 11:36, Chris Wilson wrote: > >On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote: > >>+ if (ctx->engine[ring->id].dirty) { > >>+ struct drm_i915_gem_request *req = NULL; > >>+ > >>+ /** > >>+ * If there is already a request pending on > >>+ * this ring, wait for that to complete, > >>+ * otherwise create a switch to idle request > >>+ */ > >>+ if (list_empty(&ring->request_list)) { > >>+ int ret; > >>+ > >>+ ret = i915_gem_request_alloc( > >>+ ring, > >>+ ring->default_context, > >>+ &req); > >>+ if (!ret) > >>+ i915_add_request(req); > >>+ else > >>+ DRM_DEBUG("Failed to ensure context saved"); > >>+ } else { > >>+ req = list_first_entry( > >>+ &ring->request_list, > >>+ typeof(*req), list); > >>+ } > >>+ if (req) { > >>+ ret = i915_wait_request(req); > >>+ if (ret != 0) { > >>+ /** > >>+ * If we get here, there's probably been a ring > >>+ * reset, so we just clean up the dirty flag.& > >>+ * pin count. > >>+ */ > >>+ ctx->engine[ring->id].dirty = false; > >>+ __intel_lr_context_unpin( > >>+ ring, > >>+ ctx); > >>+ } > >>+ } > > > >If you were to take a lr_context_pin on the last_context, and only > >release that pin when you change to a new context, you do not need to > > That what this patch does. > > >introduce a blocking context-close, nor do you need to introduce the > >usage of default_context. > > The use of default_context here is to stop a context hanging around > after it is no longer needed. By blocking, which is not acceptable. Also we can eliminate the default_context and so pinning that opposed to the last_context serves no purpose other than by chance having a more preferrable position when it comes to defragmentation. But you don't enable that anyway and we have alternative strategies now that avoid the issue with fragmentation of the mappable aperture. > >(lr_context_pin should take a reference on the ctx to prevent early > >freeeing ofc). > > You can't clear the reference on the ctx in an interrupt context. The execlists submission should moved out of the interrupt context, for the very simple reason that it is causing machine panics. userspace submits a workload, machine lockups.... -Chris
On 14/01/2016 12:31, Chris Wilson wrote: > On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote: >> On 14/01/2016 11:36, Chris Wilson wrote: >>> On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote: >>>> + if (ctx->engine[ring->id].dirty) { >>>> + struct drm_i915_gem_request *req = NULL; >>>> + >>>> + /** >>>> + * If there is already a request pending on >>>> + * this ring, wait for that to complete, >>>> + * otherwise create a switch to idle request >>>> + */ >>>> + if (list_empty(&ring->request_list)) { >>>> + int ret; >>>> + >>>> + ret = i915_gem_request_alloc( >>>> + ring, >>>> + ring->default_context, >>>> + &req); >>>> + if (!ret) >>>> + i915_add_request(req); >>>> + else >>>> + DRM_DEBUG("Failed to ensure context saved"); >>>> + } else { >>>> + req = list_first_entry( >>>> + &ring->request_list, >>>> + typeof(*req), list); >>>> + } >>>> + if (req) { >>>> + ret = i915_wait_request(req); >>>> + if (ret != 0) { >>>> + /** >>>> + * If we get here, there's probably been a ring >>>> + * reset, so we just clean up the dirty flag.& >>>> + * pin count. >>>> + */ >>>> + ctx->engine[ring->id].dirty = false; >>>> + __intel_lr_context_unpin( >>>> + ring, >>>> + ctx); >>>> + } >>>> + } >>> >>> If you were to take a lr_context_pin on the last_context, and only >>> release that pin when you change to a new context, you do not need to >> >> That what this patch does. >> >>> introduce a blocking context-close, nor do you need to introduce the >>> usage of default_context. >> >> The use of default_context here is to stop a context hanging around >> after it is no longer needed. > > By blocking, which is not acceptable. Also we can eliminate the > default_context and so pinning that opposed to the last_context serves > no purpose other than by chance having a more preferrable position when > it comes to defragmentation. But you don't enable that anyway and we Enabling the shrinker on execlists is something I'm working on which is predicated on this patch. Also why is blocking on closing a context not acceptable? > have alternative strategies now that avoid the issue with fragmentation > of the mappable aperture. > >>> (lr_context_pin should take a reference on the ctx to prevent early >>> freeeing ofc). >> >> You can't clear the reference on the ctx in an interrupt context. > > The execlists submission should moved out of the interrupt context, for > the very simple reason that it is causing machine panics. userspace > submits a workload, machine lockups.... Create a jira, and I'm sure we'll look at making that change. > -Chris >
On 14/01/2016 12:37, Nick Hoath wrote: > On 14/01/2016 12:31, Chris Wilson wrote: >> On Thu, Jan 14, 2016 at 11:56:07AM +0000, Nick Hoath wrote: >>> On 14/01/2016 11:36, Chris Wilson wrote: >>>> On Wed, Jan 13, 2016 at 04:19:45PM +0000, Nick Hoath wrote: >>>>> + if (ctx->engine[ring->id].dirty) { >>>>> + struct drm_i915_gem_request *req = NULL; >>>>> + >>>>> + /** >>>>> + * If there is already a request pending on >>>>> + * this ring, wait for that to complete, >>>>> + * otherwise create a switch to idle request >>>>> + */ >>>>> + if (list_empty(&ring->request_list)) { >>>>> + int ret; >>>>> + >>>>> + ret = i915_gem_request_alloc( >>>>> + ring, >>>>> + ring->default_context, >>>>> + &req); >>>>> + if (!ret) >>>>> + i915_add_request(req); >>>>> + else >>>>> + DRM_DEBUG("Failed to ensure context saved"); >>>>> + } else { >>>>> + req = list_first_entry( >>>>> + &ring->request_list, >>>>> + typeof(*req), list); >>>>> + } >>>>> + if (req) { >>>>> + ret = i915_wait_request(req); >>>>> + if (ret != 0) { >>>>> + /** >>>>> + * If we get here, there's probably been a ring >>>>> + * reset, so we just clean up the dirty flag.& >>>>> + * pin count. >>>>> + */ >>>>> + ctx->engine[ring->id].dirty = false; >>>>> + __intel_lr_context_unpin( >>>>> + ring, >>>>> + ctx); >>>>> + } >>>>> + } >>>> >>>> If you were to take a lr_context_pin on the last_context, and only >>>> release that pin when you change to a new context, you do not need to >>> >>> That what this patch does. >>> >>>> introduce a blocking context-close, nor do you need to introduce the >>>> usage of default_context. >>> >>> The use of default_context here is to stop a context hanging around >>> after it is no longer needed. >> >> By blocking, which is not acceptable. Also we can eliminate the >> default_context and so pinning that opposed to the last_context serves >> no purpose other than by chance having a more preferrable position when >> it comes to defragmentation. But you don't enable that anyway and we > > Enabling the shrinker on execlists is something I'm working on which is > predicated on this patch. Also why is blocking on closing a context not > acceptable? > As a clarification: Without rewriting the execlist code to not submit or cleanup from an interrupt handler, we can't use refcounting to allow non blocking closing. >> have alternative strategies now that avoid the issue with fragmentation >> of the mappable aperture. >> >>>> (lr_context_pin should take a reference on the ctx to prevent early >>>> freeeing ofc). >>> >>> You can't clear the reference on the ctx in an interrupt context. >> >> The execlists submission should moved out of the interrupt context, for >> the very simple reason that it is causing machine panics. userspace >> submits a workload, machine lockups.... > > Create a jira, and I'm sure we'll look at making that change. > >> -Chris >> > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx >
On Thu, Jan 14, 2016 at 11:31:07AM +0000, Nick Hoath wrote: > On 14/01/2016 07:20, Patchwork wrote: > >== Summary == Our BKM is to link to bugzilla entries to make sure these are all real failures which are tracked already. Otherwise stuff falls through the cracks. > > > >Built on 058740f8fced6851aeda34f366f5330322cd585f drm-intel-nightly: 2016y-01m-13d-17h-07m-44s UTC integration manifest > > > >Test gem_ctx_basic: > > pass -> FAIL (bdw-ultra) > > Test failed to load - not patch related > > >Test gem_ctx_param_basic: > > Subgroup non-root-set: > > pass -> DMESG-WARN (bsw-nuc-2) > > gem driver allocated a poisoned slab - not patch related > > >Test kms_flip: > > Subgroup basic-flip-vs-dpms: > > pass -> SKIP (bsw-nuc-2) > > test reqs not met - not patch related basic-flip-vs-dpms MUST always work. Well except if you have a chip with GT only where the display is fused off. I expect more serious analysis instead of casually shrugging issues away as "not my problem". Same sloppy analysis with the others imo. -Daniel > > > dmesg-warn -> PASS (ilk-hp8440p) > > warn to PASS > > > > >bdw-nuci7 total:138 pass:128 dwarn:1 dfail:0 fail:0 skip:9 > >bdw-ultra total:138 pass:131 dwarn:0 dfail:0 fail:1 skip:6 > >bsw-nuc-2 total:141 pass:113 dwarn:3 dfail:0 fail:0 skip:25 > >hsw-brixbox total:141 pass:134 dwarn:0 dfail:0 fail:0 skip:7 > >hsw-gt2 total:141 pass:137 dwarn:0 dfail:0 fail:0 skip:4 > >ilk-hp8440p total:141 pass:101 dwarn:3 dfail:0 fail:0 skip:37 > >ivb-t430s total:135 pass:122 dwarn:3 dfail:4 fail:0 skip:6 > >skl-i5k-2 total:141 pass:131 dwarn:2 dfail:0 fail:0 skip:8 > >skl-i7k-2 total:141 pass:131 dwarn:2 dfail:0 fail:0 skip:8 > >snb-dellxps total:141 pass:122 dwarn:5 dfail:0 fail:0 skip:14 > >snb-x220t total:141 pass:122 dwarn:5 dfail:0 fail:1 skip:13 > > > >Results at /archive/results/CI_IGT_test/Patchwork_1174/ > > > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 104bd18..d28e10a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -882,6 +882,7 @@ struct intel_context { struct { struct drm_i915_gem_object *state; struct intel_ringbuffer *ringbuf; + bool dirty; int pin_count; } engine[I915_NUM_RINGS]; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ddc21d4..7b79405 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1413,6 +1413,9 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request) { trace_i915_gem_request_retire(request); + if (i915.enable_execlists) + intel_lr_context_complete_check(request); + /* We know the GPU must have read the request to have * sent us the seqno + interrupt, so use the position * of tail of the request to update the last known position diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 5027699..b661058 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -585,9 +585,6 @@ static int execlists_context_queue(struct drm_i915_gem_request *request) struct drm_i915_gem_request *cursor; int num_elements = 0; - if (request->ctx != ring->default_context) - intel_lr_context_pin(request); - i915_gem_request_reference(request); spin_lock_irq(&ring->execlist_lock); @@ -763,6 +760,13 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request) if (intel_ring_stopped(ring)) return; + if (request->ctx != ring->default_context) { + if (!request->ctx->engine[ring->id].dirty) { + intel_lr_context_pin(request); + request->ctx->engine[ring->id].dirty = true; + } + } + if (dev_priv->guc.execbuf_client) i915_guc_submit(dev_priv->guc.execbuf_client, request); else @@ -989,12 +993,6 @@ void intel_execlists_retire_requests(struct intel_engine_cs *ring) spin_unlock_irq(&ring->execlist_lock); list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) { - struct intel_context *ctx = req->ctx; - struct drm_i915_gem_object *ctx_obj = - ctx->engine[ring->id].state; - - if (ctx_obj && (ctx != ring->default_context)) - intel_lr_context_unpin(req); list_del(&req->execlist_link); i915_gem_request_unreference(req); } @@ -1089,21 +1087,39 @@ reset_pin_count: return ret; } -void intel_lr_context_unpin(struct drm_i915_gem_request *rq) +static void __intel_lr_context_unpin(struct intel_engine_cs *ring, + struct intel_context *ctx) { - struct intel_engine_cs *ring = rq->ring; - struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state; - struct intel_ringbuffer *ringbuf = rq->ringbuf; - + struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state; + struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; if (ctx_obj) { WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex)); - if (--rq->ctx->engine[ring->id].pin_count == 0) { + if (--ctx->engine[ring->id].pin_count == 0) { intel_unpin_ringbuffer_obj(ringbuf); i915_gem_object_ggtt_unpin(ctx_obj); } } } +void intel_lr_context_unpin(struct drm_i915_gem_request *rq) +{ + __intel_lr_context_unpin(rq->ring, rq->ctx); +} + +void intel_lr_context_complete_check(struct drm_i915_gem_request *req) +{ + struct intel_engine_cs *ring = req->ring; + + if (ring->lrc_last_context && ring->lrc_last_context != req->ctx && + ring->lrc_last_context->engine[ring->id].dirty) { + __intel_lr_context_unpin( + ring, + ring->lrc_last_context); + ring->lrc_last_context->engine[ring->id].dirty = false; + } + ring->lrc_last_context = req->ctx; +} + static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) { int ret, i; @@ -2347,6 +2363,74 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o } /** + * intel_lr_context_clean_ring() - clean the ring specific parts of an LRC + * @ctx: the LR context being freed. + * @ring: the engine being cleaned + * @ctx_obj: the hw context being unreferenced + * @ringbuf: the ringbuf being freed + * + * Take care of cleaning up the per-engine backing + * objects and the logical ringbuffer. + */ +static void +intel_lr_context_clean_ring(struct intel_context *ctx, + struct intel_engine_cs *ring, + struct drm_i915_gem_object *ctx_obj, + struct intel_ringbuffer *ringbuf) +{ + int ret; + + if (ctx == ring->default_context) { + intel_unpin_ringbuffer_obj(ringbuf); + i915_gem_object_ggtt_unpin(ctx_obj); + } + + if (ctx->engine[ring->id].dirty) { + struct drm_i915_gem_request *req = NULL; + + /** + * If there is already a request pending on + * this ring, wait for that to complete, + * otherwise create a switch to idle request + */ + if (list_empty(&ring->request_list)) { + int ret; + + ret = i915_gem_request_alloc( + ring, + ring->default_context, + &req); + if (!ret) + i915_add_request(req); + else + DRM_DEBUG("Failed to ensure context saved"); + } else { + req = list_first_entry( + &ring->request_list, + typeof(*req), list); + } + if (req) { + ret = i915_wait_request(req); + if (ret != 0) { + /** + * If we get here, there's probably been a ring + * reset, so we just clean up the dirty flag.& + * pin count. + */ + ctx->engine[ring->id].dirty = false; + __intel_lr_context_unpin( + ring, + ctx); + } + } + } + + WARN_ON(ctx->engine[ring->id].pin_count); + intel_ringbuffer_free(ringbuf); + drm_gem_object_unreference(&ctx_obj->base); +} + +/** * intel_lr_context_free() - free the LRC specific bits of a context * @ctx: the LR context to free. * @@ -2358,7 +2444,7 @@ void intel_lr_context_free(struct intel_context *ctx) { int i; - for (i = 0; i < I915_NUM_RINGS; i++) { + for (i = 0; i < I915_NUM_RINGS; ++i) { struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state; if (ctx_obj) { @@ -2366,13 +2452,10 @@ void intel_lr_context_free(struct intel_context *ctx) ctx->engine[i].ringbuf; struct intel_engine_cs *ring = ringbuf->ring; - if (ctx == ring->default_context) { - intel_unpin_ringbuffer_obj(ringbuf); - i915_gem_object_ggtt_unpin(ctx_obj); - } - WARN_ON(ctx->engine[ring->id].pin_count); - intel_ringbuffer_free(ringbuf); - drm_gem_object_unreference(&ctx_obj->base); + intel_lr_context_clean_ring(ctx, + ring, + ctx_obj, + ringbuf); } } } @@ -2548,5 +2631,14 @@ void intel_lr_context_reset(struct drm_device *dev, ringbuf->head = 0; ringbuf->tail = 0; + + if (ctx->engine[ring->id].dirty) { + __intel_lr_context_unpin( + ring, + ctx); + ctx->engine[ring->id].dirty = false; + if (ring->lrc_last_context == ctx) + ring->lrc_last_context = NULL; + } } } diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index de41ad6..48690f79 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -106,6 +106,7 @@ void intel_lr_context_reset(struct drm_device *dev, struct intel_context *ctx); uint64_t intel_lr_context_descriptor(struct intel_context *ctx, struct intel_engine_cs *ring); +void intel_lr_context_complete_check(struct drm_i915_gem_request *req); /* Execlists */ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 7349d92..fb1df83 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -276,6 +276,7 @@ struct intel_engine_cs { u32 flush_domains); int (*emit_bb_start)(struct drm_i915_gem_request *req, u64 offset, unsigned dispatch_flags); + struct intel_context *lrc_last_context; /** * List of objects currently involved in rendering from the
Use the first retired request on a new context to unpin the old context. This ensures that the hw context remains bound until it has been written back to by the GPU. Now that the context is pinned until later in the request/context lifecycle, it no longer needs to be pinned from context_queue to retire_requests. This fixes an issue with GuC submission where the GPU might not have finished writing back the context before it is unpinned. This results in a GPU hang. v2: Moved the new pin to cover GuC submission (Alex Dai) Moved the new unpin to request_retire to fix coverage leak v3: Added switch to default context if freeing a still pinned context just in case the hw was actually still using it v4: Unwrapped context unpin to allow calling without a request v5: Only create a switch to idle context if the ring doesn't already have a request pending on it (Alex Dai) Rename unsaved to dirty to avoid double negatives (Dave Gordon) Changed _no_req postfix to __ prefix for consistency (Dave Gordon) Split out per engine cleanup from context_free as it was getting unwieldy Corrected locking (Dave Gordon) v6: Removed some bikeshedding (Mika Kuoppala) Added explanation of the GuC hang that this fixes (Daniel Vetter) v7: Removed extra per request pinning from ring reset code (Alex Dai) Added forced ring unpin/clean in error case in context free (Alex Dai) v8: Renamed lrc specific last_context to lrc_last_context as there were some reset cases where the codepaths leaked (Mika Kuoppala) NULL'd last_context in reset case - there was a pointer leak if someone did reset->close context. v9: Rebase over "Fix context/engine cleanup order" v10: Rebase over nightly, remove WARN_ON which caused the dependency on dev. Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Issue: VIZ-4277 Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Dai <yu.dai@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 3 + drivers/gpu/drm/i915/intel_lrc.c | 138 ++++++++++++++++++++++++++------ drivers/gpu/drm/i915/intel_lrc.h | 1 + drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + 5 files changed, 121 insertions(+), 23 deletions(-)