From patchwork Wed Apr 20 17:13:21 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 8893271 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 26B25BF29F for ; Wed, 20 Apr 2016 17:14:13 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id AC36220145 for ; Wed, 20 Apr 2016 17:14:11 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 15CBE20149 for ; Wed, 20 Apr 2016 17:14:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D622F6EA87; Wed, 20 Apr 2016 17:14:02 +0000 (UTC) X-Original-To: Intel-GFX@lists.freedesktop.org Delivered-To: Intel-GFX@lists.freedesktop.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTP id B80336EA80 for ; Wed, 20 Apr 2016 17:14:00 +0000 (UTC) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP; 20 Apr 2016 10:14:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,510,1455004800"; d="scan'208";a="689424223" Received: from johnharr-linux.isw.intel.com ([10.102.226.93]) by FMSMGA003.fm.intel.com with ESMTP; 20 Apr 2016 10:13:59 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Wed, 20 Apr 2016 18:13:21 +0100 Message-Id: <1461172435-4256-4-git-send-email-John.C.Harrison@Intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1461172435-4256-1-git-send-email-John.C.Harrison@Intel.com> References: <1461172435-4256-1-git-send-email-John.C.Harrison@Intel.com> Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH v6 03/34] drm/i915: Split i915_dem_do_execbuffer() in half X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: John Harrison Split the execbuffer() function in half. The first half collects and validates all the information required to process the batch buffer. It also does all the object pinning, relocations, active list management, etc - basically anything that must be done upfront before the IOCTL returns and allows the user land side to start changing/freeing things. The second half does the actual ring submission. This change implements the split but leaves the back half being called directly from the end of the front half. v2: Updated due to changes in underlying tree - addition of sync fence support and removal of cliprects. v3: Moved local 'ringbuf' variable to make later patches in the series a bit neater. v4: Corrected a typo in the commit message and downgraded a BUG_ON to a WARN_ON as the latter is preferred. Also removed all the external sync/fence support as that will now be a separate patch series. v5: Updated for runtime PM changes. v6: Updated to newer nightly (lots of ring -> engine renaming). Renamed 'i915_gem_execbuff_release_batch_obj' to 'i915_gem_execbuf_release_batch_obj' and updated to use 'to_i915()' instead of dev_private. [review feedback from Joonas Lahtinen] Added an explicit remove_from_client() call to the failure path to fix a race condition with invalid requests the client list. For: VIZ-1587 Signed-off-by: John Harrison Reviewed-by: Jesse Barnes --- drivers/gpu/drm/i915/i915_drv.h | 11 +++ drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 113 +++++++++++++++++++++-------- drivers/gpu/drm/i915/intel_lrc.c | 57 ++++++++++----- drivers/gpu/drm/i915/intel_lrc.h | 1 + 5 files changed, 137 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9519b11..f978539 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1701,10 +1701,18 @@ struct i915_execbuffer_params { struct drm_device *dev; struct drm_file *file; uint32_t dispatch_flags; + uint32_t args_flags; uint32_t args_batch_start_offset; + uint32_t args_batch_len; + uint32_t args_num_cliprects; + uint32_t args_DR1; + uint32_t args_DR4; uint64_t batch_obj_vm_offset; struct intel_engine_cs *engine; struct drm_i915_gem_object *batch_obj; + struct drm_clip_rect *cliprects; + uint32_t instp_mask; + int instp_mode; struct intel_context *ctx; struct drm_i915_gem_request *request; }; @@ -1991,6 +1999,7 @@ struct drm_i915_private { int (*execbuf_submit)(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); + int (*execbuf_final)(struct i915_execbuffer_params *params); int (*init_engines)(struct drm_device *dev); void (*cleanup_engine)(struct intel_engine_cs *engine); void (*stop_engine)(struct intel_engine_cs *engine); @@ -2896,9 +2905,11 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, void i915_gem_execbuffer_move_to_active(struct list_head *vmas, struct drm_i915_gem_request *req); void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params); +void i915_gem_execbuf_release_batch_obj(struct drm_i915_gem_object *batch_obj); int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); +int i915_gem_ringbuffer_submission_final(struct i915_execbuffer_params *params); int i915_gem_execbuffer(struct drm_device *dev, void *data, struct drm_file *file_priv); int i915_gem_execbuffer2(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d1f8c47..21fce67 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5383,11 +5383,13 @@ int i915_gem_init(struct drm_device *dev) if (!i915.enable_execlists) { dev_priv->gt.execbuf_submit = i915_gem_ringbuffer_submission; + dev_priv->gt.execbuf_final = i915_gem_ringbuffer_submission_final; dev_priv->gt.init_engines = i915_gem_init_engines; dev_priv->gt.cleanup_engine = intel_cleanup_engine; dev_priv->gt.stop_engine = intel_stop_engine; } else { dev_priv->gt.execbuf_submit = intel_execlists_submission; + dev_priv->gt.execbuf_final = intel_execlists_submission_final; dev_priv->gt.init_engines = intel_logical_rings_init; dev_priv->gt.cleanup_engine = intel_logical_ring_cleanup; dev_priv->gt.stop_engine = intel_logical_ring_stop; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index f25fd2c..6e500bf 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1230,41 +1230,38 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_device *dev = params->dev; struct intel_engine_cs *engine = params->engine; struct drm_i915_private *dev_priv = dev->dev_private; - u64 exec_start, exec_len; - int instp_mode; - u32 instp_mask; int ret; - instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; - instp_mask = I915_EXEC_CONSTANTS_MASK; - switch (instp_mode) { + params->instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; + params->instp_mask = I915_EXEC_CONSTANTS_MASK; + switch (params->instp_mode) { case I915_EXEC_CONSTANTS_REL_GENERAL: case I915_EXEC_CONSTANTS_ABSOLUTE: case I915_EXEC_CONSTANTS_REL_SURFACE: - if (instp_mode != 0 && engine != &dev_priv->engine[RCS]) { + if (params->instp_mode != 0 && engine != &dev_priv->engine[RCS]) { DRM_DEBUG("non-0 rel constants mode on non-RCS\n"); return -EINVAL; } - if (instp_mode != dev_priv->relative_constants_mode) { + if (params->instp_mode != dev_priv->relative_constants_mode) { if (INTEL_INFO(dev)->gen < 4) { DRM_DEBUG("no rel constants on pre-gen4\n"); return -EINVAL; } if (INTEL_INFO(dev)->gen > 5 && - instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) { + params->instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) { DRM_DEBUG("rel surface constants mode invalid on gen5+\n"); return -EINVAL; } /* The HW changed the meaning on this bit on gen6 */ if (INTEL_INFO(dev)->gen >= 6) - instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE; + params->instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE; } break; default: - DRM_DEBUG("execbuf with unknown constants: %d\n", instp_mode); + DRM_DEBUG("execbuf with unknown constants: %d\n", params->instp_mode); return -EINVAL; } @@ -1274,7 +1271,33 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, i915_gem_execbuffer_move_to_active(vmas, params->request); - /* To be split into two functions here... */ + ret = dev_priv->gt.execbuf_final(params); + if (ret) + return ret; + + /* + * Free everything that was stored in the QE structure (until the + * scheduler arrives and does it instead): + */ + if (params->dispatch_flags & I915_DISPATCH_SECURE) + i915_gem_execbuf_release_batch_obj(params->batch_obj); + + return 0; +} + +/* + * This is the main function for sending a batch to the engine. + * It is called from the scheduler, with the struct_mutex already held. + */ +int i915_gem_ringbuffer_submission_final(struct i915_execbuffer_params *params) +{ + struct drm_i915_private *dev_priv = to_i915(params->dev); + struct intel_engine_cs *engine = params->engine; + u64 exec_start, exec_len; + int ret; + + /* The mutex must be acquired before calling this function */ + WARN_ON(!mutex_is_locked(¶ms->dev->struct_mutex)); /* * Unconditionally invalidate gpu caches and ensure that we do flush @@ -1293,7 +1316,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, "%s didn't clear reload\n", engine->name); if (engine == &dev_priv->engine[RCS] && - instp_mode != dev_priv->relative_constants_mode) { + params->instp_mode != dev_priv->relative_constants_mode) { ret = intel_ring_begin(params->request, 4); if (ret) return ret; @@ -1301,19 +1324,19 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, intel_ring_emit(engine, MI_NOOP); intel_ring_emit(engine, MI_LOAD_REGISTER_IMM(1)); intel_ring_emit_reg(engine, INSTPM); - intel_ring_emit(engine, instp_mask << 16 | instp_mode); + intel_ring_emit(engine, params->instp_mask << 16 | params->instp_mode); intel_ring_advance(engine); - dev_priv->relative_constants_mode = instp_mode; + dev_priv->relative_constants_mode = params->instp_mode; } - if (args->flags & I915_EXEC_GEN7_SOL_RESET) { - ret = i915_reset_gen7_sol_offsets(dev, params->request); + if (params->args_flags & I915_EXEC_GEN7_SOL_RESET) { + ret = i915_reset_gen7_sol_offsets(params->dev, params->request); if (ret) return ret; } - exec_len = args->batch_len; + exec_len = params->args_batch_len; exec_start = params->batch_obj_vm_offset + params->args_batch_start_offset; @@ -1642,24 +1665,47 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, params->file = file; params->engine = engine; params->dispatch_flags = dispatch_flags; + params->args_flags = args->flags; + params->args_batch_len = args->batch_len; + params->args_num_cliprects = args->num_cliprects; + params->args_DR1 = args->DR1; + params->args_DR4 = args->DR4; params->batch_obj = batch_obj; params->ctx = ctx; params->request = req; ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas); + if (ret) + goto err_client; + + /* the request owns the ref now */ + i915_gem_context_unreference(ctx); -err_batch_unpin: /* - * FIXME: We crucially rely upon the active tracking for the (ppgtt) - * batch vma for correctness. For less ugly and less fragility this - * needs to be adjusted to also track the ggtt batch vma properly as - * active. + * The eb list is no longer required. The scheduler has extracted all + * the information than needs to persist. */ + eb_destroy(eb); + + /* + * Don't clean up everything that is now saved away in the queue. + * Just unlock and return immediately. + */ + mutex_unlock(&dev->struct_mutex); + + intel_runtime_pm_put(dev_priv); + + return 0; + +err_client: + if (req->file_priv) + i915_gem_request_remove_from_client(req); + +err_batch_unpin: if (dispatch_flags & I915_DISPATCH_SECURE) - i915_gem_object_ggtt_unpin(batch_obj); + i915_gem_execbuf_release_batch_obj(batch_obj); err: - /* the request owns the ref now */ i915_gem_context_unreference(ctx); eb_destroy(eb); @@ -1668,12 +1714,8 @@ err: * must be freed again. If it was submitted then it is being tracked * on the active request list and no clean up is required here. */ - if (ret && !IS_ERR_OR_NULL(req)) { - if (req->file_priv) - i915_gem_request_remove_from_client(req); - + if (ret && !IS_ERR_OR_NULL(req)) i915_gem_request_cancel(req); - } mutex_unlock(&dev->struct_mutex); @@ -1684,6 +1726,17 @@ pre_mutex_err: return ret; } +void i915_gem_execbuf_release_batch_obj(struct drm_i915_gem_object *batch_obj) +{ + /* + * FIXME: We crucially rely upon the active tracking for the (ppgtt) + * batch vma for correctness. For less ugly and less fragility this + * needs to be adjusted to also track the ggtt batch vma properly as + * active. + */ + i915_gem_object_ggtt_unpin(batch_obj); +} + /* * Legacy execbuffer just creates an exec2 list from the original exec object * list array and passes it to the real function. diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 5866342..6ac063d 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -945,35 +945,31 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, struct drm_device *dev = params->dev; struct intel_engine_cs *engine = params->engine; struct drm_i915_private *dev_priv = dev->dev_private; - struct intel_ringbuffer *ringbuf = params->ctx->engine[engine->id].ringbuf; - u64 exec_start; - int instp_mode; - u32 instp_mask; int ret; - instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; - instp_mask = I915_EXEC_CONSTANTS_MASK; - switch (instp_mode) { + params->instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; + params->instp_mask = I915_EXEC_CONSTANTS_MASK; + switch (params->instp_mode) { case I915_EXEC_CONSTANTS_REL_GENERAL: case I915_EXEC_CONSTANTS_ABSOLUTE: case I915_EXEC_CONSTANTS_REL_SURFACE: - if (instp_mode != 0 && engine != &dev_priv->engine[RCS]) { + if (params->instp_mode != 0 && engine != &dev_priv->engine[RCS]) { DRM_DEBUG("non-0 rel constants mode on non-RCS\n"); return -EINVAL; } - if (instp_mode != dev_priv->relative_constants_mode) { - if (instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) { + if (params->instp_mode != dev_priv->relative_constants_mode) { + if (params->instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) { DRM_DEBUG("rel surface constants mode invalid on gen5+\n"); return -EINVAL; } /* The HW changed the meaning on this bit on gen6 */ - instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE; + params->instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE; } break; default: - DRM_DEBUG("execbuf with unknown constants: %d\n", instp_mode); + DRM_DEBUG("execbuf with unknown constants: %d\n", params->instp_mode); return -EINVAL; } @@ -988,7 +984,34 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, i915_gem_execbuffer_move_to_active(vmas, params->request); - /* To be split into two functions here... */ + ret = dev_priv->gt.execbuf_final(params); + if (ret) + return ret; + + /* + * Free everything that was stored in the QE structure (until the + * scheduler arrives and does it instead): + */ + if (params->dispatch_flags & I915_DISPATCH_SECURE) + i915_gem_execbuf_release_batch_obj(params->batch_obj); + + return 0; +} + +/* + * This is the main function for sending a batch to the engine. + * It is called from the scheduler, with the struct_mutex already held. + */ +int intel_execlists_submission_final(struct i915_execbuffer_params *params) +{ + struct drm_i915_private *dev_priv = to_i915(params->dev); + struct intel_ringbuffer *ringbuf = params->request->ringbuf; + struct intel_engine_cs *engine = params->engine; + u64 exec_start; + int ret; + + /* The mutex must be acquired before calling this function */ + WARN_ON(!mutex_is_locked(¶ms->dev->struct_mutex)); /* * Unconditionally invalidate gpu caches and ensure that we do flush @@ -999,7 +1022,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, return ret; if (engine == &dev_priv->engine[RCS] && - instp_mode != dev_priv->relative_constants_mode) { + params->instp_mode != dev_priv->relative_constants_mode) { ret = intel_logical_ring_begin(params->request, 4); if (ret) return ret; @@ -1007,14 +1030,14 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, intel_logical_ring_emit(ringbuf, MI_NOOP); intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1)); intel_logical_ring_emit_reg(ringbuf, INSTPM); - intel_logical_ring_emit(ringbuf, instp_mask << 16 | instp_mode); + intel_logical_ring_emit(ringbuf, params->instp_mask << 16 | params->instp_mode); intel_logical_ring_advance(ringbuf); - dev_priv->relative_constants_mode = instp_mode; + dev_priv->relative_constants_mode = params->instp_mode; } exec_start = params->batch_obj_vm_offset + - args->batch_start_offset; + params->args_batch_start_offset; ret = engine->emit_bb_start(params->request, exec_start, params->dispatch_flags); if (ret) diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 9affda2..3d0a4c9 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -121,6 +121,7 @@ struct i915_execbuffer_params; int intel_execlists_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); +int intel_execlists_submission_final(struct i915_execbuffer_params *params); void intel_execlists_retire_requests(struct intel_engine_cs *engine);