From patchwork Wed Jun 5 13:38:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10977005 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6FE0815E6 for ; Wed, 5 Jun 2019 13:39:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60CA6286FE for ; Wed, 5 Jun 2019 13:39:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 54D1F28928; Wed, 5 Jun 2019 13:39:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6363C286D4 for ; Wed, 5 Jun 2019 13:39:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DD9D089998; Wed, 5 Jun 2019 13:39:10 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 373BE89996 for ; Wed, 5 Jun 2019 13:39:10 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Jun 2019 06:39:10 -0700 X-ExtLoop1: 1 Received: from mkopansk-mobl1.ger.corp.intel.com (HELO delly.ger.corp.intel.com) ([10.249.139.46]) by fmsmga008.fm.intel.com with ESMTP; 05 Jun 2019 06:39:08 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 5 Jun 2019 16:38:49 +0300 Message-Id: <20190605133852.4493-6-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.21.0.392.gf8f6787159e In-Reply-To: <20190605133852.4493-1-lionel.g.landwerlin@intel.com> References: <20190605133852.4493-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 5/7] drm/i915: add a new perf configuration execbuf parameter X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We want the ability to dispatch a set of command buffer to the hardware, each with a different OA configuration. To achieve this, we reuse a couple of fields from the execbuf2 struct (I CAN HAZ execbuf3?) to notify what OA configuration should be used for a batch buffer. This requires the process making the execbuf with this flag to also own the perf fd at the time of execbuf. v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris) Move oa_config vma to active (Chris) v3: Don't drop the lock for engine lookup (Chris) Move OA config vma to active before writing the ringbuffer (Chris) v4: Reuse i915_user_extension_fn Serialize requests with OA config updates Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 130 +++++++++++++++++- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 9 ++ drivers/gpu/drm/i915/gt/intel_lrc.c | 1 + drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 4 +- drivers/gpu/drm/i915/i915_drv.c | 4 + drivers/gpu/drm/i915/i915_drv.h | 10 +- drivers/gpu/drm/i915/i915_perf.c | 26 ++-- include/uapi/drm/i915_drm.h | 37 +++++ 9 files changed, 208 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 86864ec76a8b..dc7e765edf4b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -281,7 +281,11 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; + struct drm_i915_gem_execbuffer_ext_perf perf_config; } extensions; + + struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is not needed. */ + struct drm_i915_gem_object *oa_bo; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1199,6 +1203,34 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma) return err; } + +static int +get_execbuf_oa_config(struct drm_i915_private *dev_priv, + s32 perf_fd, u64 oa_config_id, + struct i915_oa_config **out_oa_config, + struct drm_i915_gem_object **out_oa_obj) +{ + struct file *perf_file; + int ret; + + if (!dev_priv->perf.oa.exclusive_stream) + return -EINVAL; + + perf_file = fget(perf_fd); + if (!perf_file) + return -EINVAL; + + if (perf_file->private_data != dev_priv->perf.oa.exclusive_stream) + return -EINVAL; + + fput(perf_file); + + ret = i915_perf_get_oa_config(dev_priv, oa_config_id, + out_oa_config, out_oa_obj); + + return ret; +} + static int __reloc_gpu_alloc(struct i915_execbuffer *eb, struct i915_vma *vma, unsigned int len) @@ -2061,6 +2093,64 @@ add_to_client(struct i915_request *rq, struct drm_file *file) list_add_tail(&rq->client_link, &rq->file_priv->mm.request_list); } +static int eb_oa_config(struct i915_execbuffer *eb) +{ + struct i915_vma *oa_vma; + int err; + + if (!eb->oa_config) + return 0; + + err = i915_active_request_set(&eb->engine->last_oa_config, + eb->request); + if (err) + return err; + + /* + * If the perf stream was opened with hold preemption, flag the + * request properly so that the priority of the request is bumped once + * it reaches the execlist ports. + */ + if (eb->i915->perf.oa.exclusive_stream->hold_preemption) + eb->request->has_perf = true; + + /* + * If the config hasn't changed, skip reconfiguring the HW (this is + * subject to a delay we want to avoid has much as possible). + */ + if (eb->oa_config == eb->i915->perf.oa.exclusive_stream->oa_config) + return 0; + + oa_vma = i915_vma_instance(eb->oa_bo, + &eb->engine->i915->ggtt.vm, NULL); + if (unlikely(IS_ERR(oa_vma))) + return PTR_ERR(oa_vma); + + err = i915_vma_pin(oa_vma, 0, 0, PIN_GLOBAL); + if (err) + return err; + + err = i915_vma_move_to_active(oa_vma, eb->request, 0); + if (err) { + i915_vma_unpin(oa_vma); + return err; + } + + err = eb->engine->emit_bb_start(eb->request, + oa_vma->node.start, + 0, I915_DISPATCH_SECURE); + if (err) { + i915_vma_unpin(oa_vma); + return err; + } + + i915_vma_unpin(oa_vma); + + swap(eb->oa_config, eb->i915->perf.oa.exclusive_stream->oa_config); + + return 0; +} + static int eb_submit(struct i915_execbuffer *eb) { int err; @@ -2087,6 +2177,10 @@ static int eb_submit(struct i915_execbuffer *eb) return err; } + err = eb_oa_config(eb); + if (err) + return err; + err = eb->engine->emit_bb_start(eb->request, eb->batch->node.start + eb->batch_start_offset, @@ -2508,8 +2602,22 @@ static int parse_timeline_fences(struct i915_user_extension __user *ext, void *d return 0; } +static int parse_perf_config(struct i915_user_extension __user *ext, void *data) +{ + struct i915_execbuffer *eb = data; + + if (copy_from_user(&eb->extensions.perf_config, ext, + sizeof(eb->extensions.perf_config))) + return -EFAULT; + + eb->extensions.flags |= BIT(DRM_I915_GEM_EXECBUFFER_EXT_PERF); + + return 0; +} + static const i915_user_extension_fn execbuf_extensions[] = { [DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences, + [DRM_I915_GEM_EXECBUFFER_EXT_PERF] = parse_perf_config, }; static int @@ -2561,6 +2669,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, eb.buffer_count = args->buffer_count; eb.batch_start_offset = args->batch_start_offset; eb.batch_len = args->batch_len; + eb.oa_config = NULL; eb.batch_flags = 0; if (args->flags & I915_EXEC_SECURE) { @@ -2639,9 +2748,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_unlock; + if (eb.extensions.flags & BIT(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) { + if (!intel_engine_has_oa(eb.engine)) { + err = -ENODEV; + goto err_engine; + } + + err = get_execbuf_oa_config(eb.i915, + eb.extensions.perf_config.perf_fd, + eb.extensions.perf_config.oa_config, + &eb.oa_config, &eb.oa_bo); + if (err) + goto err_engine; + } + err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */ if (unlikely(err)) - goto err_engine; + goto err_oa; err = eb_relocate(&eb); if (err) { @@ -2794,6 +2917,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, err_vma: if (eb.exec) eb_release_vmas(&eb); +err_oa: + if (eb.oa_config) { + i915_gem_object_put(eb.oa_bo); + i915_oa_config_put(eb.oa_config); + } err_engine: eb_unpin_context(&eb); err_unlock: diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 6b838948ba24..7b2e81c7930d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -840,6 +840,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine) engine->set_default_submission(engine); + INIT_ACTIVE_REQUEST(&engine->last_oa_config); + return 0; err_unpin: diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 01223864237a..0210a905e636 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -363,6 +363,8 @@ struct intel_engine_cs { struct i915_wa_list wa_list; struct i915_wa_list whitelist; + struct i915_active_request last_oa_config; + u32 irq_keep_mask; /* always keep these interrupts */ u32 irq_enable_mask; /* bitmask to enable ring interrupt */ void (*irq_enable)(struct intel_engine_cs *engine); @@ -457,6 +459,7 @@ struct intel_engine_cs { #define I915_ENGINE_HAS_SEMAPHORES BIT(3) #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4) #define I915_ENGINE_IS_VIRTUAL BIT(5) +#define I915_ENGINE_HAS_OA BIT(6) unsigned int flags; /* @@ -552,6 +555,12 @@ intel_engine_is_virtual(const struct intel_engine_cs *engine) return engine->flags & I915_ENGINE_IS_VIRTUAL; } +static inline bool +intel_engine_has_oa(const struct intel_engine_cs *engine) +{ + return engine->flags & I915_ENGINE_HAS_OA; +} + #define instdone_slice_mask(dev_priv__) \ (IS_GEN(dev_priv__, 7) ? \ 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index fed704802c57..ed19f4e53d31 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2732,6 +2732,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) engine->init_context = gen8_init_rcs_context; engine->emit_flush = gen8_emit_flush_render; engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs; + engine->flags |= I915_ENGINE_HAS_OA; } return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index ff58d658e3e2..972193cfbf41 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -2212,8 +2212,10 @@ static void setup_rcs(struct intel_engine_cs *engine) engine->irq_enable_mask = I915_USER_INTERRUPT; } - if (IS_HASWELL(i915)) + if (IS_HASWELL(i915)) { engine->emit_bb_start = hsw_emit_bb_start; + engine->flags |= I915_ENGINE_HAS_OA; + } engine->resume = rcs_resume; } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index a872791f98bd..75029d1a3802 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -478,6 +478,10 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_PERF_REVISION: value = 1; break; + case I915_PARAM_HAS_EXEC_PERF_CONFIG: + /* Obviously requires perf support. */ + value = dev_priv->perf.initialized; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 83a32e9d86b5..7b82bb85e45d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1156,6 +1156,8 @@ struct i915_oa_reg { }; struct i915_oa_config { + struct drm_i915_private *i915; + char uuid[UUID_STRING_LEN + 1]; int id; @@ -1174,7 +1176,7 @@ struct i915_oa_config { struct list_head vma_link; - atomic_t ref_count; + struct kref ref; }; struct i915_perf_stream; @@ -2727,6 +2729,12 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915, int metrics_set, struct i915_oa_config **out_config, struct drm_i915_gem_object **out_obj); +void i915_oa_config_release(struct kref *ref); + +static inline void i915_oa_config_put(struct i915_oa_config *oa_config) +{ + kref_put(&oa_config->ref, i915_oa_config_release); +} /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index e0071e44de3d..4e69266bba61 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -366,10 +366,11 @@ struct perf_open_properties { int oa_period_exponent; }; -static void put_oa_config(struct i915_oa_config *oa_config) +void i915_oa_config_release(struct kref *ref) { - if (!atomic_dec_and_test(&oa_config->ref_count)) - return; + struct i915_oa_config *oa_config = container_of(ref, typeof(*oa_config), ref); + + lockdep_assert_held(&oa_config->i915->perf.metrics_lock); if (oa_config->obj) { list_del(&oa_config->vma_link); @@ -478,7 +479,7 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915, } if (out_config) { - atomic_inc(&oa_config->ref_count); + kref_get(&oa_config->ref); *out_config = oa_config; } @@ -500,7 +501,7 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915, err_buf_alloc: if (out_config) { - put_oa_config(oa_config); + i915_oa_config_put(oa_config); *out_config = NULL; } unlock: @@ -1475,7 +1476,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream) if (stream->ctx) oa_put_render_ctx_id(stream); - put_oa_config(stream->oa_config); + i915_oa_config_put(stream->oa_config); if (dev_priv->perf.oa.spurious_report_rs.missed) { DRM_NOTE("%d spurious OA report notices suppressed due to ratelimiting\n", @@ -2243,7 +2244,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, free_oa_buffer(dev_priv); err_oa_buf_alloc: - put_oa_config(stream->oa_config); + i915_oa_config_put(stream->oa_config); intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL); intel_runtime_pm_put(dev_priv, stream->wakeref); @@ -3052,7 +3053,7 @@ void i915_perf_register(struct drm_i915_private *dev_priv) if (ret) goto sysfs_error; - atomic_set(&dev_priv->perf.oa.test_config.ref_count, 1); + dev_priv->perf.oa.test_config.i915 = dev_priv; goto exit; @@ -3307,7 +3308,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, return -ENOMEM; } - atomic_set(&oa_config->ref_count, 1); + oa_config->i915 = dev_priv; + kref_init(&oa_config->ref); if (!uuid_is_valid(args->uuid)) { DRM_DEBUG("Invalid uuid format for OA config\n"); @@ -3406,7 +3408,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, sysfs_err: mutex_unlock(&dev_priv->perf.metrics_lock); reg_err: - put_oa_config(oa_config); + i915_oa_config_put(oa_config); DRM_DEBUG("Failed to add new OA config\n"); return err; } @@ -3460,7 +3462,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id); - put_oa_config(oa_config); + i915_oa_config_put(oa_config); config_err: mutex_unlock(&dev_priv->perf.metrics_lock); @@ -3622,7 +3624,7 @@ static int destroy_config(int id, void *p, void *data) { struct i915_oa_config *oa_config = p; - put_oa_config(oa_config); + i915_oa_config_put(oa_config); return 0; } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index b7fe1915e34d..5938a73963fe 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -623,6 +623,16 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 55 +/* + * Request an i915/perf performance configuration change before running the + * commands given in an execbuf. + * + * Performance configuration ID and the file descriptor of the i915 perf + * stream are given through drm_i915_gem_execbuffer_ext_perf. See + * I915_EXEC_EXT. + */ +#define I915_PARAM_HAS_EXEC_PERF_CONFIG 56 + /* Must be kept compact -- no holes and well documented */ typedef struct drm_i915_getparam { @@ -1026,6 +1036,12 @@ enum drm_i915_gem_execbuffer_ext { */ DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES, + /** + * This identifier is associated with + * drm_i915_gem_execbuffer_perf_ext. + */ + DRM_I915_GEM_EXECBUFFER_EXT_PERF, + DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */ }; @@ -1054,6 +1070,27 @@ struct drm_i915_gem_execbuffer_ext_timeline_fences { __u64 values_ptr; }; +struct drm_i915_gem_execbuffer_ext_perf { + struct i915_user_extension base; + + /** + * Performance file descriptor returned by DRM_IOCTL_I915_PERF_OPEN. + * This is used to identify that the application + */ + __s32 perf_fd; + + /** + * Unused for now. Must be cleared to zero. + */ + __u32 pad; + + /** + * OA configuration ID to switch to before executing the commands + * associated to the execbuf. + */ + __u64 oa_config; +}; + struct drm_i915_gem_execbuffer2 { /** * List of gem_exec_object2 structs