From patchwork Mon Nov 7 10:30:56 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 9414677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B27816048F for ; Mon, 7 Nov 2016 10:28:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1B0728CB3 for ; Mon, 7 Nov 2016 10:28:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 968CC28CBE; Mon, 7 Nov 2016 10:28:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C549F28CB3 for ; Mon, 7 Nov 2016 10:28:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5DB156E352; Mon, 7 Nov 2016 10:28:49 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 447636E352 for ; Mon, 7 Nov 2016 10:28:48 +0000 (UTC) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga105.jf.intel.com with ESMTP; 07 Nov 2016 02:28:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,606,1473145200"; d="scan'208";a="188501803" Received: from sourab-desktop.iind.intel.com ([10.223.82.156]) by fmsmga004.fm.intel.com with ESMTP; 07 Nov 2016 02:28:45 -0800 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Mon, 7 Nov 2016 16:00:56 +0530 Message-Id: <1478514656-31035-1-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <20161104100450.GJ15981@nuc-i3427.alporthouse.com> References: <20161104100450.GJ15981@nuc-i3427.alporthouse.com> Cc: Daniel Vetter , Sourab Gupta , Matthew Auld Subject: [Intel-gfx] [PATCH v2 08/15] drm/i915: Add support for emitting execbuffer tags through OA counter reports X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta This patch enables userspace to specify tags (per workload), provided via execbuffer ioctl, which could be added to OA reports, to help associate reports with the corresponding workloads. There may be multiple stages within a single context, from a userspace perspective. An ability is needed to individually associate the OA reports with their corresponding workloads(execbuffers), which may not be possible solely with ctx_id or pid information. This patch enables such a mechanism. In this patch, upper 32 bits of rsvd1 field, which were previously unused are now being used to pass in the tag. v2: Corrected the tag extraction macro (Chris) Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_drv.h | 6 +++-- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 +++-- drivers/gpu/drm/i915/i915_perf.c | 38 ++++++++++++++++++++++++++---- include/uapi/drm/i915_drm.h | 12 ++++++++++ 4 files changed, 53 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f250e7b..0f171f8 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1814,7 +1814,7 @@ struct i915_perf_stream_ops { * Routine to emit the commands in the command streamer associated * with the corresponding gpu engine. */ - void (*command_stream_hook)(struct drm_i915_gem_request *req); + void (*command_stream_hook)(struct drm_i915_gem_request *req, u32 tag); }; enum i915_perf_stream_state { @@ -1873,6 +1873,7 @@ struct i915_perf_cs_data_node { u32 offset; u32 ctx_id; u32 pid; + u32 tag; }; struct drm_i915_private { @@ -2244,6 +2245,7 @@ struct drm_i915_private { u32 last_ctx_id; u32 last_pid; + u32 last_tag; struct list_head node_list; spinlock_t node_list_lock; } perf; @@ -3666,7 +3668,7 @@ void i915_oa_legacy_ctx_switch_notify(struct drm_i915_gem_request *req); void i915_oa_update_reg_state(struct intel_engine_cs *engine, struct i915_gem_context *ctx, uint32_t *reg_state); -void i915_perf_command_stream_hook(struct drm_i915_gem_request *req); +void i915_perf_command_stream_hook(struct drm_i915_gem_request *req, u32 tag); /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index da502c7..d89787b 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -58,6 +58,7 @@ struct i915_execbuffer_params { struct intel_engine_cs *engine; struct i915_gem_context *ctx; struct drm_i915_gem_request *request; + uint32_t tag; }; struct eb_vmas { @@ -1523,7 +1524,7 @@ execbuf_submit(struct i915_execbuffer_params *params, if (exec_len == 0) exec_len = params->batch->size - params->args_batch_start_offset; - i915_perf_command_stream_hook(params->request); + i915_perf_command_stream_hook(params->request, params->tag); ret = params->engine->emit_bb_start(params->request, exec_start, exec_len, @@ -1531,7 +1532,7 @@ execbuf_submit(struct i915_execbuffer_params *params, if (ret) return ret; - i915_perf_command_stream_hook(params->request); + i915_perf_command_stream_hook(params->request, params->tag); trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); @@ -1843,6 +1844,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, params->engine = engine; params->dispatch_flags = dispatch_flags; params->ctx = ctx; + params->tag = i915_execbuffer2_get_tag(*args); ret = execbuf_submit(params, args, &eb->vmas); err_request: diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 0a13672..18489c2 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -255,6 +255,7 @@ struct oa_sample_data { u32 source; u32 ctx_id; u32 pid; + u32 tag; const u8 *report; }; @@ -311,6 +312,7 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = { #define SAMPLE_OA_SOURCE_INFO (1<<1) #define SAMPLE_CTX_ID (1<<2) #define SAMPLE_PID (1<<3) +#define SAMPLE_TAG (1<<4) struct perf_open_properties { u32 sample_flags; @@ -335,7 +337,8 @@ struct perf_open_properties { * perf mutex lock. */ -void i915_perf_command_stream_hook(struct drm_i915_gem_request *request) +void i915_perf_command_stream_hook(struct drm_i915_gem_request *request, + u32 tag) { struct intel_engine_cs *engine = request->engine; struct drm_i915_private *dev_priv = engine->i915; @@ -348,7 +351,7 @@ void i915_perf_command_stream_hook(struct drm_i915_gem_request *request) list_for_each_entry(stream, &dev_priv->perf.streams, link) { if ((stream->state == I915_PERF_STREAM_ENABLED) && stream->cs_mode) - stream->ops->command_stream_hook(request); + stream->ops->command_stream_hook(request, tag); } mutex_unlock(&dev_priv->perf.streams_lock); } @@ -462,7 +465,8 @@ out_unlock: return ret; } -static void i915_perf_command_stream_hook_oa(struct drm_i915_gem_request *req) +static void i915_perf_command_stream_hook_oa(struct drm_i915_gem_request *req, + u32 tag) { struct drm_i915_private *dev_priv = req->i915; struct intel_ring *ring = req->ring; @@ -487,6 +491,7 @@ static void i915_perf_command_stream_hook_oa(struct drm_i915_gem_request *req) entry->ctx_id = ctx->hw_id; entry->pid = current->pid; + entry->tag = tag; i915_gem_request_assign(&entry->request, req); addr = dev_priv->perf.command_stream_buf.vma->node.start + @@ -744,6 +749,12 @@ static int append_oa_sample(struct i915_perf_stream *stream, buf += 4; } + if (sample_flags & SAMPLE_TAG) { + if (copy_to_user(buf, &data->tag, 4)) + return -EFAULT; + buf += 4; + } + if (sample_flags & SAMPLE_OA_REPORT) { if (copy_to_user(buf, data->report, report_size)) return -EFAULT; @@ -789,6 +800,9 @@ static int append_oa_buffer_sample(struct i915_perf_stream *stream, if (sample_flags & SAMPLE_PID) data.pid = dev_priv->perf.last_pid; + if (sample_flags & SAMPLE_TAG) + data.tag = dev_priv->perf.last_tag; + if (sample_flags & SAMPLE_OA_REPORT) data.report = report; @@ -1310,6 +1324,11 @@ static int append_oa_rcs_sample(struct i915_perf_stream *stream, dev_priv->perf.last_pid = node->pid; } + if (sample_flags & SAMPLE_TAG) { + data.tag = node->tag; + dev_priv->perf.last_tag = node->tag; + } + if (sample_flags & SAMPLE_OA_REPORT) data.report = report; @@ -2144,7 +2163,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, struct drm_i915_private *dev_priv = stream->dev_priv; bool require_oa_unit = props->sample_flags & (SAMPLE_OA_REPORT | SAMPLE_OA_SOURCE_INFO); - bool require_cs_mode = props->sample_flags & SAMPLE_PID; + bool require_cs_mode = props->sample_flags & (SAMPLE_PID | + SAMPLE_TAG); bool cs_sample_data = props->sample_flags & SAMPLE_OA_REPORT; int ret; @@ -2297,7 +2317,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, } if (require_cs_mode && !props->cs_mode) { - DRM_ERROR("PID sampling requires a ring to be specified"); + DRM_ERROR("PID or TAG sampling require a ring to be specified"); ret = -EINVAL; goto cs_error; } @@ -2330,6 +2350,11 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->sample_size += 4; } + if (props->sample_flags & SAMPLE_TAG) { + stream->sample_flags |= SAMPLE_TAG; + stream->sample_size += 4; + } + ret = alloc_command_stream_buf(dev_priv); if (ret) goto cs_error; @@ -3005,6 +3030,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, case DRM_I915_PERF_PROP_SAMPLE_PID: props->sample_flags |= SAMPLE_PID; break; + case DRM_I915_PERF_PROP_SAMPLE_TAG: + props->sample_flags |= SAMPLE_TAG; + break; case DRM_I915_PERF_PROP_MAX: BUG(); } diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index ead97b7f4..452c497 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -832,6 +832,11 @@ struct drm_i915_gem_execbuffer2 { #define i915_execbuffer2_get_context_id(eb2) \ ((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK) +/* upper 32 bits of rsvd1 field contain tag */ +#define I915_EXEC_TAG_MASK (0xffffffff00000000UL) +#define i915_execbuffer2_get_tag(eb2) \ + (((eb2).rsvd1 & I915_EXEC_TAG_MASK) >> 32) + struct drm_i915_gem_pin { /** Handle of the buffer to be pinned. */ __u32 handle; @@ -1313,6 +1318,12 @@ enum drm_i915_perf_property_id { */ DRM_I915_PERF_PROP_SAMPLE_PID, + /** + * The value of this property set to 1 requests inclusion of tag in the + * perf sample data. + */ + DRM_I915_PERF_PROP_SAMPLE_TAG, + DRM_I915_PERF_PROP_MAX /* non-ABI */ }; @@ -1380,6 +1391,7 @@ enum drm_i915_perf_record_type { * { u32 source_info; } && DRM_I915_PERF_PROP_SAMPLE_OA_SOURCE * { u32 ctx_id; } && DRM_I915_PERF_PROP_SAMPLE_CTX_ID * { u32 pid; } && DRM_I915_PERF_PROP_SAMPLE_PID + * { u32 tag; } && DRM_I915_PERF_PROP_SAMPLE_TAG * { u32 oa_report[]; } && DRM_I915_PERF_PROP_SAMPLE_OA * }; */