diff mbox

[v2] drm/i915/trace: add hw_id to gem requests trace points

Message ID 20171215180603.17600-1-lionel.g.landwerlin@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lionel Landwerlin Dec. 15, 2017, 6:06 p.m. UTC
When monitoring the GPU with i915 perf, reports are tagged with a hw
id. When also tracking the requests using the kernel tracepoints, if
we include the hw_id from i915_gem_context, this allows us to
correlate a process with hw id fields in the OA reports.

v2: Place hw_id at the end of the tracepoint to not disrupt too much
    existing tools (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_trace.h | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

Comments

Chris Wilson Dec. 17, 2017, 6:45 p.m. UTC | #1
Quoting Lionel Landwerlin (2017-12-15 18:06:03)
> When monitoring the GPU with i915 perf, reports are tagged with a hw
> id. When also tracking the requests using the kernel tracepoints, if
> we include the hw_id from i915_gem_context, this allows us to
> correlate a process with hw id fields in the OA reports.
> 
> v2: Place hw_id at the end of the tracepoint to not disrupt too much
>     existing tools (Chris)
> 
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

This is a good excuse to reopen an old thread:
20170511130045.32271-1-tvrtko.ursulin@linux.intel.com
│     On 11/05/2017 14:16, Tvrtko Ursulin wrote:
│     >
│     > On 11/05/2017 14:07, Chris Wilson wrote:
│     >> On Thu, May 11, 2017 at 02:00:45PM +0100, Tvrtko Ursulin wrote:
│     >>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
│     >>>
│     >>> For userspace receiving binary data it is easier if all related
│     >>> request tracepoints emit the binary data in the same order of
│     >>> dev, ring, ctx, seqno, ...
│     >>
│     >> We decided that dev, ctx, ring, seqno was the right heirachy last time.
│     >> After much debate :)
│     >> ctx is the logical view of the device for a user
│     >> ctx + ring = timeline
│     >
│     > I couldn't remember so I thought it must have been what is documented in
│     > TP_printk. Now I am confused. On one hand it's true, but on the other
│     > ctx/seqno is also a pair as opposed to engine being stuck in the middle.
│     > So ring-ctx-seqno is easier for humans I think. Even since per
│     > ctx/engine seqno space.
│     
│     [thread bump!]
│     
│     Can we agree what to do here? Printk's are all engine-ctx-seqno which
│     makes it sound like that was the order we agreed upon. But binary blobs
│     are inconsistent as you have noticed - do we care enough to go and
│     fiddle with that is the question? Is overlay the only userspace to your
│     knowledge that accesses the binary blobs?
│     
│     Regards,
│     
│     Tvrtko

Given that we are adding info here, we can also go through and make the
order consistent. dev, hw_id, engine, ctx, seqno?
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 4e76768ffa95..aa842596996d 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -620,6 +620,7 @@  TRACE_EVENT(i915_gem_request_queue,
 			     __field(u32, ctx)
 			     __field(u32, seqno)
 			     __field(u32, flags)
+			     __field(u32, hw_id)
 			     ),
 
 	    TP_fast_assign(
@@ -628,11 +629,12 @@  TRACE_EVENT(i915_gem_request_queue,
 			   __entry->ctx = req->fence.context;
 			   __entry->seqno = req->fence.seqno;
 			   __entry->flags = flags;
+			   __entry->hw_id = req->ctx->hw_id;
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, flags=0x%x",
+	    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, flags=0x%x, hw_id=%u",
 		      __entry->dev, __entry->ring, __entry->ctx, __entry->seqno,
-		      __entry->flags)
+		      __entry->flags, __entry->hw_id)
 );
 
 DECLARE_EVENT_CLASS(i915_gem_request,
@@ -645,6 +647,7 @@  DECLARE_EVENT_CLASS(i915_gem_request,
 			     __field(u32, ring)
 			     __field(u32, seqno)
 			     __field(u32, global)
+			     __field(u32, hw_id)
 			     ),
 
 	    TP_fast_assign(
@@ -653,11 +656,12 @@  DECLARE_EVENT_CLASS(i915_gem_request,
 			   __entry->ctx = req->fence.context;
 			   __entry->seqno = req->fence.seqno;
 			   __entry->global = req->global_seqno;
+			   __entry->hw_id = req->ctx->hw_id;
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u",
+	    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, hw_id=%u",
 		      __entry->dev, __entry->ring, __entry->ctx, __entry->seqno,
-		      __entry->global)
+		      __entry->global, __entry->hw_id)
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
@@ -688,6 +692,7 @@  DECLARE_EVENT_CLASS(i915_gem_request_hw,
 				     __field(u32, global_seqno)
 				     __field(u32, ctx)
 				     __field(u32, port)
+				     __field(u32, hw_id)
 				    ),
 
 		    TP_fast_assign(
@@ -697,12 +702,13 @@  DECLARE_EVENT_CLASS(i915_gem_request_hw,
 			           __entry->seqno = req->fence.seqno;
 			           __entry->global_seqno = req->global_seqno;
 			           __entry->port = port;
+			           __entry->hw_id = req->ctx->hw_id;
 			          ),
 
-		    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, port=%u",
+		    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, port=%u, hw_id=%u",
 			      __entry->dev, __entry->ring, __entry->ctx,
 			      __entry->seqno, __entry->global_seqno,
-			      __entry->port)
+			      __entry->port, __entry->hw_id)
 );
 
 DEFINE_EVENT(i915_gem_request_hw, i915_gem_request_in,