diff mbox

drm/i915/trace: add hw_id to gem requests trace points

Message ID 20171215155136.8739-1-lionel.g.landwerlin@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lionel Landwerlin Dec. 15, 2017, 3:51 p.m. UTC
When monitoring the GPU with i915 perf, reports are tagged with a hw
id. When also tracking the requests using the kernel tracepoints, if
we include the hw_id from i915_gem_context, this allows us to
correlate a process with hw id fields in the OA reports.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_trace.h | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

Comments

Chris Wilson Dec. 15, 2017, 4:08 p.m. UTC | #1
Quoting Lionel Landwerlin (2017-12-15 15:51:36)
> When monitoring the GPU with i915 perf, reports are tagged with a hw
> id. When also tracking the requests using the kernel tracepoints, if
> we include the hw_id from i915_gem_context, this allows us to
> correlate a process with hw id fields in the OA reports.

Maybe, but don't split the fence.context and fence.seqno. You should
also update the igt tools using the tracepoints.
-Chris
Lionel Landwerlin Dec. 15, 2017, 4:18 p.m. UTC | #2
On 15/12/17 16:08, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2017-12-15 15:51:36)
>> When monitoring the GPU with i915 perf, reports are tagged with a hw
>> id. When also tracking the requests using the kernel tracepoints, if
>> we include the hw_id from i915_gem_context, this allows us to
>> correlate a process with hw id fields in the OA reports.
> Maybe, but don't split the fence.context and fence.seqno. You should
> also update the igt tools using the tracepoints.
> -Chris
>
I would have thought the tools could deal with a different ordering :(
Tvrtko Ursulin Dec. 15, 2017, 5:22 p.m. UTC | #3
On 15/12/2017 16:18, Lionel Landwerlin wrote:
> On 15/12/17 16:08, Chris Wilson wrote:
>> Quoting Lionel Landwerlin (2017-12-15 15:51:36)
>>> When monitoring the GPU with i915 perf, reports are tagged with a hw
>>> id. When also tracking the requests using the kernel tracepoints, if
>>> we include the hw_id from i915_gem_context, this allows us to
>>> correlate a process with hw id fields in the OA reports.
>> Maybe, but don't split the fence.context and fence.seqno. You should
>> also update the igt tools using the tracepoints.
>> -Chris
>>
> I would have thought the tools could deal with a different ordering :(

At least as a benefit you get to refresh your Perl skills. ;)

trace.pl and intel-gpu-overlay I think are the only two.

I guess it would be possible to smarten both up to detect where the 
fields they need are located but maybe too much work.

Regards,

Tvrtko
Chris Wilson Dec. 15, 2017, 9:15 p.m. UTC | #4
Quoting Tvrtko Ursulin (2017-12-15 17:22:57)
> 
> On 15/12/2017 16:18, Lionel Landwerlin wrote:
> > On 15/12/17 16:08, Chris Wilson wrote:
> >> Quoting Lionel Landwerlin (2017-12-15 15:51:36)
> >>> When monitoring the GPU with i915 perf, reports are tagged with a hw
> >>> id. When also tracking the requests using the kernel tracepoints, if
> >>> we include the hw_id from i915_gem_context, this allows us to
> >>> correlate a process with hw id fields in the OA reports.
> >> Maybe, but don't split the fence.context and fence.seqno. You should
> >> also update the igt tools using the tracepoints.
> >> -Chris
> >>
> > I would have thought the tools could deal with a different ordering :(
> 
> At least as a benefit you get to refresh your Perl skills. ;)
> 
> trace.pl and intel-gpu-overlay I think are the only two.
> 
> I guess it would be possible to smarten both up to detect where the 
> fields they need are located but maybe too much work.

That's the thinking that got us into this mess to begin with ;)

But yes, it may remain much easier to keep igt in line with the kernel
and just say no to backwards-compatibility for these devtools.
-Chris
Lionel Landwerlin Dec. 16, 2017, 12:42 a.m. UTC | #5
On 15/12/17 21:15, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-12-15 17:22:57)
>> On 15/12/2017 16:18, Lionel Landwerlin wrote:
>>> On 15/12/17 16:08, Chris Wilson wrote:
>>>> Quoting Lionel Landwerlin (2017-12-15 15:51:36)
>>>>> When monitoring the GPU with i915 perf, reports are tagged with a hw
>>>>> id. When also tracking the requests using the kernel tracepoints, if
>>>>> we include the hw_id from i915_gem_context, this allows us to
>>>>> correlate a process with hw id fields in the OA reports.
>>>> Maybe, but don't split the fence.context and fence.seqno. You should
>>>> also update the igt tools using the tracepoints.
>>>> -Chris
>>>>
>>> I would have thought the tools could deal with a different ordering :(
>> At least as a benefit you get to refresh your Perl skills. ;)
>>
>> trace.pl and intel-gpu-overlay I think are the only two.
>>
>> I guess it would be possible to smarten both up to detect where the
>> fields they need are located but maybe too much work.
> That's the thinking that got us into this mess to begin with ;)
>
> But yes, it may remain much easier to keep igt in line with the kernel
> and just say no to backwards-compatibility for these devtools.
> -Chris
>

Like I said on IRC, I wrote a small parser in peg : 
https://github.com/rib/gputop/blob/imgui/gputop-ui/tracepoint_format.leg
That's enough to get the fields' location/size, but not to interpret 
whether something's a pointer rather than u32/64.
Using a peg parser limits the pain quite a bit. Those should be 
available in python as well (don't know about perl).

-
Lionel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 4e76768ffa95..e4c7b6a368bd 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -618,6 +618,7 @@  TRACE_EVENT(i915_gem_request_queue,
 			     __field(u32, dev)
 			     __field(u32, ring)
 			     __field(u32, ctx)
+			     __field(u32, hw_id)
 			     __field(u32, seqno)
 			     __field(u32, flags)
 			     ),
@@ -626,13 +627,14 @@  TRACE_EVENT(i915_gem_request_queue,
 			   __entry->dev = req->i915->drm.primary->index;
 			   __entry->ring = req->engine->id;
 			   __entry->ctx = req->fence.context;
+			   __entry->hw_id = req->ctx->hw_id;
 			   __entry->seqno = req->fence.seqno;
 			   __entry->flags = flags;
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, flags=0x%x",
-		      __entry->dev, __entry->ring, __entry->ctx, __entry->seqno,
-		      __entry->flags)
+	    TP_printk("dev=%u, ring=%u, ctx=%u, hw_id=%u, seqno=%u, flags=0x%x",
+		      __entry->dev, __entry->ring, __entry->ctx, __entry->hw_id,
+		      __entry->seqno, __entry->flags)
 );
 
 DECLARE_EVENT_CLASS(i915_gem_request,
@@ -642,6 +644,7 @@  DECLARE_EVENT_CLASS(i915_gem_request,
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
 			     __field(u32, ctx)
+			     __field(u32, hw_id)
 			     __field(u32, ring)
 			     __field(u32, seqno)
 			     __field(u32, global)
@@ -651,13 +654,14 @@  DECLARE_EVENT_CLASS(i915_gem_request,
 			   __entry->dev = req->i915->drm.primary->index;
 			   __entry->ring = req->engine->id;
 			   __entry->ctx = req->fence.context;
+			   __entry->hw_id = req->ctx->hw_id;
 			   __entry->seqno = req->fence.seqno;
 			   __entry->global = req->global_seqno;
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u",
-		      __entry->dev, __entry->ring, __entry->ctx, __entry->seqno,
-		      __entry->global)
+	    TP_printk("dev=%u, ring=%u, ctx=%u, hw_id=%u, seqno=%u, global=%u",
+		      __entry->dev, __entry->ring, __entry->ctx, __entry->hw_id,
+		      __entry->seqno, __entry->global)
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
@@ -687,6 +691,7 @@  DECLARE_EVENT_CLASS(i915_gem_request_hw,
 				     __field(u32, seqno)
 				     __field(u32, global_seqno)
 				     __field(u32, ctx)
+				     __field(u32, hw_id)
 				     __field(u32, port)
 				    ),
 
@@ -694,15 +699,16 @@  DECLARE_EVENT_CLASS(i915_gem_request_hw,
 			           __entry->dev = req->i915->drm.primary->index;
 			           __entry->ring = req->engine->id;
 			           __entry->ctx = req->fence.context;
+			           __entry->hw_id = req->ctx->hw_id;
 			           __entry->seqno = req->fence.seqno;
 			           __entry->global_seqno = req->global_seqno;
 			           __entry->port = port;
 			          ),
 
-		    TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, port=%u",
+		    TP_printk("dev=%u, ring=%u, ctx=%u, hw_id=%u, seqno=%u, global=%u, port=%u",
 			      __entry->dev, __entry->ring, __entry->ctx,
-			      __entry->seqno, __entry->global_seqno,
-			      __entry->port)
+			      __entry->hw_id, __entry->seqno,
+			      __entry->global_seqno, __entry->port)
 );
 
 DEFINE_EVENT(i915_gem_request_hw, i915_gem_request_in,