Message ID | 20171215180603.17600-1-lionel.g.landwerlin@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Lionel Landwerlin (2017-12-15 18:06:03) > When monitoring the GPU with i915 perf, reports are tagged with a hw > id. When also tracking the requests using the kernel tracepoints, if > we include the hw_id from i915_gem_context, this allows us to > correlate a process with hw id fields in the OA reports. > > v2: Place hw_id at the end of the tracepoint to not disrupt too much > existing tools (Chris) > > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> This is a good excuse to reopen an old thread: 20170511130045.32271-1-tvrtko.ursulin@linux.intel.com │ On 11/05/2017 14:16, Tvrtko Ursulin wrote: │ > │ > On 11/05/2017 14:07, Chris Wilson wrote: │ >> On Thu, May 11, 2017 at 02:00:45PM +0100, Tvrtko Ursulin wrote: │ >>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> │ >>> │ >>> For userspace receiving binary data it is easier if all related │ >>> request tracepoints emit the binary data in the same order of │ >>> dev, ring, ctx, seqno, ... │ >> │ >> We decided that dev, ctx, ring, seqno was the right heirachy last time. │ >> After much debate :) │ >> ctx is the logical view of the device for a user │ >> ctx + ring = timeline │ > │ > I couldn't remember so I thought it must have been what is documented in │ > TP_printk. Now I am confused. On one hand it's true, but on the other │ > ctx/seqno is also a pair as opposed to engine being stuck in the middle. │ > So ring-ctx-seqno is easier for humans I think. Even since per │ > ctx/engine seqno space. │ │ [thread bump!] │ │ Can we agree what to do here? Printk's are all engine-ctx-seqno which │ makes it sound like that was the order we agreed upon. But binary blobs │ are inconsistent as you have noticed - do we care enough to go and │ fiddle with that is the question? Is overlay the only userspace to your │ knowledge that accesses the binary blobs? │ │ Regards, │ │ Tvrtko Given that we are adding info here, we can also go through and make the order consistent. dev, hw_id, engine, ctx, seqno? -Chris
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 4e76768ffa95..aa842596996d 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -620,6 +620,7 @@ TRACE_EVENT(i915_gem_request_queue, __field(u32, ctx) __field(u32, seqno) __field(u32, flags) + __field(u32, hw_id) ), TP_fast_assign( @@ -628,11 +629,12 @@ TRACE_EVENT(i915_gem_request_queue, __entry->ctx = req->fence.context; __entry->seqno = req->fence.seqno; __entry->flags = flags; + __entry->hw_id = req->ctx->hw_id; ), - TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, flags=0x%x", + TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, flags=0x%x, hw_id=%u", __entry->dev, __entry->ring, __entry->ctx, __entry->seqno, - __entry->flags) + __entry->flags, __entry->hw_id) ); DECLARE_EVENT_CLASS(i915_gem_request, @@ -645,6 +647,7 @@ DECLARE_EVENT_CLASS(i915_gem_request, __field(u32, ring) __field(u32, seqno) __field(u32, global) + __field(u32, hw_id) ), TP_fast_assign( @@ -653,11 +656,12 @@ DECLARE_EVENT_CLASS(i915_gem_request, __entry->ctx = req->fence.context; __entry->seqno = req->fence.seqno; __entry->global = req->global_seqno; + __entry->hw_id = req->ctx->hw_id; ), - TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u", + TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, hw_id=%u", __entry->dev, __entry->ring, __entry->ctx, __entry->seqno, - __entry->global) + __entry->global, __entry->hw_id) ); DEFINE_EVENT(i915_gem_request, i915_gem_request_add, @@ -688,6 +692,7 @@ DECLARE_EVENT_CLASS(i915_gem_request_hw, __field(u32, global_seqno) __field(u32, ctx) __field(u32, port) + __field(u32, hw_id) ), TP_fast_assign( @@ -697,12 +702,13 @@ DECLARE_EVENT_CLASS(i915_gem_request_hw, __entry->seqno = req->fence.seqno; __entry->global_seqno = req->global_seqno; __entry->port = port; + __entry->hw_id = req->ctx->hw_id; ), - TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, port=%u", + TP_printk("dev=%u, ring=%u, ctx=%u, seqno=%u, global=%u, port=%u, hw_id=%u", __entry->dev, __entry->ring, __entry->ctx, __entry->seqno, __entry->global_seqno, - __entry->port) + __entry->port, __entry->hw_id) ); DEFINE_EVENT(i915_gem_request_hw, i915_gem_request_in,
When monitoring the GPU with i915 perf, reports are tagged with a hw id. When also tracking the requests using the kernel tracepoints, if we include the hw_id from i915_gem_context, this allows us to correlate a process with hw id fields in the OA reports. v2: Place hw_id at the end of the tracepoint to not disrupt too much existing tools (Chris) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> --- drivers/gpu/drm/i915/i915_trace.h | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)