diff mbox series

drm/vc4: add tracepoints for CL submissions

Message ID 20220201212651.zhltjmaokisffq3x@mail.igalia.com (mailing list archive)
State New, archived
Headers show
Series drm/vc4: add tracepoints for CL submissions | expand

Commit Message

Melissa Wen Feb. 1, 2022, 9:26 p.m. UTC
Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
jobs execution. It might be helpful to get a rendering timeline and
track job throttling.

Signed-off-by: Melissa Wen <mwen@igalia.com>
---
 drivers/gpu/drm/vc4/vc4_gem.c   |  7 +++
 drivers/gpu/drm/vc4/vc4_irq.c   |  5 ++
 drivers/gpu/drm/vc4/vc4_trace.h | 95 +++++++++++++++++++++++++++++++++
 3 files changed, 107 insertions(+)

Comments

Maxime Ripard Feb. 25, 2022, 4:11 p.m. UTC | #1
Hi Melissa,

On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote:
> Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
> jobs execution. It might be helpful to get a rendering timeline and
> track job throttling.
> 
> Signed-off-by: Melissa Wen <mwen@igalia.com>

I'm not really sure what to do about this patch to be honest.

My understanding is that tracepoints are considered as userspace ABI,
but I can't really judge whether your traces are good enough or if it's
something that will bit us some time down the road.

Emma, Daniel, David, any insight?

Thanks,
Maxime
Melissa Wen March 1, 2022, 2:58 p.m. UTC | #2
On 02/25, Maxime Ripard wrote:
> Hi Melissa,
> 
> On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote:
> > Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
> > jobs execution. It might be helpful to get a rendering timeline and
> > track job throttling.
> > 
> > Signed-off-by: Melissa Wen <mwen@igalia.com>
> 
> I'm not really sure what to do about this patch to be honest.
> 
> My understanding is that tracepoints are considered as userspace ABI,
> but I can't really judge whether your traces are good enough or if it's
> something that will bit us some time down the road.

Hi Maxime,

Thanks for taking a look at this patch.

So, I followed the same path of tracing CL submissions on v3d. I mean,
tracking submit_cl ioctl, points when a job (bin/render) starts it
execution, and irqs of completion (bin/render job). We used it to
examine job throttling when running Chromium and, therefore, in addition
to have the timeline of jobs execution, I show some data submitted in
the ioctl to make connections. I think these tracers might be useful for
some investigation in the future, but I'm also not sure if all data are
necessary to be displayed.

Melissa

> 
> Emma, Daniel, David, any insight?
> 
> Thanks,
> Maxime
Maxime Ripard March 10, 2022, 11:12 a.m. UTC | #3
On Tue, Mar 01, 2022 at 01:58:26PM -0100, Melissa Wen wrote:
> On 02/25, Maxime Ripard wrote:
> > Hi Melissa,
> > 
> > On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote:
> > > Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
> > > jobs execution. It might be helpful to get a rendering timeline and
> > > track job throttling.
> > > 
> > > Signed-off-by: Melissa Wen <mwen@igalia.com>
> > 
> > I'm not really sure what to do about this patch to be honest.
> > 
> > My understanding is that tracepoints are considered as userspace ABI,
> > but I can't really judge whether your traces are good enough or if it's
> > something that will bit us some time down the road.
>
> Thanks for taking a look at this patch.
> 
> So, I followed the same path of tracing CL submissions on v3d. I mean,
> tracking submit_cl ioctl, points when a job (bin/render) starts it
> execution, and irqs of completion (bin/render job). We used it to
> examine job throttling when running Chromium and, therefore, in addition
> to have the timeline of jobs execution, I show some data submitted in
> the ioctl to make connections. I think these tracers might be useful for
> some investigation in the future, but I'm also not sure if all data are
> necessary to be displayed.

Yeah, I'm sure that it's useful :)

I don't see anything wrong with that patch, really. What I meant is that
I don't really have the experience to judge if there's anything wrong in
the first place :)

If you can get someone with more experience with the v3d driver (Emma,
Iago maybe?) I'll be definitely be ok merging that patch

Maxime
Chema Casanova March 10, 2022, 11:54 a.m. UTC | #4
El 10/3/22 a las 12:12, Maxime Ripard escribió:
> On Tue, Mar 01, 2022 at 01:58:26PM -0100, Melissa Wen wrote:
>> On 02/25, Maxime Ripard wrote:
>>> Hi Melissa,
>>>
>>> On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote:
>>>> Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
>>>> jobs execution. It might be helpful to get a rendering timeline and
>>>> track job throttling.
>>>>
>>>> Signed-off-by: Melissa Wen <mwen@igalia.com>
>>> I'm not really sure what to do about this patch to be honest.
>>>
>>> My understanding is that tracepoints are considered as userspace ABI,
>>> but I can't really judge whether your traces are good enough or if it's
>>> something that will bit us some time down the road.
>> Thanks for taking a look at this patch.
>>
>> So, I followed the same path of tracing CL submissions on v3d. I mean,
>> tracking submit_cl ioctl, points when a job (bin/render) starts it
>> execution, and irqs of completion (bin/render job). We used it to
>> examine job throttling when running Chromium and, therefore, in addition
>> to have the timeline of jobs execution, I show some data submitted in
>> the ioctl to make connections. I think these tracers might be useful for
>> some investigation in the future, but I'm also not sure if all data are
>> necessary to be displayed.
> Yeah, I'm sure that it's useful :)
>
> I don't see anything wrong with that patch, really. What I meant is that
> I don't really have the experience to judge if there's anything wrong in
> the first place :)
>
> If you can get someone with more experience with the v3d driver (Emma,
> Iago maybe?) I'll be definitely be ok merging that patch

I've checked this patch and I've been using these tracepoints.
They have been working properly.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>

Regards,

Chema Casanova
Maxime Ripard March 17, 2022, 2:12 p.m. UTC | #5
On Tue, 1 Feb 2022 20:26:51 -0100, Melissa Wen wrote:
> Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
> jobs execution. It might be helpful to get a rendering timeline and
> track job throttling.
> 
> 

Applied to drm/drm-misc (drm-misc-next).

Thanks!
Maxime
Maxime Ripard March 17, 2022, 2:13 p.m. UTC | #6
On Thu, Mar 10, 2022 at 12:54:32PM +0100, Chema Casanova wrote:
> El 10/3/22 a las 12:12, Maxime Ripard escribió:
> > On Tue, Mar 01, 2022 at 01:58:26PM -0100, Melissa Wen wrote:
> > > On 02/25, Maxime Ripard wrote:
> > > > Hi Melissa,
> > > > 
> > > > On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote:
> > > > > Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
> > > > > jobs execution. It might be helpful to get a rendering timeline and
> > > > > track job throttling.
> > > > > 
> > > > > Signed-off-by: Melissa Wen <mwen@igalia.com>
> > > > I'm not really sure what to do about this patch to be honest.
> > > > 
> > > > My understanding is that tracepoints are considered as userspace ABI,
> > > > but I can't really judge whether your traces are good enough or if it's
> > > > something that will bit us some time down the road.
> > > Thanks for taking a look at this patch.
> > > 
> > > So, I followed the same path of tracing CL submissions on v3d. I mean,
> > > tracking submit_cl ioctl, points when a job (bin/render) starts it
> > > execution, and irqs of completion (bin/render job). We used it to
> > > examine job throttling when running Chromium and, therefore, in addition
> > > to have the timeline of jobs execution, I show some data submitted in
> > > the ioctl to make connections. I think these tracers might be useful for
> > > some investigation in the future, but I'm also not sure if all data are
> > > necessary to be displayed.
> > Yeah, I'm sure that it's useful :)
> > 
> > I don't see anything wrong with that patch, really. What I meant is that
> > I don't really have the experience to judge if there's anything wrong in
> > the first place :)
> > 
> > If you can get someone with more experience with the v3d driver (Emma,
> > Iago maybe?) I'll be definitely be ok merging that patch
> 
> I've checked this patch and I've been using these tracepoints.
> They have been working properly.
> 
> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>

Thanks for your feedback, I just merged the patch

Maxime
diff mbox series

Patch

diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 445d3bab89e0..4abf10b66fe8 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -485,6 +485,8 @@  vc4_submit_next_bin_job(struct drm_device *dev)
 	 * immediately move it to the to-be-rendered queue.
 	 */
 	if (exec->ct0ca != exec->ct0ea) {
+		trace_vc4_submit_cl(dev, false, exec->seqno, exec->ct0ca,
+				    exec->ct0ea);
 		submit_cl(dev, 0, exec->ct0ca, exec->ct0ea);
 	} else {
 		struct vc4_exec_info *next;
@@ -519,6 +521,7 @@  vc4_submit_next_render_job(struct drm_device *dev)
 	 */
 	vc4_flush_texture_caches(dev);
 
+	trace_vc4_submit_cl(dev, true, exec->seqno, exec->ct1ca, exec->ct1ea);
 	submit_cl(dev, 1, exec->ct1ca, exec->ct1ea);
 }
 
@@ -1135,6 +1138,10 @@  vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
 	struct dma_fence *in_fence;
 	int ret = 0;
 
+	trace_vc4_submit_cl_ioctl(dev, args->bin_cl_size,
+				  args->shader_rec_size,
+				  args->bo_handle_count);
+
 	if (!vc4->v3d) {
 		DRM_DEBUG("VC4_SUBMIT_CL with no VC4 V3D probed\n");
 		return -ENODEV;
diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 20fa8e34c20b..4342fb43e8c1 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -51,6 +51,7 @@ 
 
 #include "vc4_drv.h"
 #include "vc4_regs.h"
+#include "vc4_trace.h"
 
 #define V3D_DRIVER_IRQS (V3D_INT_OUTOMEM | \
 			 V3D_INT_FLDONE | \
@@ -123,6 +124,8 @@  vc4_irq_finish_bin_job(struct drm_device *dev)
 	if (!exec)
 		return;
 
+	trace_vc4_bcl_end_irq(dev, exec->seqno);
+
 	vc4_move_job_to_render(dev, exec);
 	next = vc4_first_bin_job(vc4);
 
@@ -161,6 +164,8 @@  vc4_irq_finish_render_job(struct drm_device *dev)
 	if (!exec)
 		return;
 
+	trace_vc4_rcl_end_irq(dev, exec->seqno);
+
 	vc4->finished_seqno++;
 	list_move_tail(&exec->head, &vc4->job_done_list);
 
diff --git a/drivers/gpu/drm/vc4/vc4_trace.h b/drivers/gpu/drm/vc4/vc4_trace.h
index 1cccde0b09a7..7f4c49e7e011 100644
--- a/drivers/gpu/drm/vc4/vc4_trace.h
+++ b/drivers/gpu/drm/vc4/vc4_trace.h
@@ -52,6 +52,101 @@  TRACE_EVENT(vc4_wait_for_seqno_end,
 		      __entry->dev, __entry->seqno)
 );
 
+TRACE_EVENT(vc4_submit_cl_ioctl,
+	    TP_PROTO(struct drm_device *dev, u32 bin_cl_size, u32 shader_rec_size, u32 bo_count),
+	    TP_ARGS(dev, bin_cl_size, shader_rec_size, bo_count),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(u32, bin_cl_size)
+			     __field(u32, shader_rec_size)
+			     __field(u32, bo_count)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = dev->primary->index;
+			   __entry->bin_cl_size = bin_cl_size;
+			   __entry->shader_rec_size = shader_rec_size;
+			   __entry->bo_count = bo_count;
+			   ),
+
+	    TP_printk("dev=%u, bin_cl_size=%u, shader_rec_size=%u, bo_count=%u",
+		      __entry->dev,
+		      __entry->bin_cl_size,
+		      __entry->shader_rec_size,
+		      __entry->bo_count)
+);
+
+TRACE_EVENT(vc4_submit_cl,
+	    TP_PROTO(struct drm_device *dev, bool is_render,
+		     uint64_t seqno,
+		     u32 ctnqba, u32 ctnqea),
+	    TP_ARGS(dev, is_render, seqno, ctnqba, ctnqea),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(bool, is_render)
+			     __field(u64, seqno)
+			     __field(u32, ctnqba)
+			     __field(u32, ctnqea)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = dev->primary->index;
+			   __entry->is_render = is_render;
+			   __entry->seqno = seqno;
+			   __entry->ctnqba = ctnqba;
+			   __entry->ctnqea = ctnqea;
+			   ),
+
+	    TP_printk("dev=%u, %s, seqno=%llu, 0x%08x..0x%08x",
+		      __entry->dev,
+		      __entry->is_render ? "RCL" : "BCL",
+		      __entry->seqno,
+		      __entry->ctnqba,
+		      __entry->ctnqea)
+);
+
+TRACE_EVENT(vc4_bcl_end_irq,
+	    TP_PROTO(struct drm_device *dev,
+		     uint64_t seqno),
+	    TP_ARGS(dev, seqno),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(u64, seqno)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = dev->primary->index;
+			   __entry->seqno = seqno;
+			   ),
+
+	    TP_printk("dev=%u, seqno=%llu",
+		      __entry->dev,
+		      __entry->seqno)
+);
+
+TRACE_EVENT(vc4_rcl_end_irq,
+	    TP_PROTO(struct drm_device *dev,
+		     uint64_t seqno),
+	    TP_ARGS(dev, seqno),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(u64, seqno)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = dev->primary->index;
+			   __entry->seqno = seqno;
+			   ),
+
+	    TP_printk("dev=%u, seqno=%llu",
+		      __entry->dev,
+		      __entry->seqno)
+);
+
 #endif /* _VC4_TRACE_H_ */
 
 /* This part must be outside protection */