drm/sched: Always trace the dependencies we wait on, to fix a race.
diff mbox series

Message ID 20181207191653.30118-1-eric@anholt.net
State New
Headers show
Series
  • drm/sched: Always trace the dependencies we wait on, to fix a race.
Related show

Commit Message

Eric Anholt Dec. 7, 2018, 7:16 p.m. UTC
The entity->dependency can go away completely once we've called
drm_sched_entity_add_dependency_cb() (if the cb is called before we
get around to tracing).  The tracepoint is more useful if we trace
every dependency instead of just ones that get callbacks installed,
anyway, so just do that.

Fixes any easy-to-produce OOPS when tracing the scheduler on V3D with
"perf record -a -e gpu_scheduler:.\* glxgears" and DEBUG_SLAB enabled.

Signed-off-by: Eric Anholt <eric@anholt.net>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Koenig, Christian Dec. 7, 2018, 7:46 p.m. UTC | #1
Am 07.12.18 um 20:16 schrieb Eric Anholt:
> The entity->dependency can go away completely once we've called
> drm_sched_entity_add_dependency_cb() (if the cb is called before we
> get around to tracing).  The tracepoint is more useful if we trace
> every dependency instead of just ones that get callbacks installed,
> anyway, so just do that.
>
> Fixes any easy-to-produce OOPS when tracing the scheduler on V3D with
> "perf record -a -e gpu_scheduler:.\* glxgears" and DEBUG_SLAB enabled.
>
> Signed-off-by: Eric Anholt <eric@anholt.net>

Reviewed-by: Christian König <christian.koenig@amd.com>

Going to pick that up for upstream and will add with a CC: stable.

Thanks,
Christian.

> ---
>   drivers/gpu/drm/scheduler/sched_entity.c | 7 ++-----
>   1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 4463d3826ecb..e2942c9a11a7 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -440,13 +440,10 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>   
>   	while ((entity->dependency =
>   			sched->ops->dependency(sched_job, entity))) {
> +		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>   
> -		if (drm_sched_entity_add_dependency_cb(entity)) {
> -
> -			trace_drm_sched_job_wait_dep(sched_job,
> -						     entity->dependency);
> +		if (drm_sched_entity_add_dependency_cb(entity))
>   			return NULL;
> -		}
>   	}
>   
>   	/* skip jobs from entity that marked guilty */
Eric Anholt Feb. 7, 2019, 11:10 p.m. UTC | #2
"Koenig, Christian" <Christian.Koenig@amd.com> writes:

> Am 07.12.18 um 20:16 schrieb Eric Anholt:
>> The entity->dependency can go away completely once we've called
>> drm_sched_entity_add_dependency_cb() (if the cb is called before we
>> get around to tracing).  The tracepoint is more useful if we trace
>> every dependency instead of just ones that get callbacks installed,
>> anyway, so just do that.
>>
>> Fixes any easy-to-produce OOPS when tracing the scheduler on V3D with
>> "perf record -a -e gpu_scheduler:.\* glxgears" and DEBUG_SLAB enabled.
>>
>> Signed-off-by: Eric Anholt <eric@anholt.net>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Going to pick that up for upstream and will add with a CC: stable.

Looks like this got misplaced.
Christian König Feb. 8, 2019, 8:43 a.m. UTC | #3
Am 08.02.19 um 00:10 schrieb Eric Anholt:
> "Koenig, Christian" <Christian.Koenig@amd.com> writes:
>
>> Am 07.12.18 um 20:16 schrieb Eric Anholt:
>>> The entity->dependency can go away completely once we've called
>>> drm_sched_entity_add_dependency_cb() (if the cb is called before we
>>> get around to tracing).  The tracepoint is more useful if we trace
>>> every dependency instead of just ones that get callbacks installed,
>>> anyway, so just do that.
>>>
>>> Fixes any easy-to-produce OOPS when tracing the scheduler on V3D with
>>> "perf record -a -e gpu_scheduler:.\* glxgears" and DEBUG_SLAB enabled.
>>>
>>> Signed-off-by: Eric Anholt <eric@anholt.net>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>
>> Going to pick that up for upstream and will add with a CC: stable.
> Looks like this got misplaced.

My fault, pushed to our internal branch now with a CC: stable tag on it.

Christian.

>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Am 08.02.19 um 00:10 schrieb Eric
      Anholt:<br>
    </div>
    <blockquote type="cite" cite="mid:87tvhf2o0l.fsf@anholt.net">
      <pre class="moz-quote-pre" wrap="">"Koenig, Christian" <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com">&lt;Christian.Koenig@amd.com&gt;</a> writes:

</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Am 07.12.18 um 20:16 schrieb Eric Anholt:
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">The entity-&gt;dependency can go away completely once we've called
drm_sched_entity_add_dependency_cb() (if the cb is called before we
get around to tracing).  The tracepoint is more useful if we trace
every dependency instead of just ones that get callbacks installed,
anyway, so just do that.

Fixes any easy-to-produce OOPS when tracing the scheduler on V3D with
"perf record -a -e gpu_scheduler:.\* glxgears" and DEBUG_SLAB enabled.

Signed-off-by: Eric Anholt <a class="moz-txt-link-rfc2396E" href="mailto:eric@anholt.net">&lt;eric@anholt.net&gt;</a>
</pre>
        </blockquote>
        <pre class="moz-quote-pre" wrap="">
Reviewed-by: Christian König <a class="moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com">&lt;christian.koenig@amd.com&gt;</a>

Going to pick that up for upstream and will add with a CC: stable.
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
Looks like this got misplaced.
</pre>
    </blockquote>
    <br>
    My fault, pushed to our internal branch now with a CC: stable tag on
    it.<br>
    <br>
    Christian.<br>
    <br>
    <blockquote type="cite" cite="mid:87tvhf2o0l.fsf@anholt.net"><br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>

Patch
diff mbox series

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 4463d3826ecb..e2942c9a11a7 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -440,13 +440,10 @@  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 
 	while ((entity->dependency =
 			sched->ops->dependency(sched_job, entity))) {
+		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
-		if (drm_sched_entity_add_dependency_cb(entity)) {
-
-			trace_drm_sched_job_wait_dep(sched_job,
-						     entity->dependency);
+		if (drm_sched_entity_add_dependency_cb(entity))
 			return NULL;
-		}
 	}
 
 	/* skip jobs from entity that marked guilty */