diff mbox series

[v2,01/11] drm/sched: Split drm_sched_job_init

Message ID 20210702213815.2249499-2-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series [v2,01/11] drm/sched: Split drm_sched_job_init | expand

Commit Message

Daniel Vetter July 2, 2021, 9:38 p.m. UTC
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

Acked-by: Steven Price <steven.price@arm.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
 drivers/gpu/drm/lima/lima_sched.c        |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
 drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
 drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
 drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
 drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
 include/drm/gpu_scheduler.h              |  7 +++-
 10 files changed, 74 insertions(+), 14 deletions(-)

Comments

Christian König July 7, 2021, 9:29 a.m. UTC | #1
Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.
>
> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.
>
> v2:
> - don't change .gitignore (Steven)
> - don't forget v3d (Emma)
>
> v3: Emma noticed that I leak the memory allocated in
> drm_sched_job_init if we bail out before the point of no return in
> subsequent driver patches. To be able to fix this change
> drm_sched_job_cleanup() so it can handle being called both before and
> after drm_sched_job_arm().

Thinking more about this, I'm not sure if this really works.

See drm_sched_job_init() was also calling drm_sched_entity_select_rq() 
to update the entity->rq association.

And that can only be done later on when we arm the fence as well.

Christian.

>
> Also improve the kerneldoc for this.
>
> Acked-by: Steven Price <steven.price@arm.com> (v2)
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: Emma Anholt <emma@anholt.net>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>   drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>   drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>   drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>   drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>   drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>   include/drm/gpu_scheduler.h              |  7 +++-
>   10 files changed, 74 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index c5386d13eb4a..a4ec092af9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>   	if (r)
>   		goto error_unlock;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	/* No memory allocation is allowed while holding the notifier lock.
>   	 * The lock is held until amdgpu_cs_submit is finished and fence is
>   	 * added to BOs.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index d33e6d97cc89..5ddb955d2315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>   	if (r)
>   		return r;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	*f = dma_fence_get(&job->base.s_fence->finished);
>   	amdgpu_job_free_resources(job);
>   	drm_sched_entity_push_job(&job->base, entity);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index feb6da1b6ceb..05f412204118 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>   	if (ret)
>   		goto out_unlock;
>   
> +	drm_sched_job_arm(&submit->sched_job);
> +
>   	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>   	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>   						submit->out_fence, 0,
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index dba8329937a3..38f755580507 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>   		return err;
>   	}
>   
> +	drm_sched_job_arm(&task->base);
> +
>   	task->num_bos = num_bos;
>   	task->vm = lima_vm_get(vm);
>   
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 71a72fb50e6b..2992dc85325f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>   		goto unlock;
>   	}
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>   
>   	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 79554aa4dbb1..f7347c284886 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>    * @sched_job: job to submit
>    * @entity: scheduler entity
>    *
> - * Note: To guarantee that the order of insertion to queue matches
> - * the job's fence sequence number this function should be
> - * called with drm_sched_job_init under common lock.
> + * Note: To guarantee that the order of insertion to queue matches the job's
> + * fence sequence number this function should be called with drm_sched_job_arm()
> + * under common lock.
>    *
>    * Returns 0 for success, negative error code otherwise.
>    */
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 69de2c76731f..c451ee9a30d7 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>    *
>    * Free up the fence memory after the RCU grace period.
>    */
> -static void drm_sched_fence_free(struct rcu_head *rcu)
> +void drm_sched_fence_free(struct rcu_head *rcu)
>   {
>   	struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>   	struct drm_sched_fence *fence = to_drm_sched_fence(f);
> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>   }
>   EXPORT_SYMBOL(to_drm_sched_fence);
>   
> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> -					       void *owner)
> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> +					      void *owner)
>   {
>   	struct drm_sched_fence *fence = NULL;
> -	unsigned seq;
>   
>   	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>   	if (fence == NULL)
> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>   	fence->sched = entity->rq->sched;
>   	spin_lock_init(&fence->lock);
>   
> +	return fence;
> +}
> +
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +			  struct drm_sched_entity *entity)
> +{
> +	unsigned seq;
> +
>   	seq = atomic_inc_return(&entity->fence_seq);
>   	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>   		       &fence->lock, entity->fence_context, seq);
>   	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>   		       &fence->lock, entity->fence_context + 1, seq);
> -
> -	return fence;
>   }
>   
>   module_init(drm_sched_fence_slab_init);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 33c414d55fab..5e84e1500c32 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -48,9 +48,11 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   #include <linux/completion.h>
> +#include <linux/dma-resv.h>
>   #include <uapi/linux/sched/types.h>
>   
>   #include <drm/drm_print.h>
> +#include <drm/drm_gem.h>
>   #include <drm/gpu_scheduler.h>
>   #include <drm/spsc_queue.h>
>   
> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>   
>   /**
>    * drm_sched_job_init - init a scheduler job
> - *
>    * @job: scheduler job to init
>    * @entity: scheduler entity to use
>    * @owner: job owner for debugging
> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>    * Refer to drm_sched_entity_push_job() documentation
>    * for locking considerations.
>    *
> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> + *
>    * Returns 0 for success, negative error code otherwise.
>    */
>   int drm_sched_job_init(struct drm_sched_job *job,
> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   	job->sched = sched;
>   	job->entity = entity;
>   	job->s_priority = entity->rq - sched->sched_rq;
> -	job->s_fence = drm_sched_fence_create(entity, owner);
> +	job->s_fence = drm_sched_fence_alloc(entity, owner);
>   	if (!job->s_fence)
>   		return -ENOMEM;
>   	job->id = atomic64_inc_return(&sched->job_id_count);
> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   EXPORT_SYMBOL(drm_sched_job_init);
>   
>   /**
> - * drm_sched_job_cleanup - clean up scheduler job resources
> + * drm_sched_job_arm - arm a scheduler job for execution
> + * @job: scheduler job to arm
> + *
> + * This arms a scheduler job for execution. Specifically it initializes the
> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> + * or other places that need to track the completion of this job.
> + *
> + * Refer to drm_sched_entity_push_job() documentation for locking
> + * considerations.
>    *
> + * This can only be called if drm_sched_job_init() succeeded.
> + */
> +void drm_sched_job_arm(struct drm_sched_job *job)
> +{
> +	drm_sched_fence_init(job->s_fence, job->entity);
> +}
> +EXPORT_SYMBOL(drm_sched_job_arm);
> +
> +/**
> + * drm_sched_job_cleanup - clean up scheduler job resources
>    * @job: scheduler job to clean up
> + *
> + * Cleans up the resources allocated with drm_sched_job_init().
> + *
> + * Drivers should call this from their error unwind code if @job is aborted
> + * before drm_sched_job_arm() is called.
> + *
> + * After that point of no return @job is committed to be executed by the
> + * scheduler, and this function should be called from the
> + * &drm_sched_backend_ops.free_job callback.
>    */
>   void drm_sched_job_cleanup(struct drm_sched_job *job)
>   {
> -	dma_fence_put(&job->s_fence->finished);
> +	if (!kref_read(&job->s_fence->finished.refcount)) {
> +		/* drm_sched_job_arm() has been called */
> +		dma_fence_put(&job->s_fence->finished);
> +	} else {
> +		/* aborted job before committing to run it */
> +		drm_sched_fence_free(&job->s_fence->finished.rcu);
> +	}
> +
>   	job->s_fence = NULL;
>   }
>   EXPORT_SYMBOL(drm_sched_job_cleanup);
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 4eb354226972..5c3a99027ecd 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>   	if (ret)
>   		return ret;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>   
>   	/* put by scheduler job completion */
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 88ae7f331bb1..83afc3aa8e2f 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>   int drm_sched_job_init(struct drm_sched_job *job,
>   		       struct drm_sched_entity *entity,
>   		       void *owner);
> +void drm_sched_job_arm(struct drm_sched_job *job);
>   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>   				    struct drm_gpu_scheduler **sched_list,
>                                      unsigned int num_sched_list);
> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>   				   enum drm_sched_priority priority);
>   bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>   
> -struct drm_sched_fence *drm_sched_fence_create(
> +struct drm_sched_fence *drm_sched_fence_alloc(
>   	struct drm_sched_entity *s_entity, void *owner);
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +			  struct drm_sched_entity *entity);
> +void drm_sched_fence_free(struct rcu_head *rcu);
> +
>   void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>   void drm_sched_fence_finished(struct drm_sched_fence *fence);
>
Daniel Vetter July 7, 2021, 11:14 a.m. UTC | #2
On Wed, Jul 7, 2021 at 11:30 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
> >
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
> >
> > v2:
> > - don't change .gitignore (Steven)
> > - don't forget v3d (Emma)
> >
> > v3: Emma noticed that I leak the memory allocated in
> > drm_sched_job_init if we bail out before the point of no return in
> > subsequent driver patches. To be able to fix this change
> > drm_sched_job_cleanup() so it can handle being called both before and
> > after drm_sched_job_arm().
>
> Thinking more about this, I'm not sure if this really works.
>
> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> to update the entity->rq association.
>
> And that can only be done later on when we arm the fence as well.

Hm yeah, but that's a bug in the existing code I think: We already
fail to clean up if we fail to allocate the fences. So I think the
right thing to do here is to split the checks into job_init, and do
the actual arming/rq selection in job_arm? I'm not entirely sure
what's all going on there, the first check looks a bit like trying to
schedule before the entity is set up, which is a driver bug and should
have a WARN_ON?

The 2nd check around last_scheduled I have honeslty no idea what it's
even trying to do.
-Daniel

>
> Christian.
>
> >
> > Also improve the kerneldoc for this.
> >
> > Acked-by: Steven Price <steven.price@arm.com> (v2)
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: Qiang Yu <yuq825@gmail.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Adam Borowski <kilobyte@angband.pl>
> > Cc: Nick Terrell <terrelln@fb.com>
> > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: "Marek Olšák" <marek.olsak@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Sonny Jiang <sonny.jiang@amd.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: etnaviv@lists.freedesktop.org
> > Cc: lima@lists.freedesktop.org
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: Emma Anholt <emma@anholt.net>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >   drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >   drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> >   drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> >   drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> >   drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >   include/drm/gpu_scheduler.h              |  7 +++-
> >   10 files changed, 74 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index c5386d13eb4a..a4ec092af9a7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >       if (r)
> >               goto error_unlock;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       /* No memory allocation is allowed while holding the notifier lock.
> >        * The lock is held until amdgpu_cs_submit is finished and fence is
> >        * added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >       if (r)
> >               return r;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       *f = dma_fence_get(&job->base.s_fence->finished);
> >       amdgpu_job_free_resources(job);
> >       drm_sched_entity_push_job(&job->base, entity);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index feb6da1b6ceb..05f412204118 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >       if (ret)
> >               goto out_unlock;
> >
> > +     drm_sched_job_arm(&submit->sched_job);
> > +
> >       submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >       submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >                                               submit->out_fence, 0,
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index dba8329937a3..38f755580507 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >               return err;
> >       }
> >
> > +     drm_sched_job_arm(&task->base);
> > +
> >       task->num_bos = num_bos;
> >       task->vm = lima_vm_get(vm);
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 71a72fb50e6b..2992dc85325f 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >               goto unlock;
> >       }
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >       ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 79554aa4dbb1..f7347c284886 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >    * @sched_job: job to submit
> >    * @entity: scheduler entity
> >    *
> > - * Note: To guarantee that the order of insertion to queue matches
> > - * the job's fence sequence number this function should be
> > - * called with drm_sched_job_init under common lock.
> > + * Note: To guarantee that the order of insertion to queue matches the job's
> > + * fence sequence number this function should be called with drm_sched_job_arm()
> > + * under common lock.
> >    *
> >    * Returns 0 for success, negative error code otherwise.
> >    */
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 69de2c76731f..c451ee9a30d7 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> >    *
> >    * Free up the fence memory after the RCU grace period.
> >    */
> > -static void drm_sched_fence_free(struct rcu_head *rcu)
> > +void drm_sched_fence_free(struct rcu_head *rcu)
> >   {
> >       struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> >       struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >   }
> >   EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > -                                            void *owner)
> > +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > +                                           void *owner)
> >   {
> >       struct drm_sched_fence *fence = NULL;
> > -     unsigned seq;
> >
> >       fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >       if (fence == NULL)
> > @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >       fence->sched = entity->rq->sched;
> >       spin_lock_init(&fence->lock);
> >
> > +     return fence;
> > +}
> > +
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                       struct drm_sched_entity *entity)
> > +{
> > +     unsigned seq;
> > +
> >       seq = atomic_inc_return(&entity->fence_seq);
> >       dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >                      &fence->lock, entity->fence_context, seq);
> >       dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >                      &fence->lock, entity->fence_context + 1, seq);
> > -
> > -     return fence;
> >   }
> >
> >   module_init(drm_sched_fence_slab_init);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 33c414d55fab..5e84e1500c32 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -48,9 +48,11 @@
> >   #include <linux/wait.h>
> >   #include <linux/sched.h>
> >   #include <linux/completion.h>
> > +#include <linux/dma-resv.h>
> >   #include <uapi/linux/sched/types.h>
> >
> >   #include <drm/drm_print.h>
> > +#include <drm/drm_gem.h>
> >   #include <drm/gpu_scheduler.h>
> >   #include <drm/spsc_queue.h>
> >
> > @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >
> >   /**
> >    * drm_sched_job_init - init a scheduler job
> > - *
> >    * @job: scheduler job to init
> >    * @entity: scheduler entity to use
> >    * @owner: job owner for debugging
> > @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >    * Refer to drm_sched_entity_push_job() documentation
> >    * for locking considerations.
> >    *
> > + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> > + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> > + *
> >    * Returns 0 for success, negative error code otherwise.
> >    */
> >   int drm_sched_job_init(struct drm_sched_job *job,
> > @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >       job->sched = sched;
> >       job->entity = entity;
> >       job->s_priority = entity->rq - sched->sched_rq;
> > -     job->s_fence = drm_sched_fence_create(entity, owner);
> > +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >       if (!job->s_fence)
> >               return -ENOMEM;
> >       job->id = atomic64_inc_return(&sched->job_id_count);
> > @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >   EXPORT_SYMBOL(drm_sched_job_init);
> >
> >   /**
> > - * drm_sched_job_cleanup - clean up scheduler job resources
> > + * drm_sched_job_arm - arm a scheduler job for execution
> > + * @job: scheduler job to arm
> > + *
> > + * This arms a scheduler job for execution. Specifically it initializes the
> > + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> > + * or other places that need to track the completion of this job.
> > + *
> > + * Refer to drm_sched_entity_push_job() documentation for locking
> > + * considerations.
> >    *
> > + * This can only be called if drm_sched_job_init() succeeded.
> > + */
> > +void drm_sched_job_arm(struct drm_sched_job *job)
> > +{
> > +     drm_sched_fence_init(job->s_fence, job->entity);
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_arm);
> > +
> > +/**
> > + * drm_sched_job_cleanup - clean up scheduler job resources
> >    * @job: scheduler job to clean up
> > + *
> > + * Cleans up the resources allocated with drm_sched_job_init().
> > + *
> > + * Drivers should call this from their error unwind code if @job is aborted
> > + * before drm_sched_job_arm() is called.
> > + *
> > + * After that point of no return @job is committed to be executed by the
> > + * scheduler, and this function should be called from the
> > + * &drm_sched_backend_ops.free_job callback.
> >    */
> >   void drm_sched_job_cleanup(struct drm_sched_job *job)
> >   {
> > -     dma_fence_put(&job->s_fence->finished);
> > +     if (!kref_read(&job->s_fence->finished.refcount)) {
> > +             /* drm_sched_job_arm() has been called */
> > +             dma_fence_put(&job->s_fence->finished);
> > +     } else {
> > +             /* aborted job before committing to run it */
> > +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> > +     }
> > +
> >       job->s_fence = NULL;
> >   }
> >   EXPORT_SYMBOL(drm_sched_job_cleanup);
> > diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> > index 4eb354226972..5c3a99027ecd 100644
> > --- a/drivers/gpu/drm/v3d/v3d_gem.c
> > +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> > @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >       if (ret)
> >               return ret;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >       /* put by scheduler job completion */
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 88ae7f331bb1..83afc3aa8e2f 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >   int drm_sched_job_init(struct drm_sched_job *job,
> >                      struct drm_sched_entity *entity,
> >                      void *owner);
> > +void drm_sched_job_arm(struct drm_sched_job *job);
> >   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >                                   struct drm_gpu_scheduler **sched_list,
> >                                      unsigned int num_sched_list);
> > @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >                                  enum drm_sched_priority priority);
> >   bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(
> > +struct drm_sched_fence *drm_sched_fence_alloc(
> >       struct drm_sched_entity *s_entity, void *owner);
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                       struct drm_sched_entity *entity);
> > +void drm_sched_fence_free(struct rcu_head *rcu);
> > +
> >   void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >   void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >
>
Christian König July 7, 2021, 11:57 a.m. UTC | #3
Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> On Wed, Jul 7, 2021 at 11:30 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
>>> This is a very confusingly named function, because not just does it
>>> init an object, it arms it and provides a point of no return for
>>> pushing a job into the scheduler. It would be nice if that's a bit
>>> clearer in the interface.
>>>
>>> But the real reason is that I want to push the dependency tracking
>>> helpers into the scheduler code, and that means drm_sched_job_init
>>> must be called a lot earlier, without arming the job.
>>>
>>> v2:
>>> - don't change .gitignore (Steven)
>>> - don't forget v3d (Emma)
>>>
>>> v3: Emma noticed that I leak the memory allocated in
>>> drm_sched_job_init if we bail out before the point of no return in
>>> subsequent driver patches. To be able to fix this change
>>> drm_sched_job_cleanup() so it can handle being called both before and
>>> after drm_sched_job_arm().
>> Thinking more about this, I'm not sure if this really works.
>>
>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
>> to update the entity->rq association.
>>
>> And that can only be done later on when we arm the fence as well.
> Hm yeah, but that's a bug in the existing code I think: We already
> fail to clean up if we fail to allocate the fences. So I think the
> right thing to do here is to split the checks into job_init, and do
> the actual arming/rq selection in job_arm? I'm not entirely sure
> what's all going on there, the first check looks a bit like trying to
> schedule before the entity is set up, which is a driver bug and should
> have a WARN_ON?

No you misunderstood me, the problem is something else.

You asked previously why the call to drm_sched_job_init() was so late in 
the CS.

The reason for this was not alone the scheduler fence init, but also the 
call to drm_sched_entity_select_rq().

> The 2nd check around last_scheduled I have honeslty no idea what it's
> even trying to do.

You mean that here?

         fence = READ_ONCE(entity->last_scheduled);
         if (fence && !dma_fence_is_signaled(fence))
                 return;

This makes sure that load balancing is not moving the entity to a 
different scheduler while there are still jobs running from this entity 
on the hardware,

Regards
Christian.

> -Daniel
>
>> Christian.
>>
>>> Also improve the kerneldoc for this.
>>>
>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>> Cc: Qiang Yu <yuq825@gmail.com>
>>> Cc: Rob Herring <robh@kernel.org>
>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>> Cc: Steven Price <steven.price@arm.com>
>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>> Cc: Nick Terrell <terrelln@fb.com>
>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>> Cc: Lee Jones <lee.jones@linaro.org>
>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>> Cc: Chen Li <chenli@uniontech.com>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>> Cc: etnaviv@lists.freedesktop.org
>>> Cc: lima@lists.freedesktop.org
>>> Cc: linux-media@vger.kernel.org
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> Cc: Emma Anholt <emma@anholt.net>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>    drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>    drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>    drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>>>    drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>>>    drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>>>    drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>>>    include/drm/gpu_scheduler.h              |  7 +++-
>>>    10 files changed, 74 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index c5386d13eb4a..a4ec092af9a7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>        if (r)
>>>                goto error_unlock;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        /* No memory allocation is allowed while holding the notifier lock.
>>>         * The lock is held until amdgpu_cs_submit is finished and fence is
>>>         * added to BOs.
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index d33e6d97cc89..5ddb955d2315 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>        if (r)
>>>                return r;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        *f = dma_fence_get(&job->base.s_fence->finished);
>>>        amdgpu_job_free_resources(job);
>>>        drm_sched_entity_push_job(&job->base, entity);
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> index feb6da1b6ceb..05f412204118 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>        if (ret)
>>>                goto out_unlock;
>>>
>>> +     drm_sched_job_arm(&submit->sched_job);
>>> +
>>>        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>        submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>                                                submit->out_fence, 0,
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index dba8329937a3..38f755580507 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>                return err;
>>>        }
>>>
>>> +     drm_sched_job_arm(&task->base);
>>> +
>>>        task->num_bos = num_bos;
>>>        task->vm = lima_vm_get(vm);
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 71a72fb50e6b..2992dc85325f 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>                goto unlock;
>>>        }
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>
>>>        ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 79554aa4dbb1..f7347c284886 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>     * @sched_job: job to submit
>>>     * @entity: scheduler entity
>>>     *
>>> - * Note: To guarantee that the order of insertion to queue matches
>>> - * the job's fence sequence number this function should be
>>> - * called with drm_sched_job_init under common lock.
>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>> + * under common lock.
>>>     *
>>>     * Returns 0 for success, negative error code otherwise.
>>>     */
>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>> index 69de2c76731f..c451ee9a30d7 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>>>     *
>>>     * Free up the fence memory after the RCU grace period.
>>>     */
>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
>>> +void drm_sched_fence_free(struct rcu_head *rcu)
>>>    {
>>>        struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>>        struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>    }
>>>    EXPORT_SYMBOL(to_drm_sched_fence);
>>>
>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>> -                                            void *owner)
>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>> +                                           void *owner)
>>>    {
>>>        struct drm_sched_fence *fence = NULL;
>>> -     unsigned seq;
>>>
>>>        fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>        if (fence == NULL)
>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>        fence->sched = entity->rq->sched;
>>>        spin_lock_init(&fence->lock);
>>>
>>> +     return fence;
>>> +}
>>> +
>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>> +                       struct drm_sched_entity *entity)
>>> +{
>>> +     unsigned seq;
>>> +
>>>        seq = atomic_inc_return(&entity->fence_seq);
>>>        dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>                       &fence->lock, entity->fence_context, seq);
>>>        dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>                       &fence->lock, entity->fence_context + 1, seq);
>>> -
>>> -     return fence;
>>>    }
>>>
>>>    module_init(drm_sched_fence_slab_init);
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 33c414d55fab..5e84e1500c32 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -48,9 +48,11 @@
>>>    #include <linux/wait.h>
>>>    #include <linux/sched.h>
>>>    #include <linux/completion.h>
>>> +#include <linux/dma-resv.h>
>>>    #include <uapi/linux/sched/types.h>
>>>
>>>    #include <drm/drm_print.h>
>>> +#include <drm/drm_gem.h>
>>>    #include <drm/gpu_scheduler.h>
>>>    #include <drm/spsc_queue.h>
>>>
>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>
>>>    /**
>>>     * drm_sched_job_init - init a scheduler job
>>> - *
>>>     * @job: scheduler job to init
>>>     * @entity: scheduler entity to use
>>>     * @owner: job owner for debugging
>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>     * Refer to drm_sched_entity_push_job() documentation
>>>     * for locking considerations.
>>>     *
>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
>>> + *
>>>     * Returns 0 for success, negative error code otherwise.
>>>     */
>>>    int drm_sched_job_init(struct drm_sched_job *job,
>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>        job->sched = sched;
>>>        job->entity = entity;
>>>        job->s_priority = entity->rq - sched->sched_rq;
>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>        if (!job->s_fence)
>>>                return -ENOMEM;
>>>        job->id = atomic64_inc_return(&sched->job_id_count);
>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>    EXPORT_SYMBOL(drm_sched_job_init);
>>>
>>>    /**
>>> - * drm_sched_job_cleanup - clean up scheduler job resources
>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>> + * @job: scheduler job to arm
>>> + *
>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>> + * or other places that need to track the completion of this job.
>>> + *
>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>> + * considerations.
>>>     *
>>> + * This can only be called if drm_sched_job_init() succeeded.
>>> + */
>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>> +{
>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>> +}
>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>> +
>>> +/**
>>> + * drm_sched_job_cleanup - clean up scheduler job resources
>>>     * @job: scheduler job to clean up
>>> + *
>>> + * Cleans up the resources allocated with drm_sched_job_init().
>>> + *
>>> + * Drivers should call this from their error unwind code if @job is aborted
>>> + * before drm_sched_job_arm() is called.
>>> + *
>>> + * After that point of no return @job is committed to be executed by the
>>> + * scheduler, and this function should be called from the
>>> + * &drm_sched_backend_ops.free_job callback.
>>>     */
>>>    void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>    {
>>> -     dma_fence_put(&job->s_fence->finished);
>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
>>> +             /* drm_sched_job_arm() has been called */
>>> +             dma_fence_put(&job->s_fence->finished);
>>> +     } else {
>>> +             /* aborted job before committing to run it */
>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
>>> +     }
>>> +
>>>        job->s_fence = NULL;
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_job_cleanup);
>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
>>> index 4eb354226972..5c3a99027ecd 100644
>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>>>        if (ret)
>>>                return ret;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>
>>>        /* put by scheduler job completion */
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 88ae7f331bb1..83afc3aa8e2f 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>    int drm_sched_job_init(struct drm_sched_job *job,
>>>                       struct drm_sched_entity *entity,
>>>                       void *owner);
>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>    void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>                                    struct drm_gpu_scheduler **sched_list,
>>>                                       unsigned int num_sched_list);
>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>                                   enum drm_sched_priority priority);
>>>    bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>
>>> -struct drm_sched_fence *drm_sched_fence_create(
>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>        struct drm_sched_entity *s_entity, void *owner);
>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>> +                       struct drm_sched_entity *entity);
>>> +void drm_sched_fence_free(struct rcu_head *rcu);
>>> +
>>>    void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>    void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>
>
Daniel Vetter July 7, 2021, 12:13 p.m. UTC | #4
On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> > On Wed, Jul 7, 2021 at 11:30 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> >>> This is a very confusingly named function, because not just does it
> >>> init an object, it arms it and provides a point of no return for
> >>> pushing a job into the scheduler. It would be nice if that's a bit
> >>> clearer in the interface.
> >>>
> >>> But the real reason is that I want to push the dependency tracking
> >>> helpers into the scheduler code, and that means drm_sched_job_init
> >>> must be called a lot earlier, without arming the job.
> >>>
> >>> v2:
> >>> - don't change .gitignore (Steven)
> >>> - don't forget v3d (Emma)
> >>>
> >>> v3: Emma noticed that I leak the memory allocated in
> >>> drm_sched_job_init if we bail out before the point of no return in
> >>> subsequent driver patches. To be able to fix this change
> >>> drm_sched_job_cleanup() so it can handle being called both before and
> >>> after drm_sched_job_arm().
> >> Thinking more about this, I'm not sure if this really works.
> >>
> >> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> >> to update the entity->rq association.
> >>
> >> And that can only be done later on when we arm the fence as well.
> > Hm yeah, but that's a bug in the existing code I think: We already
> > fail to clean up if we fail to allocate the fences. So I think the
> > right thing to do here is to split the checks into job_init, and do
> > the actual arming/rq selection in job_arm? I'm not entirely sure
> > what's all going on there, the first check looks a bit like trying to
> > schedule before the entity is set up, which is a driver bug and should
> > have a WARN_ON?
>
> No you misunderstood me, the problem is something else.
>
> You asked previously why the call to drm_sched_job_init() was so late in
> the CS.
>
> The reason for this was not alone the scheduler fence init, but also the
> call to drm_sched_entity_select_rq().

Ah ok, I think I can fix that. Needs a prep patch to first make
drm_sched_entity_select infallible, then should be easy to do.

> > The 2nd check around last_scheduled I have honeslty no idea what it's
> > even trying to do.
>
> You mean that here?
>
>          fence = READ_ONCE(entity->last_scheduled);
>          if (fence && !dma_fence_is_signaled(fence))
>                  return;
>
> This makes sure that load balancing is not moving the entity to a
> different scheduler while there are still jobs running from this entity
> on the hardware,

Yeah after a nap that idea crossed my mind too. But now I have locking
questions, afaiui the scheduler thread updates this, without taking
any locks - entity dequeuing is lockless. And here we read the fence
and then seem to yolo check whether it's signalled? What's preventing
a use-after-free here? There's no rcu or anything going on here at
all, and it's outside of the spinlock section, which starts a bit
further down.
-Daniel

>
> Regards
> Christian.
>
> > -Daniel
> >
> >> Christian.
> >>
> >>> Also improve the kerneldoc for this.
> >>>
> >>> Acked-by: Steven Price <steven.price@arm.com> (v2)
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> >>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>> Cc: Qiang Yu <yuq825@gmail.com>
> >>> Cc: Rob Herring <robh@kernel.org>
> >>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> >>> Cc: Steven Price <steven.price@arm.com>
> >>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> >>> Cc: David Airlie <airlied@linux.ie>
> >>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> >>> Cc: Kees Cook <keescook@chromium.org>
> >>> Cc: Adam Borowski <kilobyte@angband.pl>
> >>> Cc: Nick Terrell <terrelln@fb.com>
> >>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> >>> Cc: Sami Tolvanen <samitolvanen@google.com>
> >>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>> Cc: Dave Airlie <airlied@redhat.com>
> >>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> >>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>> Cc: Lee Jones <lee.jones@linaro.org>
> >>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>> Cc: Chen Li <chenli@uniontech.com>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> >>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> >>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>> Cc: Tian Tao <tiantao6@hisilicon.com>
> >>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>> Cc: etnaviv@lists.freedesktop.org
> >>> Cc: lima@lists.freedesktop.org
> >>> Cc: linux-media@vger.kernel.org
> >>> Cc: linaro-mm-sig@lists.linaro.org
> >>> Cc: Emma Anholt <emma@anholt.net>
> >>> ---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >>>    drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >>>    drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >>>    drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> >>>    drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> >>>    drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> >>>    drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >>>    include/drm/gpu_scheduler.h              |  7 +++-
> >>>    10 files changed, 74 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> index c5386d13eb4a..a4ec092af9a7 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >>>        if (r)
> >>>                goto error_unlock;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        /* No memory allocation is allowed while holding the notifier lock.
> >>>         * The lock is held until amdgpu_cs_submit is finished and fence is
> >>>         * added to BOs.
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> index d33e6d97cc89..5ddb955d2315 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >>>        if (r)
> >>>                return r;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        *f = dma_fence_get(&job->base.s_fence->finished);
> >>>        amdgpu_job_free_resources(job);
> >>>        drm_sched_entity_push_job(&job->base, entity);
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> index feb6da1b6ceb..05f412204118 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >>>        if (ret)
> >>>                goto out_unlock;
> >>>
> >>> +     drm_sched_job_arm(&submit->sched_job);
> >>> +
> >>>        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >>>        submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >>>                                                submit->out_fence, 0,
> >>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>> index dba8329937a3..38f755580507 100644
> >>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >>>                return err;
> >>>        }
> >>>
> >>> +     drm_sched_job_arm(&task->base);
> >>> +
> >>>        task->num_bos = num_bos;
> >>>        task->vm = lima_vm_get(vm);
> >>>
> >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> index 71a72fb50e6b..2992dc85325f 100644
> >>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >>>                goto unlock;
> >>>        }
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>
> >>>        ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> index 79554aa4dbb1..f7347c284886 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >>>     * @sched_job: job to submit
> >>>     * @entity: scheduler entity
> >>>     *
> >>> - * Note: To guarantee that the order of insertion to queue matches
> >>> - * the job's fence sequence number this function should be
> >>> - * called with drm_sched_job_init under common lock.
> >>> + * Note: To guarantee that the order of insertion to queue matches the job's
> >>> + * fence sequence number this function should be called with drm_sched_job_arm()
> >>> + * under common lock.
> >>>     *
> >>>     * Returns 0 for success, negative error code otherwise.
> >>>     */
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> index 69de2c76731f..c451ee9a30d7 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> >>>     *
> >>>     * Free up the fence memory after the RCU grace period.
> >>>     */
> >>> -static void drm_sched_fence_free(struct rcu_head *rcu)
> >>> +void drm_sched_fence_free(struct rcu_head *rcu)
> >>>    {
> >>>        struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> >>>        struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>    }
> >>>    EXPORT_SYMBOL(to_drm_sched_fence);
> >>>
> >>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>> -                                            void *owner)
> >>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>> +                                           void *owner)
> >>>    {
> >>>        struct drm_sched_fence *fence = NULL;
> >>> -     unsigned seq;
> >>>
> >>>        fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >>>        if (fence == NULL)
> >>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>        fence->sched = entity->rq->sched;
> >>>        spin_lock_init(&fence->lock);
> >>>
> >>> +     return fence;
> >>> +}
> >>> +
> >>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>> +                       struct drm_sched_entity *entity)
> >>> +{
> >>> +     unsigned seq;
> >>> +
> >>>        seq = atomic_inc_return(&entity->fence_seq);
> >>>        dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >>>                       &fence->lock, entity->fence_context, seq);
> >>>        dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >>>                       &fence->lock, entity->fence_context + 1, seq);
> >>> -
> >>> -     return fence;
> >>>    }
> >>>
> >>>    module_init(drm_sched_fence_slab_init);
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 33c414d55fab..5e84e1500c32 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -48,9 +48,11 @@
> >>>    #include <linux/wait.h>
> >>>    #include <linux/sched.h>
> >>>    #include <linux/completion.h>
> >>> +#include <linux/dma-resv.h>
> >>>    #include <uapi/linux/sched/types.h>
> >>>
> >>>    #include <drm/drm_print.h>
> >>> +#include <drm/drm_gem.h>
> >>>    #include <drm/gpu_scheduler.h>
> >>>    #include <drm/spsc_queue.h>
> >>>
> >>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>
> >>>    /**
> >>>     * drm_sched_job_init - init a scheduler job
> >>> - *
> >>>     * @job: scheduler job to init
> >>>     * @entity: scheduler entity to use
> >>>     * @owner: job owner for debugging
> >>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>     * Refer to drm_sched_entity_push_job() documentation
> >>>     * for locking considerations.
> >>>     *
> >>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> >>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> >>> + *
> >>>     * Returns 0 for success, negative error code otherwise.
> >>>     */
> >>>    int drm_sched_job_init(struct drm_sched_job *job,
> >>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>        job->sched = sched;
> >>>        job->entity = entity;
> >>>        job->s_priority = entity->rq - sched->sched_rq;
> >>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> >>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >>>        if (!job->s_fence)
> >>>                return -ENOMEM;
> >>>        job->id = atomic64_inc_return(&sched->job_id_count);
> >>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>    EXPORT_SYMBOL(drm_sched_job_init);
> >>>
> >>>    /**
> >>> - * drm_sched_job_cleanup - clean up scheduler job resources
> >>> + * drm_sched_job_arm - arm a scheduler job for execution
> >>> + * @job: scheduler job to arm
> >>> + *
> >>> + * This arms a scheduler job for execution. Specifically it initializes the
> >>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> >>> + * or other places that need to track the completion of this job.
> >>> + *
> >>> + * Refer to drm_sched_entity_push_job() documentation for locking
> >>> + * considerations.
> >>>     *
> >>> + * This can only be called if drm_sched_job_init() succeeded.
> >>> + */
> >>> +void drm_sched_job_arm(struct drm_sched_job *job)
> >>> +{
> >>> +     drm_sched_fence_init(job->s_fence, job->entity);
> >>> +}
> >>> +EXPORT_SYMBOL(drm_sched_job_arm);
> >>> +
> >>> +/**
> >>> + * drm_sched_job_cleanup - clean up scheduler job resources
> >>>     * @job: scheduler job to clean up
> >>> + *
> >>> + * Cleans up the resources allocated with drm_sched_job_init().
> >>> + *
> >>> + * Drivers should call this from their error unwind code if @job is aborted
> >>> + * before drm_sched_job_arm() is called.
> >>> + *
> >>> + * After that point of no return @job is committed to be executed by the
> >>> + * scheduler, and this function should be called from the
> >>> + * &drm_sched_backend_ops.free_job callback.
> >>>     */
> >>>    void drm_sched_job_cleanup(struct drm_sched_job *job)
> >>>    {
> >>> -     dma_fence_put(&job->s_fence->finished);
> >>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
> >>> +             /* drm_sched_job_arm() has been called */
> >>> +             dma_fence_put(&job->s_fence->finished);
> >>> +     } else {
> >>> +             /* aborted job before committing to run it */
> >>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> >>> +     }
> >>> +
> >>>        job->s_fence = NULL;
> >>>    }
> >>>    EXPORT_SYMBOL(drm_sched_job_cleanup);
> >>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> >>> index 4eb354226972..5c3a99027ecd 100644
> >>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> >>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> >>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >>>        if (ret)
> >>>                return ret;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>
> >>>        /* put by scheduler job completion */
> >>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>> index 88ae7f331bb1..83afc3aa8e2f 100644
> >>> --- a/include/drm/gpu_scheduler.h
> >>> +++ b/include/drm/gpu_scheduler.h
> >>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >>>    int drm_sched_job_init(struct drm_sched_job *job,
> >>>                       struct drm_sched_entity *entity,
> >>>                       void *owner);
> >>> +void drm_sched_job_arm(struct drm_sched_job *job);
> >>>    void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >>>                                    struct drm_gpu_scheduler **sched_list,
> >>>                                       unsigned int num_sched_list);
> >>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>                                   enum drm_sched_priority priority);
> >>>    bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>
> >>> -struct drm_sched_fence *drm_sched_fence_create(
> >>> +struct drm_sched_fence *drm_sched_fence_alloc(
> >>>        struct drm_sched_entity *s_entity, void *owner);
> >>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>> +                       struct drm_sched_entity *entity);
> >>> +void drm_sched_fence_free(struct rcu_head *rcu);
> >>> +
> >>>    void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >>>    void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >>>
> >
>
Christian König July 7, 2021, 12:58 p.m. UTC | #5
Am 07.07.21 um 14:13 schrieb Daniel Vetter:
> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
>>>>> This is a very confusingly named function, because not just does it
>>>>> init an object, it arms it and provides a point of no return for
>>>>> pushing a job into the scheduler. It would be nice if that's a bit
>>>>> clearer in the interface.
>>>>>
>>>>> But the real reason is that I want to push the dependency tracking
>>>>> helpers into the scheduler code, and that means drm_sched_job_init
>>>>> must be called a lot earlier, without arming the job.
>>>>>
>>>>> v2:
>>>>> - don't change .gitignore (Steven)
>>>>> - don't forget v3d (Emma)
>>>>>
>>>>> v3: Emma noticed that I leak the memory allocated in
>>>>> drm_sched_job_init if we bail out before the point of no return in
>>>>> subsequent driver patches. To be able to fix this change
>>>>> drm_sched_job_cleanup() so it can handle being called both before and
>>>>> after drm_sched_job_arm().
>>>> Thinking more about this, I'm not sure if this really works.
>>>>
>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
>>>> to update the entity->rq association.
>>>>
>>>> And that can only be done later on when we arm the fence as well.
>>> Hm yeah, but that's a bug in the existing code I think: We already
>>> fail to clean up if we fail to allocate the fences. So I think the
>>> right thing to do here is to split the checks into job_init, and do
>>> the actual arming/rq selection in job_arm? I'm not entirely sure
>>> what's all going on there, the first check looks a bit like trying to
>>> schedule before the entity is set up, which is a driver bug and should
>>> have a WARN_ON?
>> No you misunderstood me, the problem is something else.
>>
>> You asked previously why the call to drm_sched_job_init() was so late in
>> the CS.
>>
>> The reason for this was not alone the scheduler fence init, but also the
>> call to drm_sched_entity_select_rq().
> Ah ok, I think I can fix that. Needs a prep patch to first make
> drm_sched_entity_select infallible, then should be easy to do.
>
>>> The 2nd check around last_scheduled I have honeslty no idea what it's
>>> even trying to do.
>> You mean that here?
>>
>>           fence = READ_ONCE(entity->last_scheduled);
>>           if (fence && !dma_fence_is_signaled(fence))
>>                   return;
>>
>> This makes sure that load balancing is not moving the entity to a
>> different scheduler while there are still jobs running from this entity
>> on the hardware,
> Yeah after a nap that idea crossed my mind too. But now I have locking
> questions, afaiui the scheduler thread updates this, without taking
> any locks - entity dequeuing is lockless. And here we read the fence
> and then seem to yolo check whether it's signalled? What's preventing
> a use-after-free here? There's no rcu or anything going on here at
> all, and it's outside of the spinlock section, which starts a bit
> further down.

The last_scheduled fence of an entity can only change when there are 
jobs on the entities queued, and we have just ruled that out in the 
check before.

Christian.


> -Daniel
>
>> Regards
>> Christian.
>>
>>> -Daniel
>>>
>>>> Christian.
>>>>
>>>>> Also improve the kerneldoc for this.
>>>>>
>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>>> Cc: Qiang Yu <yuq825@gmail.com>
>>>>> Cc: Rob Herring <robh@kernel.org>
>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>>>> Cc: Nick Terrell <terrelln@fb.com>
>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>> Cc: Lee Jones <lee.jones@linaro.org>
>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>> Cc: etnaviv@lists.freedesktop.org
>>>>> Cc: lima@lists.freedesktop.org
>>>>> Cc: linux-media@vger.kernel.org
>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>> Cc: Emma Anholt <emma@anholt.net>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>>>     drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>>>     drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>>>     drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>>>     drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>>>>>     drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>>>>>     drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>>>>>     drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>>>>>     include/drm/gpu_scheduler.h              |  7 +++-
>>>>>     10 files changed, 74 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> index c5386d13eb4a..a4ec092af9a7 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>>>         if (r)
>>>>>                 goto error_unlock;
>>>>>
>>>>> +     drm_sched_job_arm(&job->base);
>>>>> +
>>>>>         /* No memory allocation is allowed while holding the notifier lock.
>>>>>          * The lock is held until amdgpu_cs_submit is finished and fence is
>>>>>          * added to BOs.
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> index d33e6d97cc89..5ddb955d2315 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>>>         if (r)
>>>>>                 return r;
>>>>>
>>>>> +     drm_sched_job_arm(&job->base);
>>>>> +
>>>>>         *f = dma_fence_get(&job->base.s_fence->finished);
>>>>>         amdgpu_job_free_resources(job);
>>>>>         drm_sched_entity_push_job(&job->base, entity);
>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>> index feb6da1b6ceb..05f412204118 100644
>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>>>         if (ret)
>>>>>                 goto out_unlock;
>>>>>
>>>>> +     drm_sched_job_arm(&submit->sched_job);
>>>>> +
>>>>>         submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>>>         submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>>>                                                 submit->out_fence, 0,
>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>> index dba8329937a3..38f755580507 100644
>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>>>                 return err;
>>>>>         }
>>>>>
>>>>> +     drm_sched_job_arm(&task->base);
>>>>> +
>>>>>         task->num_bos = num_bos;
>>>>>         task->vm = lima_vm_get(vm);
>>>>>
>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>> index 71a72fb50e6b..2992dc85325f 100644
>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>>>                 goto unlock;
>>>>>         }
>>>>>
>>>>> +     drm_sched_job_arm(&job->base);
>>>>> +
>>>>>         job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>
>>>>>         ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> index 79554aa4dbb1..f7347c284886 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>>      * @sched_job: job to submit
>>>>>      * @entity: scheduler entity
>>>>>      *
>>>>> - * Note: To guarantee that the order of insertion to queue matches
>>>>> - * the job's fence sequence number this function should be
>>>>> - * called with drm_sched_job_init under common lock.
>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>>>> + * under common lock.
>>>>>      *
>>>>>      * Returns 0 for success, negative error code otherwise.
>>>>>      */
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>> index 69de2c76731f..c451ee9a30d7 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>>>>>      *
>>>>>      * Free up the fence memory after the RCU grace period.
>>>>>      */
>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>     {
>>>>>         struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>>>>         struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>>     }
>>>>>     EXPORT_SYMBOL(to_drm_sched_fence);
>>>>>
>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>> -                                            void *owner)
>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>>> +                                           void *owner)
>>>>>     {
>>>>>         struct drm_sched_fence *fence = NULL;
>>>>> -     unsigned seq;
>>>>>
>>>>>         fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>>>         if (fence == NULL)
>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>         fence->sched = entity->rq->sched;
>>>>>         spin_lock_init(&fence->lock);
>>>>>
>>>>> +     return fence;
>>>>> +}
>>>>> +
>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>> +                       struct drm_sched_entity *entity)
>>>>> +{
>>>>> +     unsigned seq;
>>>>> +
>>>>>         seq = atomic_inc_return(&entity->fence_seq);
>>>>>         dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>>>                        &fence->lock, entity->fence_context, seq);
>>>>>         dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>>>                        &fence->lock, entity->fence_context + 1, seq);
>>>>> -
>>>>> -     return fence;
>>>>>     }
>>>>>
>>>>>     module_init(drm_sched_fence_slab_init);
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 33c414d55fab..5e84e1500c32 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -48,9 +48,11 @@
>>>>>     #include <linux/wait.h>
>>>>>     #include <linux/sched.h>
>>>>>     #include <linux/completion.h>
>>>>> +#include <linux/dma-resv.h>
>>>>>     #include <uapi/linux/sched/types.h>
>>>>>
>>>>>     #include <drm/drm_print.h>
>>>>> +#include <drm/drm_gem.h>
>>>>>     #include <drm/gpu_scheduler.h>
>>>>>     #include <drm/spsc_queue.h>
>>>>>
>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>
>>>>>     /**
>>>>>      * drm_sched_job_init - init a scheduler job
>>>>> - *
>>>>>      * @job: scheduler job to init
>>>>>      * @entity: scheduler entity to use
>>>>>      * @owner: job owner for debugging
>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>      * Refer to drm_sched_entity_push_job() documentation
>>>>>      * for locking considerations.
>>>>>      *
>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
>>>>> + *
>>>>>      * Returns 0 for success, negative error code otherwise.
>>>>>      */
>>>>>     int drm_sched_job_init(struct drm_sched_job *job,
>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>         job->sched = sched;
>>>>>         job->entity = entity;
>>>>>         job->s_priority = entity->rq - sched->sched_rq;
>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>>>         if (!job->s_fence)
>>>>>                 return -ENOMEM;
>>>>>         job->id = atomic64_inc_return(&sched->job_id_count);
>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>     EXPORT_SYMBOL(drm_sched_job_init);
>>>>>
>>>>>     /**
>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>>>> + * @job: scheduler job to arm
>>>>> + *
>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>>>> + * or other places that need to track the completion of this job.
>>>>> + *
>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>>>> + * considerations.
>>>>>      *
>>>>> + * This can only be called if drm_sched_job_init() succeeded.
>>>>> + */
>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>>>> +{
>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>>>> +}
>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>>>> +
>>>>> +/**
>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>      * @job: scheduler job to clean up
>>>>> + *
>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
>>>>> + *
>>>>> + * Drivers should call this from their error unwind code if @job is aborted
>>>>> + * before drm_sched_job_arm() is called.
>>>>> + *
>>>>> + * After that point of no return @job is committed to be executed by the
>>>>> + * scheduler, and this function should be called from the
>>>>> + * &drm_sched_backend_ops.free_job callback.
>>>>>      */
>>>>>     void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>     {
>>>>> -     dma_fence_put(&job->s_fence->finished);
>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
>>>>> +             /* drm_sched_job_arm() has been called */
>>>>> +             dma_fence_put(&job->s_fence->finished);
>>>>> +     } else {
>>>>> +             /* aborted job before committing to run it */
>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
>>>>> +     }
>>>>> +
>>>>>         job->s_fence = NULL;
>>>>>     }
>>>>>     EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>> index 4eb354226972..5c3a99027ecd 100644
>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>>>>>         if (ret)
>>>>>                 return ret;
>>>>>
>>>>> +     drm_sched_job_arm(&job->base);
>>>>> +
>>>>>         job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>
>>>>>         /* put by scheduler job completion */
>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
>>>>> --- a/include/drm/gpu_scheduler.h
>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>>>     int drm_sched_job_init(struct drm_sched_job *job,
>>>>>                        struct drm_sched_entity *entity,
>>>>>                        void *owner);
>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>>>     void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>>>                                     struct drm_gpu_scheduler **sched_list,
>>>>>                                        unsigned int num_sched_list);
>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>>>                                    enum drm_sched_priority priority);
>>>>>     bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>>>
>>>>> -struct drm_sched_fence *drm_sched_fence_create(
>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>>>         struct drm_sched_entity *s_entity, void *owner);
>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>> +                       struct drm_sched_entity *entity);
>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
>>>>> +
>>>>>     void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>>>     void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>>>
>
Daniel Vetter July 7, 2021, 4:32 p.m. UTC | #6
On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
> > On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
> >> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> >>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> >>>>> This is a very confusingly named function, because not just does it
> >>>>> init an object, it arms it and provides a point of no return for
> >>>>> pushing a job into the scheduler. It would be nice if that's a bit
> >>>>> clearer in the interface.
> >>>>>
> >>>>> But the real reason is that I want to push the dependency tracking
> >>>>> helpers into the scheduler code, and that means drm_sched_job_init
> >>>>> must be called a lot earlier, without arming the job.
> >>>>>
> >>>>> v2:
> >>>>> - don't change .gitignore (Steven)
> >>>>> - don't forget v3d (Emma)
> >>>>>
> >>>>> v3: Emma noticed that I leak the memory allocated in
> >>>>> drm_sched_job_init if we bail out before the point of no return in
> >>>>> subsequent driver patches. To be able to fix this change
> >>>>> drm_sched_job_cleanup() so it can handle being called both before and
> >>>>> after drm_sched_job_arm().
> >>>> Thinking more about this, I'm not sure if this really works.
> >>>>
> >>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> >>>> to update the entity->rq association.
> >>>>
> >>>> And that can only be done later on when we arm the fence as well.
> >>> Hm yeah, but that's a bug in the existing code I think: We already
> >>> fail to clean up if we fail to allocate the fences. So I think the
> >>> right thing to do here is to split the checks into job_init, and do
> >>> the actual arming/rq selection in job_arm? I'm not entirely sure
> >>> what's all going on there, the first check looks a bit like trying to
> >>> schedule before the entity is set up, which is a driver bug and should
> >>> have a WARN_ON?
> >> No you misunderstood me, the problem is something else.
> >>
> >> You asked previously why the call to drm_sched_job_init() was so late in
> >> the CS.
> >>
> >> The reason for this was not alone the scheduler fence init, but also the
> >> call to drm_sched_entity_select_rq().
> > Ah ok, I think I can fix that. Needs a prep patch to first make
> > drm_sched_entity_select infallible, then should be easy to do.
> >
> >>> The 2nd check around last_scheduled I have honeslty no idea what it's
> >>> even trying to do.
> >> You mean that here?
> >>
> >>           fence = READ_ONCE(entity->last_scheduled);
> >>           if (fence && !dma_fence_is_signaled(fence))
> >>                   return;
> >>
> >> This makes sure that load balancing is not moving the entity to a
> >> different scheduler while there are still jobs running from this entity
> >> on the hardware,
> > Yeah after a nap that idea crossed my mind too. But now I have locking
> > questions, afaiui the scheduler thread updates this, without taking
> > any locks - entity dequeuing is lockless. And here we read the fence
> > and then seem to yolo check whether it's signalled? What's preventing
> > a use-after-free here? There's no rcu or anything going on here at
> > all, and it's outside of the spinlock section, which starts a bit
> > further down.
>
> The last_scheduled fence of an entity can only change when there are
> jobs on the entities queued, and we have just ruled that out in the
> check before.

There aren't any barriers, so the cpu could easily run the two checks
the other way round. I'll ponder this and figure out where exactly we
need docs for the constraint and/or barriers to make this work as
intended. As-is I'm not seeing how it does ...
-Daniel

> Christian.
>
>
> > -Daniel
> >
> >> Regards
> >> Christian.
> >>
> >>> -Daniel
> >>>
> >>>> Christian.
> >>>>
> >>>>> Also improve the kerneldoc for this.
> >>>>>
> >>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
> >>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> >>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>>>> Cc: Qiang Yu <yuq825@gmail.com>
> >>>>> Cc: Rob Herring <robh@kernel.org>
> >>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> >>>>> Cc: Steven Price <steven.price@arm.com>
> >>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> >>>>> Cc: David Airlie <airlied@linux.ie>
> >>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> >>>>> Cc: Kees Cook <keescook@chromium.org>
> >>>>> Cc: Adam Borowski <kilobyte@angband.pl>
> >>>>> Cc: Nick Terrell <terrelln@fb.com>
> >>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> >>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
> >>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>> Cc: Dave Airlie <airlied@redhat.com>
> >>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> >>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>> Cc: Lee Jones <lee.jones@linaro.org>
> >>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> >>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> >>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
> >>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>>>> Cc: etnaviv@lists.freedesktop.org
> >>>>> Cc: lima@lists.freedesktop.org
> >>>>> Cc: linux-media@vger.kernel.org
> >>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>> Cc: Emma Anholt <emma@anholt.net>
> >>>>> ---
> >>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >>>>>     drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >>>>>     drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >>>>>     drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >>>>>     drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> >>>>>     drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> >>>>>     drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> >>>>>     drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >>>>>     include/drm/gpu_scheduler.h              |  7 +++-
> >>>>>     10 files changed, 74 insertions(+), 14 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> index c5386d13eb4a..a4ec092af9a7 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >>>>>         if (r)
> >>>>>                 goto error_unlock;
> >>>>>
> >>>>> +     drm_sched_job_arm(&job->base);
> >>>>> +
> >>>>>         /* No memory allocation is allowed while holding the notifier lock.
> >>>>>          * The lock is held until amdgpu_cs_submit is finished and fence is
> >>>>>          * added to BOs.
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>> index d33e6d97cc89..5ddb955d2315 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >>>>>         if (r)
> >>>>>                 return r;
> >>>>>
> >>>>> +     drm_sched_job_arm(&job->base);
> >>>>> +
> >>>>>         *f = dma_fence_get(&job->base.s_fence->finished);
> >>>>>         amdgpu_job_free_resources(job);
> >>>>>         drm_sched_entity_push_job(&job->base, entity);
> >>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>> index feb6da1b6ceb..05f412204118 100644
> >>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >>>>>         if (ret)
> >>>>>                 goto out_unlock;
> >>>>>
> >>>>> +     drm_sched_job_arm(&submit->sched_job);
> >>>>> +
> >>>>>         submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >>>>>         submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >>>>>                                                 submit->out_fence, 0,
> >>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>>>> index dba8329937a3..38f755580507 100644
> >>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >>>>>                 return err;
> >>>>>         }
> >>>>>
> >>>>> +     drm_sched_job_arm(&task->base);
> >>>>> +
> >>>>>         task->num_bos = num_bos;
> >>>>>         task->vm = lima_vm_get(vm);
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>> index 71a72fb50e6b..2992dc85325f 100644
> >>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >>>>>                 goto unlock;
> >>>>>         }
> >>>>>
> >>>>> +     drm_sched_job_arm(&job->base);
> >>>>> +
> >>>>>         job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>>>
> >>>>>         ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>> index 79554aa4dbb1..f7347c284886 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >>>>>      * @sched_job: job to submit
> >>>>>      * @entity: scheduler entity
> >>>>>      *
> >>>>> - * Note: To guarantee that the order of insertion to queue matches
> >>>>> - * the job's fence sequence number this function should be
> >>>>> - * called with drm_sched_job_init under common lock.
> >>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
> >>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
> >>>>> + * under common lock.
> >>>>>      *
> >>>>>      * Returns 0 for success, negative error code otherwise.
> >>>>>      */
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>> index 69de2c76731f..c451ee9a30d7 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> >>>>>      *
> >>>>>      * Free up the fence memory after the RCU grace period.
> >>>>>      */
> >>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
> >>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
> >>>>>     {
> >>>>>         struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> >>>>>         struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>>>     }
> >>>>>     EXPORT_SYMBOL(to_drm_sched_fence);
> >>>>>
> >>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>>> -                                            void *owner)
> >>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>>>> +                                           void *owner)
> >>>>>     {
> >>>>>         struct drm_sched_fence *fence = NULL;
> >>>>> -     unsigned seq;
> >>>>>
> >>>>>         fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >>>>>         if (fence == NULL)
> >>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>>>         fence->sched = entity->rq->sched;
> >>>>>         spin_lock_init(&fence->lock);
> >>>>>
> >>>>> +     return fence;
> >>>>> +}
> >>>>> +
> >>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>>>> +                       struct drm_sched_entity *entity)
> >>>>> +{
> >>>>> +     unsigned seq;
> >>>>> +
> >>>>>         seq = atomic_inc_return(&entity->fence_seq);
> >>>>>         dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >>>>>                        &fence->lock, entity->fence_context, seq);
> >>>>>         dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >>>>>                        &fence->lock, entity->fence_context + 1, seq);
> >>>>> -
> >>>>> -     return fence;
> >>>>>     }
> >>>>>
> >>>>>     module_init(drm_sched_fence_slab_init);
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> index 33c414d55fab..5e84e1500c32 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> @@ -48,9 +48,11 @@
> >>>>>     #include <linux/wait.h>
> >>>>>     #include <linux/sched.h>
> >>>>>     #include <linux/completion.h>
> >>>>> +#include <linux/dma-resv.h>
> >>>>>     #include <uapi/linux/sched/types.h>
> >>>>>
> >>>>>     #include <drm/drm_print.h>
> >>>>> +#include <drm/drm_gem.h>
> >>>>>     #include <drm/gpu_scheduler.h>
> >>>>>     #include <drm/spsc_queue.h>
> >>>>>
> >>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>>>
> >>>>>     /**
> >>>>>      * drm_sched_job_init - init a scheduler job
> >>>>> - *
> >>>>>      * @job: scheduler job to init
> >>>>>      * @entity: scheduler entity to use
> >>>>>      * @owner: job owner for debugging
> >>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>>>      * Refer to drm_sched_entity_push_job() documentation
> >>>>>      * for locking considerations.
> >>>>>      *
> >>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> >>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> >>>>> + *
> >>>>>      * Returns 0 for success, negative error code otherwise.
> >>>>>      */
> >>>>>     int drm_sched_job_init(struct drm_sched_job *job,
> >>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>         job->sched = sched;
> >>>>>         job->entity = entity;
> >>>>>         job->s_priority = entity->rq - sched->sched_rq;
> >>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> >>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >>>>>         if (!job->s_fence)
> >>>>>                 return -ENOMEM;
> >>>>>         job->id = atomic64_inc_return(&sched->job_id_count);
> >>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>     EXPORT_SYMBOL(drm_sched_job_init);
> >>>>>
> >>>>>     /**
> >>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
> >>>>> + * drm_sched_job_arm - arm a scheduler job for execution
> >>>>> + * @job: scheduler job to arm
> >>>>> + *
> >>>>> + * This arms a scheduler job for execution. Specifically it initializes the
> >>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> >>>>> + * or other places that need to track the completion of this job.
> >>>>> + *
> >>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
> >>>>> + * considerations.
> >>>>>      *
> >>>>> + * This can only be called if drm_sched_job_init() succeeded.
> >>>>> + */
> >>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
> >>>>> +{
> >>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
> >>>>> +}
> >>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
> >>>>> +
> >>>>> +/**
> >>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
> >>>>>      * @job: scheduler job to clean up
> >>>>> + *
> >>>>> + * Cleans up the resources allocated with drm_sched_job_init().
> >>>>> + *
> >>>>> + * Drivers should call this from their error unwind code if @job is aborted
> >>>>> + * before drm_sched_job_arm() is called.
> >>>>> + *
> >>>>> + * After that point of no return @job is committed to be executed by the
> >>>>> + * scheduler, and this function should be called from the
> >>>>> + * &drm_sched_backend_ops.free_job callback.
> >>>>>      */
> >>>>>     void drm_sched_job_cleanup(struct drm_sched_job *job)
> >>>>>     {
> >>>>> -     dma_fence_put(&job->s_fence->finished);
> >>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
> >>>>> +             /* drm_sched_job_arm() has been called */
> >>>>> +             dma_fence_put(&job->s_fence->finished);
> >>>>> +     } else {
> >>>>> +             /* aborted job before committing to run it */
> >>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> >>>>> +     }
> >>>>> +
> >>>>>         job->s_fence = NULL;
> >>>>>     }
> >>>>>     EXPORT_SYMBOL(drm_sched_job_cleanup);
> >>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>> index 4eb354226972..5c3a99027ecd 100644
> >>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >>>>>         if (ret)
> >>>>>                 return ret;
> >>>>>
> >>>>> +     drm_sched_job_arm(&job->base);
> >>>>> +
> >>>>>         job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>>>
> >>>>>         /* put by scheduler job completion */
> >>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
> >>>>> --- a/include/drm/gpu_scheduler.h
> >>>>> +++ b/include/drm/gpu_scheduler.h
> >>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >>>>>     int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>                        struct drm_sched_entity *entity,
> >>>>>                        void *owner);
> >>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
> >>>>>     void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >>>>>                                     struct drm_gpu_scheduler **sched_list,
> >>>>>                                        unsigned int num_sched_list);
> >>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>>>                                    enum drm_sched_priority priority);
> >>>>>     bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>>>
> >>>>> -struct drm_sched_fence *drm_sched_fence_create(
> >>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
> >>>>>         struct drm_sched_entity *s_entity, void *owner);
> >>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>>>> +                       struct drm_sched_entity *entity);
> >>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
> >>>>> +
> >>>>>     void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >>>>>     void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >>>>>
> >
>
Christian König July 8, 2021, 6:56 a.m. UTC | #7
Am 07.07.21 um 18:32 schrieb Daniel Vetter:
> On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
>> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
>>> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
>>>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
>>>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
>>>>>>> This is a very confusingly named function, because not just does it
>>>>>>> init an object, it arms it and provides a point of no return for
>>>>>>> pushing a job into the scheduler. It would be nice if that's a bit
>>>>>>> clearer in the interface.
>>>>>>>
>>>>>>> But the real reason is that I want to push the dependency tracking
>>>>>>> helpers into the scheduler code, and that means drm_sched_job_init
>>>>>>> must be called a lot earlier, without arming the job.
>>>>>>>
>>>>>>> v2:
>>>>>>> - don't change .gitignore (Steven)
>>>>>>> - don't forget v3d (Emma)
>>>>>>>
>>>>>>> v3: Emma noticed that I leak the memory allocated in
>>>>>>> drm_sched_job_init if we bail out before the point of no return in
>>>>>>> subsequent driver patches. To be able to fix this change
>>>>>>> drm_sched_job_cleanup() so it can handle being called both before and
>>>>>>> after drm_sched_job_arm().
>>>>>> Thinking more about this, I'm not sure if this really works.
>>>>>>
>>>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
>>>>>> to update the entity->rq association.
>>>>>>
>>>>>> And that can only be done later on when we arm the fence as well.
>>>>> Hm yeah, but that's a bug in the existing code I think: We already
>>>>> fail to clean up if we fail to allocate the fences. So I think the
>>>>> right thing to do here is to split the checks into job_init, and do
>>>>> the actual arming/rq selection in job_arm? I'm not entirely sure
>>>>> what's all going on there, the first check looks a bit like trying to
>>>>> schedule before the entity is set up, which is a driver bug and should
>>>>> have a WARN_ON?
>>>> No you misunderstood me, the problem is something else.
>>>>
>>>> You asked previously why the call to drm_sched_job_init() was so late in
>>>> the CS.
>>>>
>>>> The reason for this was not alone the scheduler fence init, but also the
>>>> call to drm_sched_entity_select_rq().
>>> Ah ok, I think I can fix that. Needs a prep patch to first make
>>> drm_sched_entity_select infallible, then should be easy to do.
>>>
>>>>> The 2nd check around last_scheduled I have honeslty no idea what it's
>>>>> even trying to do.
>>>> You mean that here?
>>>>
>>>>            fence = READ_ONCE(entity->last_scheduled);
>>>>            if (fence && !dma_fence_is_signaled(fence))
>>>>                    return;
>>>>
>>>> This makes sure that load balancing is not moving the entity to a
>>>> different scheduler while there are still jobs running from this entity
>>>> on the hardware,
>>> Yeah after a nap that idea crossed my mind too. But now I have locking
>>> questions, afaiui the scheduler thread updates this, without taking
>>> any locks - entity dequeuing is lockless. And here we read the fence
>>> and then seem to yolo check whether it's signalled? What's preventing
>>> a use-after-free here? There's no rcu or anything going on here at
>>> all, and it's outside of the spinlock section, which starts a bit
>>> further down.
>> The last_scheduled fence of an entity can only change when there are
>> jobs on the entities queued, and we have just ruled that out in the
>> check before.
> There aren't any barriers, so the cpu could easily run the two checks
> the other way round. I'll ponder this and figure out where exactly we
> need docs for the constraint and/or barriers to make this work as
> intended. As-is I'm not seeing how it does ...

spsc_queue_count() provides the necessary barrier with the atomic_read().

But yes a comment would be really nice here. I had to think for a while 
why we don't need this as well.

Christian.

> -Daniel
>
>> Christian.
>>
>>
>>> -Daniel
>>>
>>>> Regards
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Also improve the kerneldoc for this.
>>>>>>>
>>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
>>>>>>> Cc: Rob Herring <robh@kernel.org>
>>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>>>>>> Cc: Nick Terrell <terrelln@fb.com>
>>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>>>> Cc: etnaviv@lists.freedesktop.org
>>>>>>> Cc: lima@lists.freedesktop.org
>>>>>>> Cc: linux-media@vger.kernel.org
>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>> Cc: Emma Anholt <emma@anholt.net>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>>>>>      drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>>>>>      drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>>>>>>>      drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>>>>>>>      drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>>>>>>>      drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>>>>>>>      include/drm/gpu_scheduler.h              |  7 +++-
>>>>>>>      10 files changed, 74 insertions(+), 14 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>>>>>          if (r)
>>>>>>>                  goto error_unlock;
>>>>>>>
>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>> +
>>>>>>>          /* No memory allocation is allowed while holding the notifier lock.
>>>>>>>           * The lock is held until amdgpu_cs_submit is finished and fence is
>>>>>>>           * added to BOs.
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> index d33e6d97cc89..5ddb955d2315 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>>>>>          if (r)
>>>>>>>                  return r;
>>>>>>>
>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>> +
>>>>>>>          *f = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>          amdgpu_job_free_resources(job);
>>>>>>>          drm_sched_entity_push_job(&job->base, entity);
>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> index feb6da1b6ceb..05f412204118 100644
>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>>>>>          if (ret)
>>>>>>>                  goto out_unlock;
>>>>>>>
>>>>>>> +     drm_sched_job_arm(&submit->sched_job);
>>>>>>> +
>>>>>>>          submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>>>>>          submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>>>>>                                                  submit->out_fence, 0,
>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> index dba8329937a3..38f755580507 100644
>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>>>>>                  return err;
>>>>>>>          }
>>>>>>>
>>>>>>> +     drm_sched_job_arm(&task->base);
>>>>>>> +
>>>>>>>          task->num_bos = num_bos;
>>>>>>>          task->vm = lima_vm_get(vm);
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> index 71a72fb50e6b..2992dc85325f 100644
>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>>>>>                  goto unlock;
>>>>>>>          }
>>>>>>>
>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>> +
>>>>>>>          job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>
>>>>>>>          ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>> index 79554aa4dbb1..f7347c284886 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>>>>       * @sched_job: job to submit
>>>>>>>       * @entity: scheduler entity
>>>>>>>       *
>>>>>>> - * Note: To guarantee that the order of insertion to queue matches
>>>>>>> - * the job's fence sequence number this function should be
>>>>>>> - * called with drm_sched_job_init under common lock.
>>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>>>>>> + * under common lock.
>>>>>>>       *
>>>>>>>       * Returns 0 for success, negative error code otherwise.
>>>>>>>       */
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>> index 69de2c76731f..c451ee9a30d7 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>>>>>>>       *
>>>>>>>       * Free up the fence memory after the RCU grace period.
>>>>>>>       */
>>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>      {
>>>>>>>          struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>>>>>>          struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>>>>      }
>>>>>>>      EXPORT_SYMBOL(to_drm_sched_fence);
>>>>>>>
>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>> -                                            void *owner)
>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>>>>> +                                           void *owner)
>>>>>>>      {
>>>>>>>          struct drm_sched_fence *fence = NULL;
>>>>>>> -     unsigned seq;
>>>>>>>
>>>>>>>          fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>>>>>          if (fence == NULL)
>>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>          fence->sched = entity->rq->sched;
>>>>>>>          spin_lock_init(&fence->lock);
>>>>>>>
>>>>>>> +     return fence;
>>>>>>> +}
>>>>>>> +
>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>> +                       struct drm_sched_entity *entity)
>>>>>>> +{
>>>>>>> +     unsigned seq;
>>>>>>> +
>>>>>>>          seq = atomic_inc_return(&entity->fence_seq);
>>>>>>>          dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>>>>>                         &fence->lock, entity->fence_context, seq);
>>>>>>>          dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>>>>>                         &fence->lock, entity->fence_context + 1, seq);
>>>>>>> -
>>>>>>> -     return fence;
>>>>>>>      }
>>>>>>>
>>>>>>>      module_init(drm_sched_fence_slab_init);
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> index 33c414d55fab..5e84e1500c32 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> @@ -48,9 +48,11 @@
>>>>>>>      #include <linux/wait.h>
>>>>>>>      #include <linux/sched.h>
>>>>>>>      #include <linux/completion.h>
>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>      #include <uapi/linux/sched/types.h>
>>>>>>>
>>>>>>>      #include <drm/drm_print.h>
>>>>>>> +#include <drm/drm_gem.h>
>>>>>>>      #include <drm/gpu_scheduler.h>
>>>>>>>      #include <drm/spsc_queue.h>
>>>>>>>
>>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>
>>>>>>>      /**
>>>>>>>       * drm_sched_job_init - init a scheduler job
>>>>>>> - *
>>>>>>>       * @job: scheduler job to init
>>>>>>>       * @entity: scheduler entity to use
>>>>>>>       * @owner: job owner for debugging
>>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>       * Refer to drm_sched_entity_push_job() documentation
>>>>>>>       * for locking considerations.
>>>>>>>       *
>>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
>>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
>>>>>>> + *
>>>>>>>       * Returns 0 for success, negative error code otherwise.
>>>>>>>       */
>>>>>>>      int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>          job->sched = sched;
>>>>>>>          job->entity = entity;
>>>>>>>          job->s_priority = entity->rq - sched->sched_rq;
>>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>>>>>          if (!job->s_fence)
>>>>>>>                  return -ENOMEM;
>>>>>>>          job->id = atomic64_inc_return(&sched->job_id_count);
>>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>      EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>>
>>>>>>>      /**
>>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>>>>>> + * @job: scheduler job to arm
>>>>>>> + *
>>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>>>>>> + * or other places that need to track the completion of this job.
>>>>>>> + *
>>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>>>>>> + * considerations.
>>>>>>>       *
>>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
>>>>>>> + */
>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>>>>>> +{
>>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>       * @job: scheduler job to clean up
>>>>>>> + *
>>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
>>>>>>> + *
>>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
>>>>>>> + * before drm_sched_job_arm() is called.
>>>>>>> + *
>>>>>>> + * After that point of no return @job is committed to be executed by the
>>>>>>> + * scheduler, and this function should be called from the
>>>>>>> + * &drm_sched_backend_ops.free_job callback.
>>>>>>>       */
>>>>>>>      void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>>      {
>>>>>>> -     dma_fence_put(&job->s_fence->finished);
>>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
>>>>>>> +             /* drm_sched_job_arm() has been called */
>>>>>>> +             dma_fence_put(&job->s_fence->finished);
>>>>>>> +     } else {
>>>>>>> +             /* aborted job before committing to run it */
>>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
>>>>>>> +     }
>>>>>>> +
>>>>>>>          job->s_fence = NULL;
>>>>>>>      }
>>>>>>>      EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>> index 4eb354226972..5c3a99027ecd 100644
>>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>>>>>>>          if (ret)
>>>>>>>                  return ret;
>>>>>>>
>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>> +
>>>>>>>          job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>
>>>>>>>          /* put by scheduler job completion */
>>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
>>>>>>> --- a/include/drm/gpu_scheduler.h
>>>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>>>>>      int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>                         struct drm_sched_entity *entity,
>>>>>>>                         void *owner);
>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>>>>>      void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>>>>>                                      struct drm_gpu_scheduler **sched_list,
>>>>>>>                                         unsigned int num_sched_list);
>>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>>>>>                                     enum drm_sched_priority priority);
>>>>>>>      bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>>>>>
>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>>>>>          struct drm_sched_entity *s_entity, void *owner);
>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>> +                       struct drm_sched_entity *entity);
>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
>>>>>>> +
>>>>>>>      void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>>>>>      void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>>>>>
>
Daniel Vetter July 8, 2021, 7:09 a.m. UTC | #8
On Thu, Jul 8, 2021 at 8:56 AM Christian König <christian.koenig@amd.com> wrote:
>
> Am 07.07.21 um 18:32 schrieb Daniel Vetter:
> > On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
> >> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
> >>> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
> >>>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> >>>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
> >>>>> <christian.koenig@amd.com> wrote:
> >>>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> >>>>>>> This is a very confusingly named function, because not just does it
> >>>>>>> init an object, it arms it and provides a point of no return for
> >>>>>>> pushing a job into the scheduler. It would be nice if that's a bit
> >>>>>>> clearer in the interface.
> >>>>>>>
> >>>>>>> But the real reason is that I want to push the dependency tracking
> >>>>>>> helpers into the scheduler code, and that means drm_sched_job_init
> >>>>>>> must be called a lot earlier, without arming the job.
> >>>>>>>
> >>>>>>> v2:
> >>>>>>> - don't change .gitignore (Steven)
> >>>>>>> - don't forget v3d (Emma)
> >>>>>>>
> >>>>>>> v3: Emma noticed that I leak the memory allocated in
> >>>>>>> drm_sched_job_init if we bail out before the point of no return in
> >>>>>>> subsequent driver patches. To be able to fix this change
> >>>>>>> drm_sched_job_cleanup() so it can handle being called both before and
> >>>>>>> after drm_sched_job_arm().
> >>>>>> Thinking more about this, I'm not sure if this really works.
> >>>>>>
> >>>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> >>>>>> to update the entity->rq association.
> >>>>>>
> >>>>>> And that can only be done later on when we arm the fence as well.
> >>>>> Hm yeah, but that's a bug in the existing code I think: We already
> >>>>> fail to clean up if we fail to allocate the fences. So I think the
> >>>>> right thing to do here is to split the checks into job_init, and do
> >>>>> the actual arming/rq selection in job_arm? I'm not entirely sure
> >>>>> what's all going on there, the first check looks a bit like trying to
> >>>>> schedule before the entity is set up, which is a driver bug and should
> >>>>> have a WARN_ON?
> >>>> No you misunderstood me, the problem is something else.
> >>>>
> >>>> You asked previously why the call to drm_sched_job_init() was so late in
> >>>> the CS.
> >>>>
> >>>> The reason for this was not alone the scheduler fence init, but also the
> >>>> call to drm_sched_entity_select_rq().
> >>> Ah ok, I think I can fix that. Needs a prep patch to first make
> >>> drm_sched_entity_select infallible, then should be easy to do.
> >>>
> >>>>> The 2nd check around last_scheduled I have honeslty no idea what it's
> >>>>> even trying to do.
> >>>> You mean that here?
> >>>>
> >>>>            fence = READ_ONCE(entity->last_scheduled);
> >>>>            if (fence && !dma_fence_is_signaled(fence))
> >>>>                    return;
> >>>>
> >>>> This makes sure that load balancing is not moving the entity to a
> >>>> different scheduler while there are still jobs running from this entity
> >>>> on the hardware,
> >>> Yeah after a nap that idea crossed my mind too. But now I have locking
> >>> questions, afaiui the scheduler thread updates this, without taking
> >>> any locks - entity dequeuing is lockless. And here we read the fence
> >>> and then seem to yolo check whether it's signalled? What's preventing
> >>> a use-after-free here? There's no rcu or anything going on here at
> >>> all, and it's outside of the spinlock section, which starts a bit
> >>> further down.
> >> The last_scheduled fence of an entity can only change when there are
> >> jobs on the entities queued, and we have just ruled that out in the
> >> check before.
> > There aren't any barriers, so the cpu could easily run the two checks
> > the other way round. I'll ponder this and figure out where exactly we
> > need docs for the constraint and/or barriers to make this work as
> > intended. As-is I'm not seeing how it does ...
>
> spsc_queue_count() provides the necessary barrier with the atomic_read().

atomic_t is fully unordered, except when it's a read-modify-write
atomic op, then it's a full barrier. So yeah you need more here. But
also since you only need a read barrier on one side, and a write
barrier on the other, you don't actually need a cpu barriers on x86.
And READ_ONCE gives you the compiler barrier on one side at least, I
haven't found it on the writer side yet.

> But yes a comment would be really nice here. I had to think for a while
> why we don't need this as well.

I'm typing a patch, which after a night's sleep I realized has the
wrong barriers. And now I'm also typing some doc improvements for
drm_sched_entity and related functions.

>
> Christian.
>
> > -Daniel
> >
> >> Christian.
> >>
> >>
> >>> -Daniel
> >>>
> >>>> Regards
> >>>> Christian.
> >>>>
> >>>>> -Daniel
> >>>>>
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Also improve the kerneldoc for this.
> >>>>>>>
> >>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
> >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> >>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
> >>>>>>> Cc: Rob Herring <robh@kernel.org>
> >>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> >>>>>>> Cc: Steven Price <steven.price@arm.com>
> >>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> >>>>>>> Cc: David Airlie <airlied@linux.ie>
> >>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> >>>>>>> Cc: Kees Cook <keescook@chromium.org>
> >>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
> >>>>>>> Cc: Nick Terrell <terrelln@fb.com>
> >>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> >>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
> >>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>> Cc: Dave Airlie <airlied@redhat.com>
> >>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
> >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> >>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
> >>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>>>>>> Cc: etnaviv@lists.freedesktop.org
> >>>>>>> Cc: lima@lists.freedesktop.org
> >>>>>>> Cc: linux-media@vger.kernel.org
> >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>>>> Cc: Emma Anholt <emma@anholt.net>
> >>>>>>> ---
> >>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >>>>>>>      drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >>>>>>>      drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> >>>>>>>      drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> >>>>>>>      drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> >>>>>>>      drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >>>>>>>      include/drm/gpu_scheduler.h              |  7 +++-
> >>>>>>>      10 files changed, 74 insertions(+), 14 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >>>>>>>          if (r)
> >>>>>>>                  goto error_unlock;
> >>>>>>>
> >>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>> +
> >>>>>>>          /* No memory allocation is allowed while holding the notifier lock.
> >>>>>>>           * The lock is held until amdgpu_cs_submit is finished and fence is
> >>>>>>>           * added to BOs.
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>>>> index d33e6d97cc89..5ddb955d2315 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >>>>>>>          if (r)
> >>>>>>>                  return r;
> >>>>>>>
> >>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>> +
> >>>>>>>          *f = dma_fence_get(&job->base.s_fence->finished);
> >>>>>>>          amdgpu_job_free_resources(job);
> >>>>>>>          drm_sched_entity_push_job(&job->base, entity);
> >>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>>>> index feb6da1b6ceb..05f412204118 100644
> >>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >>>>>>>          if (ret)
> >>>>>>>                  goto out_unlock;
> >>>>>>>
> >>>>>>> +     drm_sched_job_arm(&submit->sched_job);
> >>>>>>> +
> >>>>>>>          submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >>>>>>>          submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >>>>>>>                                                  submit->out_fence, 0,
> >>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>>>>>> index dba8329937a3..38f755580507 100644
> >>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >>>>>>>                  return err;
> >>>>>>>          }
> >>>>>>>
> >>>>>>> +     drm_sched_job_arm(&task->base);
> >>>>>>> +
> >>>>>>>          task->num_bos = num_bos;
> >>>>>>>          task->vm = lima_vm_get(vm);
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>>>> index 71a72fb50e6b..2992dc85325f 100644
> >>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >>>>>>>                  goto unlock;
> >>>>>>>          }
> >>>>>>>
> >>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>> +
> >>>>>>>          job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>>>>>
> >>>>>>>          ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>>>> index 79554aa4dbb1..f7347c284886 100644
> >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >>>>>>>       * @sched_job: job to submit
> >>>>>>>       * @entity: scheduler entity
> >>>>>>>       *
> >>>>>>> - * Note: To guarantee that the order of insertion to queue matches
> >>>>>>> - * the job's fence sequence number this function should be
> >>>>>>> - * called with drm_sched_job_init under common lock.
> >>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
> >>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
> >>>>>>> + * under common lock.
> >>>>>>>       *
> >>>>>>>       * Returns 0 for success, negative error code otherwise.
> >>>>>>>       */
> >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>>>> index 69de2c76731f..c451ee9a30d7 100644
> >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> >>>>>>>       *
> >>>>>>>       * Free up the fence memory after the RCU grace period.
> >>>>>>>       */
> >>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
> >>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
> >>>>>>>      {
> >>>>>>>          struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> >>>>>>>          struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>>>>>      }
> >>>>>>>      EXPORT_SYMBOL(to_drm_sched_fence);
> >>>>>>>
> >>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>>>>> -                                            void *owner)
> >>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>>>>>> +                                           void *owner)
> >>>>>>>      {
> >>>>>>>          struct drm_sched_fence *fence = NULL;
> >>>>>>> -     unsigned seq;
> >>>>>>>
> >>>>>>>          fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >>>>>>>          if (fence == NULL)
> >>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>>>>>          fence->sched = entity->rq->sched;
> >>>>>>>          spin_lock_init(&fence->lock);
> >>>>>>>
> >>>>>>> +     return fence;
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>>>>>> +                       struct drm_sched_entity *entity)
> >>>>>>> +{
> >>>>>>> +     unsigned seq;
> >>>>>>> +
> >>>>>>>          seq = atomic_inc_return(&entity->fence_seq);
> >>>>>>>          dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >>>>>>>                         &fence->lock, entity->fence_context, seq);
> >>>>>>>          dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >>>>>>>                         &fence->lock, entity->fence_context + 1, seq);
> >>>>>>> -
> >>>>>>> -     return fence;
> >>>>>>>      }
> >>>>>>>
> >>>>>>>      module_init(drm_sched_fence_slab_init);
> >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>>> index 33c414d55fab..5e84e1500c32 100644
> >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>>> @@ -48,9 +48,11 @@
> >>>>>>>      #include <linux/wait.h>
> >>>>>>>      #include <linux/sched.h>
> >>>>>>>      #include <linux/completion.h>
> >>>>>>> +#include <linux/dma-resv.h>
> >>>>>>>      #include <uapi/linux/sched/types.h>
> >>>>>>>
> >>>>>>>      #include <drm/drm_print.h>
> >>>>>>> +#include <drm/drm_gem.h>
> >>>>>>>      #include <drm/gpu_scheduler.h>
> >>>>>>>      #include <drm/spsc_queue.h>
> >>>>>>>
> >>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>>>>>
> >>>>>>>      /**
> >>>>>>>       * drm_sched_job_init - init a scheduler job
> >>>>>>> - *
> >>>>>>>       * @job: scheduler job to init
> >>>>>>>       * @entity: scheduler entity to use
> >>>>>>>       * @owner: job owner for debugging
> >>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>>>>>       * Refer to drm_sched_entity_push_job() documentation
> >>>>>>>       * for locking considerations.
> >>>>>>>       *
> >>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> >>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> >>>>>>> + *
> >>>>>>>       * Returns 0 for success, negative error code otherwise.
> >>>>>>>       */
> >>>>>>>      int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>          job->sched = sched;
> >>>>>>>          job->entity = entity;
> >>>>>>>          job->s_priority = entity->rq - sched->sched_rq;
> >>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> >>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >>>>>>>          if (!job->s_fence)
> >>>>>>>                  return -ENOMEM;
> >>>>>>>          job->id = atomic64_inc_return(&sched->job_id_count);
> >>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>      EXPORT_SYMBOL(drm_sched_job_init);
> >>>>>>>
> >>>>>>>      /**
> >>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
> >>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
> >>>>>>> + * @job: scheduler job to arm
> >>>>>>> + *
> >>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
> >>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> >>>>>>> + * or other places that need to track the completion of this job.
> >>>>>>> + *
> >>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
> >>>>>>> + * considerations.
> >>>>>>>       *
> >>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
> >>>>>>> + */
> >>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
> >>>>>>> +{
> >>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
> >>>>>>> +}
> >>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
> >>>>>>> +
> >>>>>>> +/**
> >>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
> >>>>>>>       * @job: scheduler job to clean up
> >>>>>>> + *
> >>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
> >>>>>>> + *
> >>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
> >>>>>>> + * before drm_sched_job_arm() is called.
> >>>>>>> + *
> >>>>>>> + * After that point of no return @job is committed to be executed by the
> >>>>>>> + * scheduler, and this function should be called from the
> >>>>>>> + * &drm_sched_backend_ops.free_job callback.
> >>>>>>>       */
> >>>>>>>      void drm_sched_job_cleanup(struct drm_sched_job *job)
> >>>>>>>      {
> >>>>>>> -     dma_fence_put(&job->s_fence->finished);
> >>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
> >>>>>>> +             /* drm_sched_job_arm() has been called */
> >>>>>>> +             dma_fence_put(&job->s_fence->finished);
> >>>>>>> +     } else {
> >>>>>>> +             /* aborted job before committing to run it */
> >>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> >>>>>>> +     }
> >>>>>>> +
> >>>>>>>          job->s_fence = NULL;
> >>>>>>>      }
> >>>>>>>      EXPORT_SYMBOL(drm_sched_job_cleanup);
> >>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>>>> index 4eb354226972..5c3a99027ecd 100644
> >>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >>>>>>>          if (ret)
> >>>>>>>                  return ret;
> >>>>>>>
> >>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>> +
> >>>>>>>          job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>>>>>
> >>>>>>>          /* put by scheduler job completion */
> >>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
> >>>>>>> --- a/include/drm/gpu_scheduler.h
> >>>>>>> +++ b/include/drm/gpu_scheduler.h
> >>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >>>>>>>      int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>                         struct drm_sched_entity *entity,
> >>>>>>>                         void *owner);
> >>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
> >>>>>>>      void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >>>>>>>                                      struct drm_gpu_scheduler **sched_list,
> >>>>>>>                                         unsigned int num_sched_list);
> >>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>>>>>                                     enum drm_sched_priority priority);
> >>>>>>>      bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>>>>>
> >>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
> >>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
> >>>>>>>          struct drm_sched_entity *s_entity, void *owner);
> >>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>>>>>> +                       struct drm_sched_entity *entity);
> >>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
> >>>>>>> +
> >>>>>>>      void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >>>>>>>      void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >>>>>>>
> >
>
Daniel Vetter July 8, 2021, 7:19 a.m. UTC | #9
On Thu, Jul 8, 2021 at 9:09 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> On Thu, Jul 8, 2021 at 8:56 AM Christian König <christian.koenig@amd.com> wrote:
> > Am 07.07.21 um 18:32 schrieb Daniel Vetter:
> > > On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
> > >> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
> > >>> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
> > >>>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> > >>>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
> > >>>>> <christian.koenig@amd.com> wrote:
> > >>>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> > >>>>>>> This is a very confusingly named function, because not just does it
> > >>>>>>> init an object, it arms it and provides a point of no return for
> > >>>>>>> pushing a job into the scheduler. It would be nice if that's a bit
> > >>>>>>> clearer in the interface.
> > >>>>>>>
> > >>>>>>> But the real reason is that I want to push the dependency tracking
> > >>>>>>> helpers into the scheduler code, and that means drm_sched_job_init
> > >>>>>>> must be called a lot earlier, without arming the job.
> > >>>>>>>
> > >>>>>>> v2:
> > >>>>>>> - don't change .gitignore (Steven)
> > >>>>>>> - don't forget v3d (Emma)
> > >>>>>>>
> > >>>>>>> v3: Emma noticed that I leak the memory allocated in
> > >>>>>>> drm_sched_job_init if we bail out before the point of no return in
> > >>>>>>> subsequent driver patches. To be able to fix this change
> > >>>>>>> drm_sched_job_cleanup() so it can handle being called both before and
> > >>>>>>> after drm_sched_job_arm().
> > >>>>>> Thinking more about this, I'm not sure if this really works.
> > >>>>>>
> > >>>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> > >>>>>> to update the entity->rq association.
> > >>>>>>
> > >>>>>> And that can only be done later on when we arm the fence as well.
> > >>>>> Hm yeah, but that's a bug in the existing code I think: We already
> > >>>>> fail to clean up if we fail to allocate the fences. So I think the
> > >>>>> right thing to do here is to split the checks into job_init, and do
> > >>>>> the actual arming/rq selection in job_arm? I'm not entirely sure
> > >>>>> what's all going on there, the first check looks a bit like trying to
> > >>>>> schedule before the entity is set up, which is a driver bug and should
> > >>>>> have a WARN_ON?
> > >>>> No you misunderstood me, the problem is something else.
> > >>>>
> > >>>> You asked previously why the call to drm_sched_job_init() was so late in
> > >>>> the CS.
> > >>>>
> > >>>> The reason for this was not alone the scheduler fence init, but also the
> > >>>> call to drm_sched_entity_select_rq().
> > >>> Ah ok, I think I can fix that. Needs a prep patch to first make
> > >>> drm_sched_entity_select infallible, then should be easy to do.
> > >>>
> > >>>>> The 2nd check around last_scheduled I have honeslty no idea what it's
> > >>>>> even trying to do.
> > >>>> You mean that here?
> > >>>>
> > >>>>            fence = READ_ONCE(entity->last_scheduled);
> > >>>>            if (fence && !dma_fence_is_signaled(fence))
> > >>>>                    return;
> > >>>>
> > >>>> This makes sure that load balancing is not moving the entity to a
> > >>>> different scheduler while there are still jobs running from this entity
> > >>>> on the hardware,
> > >>> Yeah after a nap that idea crossed my mind too. But now I have locking
> > >>> questions, afaiui the scheduler thread updates this, without taking
> > >>> any locks - entity dequeuing is lockless. And here we read the fence
> > >>> and then seem to yolo check whether it's signalled? What's preventing
> > >>> a use-after-free here? There's no rcu or anything going on here at
> > >>> all, and it's outside of the spinlock section, which starts a bit
> > >>> further down.
> > >> The last_scheduled fence of an entity can only change when there are
> > >> jobs on the entities queued, and we have just ruled that out in the
> > >> check before.
> > > There aren't any barriers, so the cpu could easily run the two checks
> > > the other way round. I'll ponder this and figure out where exactly we
> > > need docs for the constraint and/or barriers to make this work as
> > > intended. As-is I'm not seeing how it does ...
> >
> > spsc_queue_count() provides the necessary barrier with the atomic_read().
>
> atomic_t is fully unordered, except when it's a read-modify-write

Wasn't awake yet, I think the rule is read-modify-write and return
previous value gives you full barrier. So stuff like cmpxchg, but also
a few others. See atomic_t.txt under ODERING heading (yes that
maintainer refuses to accept .rst so I can't just link you to the
right section, it's silly). get/set and even RMW atomic ops that don't
return anything are all fully unordered.
-Daniel


> atomic op, then it's a full barrier. So yeah you need more here. But
> also since you only need a read barrier on one side, and a write
> barrier on the other, you don't actually need a cpu barriers on x86.
> And READ_ONCE gives you the compiler barrier on one side at least, I
> haven't found it on the writer side yet.
>
> > But yes a comment would be really nice here. I had to think for a while
> > why we don't need this as well.
>
> I'm typing a patch, which after a night's sleep I realized has the
> wrong barriers. And now I'm also typing some doc improvements for
> drm_sched_entity and related functions.
>
> >
> > Christian.
> >
> > > -Daniel
> > >
> > >> Christian.
> > >>
> > >>
> > >>> -Daniel
> > >>>
> > >>>> Regards
> > >>>> Christian.
> > >>>>
> > >>>>> -Daniel
> > >>>>>
> > >>>>>> Christian.
> > >>>>>>
> > >>>>>>> Also improve the kerneldoc for this.
> > >>>>>>>
> > >>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
> > >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > >>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
> > >>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > >>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > >>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
> > >>>>>>> Cc: Rob Herring <robh@kernel.org>
> > >>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > >>>>>>> Cc: Steven Price <steven.price@arm.com>
> > >>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > >>>>>>> Cc: David Airlie <airlied@linux.ie>
> > >>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
> > >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> > >>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> > >>>>>>> Cc: Kees Cook <keescook@chromium.org>
> > >>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
> > >>>>>>> Cc: Nick Terrell <terrelln@fb.com>
> > >>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > >>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > >>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
> > >>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> > >>>>>>> Cc: Dave Airlie <airlied@redhat.com>
> > >>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> > >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > >>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
> > >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> > >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> > >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> > >>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> > >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> > >>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > >>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > >>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> > >>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > >>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
> > >>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > >>>>>>> Cc: etnaviv@lists.freedesktop.org
> > >>>>>>> Cc: lima@lists.freedesktop.org
> > >>>>>>> Cc: linux-media@vger.kernel.org
> > >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> > >>>>>>> Cc: Emma Anholt <emma@anholt.net>
> > >>>>>>> ---
> > >>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> > >>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> > >>>>>>>      drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> > >>>>>>>      drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> > >>>>>>>      drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> > >>>>>>>      drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> > >>>>>>>      drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> > >>>>>>>      drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> > >>>>>>>      drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> > >>>>>>>      include/drm/gpu_scheduler.h              |  7 +++-
> > >>>>>>>      10 files changed, 74 insertions(+), 14 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> > >>>>>>>          if (r)
> > >>>>>>>                  goto error_unlock;
> > >>>>>>>
> > >>>>>>> +     drm_sched_job_arm(&job->base);
> > >>>>>>> +
> > >>>>>>>          /* No memory allocation is allowed while holding the notifier lock.
> > >>>>>>>           * The lock is held until amdgpu_cs_submit is finished and fence is
> > >>>>>>>           * added to BOs.
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > >>>>>>> index d33e6d97cc89..5ddb955d2315 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > >>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> > >>>>>>>          if (r)
> > >>>>>>>                  return r;
> > >>>>>>>
> > >>>>>>> +     drm_sched_job_arm(&job->base);
> > >>>>>>> +
> > >>>>>>>          *f = dma_fence_get(&job->base.s_fence->finished);
> > >>>>>>>          amdgpu_job_free_resources(job);
> > >>>>>>>          drm_sched_entity_push_job(&job->base, entity);
> > >>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > >>>>>>> index feb6da1b6ceb..05f412204118 100644
> > >>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > >>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > >>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> > >>>>>>>          if (ret)
> > >>>>>>>                  goto out_unlock;
> > >>>>>>>
> > >>>>>>> +     drm_sched_job_arm(&submit->sched_job);
> > >>>>>>> +
> > >>>>>>>          submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> > >>>>>>>          submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> > >>>>>>>                                                  submit->out_fence, 0,
> > >>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > >>>>>>> index dba8329937a3..38f755580507 100644
> > >>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
> > >>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> > >>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> > >>>>>>>                  return err;
> > >>>>>>>          }
> > >>>>>>>
> > >>>>>>> +     drm_sched_job_arm(&task->base);
> > >>>>>>> +
> > >>>>>>>          task->num_bos = num_bos;
> > >>>>>>>          task->vm = lima_vm_get(vm);
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > >>>>>>> index 71a72fb50e6b..2992dc85325f 100644
> > >>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > >>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > >>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> > >>>>>>>                  goto unlock;
> > >>>>>>>          }
> > >>>>>>>
> > >>>>>>> +     drm_sched_job_arm(&job->base);
> > >>>>>>> +
> > >>>>>>>          job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> > >>>>>>>
> > >>>>>>>          ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > >>>>>>> index 79554aa4dbb1..f7347c284886 100644
> > >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > >>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> > >>>>>>>       * @sched_job: job to submit
> > >>>>>>>       * @entity: scheduler entity
> > >>>>>>>       *
> > >>>>>>> - * Note: To guarantee that the order of insertion to queue matches
> > >>>>>>> - * the job's fence sequence number this function should be
> > >>>>>>> - * called with drm_sched_job_init under common lock.
> > >>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
> > >>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
> > >>>>>>> + * under common lock.
> > >>>>>>>       *
> > >>>>>>>       * Returns 0 for success, negative error code otherwise.
> > >>>>>>>       */
> > >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > >>>>>>> index 69de2c76731f..c451ee9a30d7 100644
> > >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > >>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> > >>>>>>>       *
> > >>>>>>>       * Free up the fence memory after the RCU grace period.
> > >>>>>>>       */
> > >>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
> > >>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
> > >>>>>>>      {
> > >>>>>>>          struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> > >>>>>>>          struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > >>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > >>>>>>>      }
> > >>>>>>>      EXPORT_SYMBOL(to_drm_sched_fence);
> > >>>>>>>
> > >>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > >>>>>>> -                                            void *owner)
> > >>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > >>>>>>> +                                           void *owner)
> > >>>>>>>      {
> > >>>>>>>          struct drm_sched_fence *fence = NULL;
> > >>>>>>> -     unsigned seq;
> > >>>>>>>
> > >>>>>>>          fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> > >>>>>>>          if (fence == NULL)
> > >>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > >>>>>>>          fence->sched = entity->rq->sched;
> > >>>>>>>          spin_lock_init(&fence->lock);
> > >>>>>>>
> > >>>>>>> +     return fence;
> > >>>>>>> +}
> > >>>>>>> +
> > >>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > >>>>>>> +                       struct drm_sched_entity *entity)
> > >>>>>>> +{
> > >>>>>>> +     unsigned seq;
> > >>>>>>> +
> > >>>>>>>          seq = atomic_inc_return(&entity->fence_seq);
> > >>>>>>>          dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> > >>>>>>>                         &fence->lock, entity->fence_context, seq);
> > >>>>>>>          dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> > >>>>>>>                         &fence->lock, entity->fence_context + 1, seq);
> > >>>>>>> -
> > >>>>>>> -     return fence;
> > >>>>>>>      }
> > >>>>>>>
> > >>>>>>>      module_init(drm_sched_fence_slab_init);
> > >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > >>>>>>> index 33c414d55fab..5e84e1500c32 100644
> > >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> > >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > >>>>>>> @@ -48,9 +48,11 @@
> > >>>>>>>      #include <linux/wait.h>
> > >>>>>>>      #include <linux/sched.h>
> > >>>>>>>      #include <linux/completion.h>
> > >>>>>>> +#include <linux/dma-resv.h>
> > >>>>>>>      #include <uapi/linux/sched/types.h>
> > >>>>>>>
> > >>>>>>>      #include <drm/drm_print.h>
> > >>>>>>> +#include <drm/drm_gem.h>
> > >>>>>>>      #include <drm/gpu_scheduler.h>
> > >>>>>>>      #include <drm/spsc_queue.h>
> > >>>>>>>
> > >>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> > >>>>>>>
> > >>>>>>>      /**
> > >>>>>>>       * drm_sched_job_init - init a scheduler job
> > >>>>>>> - *
> > >>>>>>>       * @job: scheduler job to init
> > >>>>>>>       * @entity: scheduler entity to use
> > >>>>>>>       * @owner: job owner for debugging
> > >>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> > >>>>>>>       * Refer to drm_sched_entity_push_job() documentation
> > >>>>>>>       * for locking considerations.
> > >>>>>>>       *
> > >>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> > >>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> > >>>>>>> + *
> > >>>>>>>       * Returns 0 for success, negative error code otherwise.
> > >>>>>>>       */
> > >>>>>>>      int drm_sched_job_init(struct drm_sched_job *job,
> > >>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> > >>>>>>>          job->sched = sched;
> > >>>>>>>          job->entity = entity;
> > >>>>>>>          job->s_priority = entity->rq - sched->sched_rq;
> > >>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> > >>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> > >>>>>>>          if (!job->s_fence)
> > >>>>>>>                  return -ENOMEM;
> > >>>>>>>          job->id = atomic64_inc_return(&sched->job_id_count);
> > >>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> > >>>>>>>      EXPORT_SYMBOL(drm_sched_job_init);
> > >>>>>>>
> > >>>>>>>      /**
> > >>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
> > >>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
> > >>>>>>> + * @job: scheduler job to arm
> > >>>>>>> + *
> > >>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
> > >>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> > >>>>>>> + * or other places that need to track the completion of this job.
> > >>>>>>> + *
> > >>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
> > >>>>>>> + * considerations.
> > >>>>>>>       *
> > >>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
> > >>>>>>> + */
> > >>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
> > >>>>>>> +{
> > >>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
> > >>>>>>> +}
> > >>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
> > >>>>>>> +
> > >>>>>>> +/**
> > >>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
> > >>>>>>>       * @job: scheduler job to clean up
> > >>>>>>> + *
> > >>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
> > >>>>>>> + *
> > >>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
> > >>>>>>> + * before drm_sched_job_arm() is called.
> > >>>>>>> + *
> > >>>>>>> + * After that point of no return @job is committed to be executed by the
> > >>>>>>> + * scheduler, and this function should be called from the
> > >>>>>>> + * &drm_sched_backend_ops.free_job callback.
> > >>>>>>>       */
> > >>>>>>>      void drm_sched_job_cleanup(struct drm_sched_job *job)
> > >>>>>>>      {
> > >>>>>>> -     dma_fence_put(&job->s_fence->finished);
> > >>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
> > >>>>>>> +             /* drm_sched_job_arm() has been called */
> > >>>>>>> +             dma_fence_put(&job->s_fence->finished);
> > >>>>>>> +     } else {
> > >>>>>>> +             /* aborted job before committing to run it */
> > >>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> > >>>>>>> +     }
> > >>>>>>> +
> > >>>>>>>          job->s_fence = NULL;
> > >>>>>>>      }
> > >>>>>>>      EXPORT_SYMBOL(drm_sched_job_cleanup);
> > >>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> > >>>>>>> index 4eb354226972..5c3a99027ecd 100644
> > >>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> > >>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> > >>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> > >>>>>>>          if (ret)
> > >>>>>>>                  return ret;
> > >>>>>>>
> > >>>>>>> +     drm_sched_job_arm(&job->base);
> > >>>>>>> +
> > >>>>>>>          job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> > >>>>>>>
> > >>>>>>>          /* put by scheduler job completion */
> > >>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > >>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
> > >>>>>>> --- a/include/drm/gpu_scheduler.h
> > >>>>>>> +++ b/include/drm/gpu_scheduler.h
> > >>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> > >>>>>>>      int drm_sched_job_init(struct drm_sched_job *job,
> > >>>>>>>                         struct drm_sched_entity *entity,
> > >>>>>>>                         void *owner);
> > >>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
> > >>>>>>>      void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> > >>>>>>>                                      struct drm_gpu_scheduler **sched_list,
> > >>>>>>>                                         unsigned int num_sched_list);
> > >>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> > >>>>>>>                                     enum drm_sched_priority priority);
> > >>>>>>>      bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> > >>>>>>>
> > >>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
> > >>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
> > >>>>>>>          struct drm_sched_entity *s_entity, void *owner);
> > >>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > >>>>>>> +                       struct drm_sched_entity *entity);
> > >>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
> > >>>>>>> +
> > >>>>>>>      void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> > >>>>>>>      void drm_sched_fence_finished(struct drm_sched_fence *fence);
> > >>>>>>>
> > >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Christian König July 8, 2021, 7:53 a.m. UTC | #10
Am 08.07.21 um 09:19 schrieb Daniel Vetter:
> On Thu, Jul 8, 2021 at 9:09 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>> On Thu, Jul 8, 2021 at 8:56 AM Christian König <christian.koenig@amd.com> wrote:
>>> Am 07.07.21 um 18:32 schrieb Daniel Vetter:
>>>> On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
>>>>> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
>>>>>> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
>>>>>>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
>>>>>>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
>>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
>>>>>>>>>> This is a very confusingly named function, because not just does it
>>>>>>>>>> init an object, it arms it and provides a point of no return for
>>>>>>>>>> pushing a job into the scheduler. It would be nice if that's a bit
>>>>>>>>>> clearer in the interface.
>>>>>>>>>>
>>>>>>>>>> But the real reason is that I want to push the dependency tracking
>>>>>>>>>> helpers into the scheduler code, and that means drm_sched_job_init
>>>>>>>>>> must be called a lot earlier, without arming the job.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>> - don't change .gitignore (Steven)
>>>>>>>>>> - don't forget v3d (Emma)
>>>>>>>>>>
>>>>>>>>>> v3: Emma noticed that I leak the memory allocated in
>>>>>>>>>> drm_sched_job_init if we bail out before the point of no return in
>>>>>>>>>> subsequent driver patches. To be able to fix this change
>>>>>>>>>> drm_sched_job_cleanup() so it can handle being called both before and
>>>>>>>>>> after drm_sched_job_arm().
>>>>>>>>> Thinking more about this, I'm not sure if this really works.
>>>>>>>>>
>>>>>>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
>>>>>>>>> to update the entity->rq association.
>>>>>>>>>
>>>>>>>>> And that can only be done later on when we arm the fence as well.
>>>>>>>> Hm yeah, but that's a bug in the existing code I think: We already
>>>>>>>> fail to clean up if we fail to allocate the fences. So I think the
>>>>>>>> right thing to do here is to split the checks into job_init, and do
>>>>>>>> the actual arming/rq selection in job_arm? I'm not entirely sure
>>>>>>>> what's all going on there, the first check looks a bit like trying to
>>>>>>>> schedule before the entity is set up, which is a driver bug and should
>>>>>>>> have a WARN_ON?
>>>>>>> No you misunderstood me, the problem is something else.
>>>>>>>
>>>>>>> You asked previously why the call to drm_sched_job_init() was so late in
>>>>>>> the CS.
>>>>>>>
>>>>>>> The reason for this was not alone the scheduler fence init, but also the
>>>>>>> call to drm_sched_entity_select_rq().
>>>>>> Ah ok, I think I can fix that. Needs a prep patch to first make
>>>>>> drm_sched_entity_select infallible, then should be easy to do.
>>>>>>
>>>>>>>> The 2nd check around last_scheduled I have honeslty no idea what it's
>>>>>>>> even trying to do.
>>>>>>> You mean that here?
>>>>>>>
>>>>>>>             fence = READ_ONCE(entity->last_scheduled);
>>>>>>>             if (fence && !dma_fence_is_signaled(fence))
>>>>>>>                     return;
>>>>>>>
>>>>>>> This makes sure that load balancing is not moving the entity to a
>>>>>>> different scheduler while there are still jobs running from this entity
>>>>>>> on the hardware,
>>>>>> Yeah after a nap that idea crossed my mind too. But now I have locking
>>>>>> questions, afaiui the scheduler thread updates this, without taking
>>>>>> any locks - entity dequeuing is lockless. And here we read the fence
>>>>>> and then seem to yolo check whether it's signalled? What's preventing
>>>>>> a use-after-free here? There's no rcu or anything going on here at
>>>>>> all, and it's outside of the spinlock section, which starts a bit
>>>>>> further down.
>>>>> The last_scheduled fence of an entity can only change when there are
>>>>> jobs on the entities queued, and we have just ruled that out in the
>>>>> check before.
>>>> There aren't any barriers, so the cpu could easily run the two checks
>>>> the other way round. I'll ponder this and figure out where exactly we
>>>> need docs for the constraint and/or barriers to make this work as
>>>> intended. As-is I'm not seeing how it does ...
>>> spsc_queue_count() provides the necessary barrier with the atomic_read().
>> atomic_t is fully unordered, except when it's a read-modify-write
> Wasn't awake yet, I think the rule is read-modify-write and return
> previous value gives you full barrier. So stuff like cmpxchg, but also
> a few others. See atomic_t.txt under ODERING heading (yes that
> maintainer refuses to accept .rst so I can't just link you to the
> right section, it's silly). get/set and even RMW atomic ops that don't
> return anything are all fully unordered.

As far as I know that not completely correct. The rules around atomics i 
once learned are:

1. Everything which modifies something is a write barrier.
2. Everything which returns something is a read barrier.

And I know a whole bunch of use cases where this is relied upon in the 
core kernel, so I'm pretty sure that's correct.

In this case the write barrier is the atomic_dec() in spsc_queue_pop() 
and the read barrier is the aromic_read() in spsc_queue_count().

The READ_ONCE() is actually not even necessary as far as I can see.

Christian.

> -Daniel
>
>
>> atomic op, then it's a full barrier. So yeah you need more here. But
>> also since you only need a read barrier on one side, and a write
>> barrier on the other, you don't actually need a cpu barriers on x86.
>> And READ_ONCE gives you the compiler barrier on one side at least, I
>> haven't found it on the writer side yet.
>>
>>> But yes a comment would be really nice here. I had to think for a while
>>> why we don't need this as well.
>> I'm typing a patch, which after a night's sleep I realized has the
>> wrong barriers. And now I'm also typing some doc improvements for
>> drm_sched_entity and related functions.
>>
>>> Christian.
>>>
>>>> -Daniel
>>>>
>>>>> Christian.
>>>>>
>>>>>
>>>>>> -Daniel
>>>>>>
>>>>>>> Regards
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Also improve the kerneldoc for this.
>>>>>>>>>>
>>>>>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>>>>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>>>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
>>>>>>>>>> Cc: Rob Herring <robh@kernel.org>
>>>>>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>>>>>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>>>>>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>>>>>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>>>>>>>>> Cc: Nick Terrell <terrelln@fb.com>
>>>>>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>>>>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>>>>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>>>>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>>>>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>>>>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>>>>>>> Cc: etnaviv@lists.freedesktop.org
>>>>>>>>>> Cc: lima@lists.freedesktop.org
>>>>>>>>>> Cc: linux-media@vger.kernel.org
>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>> Cc: Emma Anholt <emma@anholt.net>
>>>>>>>>>> ---
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>>>>>>>>       drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>>>>>>>>       drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>>>>>>>>       drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>>>>>>>>       drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>>>>>>>>>>       drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>>>>>>>>>>       drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>>>>>>>>>>       drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>>>>>>>>>>       include/drm/gpu_scheduler.h              |  7 +++-
>>>>>>>>>>       10 files changed, 74 insertions(+), 14 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>>>>>>>>           if (r)
>>>>>>>>>>                   goto error_unlock;
>>>>>>>>>>
>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>> +
>>>>>>>>>>           /* No memory allocation is allowed while holding the notifier lock.
>>>>>>>>>>            * The lock is held until amdgpu_cs_submit is finished and fence is
>>>>>>>>>>            * added to BOs.
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>> index d33e6d97cc89..5ddb955d2315 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>>>>>>>>           if (r)
>>>>>>>>>>                   return r;
>>>>>>>>>>
>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>> +
>>>>>>>>>>           *f = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>           amdgpu_job_free_resources(job);
>>>>>>>>>>           drm_sched_entity_push_job(&job->base, entity);
>>>>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>> index feb6da1b6ceb..05f412204118 100644
>>>>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>>>>>>>>           if (ret)
>>>>>>>>>>                   goto out_unlock;
>>>>>>>>>>
>>>>>>>>>> +     drm_sched_job_arm(&submit->sched_job);
>>>>>>>>>> +
>>>>>>>>>>           submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>>>>>>>>           submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>>>>>>>>                                                   submit->out_fence, 0,
>>>>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>> index dba8329937a3..38f755580507 100644
>>>>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>>>>>>>>                   return err;
>>>>>>>>>>           }
>>>>>>>>>>
>>>>>>>>>> +     drm_sched_job_arm(&task->base);
>>>>>>>>>> +
>>>>>>>>>>           task->num_bos = num_bos;
>>>>>>>>>>           task->vm = lima_vm_get(vm);
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>> index 71a72fb50e6b..2992dc85325f 100644
>>>>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>>>>>>>>                   goto unlock;
>>>>>>>>>>           }
>>>>>>>>>>
>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>> +
>>>>>>>>>>           job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>
>>>>>>>>>>           ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>> index 79554aa4dbb1..f7347c284886 100644
>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>>>>>>>        * @sched_job: job to submit
>>>>>>>>>>        * @entity: scheduler entity
>>>>>>>>>>        *
>>>>>>>>>> - * Note: To guarantee that the order of insertion to queue matches
>>>>>>>>>> - * the job's fence sequence number this function should be
>>>>>>>>>> - * called with drm_sched_job_init under common lock.
>>>>>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>>>>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>>>>>>>>> + * under common lock.
>>>>>>>>>>        *
>>>>>>>>>>        * Returns 0 for success, negative error code otherwise.
>>>>>>>>>>        */
>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>> index 69de2c76731f..c451ee9a30d7 100644
>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>>>>>>>>>>        *
>>>>>>>>>>        * Free up the fence memory after the RCU grace period.
>>>>>>>>>>        */
>>>>>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>>>>       {
>>>>>>>>>>           struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>>>>>>>>>           struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>>>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>>>>>>>       }
>>>>>>>>>>       EXPORT_SYMBOL(to_drm_sched_fence);
>>>>>>>>>>
>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>>>> -                                            void *owner)
>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>>>>>>>> +                                           void *owner)
>>>>>>>>>>       {
>>>>>>>>>>           struct drm_sched_fence *fence = NULL;
>>>>>>>>>> -     unsigned seq;
>>>>>>>>>>
>>>>>>>>>>           fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>>>>>>>>           if (fence == NULL)
>>>>>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>>>>           fence->sched = entity->rq->sched;
>>>>>>>>>>           spin_lock_init(&fence->lock);
>>>>>>>>>>
>>>>>>>>>> +     return fence;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>>>>> +                       struct drm_sched_entity *entity)
>>>>>>>>>> +{
>>>>>>>>>> +     unsigned seq;
>>>>>>>>>> +
>>>>>>>>>>           seq = atomic_inc_return(&entity->fence_seq);
>>>>>>>>>>           dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>>>>>>>>                          &fence->lock, entity->fence_context, seq);
>>>>>>>>>>           dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>>>>>>>>                          &fence->lock, entity->fence_context + 1, seq);
>>>>>>>>>> -
>>>>>>>>>> -     return fence;
>>>>>>>>>>       }
>>>>>>>>>>
>>>>>>>>>>       module_init(drm_sched_fence_slab_init);
>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>> index 33c414d55fab..5e84e1500c32 100644
>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>> @@ -48,9 +48,11 @@
>>>>>>>>>>       #include <linux/wait.h>
>>>>>>>>>>       #include <linux/sched.h>
>>>>>>>>>>       #include <linux/completion.h>
>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>       #include <uapi/linux/sched/types.h>
>>>>>>>>>>
>>>>>>>>>>       #include <drm/drm_print.h>
>>>>>>>>>> +#include <drm/drm_gem.h>
>>>>>>>>>>       #include <drm/gpu_scheduler.h>
>>>>>>>>>>       #include <drm/spsc_queue.h>
>>>>>>>>>>
>>>>>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>>>>
>>>>>>>>>>       /**
>>>>>>>>>>        * drm_sched_job_init - init a scheduler job
>>>>>>>>>> - *
>>>>>>>>>>        * @job: scheduler job to init
>>>>>>>>>>        * @entity: scheduler entity to use
>>>>>>>>>>        * @owner: job owner for debugging
>>>>>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>>>>        * Refer to drm_sched_entity_push_job() documentation
>>>>>>>>>>        * for locking considerations.
>>>>>>>>>>        *
>>>>>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
>>>>>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
>>>>>>>>>> + *
>>>>>>>>>>        * Returns 0 for success, negative error code otherwise.
>>>>>>>>>>        */
>>>>>>>>>>       int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>           job->sched = sched;
>>>>>>>>>>           job->entity = entity;
>>>>>>>>>>           job->s_priority = entity->rq - sched->sched_rq;
>>>>>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>>>>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>>>>>>>>           if (!job->s_fence)
>>>>>>>>>>                   return -ENOMEM;
>>>>>>>>>>           job->id = atomic64_inc_return(&sched->job_id_count);
>>>>>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>       EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>>>>>
>>>>>>>>>>       /**
>>>>>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>>>>>>>>> + * @job: scheduler job to arm
>>>>>>>>>> + *
>>>>>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>>>>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>>>>>>>>> + * or other places that need to track the completion of this job.
>>>>>>>>>> + *
>>>>>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>>>>>>>>> + * considerations.
>>>>>>>>>>        *
>>>>>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
>>>>>>>>>> + */
>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>>>>>>>>> +{
>>>>>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>>>>        * @job: scheduler job to clean up
>>>>>>>>>> + *
>>>>>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
>>>>>>>>>> + *
>>>>>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
>>>>>>>>>> + * before drm_sched_job_arm() is called.
>>>>>>>>>> + *
>>>>>>>>>> + * After that point of no return @job is committed to be executed by the
>>>>>>>>>> + * scheduler, and this function should be called from the
>>>>>>>>>> + * &drm_sched_backend_ops.free_job callback.
>>>>>>>>>>        */
>>>>>>>>>>       void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>>>>>       {
>>>>>>>>>> -     dma_fence_put(&job->s_fence->finished);
>>>>>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
>>>>>>>>>> +             /* drm_sched_job_arm() has been called */
>>>>>>>>>> +             dma_fence_put(&job->s_fence->finished);
>>>>>>>>>> +     } else {
>>>>>>>>>> +             /* aborted job before committing to run it */
>>>>>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
>>>>>>>>>> +     }
>>>>>>>>>> +
>>>>>>>>>>           job->s_fence = NULL;
>>>>>>>>>>       }
>>>>>>>>>>       EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>> index 4eb354226972..5c3a99027ecd 100644
>>>>>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>>>>>>>>>>           if (ret)
>>>>>>>>>>                   return ret;
>>>>>>>>>>
>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>> +
>>>>>>>>>>           job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>
>>>>>>>>>>           /* put by scheduler job completion */
>>>>>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
>>>>>>>>>> --- a/include/drm/gpu_scheduler.h
>>>>>>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>>>>>>>>       int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>                          struct drm_sched_entity *entity,
>>>>>>>>>>                          void *owner);
>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>>>>>>>>       void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>>>>>>>>                                       struct drm_gpu_scheduler **sched_list,
>>>>>>>>>>                                          unsigned int num_sched_list);
>>>>>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>>>>>>>>                                      enum drm_sched_priority priority);
>>>>>>>>>>       bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>>>>>>>>
>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>>>>>>>>           struct drm_sched_entity *s_entity, void *owner);
>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>>>>> +                       struct drm_sched_entity *entity);
>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
>>>>>>>>>> +
>>>>>>>>>>       void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>>>>>>>>       void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>>>>>>>>
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C1ac51fc78f9f4e2f08a808d941e0c013%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613255881294371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PRGZl6tUAc7FrL39mu%2BBV2AfC02Mz9R2Neqs5TjdB6M%3D&amp;reserved=0
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C1ac51fc78f9f4e2f08a808d941e0c013%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613255881294371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PRGZl6tUAc7FrL39mu%2BBV2AfC02Mz9R2Neqs5TjdB6M%3D&amp;reserved=0
Daniel Vetter July 8, 2021, 10:02 a.m. UTC | #11
On Thu, Jul 08, 2021 at 09:53:00AM +0200, Christian König wrote:
> Am 08.07.21 um 09:19 schrieb Daniel Vetter:
> > On Thu, Jul 8, 2021 at 9:09 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > On Thu, Jul 8, 2021 at 8:56 AM Christian König <christian.koenig@amd.com> wrote:
> > > > Am 07.07.21 um 18:32 schrieb Daniel Vetter:
> > > > > On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > > Am 07.07.21 um 14:13 schrieb Daniel Vetter:
> > > > > > > On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > > > > Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> > > > > > > > > On Wed, Jul 7, 2021 at 11:30 AM Christian König
> > > > > > > > > <christian.koenig@amd.com> wrote:
> > > > > > > > > > Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> > > > > > > > > > > This is a very confusingly named function, because not just does it
> > > > > > > > > > > init an object, it arms it and provides a point of no return for
> > > > > > > > > > > pushing a job into the scheduler. It would be nice if that's a bit
> > > > > > > > > > > clearer in the interface.
> > > > > > > > > > > 
> > > > > > > > > > > But the real reason is that I want to push the dependency tracking
> > > > > > > > > > > helpers into the scheduler code, and that means drm_sched_job_init
> > > > > > > > > > > must be called a lot earlier, without arming the job.
> > > > > > > > > > > 
> > > > > > > > > > > v2:
> > > > > > > > > > > - don't change .gitignore (Steven)
> > > > > > > > > > > - don't forget v3d (Emma)
> > > > > > > > > > > 
> > > > > > > > > > > v3: Emma noticed that I leak the memory allocated in
> > > > > > > > > > > drm_sched_job_init if we bail out before the point of no return in
> > > > > > > > > > > subsequent driver patches. To be able to fix this change
> > > > > > > > > > > drm_sched_job_cleanup() so it can handle being called both before and
> > > > > > > > > > > after drm_sched_job_arm().
> > > > > > > > > > Thinking more about this, I'm not sure if this really works.
> > > > > > > > > > 
> > > > > > > > > > See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> > > > > > > > > > to update the entity->rq association.
> > > > > > > > > > 
> > > > > > > > > > And that can only be done later on when we arm the fence as well.
> > > > > > > > > Hm yeah, but that's a bug in the existing code I think: We already
> > > > > > > > > fail to clean up if we fail to allocate the fences. So I think the
> > > > > > > > > right thing to do here is to split the checks into job_init, and do
> > > > > > > > > the actual arming/rq selection in job_arm? I'm not entirely sure
> > > > > > > > > what's all going on there, the first check looks a bit like trying to
> > > > > > > > > schedule before the entity is set up, which is a driver bug and should
> > > > > > > > > have a WARN_ON?
> > > > > > > > No you misunderstood me, the problem is something else.
> > > > > > > > 
> > > > > > > > You asked previously why the call to drm_sched_job_init() was so late in
> > > > > > > > the CS.
> > > > > > > > 
> > > > > > > > The reason for this was not alone the scheduler fence init, but also the
> > > > > > > > call to drm_sched_entity_select_rq().
> > > > > > > Ah ok, I think I can fix that. Needs a prep patch to first make
> > > > > > > drm_sched_entity_select infallible, then should be easy to do.
> > > > > > > 
> > > > > > > > > The 2nd check around last_scheduled I have honeslty no idea what it's
> > > > > > > > > even trying to do.
> > > > > > > > You mean that here?
> > > > > > > > 
> > > > > > > >             fence = READ_ONCE(entity->last_scheduled);
> > > > > > > >             if (fence && !dma_fence_is_signaled(fence))
> > > > > > > >                     return;
> > > > > > > > 
> > > > > > > > This makes sure that load balancing is not moving the entity to a
> > > > > > > > different scheduler while there are still jobs running from this entity
> > > > > > > > on the hardware,
> > > > > > > Yeah after a nap that idea crossed my mind too. But now I have locking
> > > > > > > questions, afaiui the scheduler thread updates this, without taking
> > > > > > > any locks - entity dequeuing is lockless. And here we read the fence
> > > > > > > and then seem to yolo check whether it's signalled? What's preventing
> > > > > > > a use-after-free here? There's no rcu or anything going on here at
> > > > > > > all, and it's outside of the spinlock section, which starts a bit
> > > > > > > further down.
> > > > > > The last_scheduled fence of an entity can only change when there are
> > > > > > jobs on the entities queued, and we have just ruled that out in the
> > > > > > check before.
> > > > > There aren't any barriers, so the cpu could easily run the two checks
> > > > > the other way round. I'll ponder this and figure out where exactly we
> > > > > need docs for the constraint and/or barriers to make this work as
> > > > > intended. As-is I'm not seeing how it does ...
> > > > spsc_queue_count() provides the necessary barrier with the atomic_read().
> > > atomic_t is fully unordered, except when it's a read-modify-write
> > Wasn't awake yet, I think the rule is read-modify-write and return
> > previous value gives you full barrier. So stuff like cmpxchg, but also
> > a few others. See atomic_t.txt under ODERING heading (yes that
> > maintainer refuses to accept .rst so I can't just link you to the
> > right section, it's silly). get/set and even RMW atomic ops that don't
> > return anything are all fully unordered.
> 
> As far as I know that not completely correct. The rules around atomics i
> once learned are:
> 
> 1. Everything which modifies something is a write barrier.
> 2. Everything which returns something is a read barrier.
> 
> And I know a whole bunch of use cases where this is relied upon in the core
> kernel, so I'm pretty sure that's correct.

That's against what the doc says, and also it would mean stuff like
atomic_read_acquire or smp_mb__after/before_atomic is completely pointless.

On x86 you're right, anywhere else where there's no total store ordering I
you're wrong.

If there's code that relies on this it needs to be fixed and properly
documented. I did go through the squeue code a bit, and might be better to
just replace this with a core data structure.
-Daniel

> In this case the write barrier is the atomic_dec() in spsc_queue_pop() and
> the read barrier is the aromic_read() in spsc_queue_count().
> 
> The READ_ONCE() is actually not even necessary as far as I can see.
> 
> Christian.
> 
> > -Daniel
> > 
> > 
> > > atomic op, then it's a full barrier. So yeah you need more here. But
> > > also since you only need a read barrier on one side, and a write
> > > barrier on the other, you don't actually need a cpu barriers on x86.
> > > And READ_ONCE gives you the compiler barrier on one side at least, I
> > > haven't found it on the writer side yet.
> > > 
> > > > But yes a comment would be really nice here. I had to think for a while
> > > > why we don't need this as well.
> > > I'm typing a patch, which after a night's sleep I realized has the
> > > wrong barriers. And now I'm also typing some doc improvements for
> > > drm_sched_entity and related functions.
> > > 
> > > > Christian.
> > > > 
> > > > > -Daniel
> > > > > 
> > > > > > Christian.
> > > > > > 
> > > > > > 
> > > > > > > -Daniel
> > > > > > > 
> > > > > > > > Regards
> > > > > > > > Christian.
> > > > > > > > 
> > > > > > > > > -Daniel
> > > > > > > > > 
> > > > > > > > > > Christian.
> > > > > > > > > > 
> > > > > > > > > > > Also improve the kerneldoc for this.
> > > > > > > > > > > 
> > > > > > > > > > > Acked-by: Steven Price <steven.price@arm.com> (v2)
> > > > > > > > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > > > > > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > > > > > > > > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > > > > > > > > > > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > > > > > > > > > > Cc: Qiang Yu <yuq825@gmail.com>
> > > > > > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > > > > > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > > > > > > > > > > Cc: Steven Price <steven.price@arm.com>
> > > > > > > > > > > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > > > > > > > > > > Cc: David Airlie <airlied@linux.ie>
> > > > > > > > > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > > > > > > > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > > > > > > > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > > > > > > > > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > > > > > > > > > > Cc: Kees Cook <keescook@chromium.org>
> > > > > > > > > > > Cc: Adam Borowski <kilobyte@angband.pl>
> > > > > > > > > > > Cc: Nick Terrell <terrelln@fb.com>
> > > > > > > > > > > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > > > > > > > > > > Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > > > > > > > > > > Cc: Sami Tolvanen <samitolvanen@google.com>
> > > > > > > > > > > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > > > > > > > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > > > > > > > Cc: Dave Airlie <airlied@redhat.com>
> > > > > > > > > > > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > > > > > > > > > > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > > > > > > > > > > Cc: Lee Jones <lee.jones@linaro.org>
> > > > > > > > > > > Cc: Kevin Wang <kevin1.wang@amd.com>
> > > > > > > > > > > Cc: Chen Li <chenli@uniontech.com>
> > > > > > > > > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > > > > > > > > Cc: "Marek Olšák" <marek.olsak@amd.com>
> > > > > > > > > > > Cc: Dennis Li <Dennis.Li@amd.com>
> > > > > > > > > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > > > > > > > > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > > > > > > > > Cc: Sonny Jiang <sonny.jiang@amd.com>
> > > > > > > > > > > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > > > > > > > > > > Cc: Tian Tao <tiantao6@hisilicon.com>
> > > > > > > > > > > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > > > > > > > > > > Cc: etnaviv@lists.freedesktop.org
> > > > > > > > > > > Cc: lima@lists.freedesktop.org
> > > > > > > > > > > Cc: linux-media@vger.kernel.org
> > > > > > > > > > > Cc: linaro-mm-sig@lists.linaro.org
> > > > > > > > > > > Cc: Emma Anholt <emma@anholt.net>
> > > > > > > > > > > ---
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> > > > > > > > > > >       drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> > > > > > > > > > >       drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> > > > > > > > > > >       drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> > > > > > > > > > >       drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> > > > > > > > > > >       drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> > > > > > > > > > >       drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> > > > > > > > > > >       drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> > > > > > > > > > >       drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> > > > > > > > > > >       include/drm/gpu_scheduler.h              |  7 +++-
> > > > > > > > > > >       10 files changed, 74 insertions(+), 14 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > index c5386d13eb4a..a4ec092af9a7 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> > > > > > > > > > >           if (r)
> > > > > > > > > > >                   goto error_unlock;
> > > > > > > > > > > 
> > > > > > > > > > > +     drm_sched_job_arm(&job->base);
> > > > > > > > > > > +
> > > > > > > > > > >           /* No memory allocation is allowed while holding the notifier lock.
> > > > > > > > > > >            * The lock is held until amdgpu_cs_submit is finished and fence is
> > > > > > > > > > >            * added to BOs.
> > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > > > index d33e6d97cc89..5ddb955d2315 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > > > > > > > > @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> > > > > > > > > > >           if (r)
> > > > > > > > > > >                   return r;
> > > > > > > > > > > 
> > > > > > > > > > > +     drm_sched_job_arm(&job->base);
> > > > > > > > > > > +
> > > > > > > > > > >           *f = dma_fence_get(&job->base.s_fence->finished);
> > > > > > > > > > >           amdgpu_job_free_resources(job);
> > > > > > > > > > >           drm_sched_entity_push_job(&job->base, entity);
> > > > > > > > > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > > > > > > > index feb6da1b6ceb..05f412204118 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > > > > > > > > > > @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> > > > > > > > > > >           if (ret)
> > > > > > > > > > >                   goto out_unlock;
> > > > > > > > > > > 
> > > > > > > > > > > +     drm_sched_job_arm(&submit->sched_job);
> > > > > > > > > > > +
> > > > > > > > > > >           submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> > > > > > > > > > >           submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> > > > > > > > > > >                                                   submit->out_fence, 0,
> > > > > > > > > > > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > > > > > > > > > > index dba8329937a3..38f755580507 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/lima/lima_sched.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > > > > > > > > > > @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> > > > > > > > > > >                   return err;
> > > > > > > > > > >           }
> > > > > > > > > > > 
> > > > > > > > > > > +     drm_sched_job_arm(&task->base);
> > > > > > > > > > > +
> > > > > > > > > > >           task->num_bos = num_bos;
> > > > > > > > > > >           task->vm = lima_vm_get(vm);
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > > > > > > > index 71a72fb50e6b..2992dc85325f 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > > > > > > > > > > @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> > > > > > > > > > >                   goto unlock;
> > > > > > > > > > >           }
> > > > > > > > > > > 
> > > > > > > > > > > +     drm_sched_job_arm(&job->base);
> > > > > > > > > > > +
> > > > > > > > > > >           job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> > > > > > > > > > > 
> > > > > > > > > > >           ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > > > > > > > index 79554aa4dbb1..f7347c284886 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > > > > > > > @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> > > > > > > > > > >        * @sched_job: job to submit
> > > > > > > > > > >        * @entity: scheduler entity
> > > > > > > > > > >        *
> > > > > > > > > > > - * Note: To guarantee that the order of insertion to queue matches
> > > > > > > > > > > - * the job's fence sequence number this function should be
> > > > > > > > > > > - * called with drm_sched_job_init under common lock.
> > > > > > > > > > > + * Note: To guarantee that the order of insertion to queue matches the job's
> > > > > > > > > > > + * fence sequence number this function should be called with drm_sched_job_arm()
> > > > > > > > > > > + * under common lock.
> > > > > > > > > > >        *
> > > > > > > > > > >        * Returns 0 for success, negative error code otherwise.
> > > > > > > > > > >        */
> > > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > > > > > > > > > > index 69de2c76731f..c451ee9a30d7 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > > > > > > > > > > @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> > > > > > > > > > >        *
> > > > > > > > > > >        * Free up the fence memory after the RCU grace period.
> > > > > > > > > > >        */
> > > > > > > > > > > -static void drm_sched_fence_free(struct rcu_head *rcu)
> > > > > > > > > > > +void drm_sched_fence_free(struct rcu_head *rcu)
> > > > > > > > > > >       {
> > > > > > > > > > >           struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> > > > > > > > > > >           struct drm_sched_fence *fence = to_drm_sched_fence(f);
> > > > > > > > > > > @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> > > > > > > > > > >       }
> > > > > > > > > > >       EXPORT_SYMBOL(to_drm_sched_fence);
> > > > > > > > > > > 
> > > > > > > > > > > -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > > > > > > > > > > -                                            void *owner)
> > > > > > > > > > > +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > > > > > > > > > > +                                           void *owner)
> > > > > > > > > > >       {
> > > > > > > > > > >           struct drm_sched_fence *fence = NULL;
> > > > > > > > > > > -     unsigned seq;
> > > > > > > > > > > 
> > > > > > > > > > >           fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> > > > > > > > > > >           if (fence == NULL)
> > > > > > > > > > > @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > > > > > > > > > >           fence->sched = entity->rq->sched;
> > > > > > > > > > >           spin_lock_init(&fence->lock);
> > > > > > > > > > > 
> > > > > > > > > > > +     return fence;
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > > > > > > > > > > +                       struct drm_sched_entity *entity)
> > > > > > > > > > > +{
> > > > > > > > > > > +     unsigned seq;
> > > > > > > > > > > +
> > > > > > > > > > >           seq = atomic_inc_return(&entity->fence_seq);
> > > > > > > > > > >           dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> > > > > > > > > > >                          &fence->lock, entity->fence_context, seq);
> > > > > > > > > > >           dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> > > > > > > > > > >                          &fence->lock, entity->fence_context + 1, seq);
> > > > > > > > > > > -
> > > > > > > > > > > -     return fence;
> > > > > > > > > > >       }
> > > > > > > > > > > 
> > > > > > > > > > >       module_init(drm_sched_fence_slab_init);
> > > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > > index 33c414d55fab..5e84e1500c32 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > > @@ -48,9 +48,11 @@
> > > > > > > > > > >       #include <linux/wait.h>
> > > > > > > > > > >       #include <linux/sched.h>
> > > > > > > > > > >       #include <linux/completion.h>
> > > > > > > > > > > +#include <linux/dma-resv.h>
> > > > > > > > > > >       #include <uapi/linux/sched/types.h>
> > > > > > > > > > > 
> > > > > > > > > > >       #include <drm/drm_print.h>
> > > > > > > > > > > +#include <drm/drm_gem.h>
> > > > > > > > > > >       #include <drm/gpu_scheduler.h>
> > > > > > > > > > >       #include <drm/spsc_queue.h>
> > > > > > > > > > > 
> > > > > > > > > > > @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> > > > > > > > > > > 
> > > > > > > > > > >       /**
> > > > > > > > > > >        * drm_sched_job_init - init a scheduler job
> > > > > > > > > > > - *
> > > > > > > > > > >        * @job: scheduler job to init
> > > > > > > > > > >        * @entity: scheduler entity to use
> > > > > > > > > > >        * @owner: job owner for debugging
> > > > > > > > > > > @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> > > > > > > > > > >        * Refer to drm_sched_entity_push_job() documentation
> > > > > > > > > > >        * for locking considerations.
> > > > > > > > > > >        *
> > > > > > > > > > > + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> > > > > > > > > > > + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> > > > > > > > > > > + *
> > > > > > > > > > >        * Returns 0 for success, negative error code otherwise.
> > > > > > > > > > >        */
> > > > > > > > > > >       int drm_sched_job_init(struct drm_sched_job *job,
> > > > > > > > > > > @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> > > > > > > > > > >           job->sched = sched;
> > > > > > > > > > >           job->entity = entity;
> > > > > > > > > > >           job->s_priority = entity->rq - sched->sched_rq;
> > > > > > > > > > > -     job->s_fence = drm_sched_fence_create(entity, owner);
> > > > > > > > > > > +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> > > > > > > > > > >           if (!job->s_fence)
> > > > > > > > > > >                   return -ENOMEM;
> > > > > > > > > > >           job->id = atomic64_inc_return(&sched->job_id_count);
> > > > > > > > > > > @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> > > > > > > > > > >       EXPORT_SYMBOL(drm_sched_job_init);
> > > > > > > > > > > 
> > > > > > > > > > >       /**
> > > > > > > > > > > - * drm_sched_job_cleanup - clean up scheduler job resources
> > > > > > > > > > > + * drm_sched_job_arm - arm a scheduler job for execution
> > > > > > > > > > > + * @job: scheduler job to arm
> > > > > > > > > > > + *
> > > > > > > > > > > + * This arms a scheduler job for execution. Specifically it initializes the
> > > > > > > > > > > + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> > > > > > > > > > > + * or other places that need to track the completion of this job.
> > > > > > > > > > > + *
> > > > > > > > > > > + * Refer to drm_sched_entity_push_job() documentation for locking
> > > > > > > > > > > + * considerations.
> > > > > > > > > > >        *
> > > > > > > > > > > + * This can only be called if drm_sched_job_init() succeeded.
> > > > > > > > > > > + */
> > > > > > > > > > > +void drm_sched_job_arm(struct drm_sched_job *job)
> > > > > > > > > > > +{
> > > > > > > > > > > +     drm_sched_fence_init(job->s_fence, job->entity);
> > > > > > > > > > > +}
> > > > > > > > > > > +EXPORT_SYMBOL(drm_sched_job_arm);
> > > > > > > > > > > +
> > > > > > > > > > > +/**
> > > > > > > > > > > + * drm_sched_job_cleanup - clean up scheduler job resources
> > > > > > > > > > >        * @job: scheduler job to clean up
> > > > > > > > > > > + *
> > > > > > > > > > > + * Cleans up the resources allocated with drm_sched_job_init().
> > > > > > > > > > > + *
> > > > > > > > > > > + * Drivers should call this from their error unwind code if @job is aborted
> > > > > > > > > > > + * before drm_sched_job_arm() is called.
> > > > > > > > > > > + *
> > > > > > > > > > > + * After that point of no return @job is committed to be executed by the
> > > > > > > > > > > + * scheduler, and this function should be called from the
> > > > > > > > > > > + * &drm_sched_backend_ops.free_job callback.
> > > > > > > > > > >        */
> > > > > > > > > > >       void drm_sched_job_cleanup(struct drm_sched_job *job)
> > > > > > > > > > >       {
> > > > > > > > > > > -     dma_fence_put(&job->s_fence->finished);
> > > > > > > > > > > +     if (!kref_read(&job->s_fence->finished.refcount)) {
> > > > > > > > > > > +             /* drm_sched_job_arm() has been called */
> > > > > > > > > > > +             dma_fence_put(&job->s_fence->finished);
> > > > > > > > > > > +     } else {
> > > > > > > > > > > +             /* aborted job before committing to run it */
> > > > > > > > > > > +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> > > > > > > > > > > +     }
> > > > > > > > > > > +
> > > > > > > > > > >           job->s_fence = NULL;
> > > > > > > > > > >       }
> > > > > > > > > > >       EXPORT_SYMBOL(drm_sched_job_cleanup);
> > > > > > > > > > > diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> > > > > > > > > > > index 4eb354226972..5c3a99027ecd 100644
> > > > > > > > > > > --- a/drivers/gpu/drm/v3d/v3d_gem.c
> > > > > > > > > > > +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> > > > > > > > > > > @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> > > > > > > > > > >           if (ret)
> > > > > > > > > > >                   return ret;
> > > > > > > > > > > 
> > > > > > > > > > > +     drm_sched_job_arm(&job->base);
> > > > > > > > > > > +
> > > > > > > > > > >           job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> > > > > > > > > > > 
> > > > > > > > > > >           /* put by scheduler job completion */
> > > > > > > > > > > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > > > > > > > > > > index 88ae7f331bb1..83afc3aa8e2f 100644
> > > > > > > > > > > --- a/include/drm/gpu_scheduler.h
> > > > > > > > > > > +++ b/include/drm/gpu_scheduler.h
> > > > > > > > > > > @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> > > > > > > > > > >       int drm_sched_job_init(struct drm_sched_job *job,
> > > > > > > > > > >                          struct drm_sched_entity *entity,
> > > > > > > > > > >                          void *owner);
> > > > > > > > > > > +void drm_sched_job_arm(struct drm_sched_job *job);
> > > > > > > > > > >       void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> > > > > > > > > > >                                       struct drm_gpu_scheduler **sched_list,
> > > > > > > > > > >                                          unsigned int num_sched_list);
> > > > > > > > > > > @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> > > > > > > > > > >                                      enum drm_sched_priority priority);
> > > > > > > > > > >       bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> > > > > > > > > > > 
> > > > > > > > > > > -struct drm_sched_fence *drm_sched_fence_create(
> > > > > > > > > > > +struct drm_sched_fence *drm_sched_fence_alloc(
> > > > > > > > > > >           struct drm_sched_entity *s_entity, void *owner);
> > > > > > > > > > > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > > > > > > > > > > +                       struct drm_sched_entity *entity);
> > > > > > > > > > > +void drm_sched_fence_free(struct rcu_head *rcu);
> > > > > > > > > > > +
> > > > > > > > > > >       void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> > > > > > > > > > >       void drm_sched_fence_finished(struct drm_sched_fence *fence);
> > > > > > > > > > > 
> > > 
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C1ac51fc78f9f4e2f08a808d941e0c013%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613255881294371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PRGZl6tUAc7FrL39mu%2BBV2AfC02Mz9R2Neqs5TjdB6M%3D&amp;reserved=0
> > 
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C1ac51fc78f9f4e2f08a808d941e0c013%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613255881294371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PRGZl6tUAc7FrL39mu%2BBV2AfC02Mz9R2Neqs5TjdB6M%3D&amp;reserved=0
>
Christian König July 8, 2021, 10:54 a.m. UTC | #12
Am 08.07.21 um 12:02 schrieb Daniel Vetter:
> On Thu, Jul 08, 2021 at 09:53:00AM +0200, Christian König wrote:
>> Am 08.07.21 um 09:19 schrieb Daniel Vetter:
>>> On Thu, Jul 8, 2021 at 9:09 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>> On Thu, Jul 8, 2021 at 8:56 AM Christian König <christian.koenig@amd.com> wrote:
>>>>> Am 07.07.21 um 18:32 schrieb Daniel Vetter:
>>>>>> On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
>>>>>>> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
>>>>>>>> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
>>>>>>>>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
>>>>>>>>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
>>>>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
>>>>>>>>>>>> This is a very confusingly named function, because not just does it
>>>>>>>>>>>> init an object, it arms it and provides a point of no return for
>>>>>>>>>>>> pushing a job into the scheduler. It would be nice if that's a bit
>>>>>>>>>>>> clearer in the interface.
>>>>>>>>>>>>
>>>>>>>>>>>> But the real reason is that I want to push the dependency tracking
>>>>>>>>>>>> helpers into the scheduler code, and that means drm_sched_job_init
>>>>>>>>>>>> must be called a lot earlier, without arming the job.
>>>>>>>>>>>>
>>>>>>>>>>>> v2:
>>>>>>>>>>>> - don't change .gitignore (Steven)
>>>>>>>>>>>> - don't forget v3d (Emma)
>>>>>>>>>>>>
>>>>>>>>>>>> v3: Emma noticed that I leak the memory allocated in
>>>>>>>>>>>> drm_sched_job_init if we bail out before the point of no return in
>>>>>>>>>>>> subsequent driver patches. To be able to fix this change
>>>>>>>>>>>> drm_sched_job_cleanup() so it can handle being called both before and
>>>>>>>>>>>> after drm_sched_job_arm().
>>>>>>>>>>> Thinking more about this, I'm not sure if this really works.
>>>>>>>>>>>
>>>>>>>>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
>>>>>>>>>>> to update the entity->rq association.
>>>>>>>>>>>
>>>>>>>>>>> And that can only be done later on when we arm the fence as well.
>>>>>>>>>> Hm yeah, but that's a bug in the existing code I think: We already
>>>>>>>>>> fail to clean up if we fail to allocate the fences. So I think the
>>>>>>>>>> right thing to do here is to split the checks into job_init, and do
>>>>>>>>>> the actual arming/rq selection in job_arm? I'm not entirely sure
>>>>>>>>>> what's all going on there, the first check looks a bit like trying to
>>>>>>>>>> schedule before the entity is set up, which is a driver bug and should
>>>>>>>>>> have a WARN_ON?
>>>>>>>>> No you misunderstood me, the problem is something else.
>>>>>>>>>
>>>>>>>>> You asked previously why the call to drm_sched_job_init() was so late in
>>>>>>>>> the CS.
>>>>>>>>>
>>>>>>>>> The reason for this was not alone the scheduler fence init, but also the
>>>>>>>>> call to drm_sched_entity_select_rq().
>>>>>>>> Ah ok, I think I can fix that. Needs a prep patch to first make
>>>>>>>> drm_sched_entity_select infallible, then should be easy to do.
>>>>>>>>
>>>>>>>>>> The 2nd check around last_scheduled I have honeslty no idea what it's
>>>>>>>>>> even trying to do.
>>>>>>>>> You mean that here?
>>>>>>>>>
>>>>>>>>>              fence = READ_ONCE(entity->last_scheduled);
>>>>>>>>>              if (fence && !dma_fence_is_signaled(fence))
>>>>>>>>>                      return;
>>>>>>>>>
>>>>>>>>> This makes sure that load balancing is not moving the entity to a
>>>>>>>>> different scheduler while there are still jobs running from this entity
>>>>>>>>> on the hardware,
>>>>>>>> Yeah after a nap that idea crossed my mind too. But now I have locking
>>>>>>>> questions, afaiui the scheduler thread updates this, without taking
>>>>>>>> any locks - entity dequeuing is lockless. And here we read the fence
>>>>>>>> and then seem to yolo check whether it's signalled? What's preventing
>>>>>>>> a use-after-free here? There's no rcu or anything going on here at
>>>>>>>> all, and it's outside of the spinlock section, which starts a bit
>>>>>>>> further down.
>>>>>>> The last_scheduled fence of an entity can only change when there are
>>>>>>> jobs on the entities queued, and we have just ruled that out in the
>>>>>>> check before.
>>>>>> There aren't any barriers, so the cpu could easily run the two checks
>>>>>> the other way round. I'll ponder this and figure out where exactly we
>>>>>> need docs for the constraint and/or barriers to make this work as
>>>>>> intended. As-is I'm not seeing how it does ...
>>>>> spsc_queue_count() provides the necessary barrier with the atomic_read().
>>>> atomic_t is fully unordered, except when it's a read-modify-write
>>> Wasn't awake yet, I think the rule is read-modify-write and return
>>> previous value gives you full barrier. So stuff like cmpxchg, but also
>>> a few others. See atomic_t.txt under ODERING heading (yes that
>>> maintainer refuses to accept .rst so I can't just link you to the
>>> right section, it's silly). get/set and even RMW atomic ops that don't
>>> return anything are all fully unordered.
>> As far as I know that not completely correct. The rules around atomics i
>> once learned are:
>>
>> 1. Everything which modifies something is a write barrier.
>> 2. Everything which returns something is a read barrier.
>>
>> And I know a whole bunch of use cases where this is relied upon in the core
>> kernel, so I'm pretty sure that's correct.
> That's against what the doc says, and also it would mean stuff like
> atomic_read_acquire or smp_mb__after/before_atomic is completely pointless.
>
> On x86 you're right, anywhere else where there's no total store ordering I
> you're wrong.

Good to know. I always thought that atomic_read_acquire() was just for 
documentation purpose.



> If there's code that relies on this it needs to be fixed and properly
> documented. I did go through the squeue code a bit, and might be better to
> just replace this with a core data structure.

Well the spsc was especially crafted for this use case and performed 
quite a bit better then a double linked list.

Or what core data structure do you have in mind?

Christian.

> -Daniel
>
>> In this case the write barrier is the atomic_dec() in spsc_queue_pop() and
>> the read barrier is the aromic_read() in spsc_queue_count().
>>
>> The READ_ONCE() is actually not even necessary as far as I can see.
>>
>> Christian.
>>
>>> -Daniel
>>>
>>>
>>>> atomic op, then it's a full barrier. So yeah you need more here. But
>>>> also since you only need a read barrier on one side, and a write
>>>> barrier on the other, you don't actually need a cpu barriers on x86.
>>>> And READ_ONCE gives you the compiler barrier on one side at least, I
>>>> haven't found it on the writer side yet.
>>>>
>>>>> But yes a comment would be really nice here. I had to think for a while
>>>>> why we don't need this as well.
>>>> I'm typing a patch, which after a night's sleep I realized has the
>>>> wrong barriers. And now I'm also typing some doc improvements for
>>>> drm_sched_entity and related functions.
>>>>
>>>>> Christian.
>>>>>
>>>>>> -Daniel
>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> -Daniel
>>>>>>>>>>
>>>>>>>>>>> Christian.
>>>>>>>>>>>
>>>>>>>>>>>> Also improve the kerneldoc for this.
>>>>>>>>>>>>
>>>>>>>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
>>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>>>>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>>>>>>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>>>>>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
>>>>>>>>>>>> Cc: Rob Herring <robh@kernel.org>
>>>>>>>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>>>>>>>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>>>>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>>>>>>>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>>>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>>>>>>>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>>>>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>>>>>>>>>>> Cc: Nick Terrell <terrelln@fb.com>
>>>>>>>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>>>>>>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>>>>>>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>>>>>>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>>>>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
>>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>>>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>>>>>>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>>>>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>>>>>>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>>>>>>>>> Cc: etnaviv@lists.freedesktop.org
>>>>>>>>>>>> Cc: lima@lists.freedesktop.org
>>>>>>>>>>>> Cc: linux-media@vger.kernel.org
>>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>>>> Cc: Emma Anholt <emma@anholt.net>
>>>>>>>>>>>> ---
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>>>>>>>>>>        drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>>>>>>>>>>        drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>>>>>>>>>>        drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>>>>>>>>>>>>        drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>>>>>>>>>>>>        include/drm/gpu_scheduler.h              |  7 +++-
>>>>>>>>>>>>        10 files changed, 74 insertions(+), 14 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>>>>>>>>>>            if (r)
>>>>>>>>>>>>                    goto error_unlock;
>>>>>>>>>>>>
>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>> +
>>>>>>>>>>>>            /* No memory allocation is allowed while holding the notifier lock.
>>>>>>>>>>>>             * The lock is held until amdgpu_cs_submit is finished and fence is
>>>>>>>>>>>>             * added to BOs.
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>>>> index d33e6d97cc89..5ddb955d2315 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>>>>>>>>>>            if (r)
>>>>>>>>>>>>                    return r;
>>>>>>>>>>>>
>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>> +
>>>>>>>>>>>>            *f = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>>>            amdgpu_job_free_resources(job);
>>>>>>>>>>>>            drm_sched_entity_push_job(&job->base, entity);
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>>>> index feb6da1b6ceb..05f412204118 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>>>>>>>>>>            if (ret)
>>>>>>>>>>>>                    goto out_unlock;
>>>>>>>>>>>>
>>>>>>>>>>>> +     drm_sched_job_arm(&submit->sched_job);
>>>>>>>>>>>> +
>>>>>>>>>>>>            submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>>>>>>>>>>            submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>>>>>>>>>>                                                    submit->out_fence, 0,
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>>>> index dba8329937a3..38f755580507 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>>>>>>>>>>                    return err;
>>>>>>>>>>>>            }
>>>>>>>>>>>>
>>>>>>>>>>>> +     drm_sched_job_arm(&task->base);
>>>>>>>>>>>> +
>>>>>>>>>>>>            task->num_bos = num_bos;
>>>>>>>>>>>>            task->vm = lima_vm_get(vm);
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>>>> index 71a72fb50e6b..2992dc85325f 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>>>>>>>>>>                    goto unlock;
>>>>>>>>>>>>            }
>>>>>>>>>>>>
>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>> +
>>>>>>>>>>>>            job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>>>
>>>>>>>>>>>>            ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>>>> index 79554aa4dbb1..f7347c284886 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>>>>>>>>>         * @sched_job: job to submit
>>>>>>>>>>>>         * @entity: scheduler entity
>>>>>>>>>>>>         *
>>>>>>>>>>>> - * Note: To guarantee that the order of insertion to queue matches
>>>>>>>>>>>> - * the job's fence sequence number this function should be
>>>>>>>>>>>> - * called with drm_sched_job_init under common lock.
>>>>>>>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>>>>>>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>>>>>>>>>>> + * under common lock.
>>>>>>>>>>>>         *
>>>>>>>>>>>>         * Returns 0 for success, negative error code otherwise.
>>>>>>>>>>>>         */
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>>>> index 69de2c76731f..c451ee9a30d7 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>>>>>>>>>>>>         *
>>>>>>>>>>>>         * Free up the fence memory after the RCU grace period.
>>>>>>>>>>>>         */
>>>>>>>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>>>>>>        {
>>>>>>>>>>>>            struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>>>>>>>>>>>            struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>>>>>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>>>>>>>>>        }
>>>>>>>>>>>>        EXPORT_SYMBOL(to_drm_sched_fence);
>>>>>>>>>>>>
>>>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>>>>>> -                                            void *owner)
>>>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>>>>>>>>>> +                                           void *owner)
>>>>>>>>>>>>        {
>>>>>>>>>>>>            struct drm_sched_fence *fence = NULL;
>>>>>>>>>>>> -     unsigned seq;
>>>>>>>>>>>>
>>>>>>>>>>>>            fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>>>>>>>>>>            if (fence == NULL)
>>>>>>>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>>>>>>            fence->sched = entity->rq->sched;
>>>>>>>>>>>>            spin_lock_init(&fence->lock);
>>>>>>>>>>>>
>>>>>>>>>>>> +     return fence;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>>>>>>> +                       struct drm_sched_entity *entity)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +     unsigned seq;
>>>>>>>>>>>> +
>>>>>>>>>>>>            seq = atomic_inc_return(&entity->fence_seq);
>>>>>>>>>>>>            dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>>>>>>>>>>                           &fence->lock, entity->fence_context, seq);
>>>>>>>>>>>>            dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>>>>>>>>>>                           &fence->lock, entity->fence_context + 1, seq);
>>>>>>>>>>>> -
>>>>>>>>>>>> -     return fence;
>>>>>>>>>>>>        }
>>>>>>>>>>>>
>>>>>>>>>>>>        module_init(drm_sched_fence_slab_init);
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>>>> index 33c414d55fab..5e84e1500c32 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>>>> @@ -48,9 +48,11 @@
>>>>>>>>>>>>        #include <linux/wait.h>
>>>>>>>>>>>>        #include <linux/sched.h>
>>>>>>>>>>>>        #include <linux/completion.h>
>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>        #include <uapi/linux/sched/types.h>
>>>>>>>>>>>>
>>>>>>>>>>>>        #include <drm/drm_print.h>
>>>>>>>>>>>> +#include <drm/drm_gem.h>
>>>>>>>>>>>>        #include <drm/gpu_scheduler.h>
>>>>>>>>>>>>        #include <drm/spsc_queue.h>
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>>>>>>
>>>>>>>>>>>>        /**
>>>>>>>>>>>>         * drm_sched_job_init - init a scheduler job
>>>>>>>>>>>> - *
>>>>>>>>>>>>         * @job: scheduler job to init
>>>>>>>>>>>>         * @entity: scheduler entity to use
>>>>>>>>>>>>         * @owner: job owner for debugging
>>>>>>>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>>>>>>         * Refer to drm_sched_entity_push_job() documentation
>>>>>>>>>>>>         * for locking considerations.
>>>>>>>>>>>>         *
>>>>>>>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
>>>>>>>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
>>>>>>>>>>>> + *
>>>>>>>>>>>>         * Returns 0 for success, negative error code otherwise.
>>>>>>>>>>>>         */
>>>>>>>>>>>>        int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>            job->sched = sched;
>>>>>>>>>>>>            job->entity = entity;
>>>>>>>>>>>>            job->s_priority = entity->rq - sched->sched_rq;
>>>>>>>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>>>>>>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>>>>>>>>>>            if (!job->s_fence)
>>>>>>>>>>>>                    return -ENOMEM;
>>>>>>>>>>>>            job->id = atomic64_inc_return(&sched->job_id_count);
>>>>>>>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>        EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>>>>>>>
>>>>>>>>>>>>        /**
>>>>>>>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>>>>>>>>>>> + * @job: scheduler job to arm
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>>>>>>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>>>>>>>>>>> + * or other places that need to track the completion of this job.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>>>>>>>>>>> + * considerations.
>>>>>>>>>>>>         *
>>>>>>>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>>>>>>         * @job: scheduler job to clean up
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
>>>>>>>>>>>> + * before drm_sched_job_arm() is called.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * After that point of no return @job is committed to be executed by the
>>>>>>>>>>>> + * scheduler, and this function should be called from the
>>>>>>>>>>>> + * &drm_sched_backend_ops.free_job callback.
>>>>>>>>>>>>         */
>>>>>>>>>>>>        void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>>>>>>>        {
>>>>>>>>>>>> -     dma_fence_put(&job->s_fence->finished);
>>>>>>>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
>>>>>>>>>>>> +             /* drm_sched_job_arm() has been called */
>>>>>>>>>>>> +             dma_fence_put(&job->s_fence->finished);
>>>>>>>>>>>> +     } else {
>>>>>>>>>>>> +             /* aborted job before committing to run it */
>>>>>>>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
>>>>>>>>>>>> +     }
>>>>>>>>>>>> +
>>>>>>>>>>>>            job->s_fence = NULL;
>>>>>>>>>>>>        }
>>>>>>>>>>>>        EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>>>> index 4eb354226972..5c3a99027ecd 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>>>>>>>>>>>>            if (ret)
>>>>>>>>>>>>                    return ret;
>>>>>>>>>>>>
>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>> +
>>>>>>>>>>>>            job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>>>
>>>>>>>>>>>>            /* put by scheduler job completion */
>>>>>>>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>>>>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
>>>>>>>>>>>> --- a/include/drm/gpu_scheduler.h
>>>>>>>>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>>>>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>>>>>>>>>>        int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>                           struct drm_sched_entity *entity,
>>>>>>>>>>>>                           void *owner);
>>>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>>>>>>>>>>        void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>>>>>>>>>>                                        struct drm_gpu_scheduler **sched_list,
>>>>>>>>>>>>                                           unsigned int num_sched_list);
>>>>>>>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>>>>>>>>>>                                       enum drm_sched_priority priority);
>>>>>>>>>>>>        bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>>>>>>>>>>
>>>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
>>>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>>>>>>>>>>            struct drm_sched_entity *s_entity, void *owner);
>>>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>>>>>>> +                       struct drm_sched_entity *entity);
>>>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
>>>>>>>>>>>> +
>>>>>>>>>>>>        void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>>>>>>>>>>        void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>>>>>>>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C7086790381b9415f60e708d941f78266%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613353580226578%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9GhYoGHD6TlcrW5dvT9Z%2BFukW%2F8%2BicK2t8180coEsJY%3D&amp;reserved=0
>>>
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C7086790381b9415f60e708d941f78266%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613353580236571%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Yt%2FirDjTmtDUjQS1xlYg4x5mz82cHkNyLPkNNpO31ro%3D&amp;reserved=0
Daniel Vetter July 8, 2021, 11:20 a.m. UTC | #13
On Thu, Jul 8, 2021 at 12:54 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 08.07.21 um 12:02 schrieb Daniel Vetter:
> > On Thu, Jul 08, 2021 at 09:53:00AM +0200, Christian König wrote:
> >> Am 08.07.21 um 09:19 schrieb Daniel Vetter:
> >>> On Thu, Jul 8, 2021 at 9:09 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>> On Thu, Jul 8, 2021 at 8:56 AM Christian König <christian.koenig@amd.com> wrote:
> >>>>> Am 07.07.21 um 18:32 schrieb Daniel Vetter:
> >>>>>> On Wed, Jul 7, 2021 at 2:58 PM Christian König <christian.koenig@amd.com> wrote:
> >>>>>>> Am 07.07.21 um 14:13 schrieb Daniel Vetter:
> >>>>>>>> On Wed, Jul 7, 2021 at 1:57 PM Christian König <christian.koenig@amd.com> wrote:
> >>>>>>>>> Am 07.07.21 um 13:14 schrieb Daniel Vetter:
> >>>>>>>>>> On Wed, Jul 7, 2021 at 11:30 AM Christian König
> >>>>>>>>>> <christian.koenig@amd.com> wrote:
> >>>>>>>>>>> Am 02.07.21 um 23:38 schrieb Daniel Vetter:
> >>>>>>>>>>>> This is a very confusingly named function, because not just does it
> >>>>>>>>>>>> init an object, it arms it and provides a point of no return for
> >>>>>>>>>>>> pushing a job into the scheduler. It would be nice if that's a bit
> >>>>>>>>>>>> clearer in the interface.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But the real reason is that I want to push the dependency tracking
> >>>>>>>>>>>> helpers into the scheduler code, and that means drm_sched_job_init
> >>>>>>>>>>>> must be called a lot earlier, without arming the job.
> >>>>>>>>>>>>
> >>>>>>>>>>>> v2:
> >>>>>>>>>>>> - don't change .gitignore (Steven)
> >>>>>>>>>>>> - don't forget v3d (Emma)
> >>>>>>>>>>>>
> >>>>>>>>>>>> v3: Emma noticed that I leak the memory allocated in
> >>>>>>>>>>>> drm_sched_job_init if we bail out before the point of no return in
> >>>>>>>>>>>> subsequent driver patches. To be able to fix this change
> >>>>>>>>>>>> drm_sched_job_cleanup() so it can handle being called both before and
> >>>>>>>>>>>> after drm_sched_job_arm().
> >>>>>>>>>>> Thinking more about this, I'm not sure if this really works.
> >>>>>>>>>>>
> >>>>>>>>>>> See drm_sched_job_init() was also calling drm_sched_entity_select_rq()
> >>>>>>>>>>> to update the entity->rq association.
> >>>>>>>>>>>
> >>>>>>>>>>> And that can only be done later on when we arm the fence as well.
> >>>>>>>>>> Hm yeah, but that's a bug in the existing code I think: We already
> >>>>>>>>>> fail to clean up if we fail to allocate the fences. So I think the
> >>>>>>>>>> right thing to do here is to split the checks into job_init, and do
> >>>>>>>>>> the actual arming/rq selection in job_arm? I'm not entirely sure
> >>>>>>>>>> what's all going on there, the first check looks a bit like trying to
> >>>>>>>>>> schedule before the entity is set up, which is a driver bug and should
> >>>>>>>>>> have a WARN_ON?
> >>>>>>>>> No you misunderstood me, the problem is something else.
> >>>>>>>>>
> >>>>>>>>> You asked previously why the call to drm_sched_job_init() was so late in
> >>>>>>>>> the CS.
> >>>>>>>>>
> >>>>>>>>> The reason for this was not alone the scheduler fence init, but also the
> >>>>>>>>> call to drm_sched_entity_select_rq().
> >>>>>>>> Ah ok, I think I can fix that. Needs a prep patch to first make
> >>>>>>>> drm_sched_entity_select infallible, then should be easy to do.
> >>>>>>>>
> >>>>>>>>>> The 2nd check around last_scheduled I have honeslty no idea what it's
> >>>>>>>>>> even trying to do.
> >>>>>>>>> You mean that here?
> >>>>>>>>>
> >>>>>>>>>              fence = READ_ONCE(entity->last_scheduled);
> >>>>>>>>>              if (fence && !dma_fence_is_signaled(fence))
> >>>>>>>>>                      return;
> >>>>>>>>>
> >>>>>>>>> This makes sure that load balancing is not moving the entity to a
> >>>>>>>>> different scheduler while there are still jobs running from this entity
> >>>>>>>>> on the hardware,
> >>>>>>>> Yeah after a nap that idea crossed my mind too. But now I have locking
> >>>>>>>> questions, afaiui the scheduler thread updates this, without taking
> >>>>>>>> any locks - entity dequeuing is lockless. And here we read the fence
> >>>>>>>> and then seem to yolo check whether it's signalled? What's preventing
> >>>>>>>> a use-after-free here? There's no rcu or anything going on here at
> >>>>>>>> all, and it's outside of the spinlock section, which starts a bit
> >>>>>>>> further down.
> >>>>>>> The last_scheduled fence of an entity can only change when there are
> >>>>>>> jobs on the entities queued, and we have just ruled that out in the
> >>>>>>> check before.
> >>>>>> There aren't any barriers, so the cpu could easily run the two checks
> >>>>>> the other way round. I'll ponder this and figure out where exactly we
> >>>>>> need docs for the constraint and/or barriers to make this work as
> >>>>>> intended. As-is I'm not seeing how it does ...
> >>>>> spsc_queue_count() provides the necessary barrier with the atomic_read().
> >>>> atomic_t is fully unordered, except when it's a read-modify-write
> >>> Wasn't awake yet, I think the rule is read-modify-write and return
> >>> previous value gives you full barrier. So stuff like cmpxchg, but also
> >>> a few others. See atomic_t.txt under ODERING heading (yes that
> >>> maintainer refuses to accept .rst so I can't just link you to the
> >>> right section, it's silly). get/set and even RMW atomic ops that don't
> >>> return anything are all fully unordered.
> >> As far as I know that not completely correct. The rules around atomics i
> >> once learned are:
> >>
> >> 1. Everything which modifies something is a write barrier.
> >> 2. Everything which returns something is a read barrier.
> >>
> >> And I know a whole bunch of use cases where this is relied upon in the core
> >> kernel, so I'm pretty sure that's correct.
> > That's against what the doc says, and also it would mean stuff like
> > atomic_read_acquire or smp_mb__after/before_atomic is completely pointless.
> >
> > On x86 you're right, anywhere else where there's no total store ordering I
> > you're wrong.
>
> Good to know. I always thought that atomic_read_acquire() was just for
> documentation purpose.

Maybe you mixed it up with C++ atomics (which I think are now also in
C)? Those are strongly ordered by default (you can get the weakly
ordered kernel-style one too). It's a bit unfortunate that the default
semantics are exactly opposite between kernel and userspace :-/

> > If there's code that relies on this it needs to be fixed and properly
> > documented. I did go through the squeue code a bit, and might be better to
> > just replace this with a core data structure.
>
> Well the spsc was especially crafted for this use case and performed
> quite a bit better then a double linked list.

Yeah  double-linked list is awful.

> Or what core data structure do you have in mind?

Hm I thought there's a ready-made queue primitive, but there's just
llist.h. Which I think is roughly what the scheduler queue also does.
Minus the atomic_t for counting how many there are, and aside from the
tracepoints I don't think we're using those anywhere, we just check
for is_empty in the code (from a quick look only).
-Daniel

>
> Christian.
>
> > -Daniel
> >
> >> In this case the write barrier is the atomic_dec() in spsc_queue_pop() and
> >> the read barrier is the aromic_read() in spsc_queue_count().
> >>
> >> The READ_ONCE() is actually not even necessary as far as I can see.
> >>
> >> Christian.
> >>
> >>> -Daniel
> >>>
> >>>
> >>>> atomic op, then it's a full barrier. So yeah you need more here. But
> >>>> also since you only need a read barrier on one side, and a write
> >>>> barrier on the other, you don't actually need a cpu barriers on x86.
> >>>> And READ_ONCE gives you the compiler barrier on one side at least, I
> >>>> haven't found it on the writer side yet.
> >>>>
> >>>>> But yes a comment would be really nice here. I had to think for a while
> >>>>> why we don't need this as well.
> >>>> I'm typing a patch, which after a night's sleep I realized has the
> >>>> wrong barriers. And now I'm also typing some doc improvements for
> >>>> drm_sched_entity and related functions.
> >>>>
> >>>>> Christian.
> >>>>>
> >>>>>> -Daniel
> >>>>>>
> >>>>>>> Christian.
> >>>>>>>
> >>>>>>>
> >>>>>>>> -Daniel
> >>>>>>>>
> >>>>>>>>> Regards
> >>>>>>>>> Christian.
> >>>>>>>>>
> >>>>>>>>>> -Daniel
> >>>>>>>>>>
> >>>>>>>>>>> Christian.
> >>>>>>>>>>>
> >>>>>>>>>>>> Also improve the kerneldoc for this.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
> >>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>>>>>>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> >>>>>>>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>>>>>>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
> >>>>>>>>>>>> Cc: Rob Herring <robh@kernel.org>
> >>>>>>>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> >>>>>>>>>>>> Cc: Steven Price <steven.price@arm.com>
> >>>>>>>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> >>>>>>>>>>>> Cc: David Airlie <airlied@linux.ie>
> >>>>>>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>>>>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> >>>>>>>>>>>> Cc: Kees Cook <keescook@chromium.org>
> >>>>>>>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
> >>>>>>>>>>>> Cc: Nick Terrell <terrelln@fb.com>
> >>>>>>>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>>>>>>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> >>>>>>>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
> >>>>>>>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>>>>>>> Cc: Dave Airlie <airlied@redhat.com>
> >>>>>>>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> >>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>>>>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
> >>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> >>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>>>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>>>>>>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>>>>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> >>>>>>>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>>>>>>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
> >>>>>>>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>>>>>>>>>>> Cc: etnaviv@lists.freedesktop.org
> >>>>>>>>>>>> Cc: lima@lists.freedesktop.org
> >>>>>>>>>>>> Cc: linux-media@vger.kernel.org
> >>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>>>>>>>>> Cc: Emma Anholt <emma@anholt.net>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >>>>>>>>>>>>        drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >>>>>>>>>>>>        drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >>>>>>>>>>>>        drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
> >>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
> >>>>>>>>>>>>        drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
> >>>>>>>>>>>>        drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >>>>>>>>>>>>        include/drm/gpu_scheduler.h              |  7 +++-
> >>>>>>>>>>>>        10 files changed, 74 insertions(+), 14 deletions(-)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >>>>>>>>>>>>            if (r)
> >>>>>>>>>>>>                    goto error_unlock;
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            /* No memory allocation is allowed while holding the notifier lock.
> >>>>>>>>>>>>             * The lock is held until amdgpu_cs_submit is finished and fence is
> >>>>>>>>>>>>             * added to BOs.
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>>>>>>>>> index d33e6d97cc89..5ddb955d2315 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>>>>>>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >>>>>>>>>>>>            if (r)
> >>>>>>>>>>>>                    return r;
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            *f = dma_fence_get(&job->base.s_fence->finished);
> >>>>>>>>>>>>            amdgpu_job_free_resources(job);
> >>>>>>>>>>>>            drm_sched_entity_push_job(&job->base, entity);
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>>>>>>>>> index feb6da1b6ceb..05f412204118 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>>>>>>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >>>>>>>>>>>>            if (ret)
> >>>>>>>>>>>>                    goto out_unlock;
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     drm_sched_job_arm(&submit->sched_job);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >>>>>>>>>>>>            submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >>>>>>>>>>>>                                                    submit->out_fence, 0,
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>>>>>>>>>>> index dba8329937a3..38f755580507 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>>>>>>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >>>>>>>>>>>>                    return err;
> >>>>>>>>>>>>            }
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     drm_sched_job_arm(&task->base);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            task->num_bos = num_bos;
> >>>>>>>>>>>>            task->vm = lima_vm_get(vm);
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>>>>>>>>> index 71a72fb50e6b..2992dc85325f 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>>>>>>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >>>>>>>>>>>>                    goto unlock;
> >>>>>>>>>>>>            }
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>>>>>>>>>>
> >>>>>>>>>>>>            ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>>>>>>>>> index 79554aa4dbb1..f7347c284886 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>>>>>>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >>>>>>>>>>>>         * @sched_job: job to submit
> >>>>>>>>>>>>         * @entity: scheduler entity
> >>>>>>>>>>>>         *
> >>>>>>>>>>>> - * Note: To guarantee that the order of insertion to queue matches
> >>>>>>>>>>>> - * the job's fence sequence number this function should be
> >>>>>>>>>>>> - * called with drm_sched_job_init under common lock.
> >>>>>>>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
> >>>>>>>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
> >>>>>>>>>>>> + * under common lock.
> >>>>>>>>>>>>         *
> >>>>>>>>>>>>         * Returns 0 for success, negative error code otherwise.
> >>>>>>>>>>>>         */
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>>>>>>>>> index 69de2c76731f..c451ee9a30d7 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>>>>>>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
> >>>>>>>>>>>>         *
> >>>>>>>>>>>>         * Free up the fence memory after the RCU grace period.
> >>>>>>>>>>>>         */
> >>>>>>>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
> >>>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
> >>>>>>>>>>>>        {
> >>>>>>>>>>>>            struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
> >>>>>>>>>>>>            struct drm_sched_fence *fence = to_drm_sched_fence(f);
> >>>>>>>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>>>>>>>>>>        }
> >>>>>>>>>>>>        EXPORT_SYMBOL(to_drm_sched_fence);
> >>>>>>>>>>>>
> >>>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>>>>>>>>>> -                                            void *owner)
> >>>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>>>>>>>>>>> +                                           void *owner)
> >>>>>>>>>>>>        {
> >>>>>>>>>>>>            struct drm_sched_fence *fence = NULL;
> >>>>>>>>>>>> -     unsigned seq;
> >>>>>>>>>>>>
> >>>>>>>>>>>>            fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >>>>>>>>>>>>            if (fence == NULL)
> >>>>>>>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>>>>>>>>>>            fence->sched = entity->rq->sched;
> >>>>>>>>>>>>            spin_lock_init(&fence->lock);
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     return fence;
> >>>>>>>>>>>> +}
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>>>>>>>>>>> +                       struct drm_sched_entity *entity)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> +     unsigned seq;
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            seq = atomic_inc_return(&entity->fence_seq);
> >>>>>>>>>>>>            dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >>>>>>>>>>>>                           &fence->lock, entity->fence_context, seq);
> >>>>>>>>>>>>            dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >>>>>>>>>>>>                           &fence->lock, entity->fence_context + 1, seq);
> >>>>>>>>>>>> -
> >>>>>>>>>>>> -     return fence;
> >>>>>>>>>>>>        }
> >>>>>>>>>>>>
> >>>>>>>>>>>>        module_init(drm_sched_fence_slab_init);
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>>>>>>>> index 33c414d55fab..5e84e1500c32 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>>>>>>>> @@ -48,9 +48,11 @@
> >>>>>>>>>>>>        #include <linux/wait.h>
> >>>>>>>>>>>>        #include <linux/sched.h>
> >>>>>>>>>>>>        #include <linux/completion.h>
> >>>>>>>>>>>> +#include <linux/dma-resv.h>
> >>>>>>>>>>>>        #include <uapi/linux/sched/types.h>
> >>>>>>>>>>>>
> >>>>>>>>>>>>        #include <drm/drm_print.h>
> >>>>>>>>>>>> +#include <drm/drm_gem.h>
> >>>>>>>>>>>>        #include <drm/gpu_scheduler.h>
> >>>>>>>>>>>>        #include <drm/spsc_queue.h>
> >>>>>>>>>>>>
> >>>>>>>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>>>>>>>>>>
> >>>>>>>>>>>>        /**
> >>>>>>>>>>>>         * drm_sched_job_init - init a scheduler job
> >>>>>>>>>>>> - *
> >>>>>>>>>>>>         * @job: scheduler job to init
> >>>>>>>>>>>>         * @entity: scheduler entity to use
> >>>>>>>>>>>>         * @owner: job owner for debugging
> >>>>>>>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
> >>>>>>>>>>>>         * Refer to drm_sched_entity_push_job() documentation
> >>>>>>>>>>>>         * for locking considerations.
> >>>>>>>>>>>>         *
> >>>>>>>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
> >>>>>>>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
> >>>>>>>>>>>> + *
> >>>>>>>>>>>>         * Returns 0 for success, negative error code otherwise.
> >>>>>>>>>>>>         */
> >>>>>>>>>>>>        int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>>>>>>            job->sched = sched;
> >>>>>>>>>>>>            job->entity = entity;
> >>>>>>>>>>>>            job->s_priority = entity->rq - sched->sched_rq;
> >>>>>>>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> >>>>>>>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >>>>>>>>>>>>            if (!job->s_fence)
> >>>>>>>>>>>>                    return -ENOMEM;
> >>>>>>>>>>>>            job->id = atomic64_inc_return(&sched->job_id_count);
> >>>>>>>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>>>>>>        EXPORT_SYMBOL(drm_sched_job_init);
> >>>>>>>>>>>>
> >>>>>>>>>>>>        /**
> >>>>>>>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
> >>>>>>>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
> >>>>>>>>>>>> + * @job: scheduler job to arm
> >>>>>>>>>>>> + *
> >>>>>>>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
> >>>>>>>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> >>>>>>>>>>>> + * or other places that need to track the completion of this job.
> >>>>>>>>>>>> + *
> >>>>>>>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
> >>>>>>>>>>>> + * considerations.
> >>>>>>>>>>>>         *
> >>>>>>>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
> >>>>>>>>>>>> + */
> >>>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
> >>>>>>>>>>>> +{
> >>>>>>>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
> >>>>>>>>>>>> +}
> >>>>>>>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
> >>>>>>>>>>>> +
> >>>>>>>>>>>> +/**
> >>>>>>>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
> >>>>>>>>>>>>         * @job: scheduler job to clean up
> >>>>>>>>>>>> + *
> >>>>>>>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
> >>>>>>>>>>>> + *
> >>>>>>>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
> >>>>>>>>>>>> + * before drm_sched_job_arm() is called.
> >>>>>>>>>>>> + *
> >>>>>>>>>>>> + * After that point of no return @job is committed to be executed by the
> >>>>>>>>>>>> + * scheduler, and this function should be called from the
> >>>>>>>>>>>> + * &drm_sched_backend_ops.free_job callback.
> >>>>>>>>>>>>         */
> >>>>>>>>>>>>        void drm_sched_job_cleanup(struct drm_sched_job *job)
> >>>>>>>>>>>>        {
> >>>>>>>>>>>> -     dma_fence_put(&job->s_fence->finished);
> >>>>>>>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
> >>>>>>>>>>>> +             /* drm_sched_job_arm() has been called */
> >>>>>>>>>>>> +             dma_fence_put(&job->s_fence->finished);
> >>>>>>>>>>>> +     } else {
> >>>>>>>>>>>> +             /* aborted job before committing to run it */
> >>>>>>>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
> >>>>>>>>>>>> +     }
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            job->s_fence = NULL;
> >>>>>>>>>>>>        }
> >>>>>>>>>>>>        EXPORT_SYMBOL(drm_sched_job_cleanup);
> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>>>>>>>>> index 4eb354226972..5c3a99027ecd 100644
> >>>>>>>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>>>>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> >>>>>>>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >>>>>>>>>>>>            if (ret)
> >>>>>>>>>>>>                    return ret;
> >>>>>>>>>>>>
> >>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>            job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>>>>>>>>>>
> >>>>>>>>>>>>            /* put by scheduler job completion */
> >>>>>>>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>>>>>>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
> >>>>>>>>>>>> --- a/include/drm/gpu_scheduler.h
> >>>>>>>>>>>> +++ b/include/drm/gpu_scheduler.h
> >>>>>>>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >>>>>>>>>>>>        int drm_sched_job_init(struct drm_sched_job *job,
> >>>>>>>>>>>>                           struct drm_sched_entity *entity,
> >>>>>>>>>>>>                           void *owner);
> >>>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
> >>>>>>>>>>>>        void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >>>>>>>>>>>>                                        struct drm_gpu_scheduler **sched_list,
> >>>>>>>>>>>>                                           unsigned int num_sched_list);
> >>>>>>>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>>>>>>>>>>                                       enum drm_sched_priority priority);
> >>>>>>>>>>>>        bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>>>>>>>>>>
> >>>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
> >>>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
> >>>>>>>>>>>>            struct drm_sched_entity *s_entity, void *owner);
> >>>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>>>>>>>>>>> +                       struct drm_sched_entity *entity);
> >>>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
> >>>>>>>>>>>> +
> >>>>>>>>>>>>        void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >>>>>>>>>>>>        void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >>>>>>>>>>>>
> >>>> --
> >>>> Daniel Vetter
> >>>> Software Engineer, Intel Corporation
> >>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C7086790381b9415f60e708d941f78266%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613353580226578%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9GhYoGHD6TlcrW5dvT9Z%2BFukW%2F8%2BicK2t8180coEsJY%3D&amp;reserved=0
> >>>
> >>> --
> >>> Daniel Vetter
> >>> Software Engineer, Intel Corporation
> >>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C7086790381b9415f60e708d941f78266%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613353580236571%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Yt%2FirDjTmtDUjQS1xlYg4x5mz82cHkNyLPkNNpO31ro%3D&amp;reserved=0
>
Christian König July 8, 2021, 11:28 a.m. UTC | #14
Am 08.07.21 um 13:20 schrieb Daniel Vetter:
> On Thu, Jul 8, 2021 at 12:54 PM Christian König
> <christian.koenig@amd.com> wrote:
>> [SNIP]
>>>> As far as I know that not completely correct. The rules around atomics i
>>>> once learned are:
>>>>
>>>> 1. Everything which modifies something is a write barrier.
>>>> 2. Everything which returns something is a read barrier.
>>>>
>>>> And I know a whole bunch of use cases where this is relied upon in the core
>>>> kernel, so I'm pretty sure that's correct.
>>> That's against what the doc says, and also it would mean stuff like
>>> atomic_read_acquire or smp_mb__after/before_atomic is completely pointless.
>>>
>>> On x86 you're right, anywhere else where there's no total store ordering I
>>> you're wrong.
>> Good to know. I always thought that atomic_read_acquire() was just for
>> documentation purpose.
> Maybe you mixed it up with C++ atomics (which I think are now also in
> C)? Those are strongly ordered by default (you can get the weakly
> ordered kernel-style one too). It's a bit unfortunate that the default
> semantics are exactly opposite between kernel and userspace :-/

Yeah, that's most likely it.

>>> If there's code that relies on this it needs to be fixed and properly
>>> documented. I did go through the squeue code a bit, and might be better to
>>> just replace this with a core data structure.
>> Well the spsc was especially crafted for this use case and performed
>> quite a bit better then a double linked list.
> Yeah  double-linked list is awful.
>
>> Or what core data structure do you have in mind?
> Hm I thought there's a ready-made queue primitive, but there's just
> llist.h. Which I think is roughly what the scheduler queue also does.
> Minus the atomic_t for counting how many there are, and aside from the
> tracepoints I don't think we're using those anywhere, we just check
> for is_empty in the code (from a quick look only).

I think we just need to replace the atomic_read() with 
atomic_read_acquire() and the atomic_dec() with atomic_dec_return_release().

Apart from that everything should be working as far as I can see. And 
yes llist.h doesn't really do much different, it just doesn't keeps a 
tail pointer.

Christian.

> -Daniel
>
>> Christian.
>>
>>> -Daniel
>>>
>>>> In this case the write barrier is the atomic_dec() in spsc_queue_pop() and
>>>> the read barrier is the aromic_read() in spsc_queue_count().
>>>>
>>>> The READ_ONCE() is actually not even necessary as far as I can see.
>>>>
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>>>>>
>>>>>> atomic op, then it's a full barrier. So yeah you need more here. But
>>>>>> also since you only need a read barrier on one side, and a write
>>>>>> barrier on the other, you don't actually need a cpu barriers on x86.
>>>>>> And READ_ONCE gives you the compiler barrier on one side at least, I
>>>>>> haven't found it on the writer side yet.
>>>>>>
>>>>>>> But yes a comment would be really nice here. I had to think for a while
>>>>>>> why we don't need this as well.
>>>>>> I'm typing a patch, which after a night's sleep I realized has the
>>>>>> wrong barriers. And now I'm also typing some doc improvements for
>>>>>> drm_sched_entity and related functions.
>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -Daniel
>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> Christian.
>>>>>>>>>>>
>>>>>>>>>>>> -Daniel
>>>>>>>>>>>>
>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also improve the kerneldoc for this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Acked-by: Steven Price <steven.price@arm.com> (v2)
>>>>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>>>>>>>>>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>>>>>>>>>>>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>>>>>>>>>>>> Cc: Qiang Yu <yuq825@gmail.com>
>>>>>>>>>>>>>> Cc: Rob Herring <robh@kernel.org>
>>>>>>>>>>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>>>>>>>>>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>>>>>>>>>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>>>>>>>>>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>>>>>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>>>>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>>>>>>>>>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>>>>>>>>>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>>>>>>>>>>>>> Cc: Nick Terrell <terrelln@fb.com>
>>>>>>>>>>>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>>>>>>>>>>>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>>>>>>>>>>>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>>>>>>>>>>>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>>>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>>>>>>>>>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>>>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>>>>>> Cc: Lee Jones <lee.jones@linaro.org>
>>>>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>>>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>>>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>>>>>>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>>>>>>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>>>>>>>>>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>>>>>>>>>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>>>>>>>>>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>>>>>>>>>>> Cc: etnaviv@lists.freedesktop.org
>>>>>>>>>>>>>> Cc: lima@lists.freedesktop.org
>>>>>>>>>>>>>> Cc: linux-media@vger.kernel.org
>>>>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>>>>>> Cc: Emma Anholt <emma@anholt.net>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>         drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>>>>>>>>>>>>         drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>>>>>>>>>>>>         drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>>>>>>>>>>>>         drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>>>>>>>>>>>>         drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>>>>>>>>>>>>         drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
>>>>>>>>>>>>>>         drivers/gpu/drm/scheduler/sched_fence.c  | 17 +++++----
>>>>>>>>>>>>>>         drivers/gpu/drm/scheduler/sched_main.c   | 46 +++++++++++++++++++++---
>>>>>>>>>>>>>>         drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>>>>>>>>>>>>>>         include/drm/gpu_scheduler.h              |  7 +++-
>>>>>>>>>>>>>>         10 files changed, 74 insertions(+), 14 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>>> index c5386d13eb4a..a4ec092af9a7 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>>>>>>>>>>>>             if (r)
>>>>>>>>>>>>>>                     goto error_unlock;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             /* No memory allocation is allowed while holding the notifier lock.
>>>>>>>>>>>>>>              * The lock is held until amdgpu_cs_submit is finished and fence is
>>>>>>>>>>>>>>              * added to BOs.
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>>>>>> index d33e6d97cc89..5ddb955d2315 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>>>>>>>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>>>>>>>>>>>>             if (r)
>>>>>>>>>>>>>>                     return r;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             *f = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>>>>>             amdgpu_job_free_resources(job);
>>>>>>>>>>>>>>             drm_sched_entity_push_job(&job->base, entity);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>>>>>> index feb6da1b6ceb..05f412204118 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>>>>>>>>>>>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>>>>>>>>>>>>             if (ret)
>>>>>>>>>>>>>>                     goto out_unlock;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     drm_sched_job_arm(&submit->sched_job);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>>>>>>>>>>>>             submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>>>>>>>>>>>>                                                     submit->out_fence, 0,
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>>>>>> index dba8329937a3..38f755580507 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>>>>>>>>>>>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>>>>>>>>>>>>                     return err;
>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     drm_sched_job_arm(&task->base);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             task->num_bos = num_bos;
>>>>>>>>>>>>>>             task->vm = lima_vm_get(vm);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>>>>>> index 71a72fb50e6b..2992dc85325f 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>>>>>>>>>>>> @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>>>>>>>>>>>>                     goto unlock;
>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>>>>>> index 79554aa4dbb1..f7347c284886 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>>>>>>>>>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>>>>>>>>>>>          * @sched_job: job to submit
>>>>>>>>>>>>>>          * @entity: scheduler entity
>>>>>>>>>>>>>>          *
>>>>>>>>>>>>>> - * Note: To guarantee that the order of insertion to queue matches
>>>>>>>>>>>>>> - * the job's fence sequence number this function should be
>>>>>>>>>>>>>> - * called with drm_sched_job_init under common lock.
>>>>>>>>>>>>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>>>>>>>>>>>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>>>>>>>>>>>>> + * under common lock.
>>>>>>>>>>>>>>          *
>>>>>>>>>>>>>>          * Returns 0 for success, negative error code otherwise.
>>>>>>>>>>>>>>          */
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>>>>>> index 69de2c76731f..c451ee9a30d7 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>>>>>>>>>>>>> @@ -90,7 +90,7 @@ static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>>>>>>>>>>>>>>          *
>>>>>>>>>>>>>>          * Free up the fence memory after the RCU grace period.
>>>>>>>>>>>>>>          */
>>>>>>>>>>>>>> -static void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu)
>>>>>>>>>>>>>>         {
>>>>>>>>>>>>>>             struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>>>>>>>>>>>>>>             struct drm_sched_fence *fence = to_drm_sched_fence(f);
>>>>>>>>>>>>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>         EXPORT_SYMBOL(to_drm_sched_fence);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>>>>>>>> -                                            void *owner)
>>>>>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>>>>>>>>>>>>> +                                           void *owner)
>>>>>>>>>>>>>>         {
>>>>>>>>>>>>>>             struct drm_sched_fence *fence = NULL;
>>>>>>>>>>>>>> -     unsigned seq;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>>>>>>>>>>>>             if (fence == NULL)
>>>>>>>>>>>>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>>>>>>>>>>>>             fence->sched = entity->rq->sched;
>>>>>>>>>>>>>>             spin_lock_init(&fence->lock);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     return fence;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>>>>>>>>> +                       struct drm_sched_entity *entity)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +     unsigned seq;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             seq = atomic_inc_return(&entity->fence_seq);
>>>>>>>>>>>>>>             dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>>>>>>>>>>>>                            &fence->lock, entity->fence_context, seq);
>>>>>>>>>>>>>>             dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>>>>>>>>>>>>                            &fence->lock, entity->fence_context + 1, seq);
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>> -     return fence;
>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         module_init(drm_sched_fence_slab_init);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>>>>>> index 33c414d55fab..5e84e1500c32 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>>>>>>>> @@ -48,9 +48,11 @@
>>>>>>>>>>>>>>         #include <linux/wait.h>
>>>>>>>>>>>>>>         #include <linux/sched.h>
>>>>>>>>>>>>>>         #include <linux/completion.h>
>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>         #include <uapi/linux/sched/types.h>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         #include <drm/drm_print.h>
>>>>>>>>>>>>>> +#include <drm/drm_gem.h>
>>>>>>>>>>>>>>         #include <drm/gpu_scheduler.h>
>>>>>>>>>>>>>>         #include <drm/spsc_queue.h>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @@ -569,7 +571,6 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         /**
>>>>>>>>>>>>>>          * drm_sched_job_init - init a scheduler job
>>>>>>>>>>>>>> - *
>>>>>>>>>>>>>>          * @job: scheduler job to init
>>>>>>>>>>>>>>          * @entity: scheduler entity to use
>>>>>>>>>>>>>>          * @owner: job owner for debugging
>>>>>>>>>>>>>> @@ -577,6 +578,9 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
>>>>>>>>>>>>>>          * Refer to drm_sched_entity_push_job() documentation
>>>>>>>>>>>>>>          * for locking considerations.
>>>>>>>>>>>>>>          *
>>>>>>>>>>>>>> + * Drivers must make sure drm_sched_job_cleanup() if this function returns
>>>>>>>>>>>>>> + * successfully, even when @job is aborted before drm_sched_job_arm() is called.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>          * Returns 0 for success, negative error code otherwise.
>>>>>>>>>>>>>>          */
>>>>>>>>>>>>>>         int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>>> @@ -594,7 +598,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>>>             job->sched = sched;
>>>>>>>>>>>>>>             job->entity = entity;
>>>>>>>>>>>>>>             job->s_priority = entity->rq - sched->sched_rq;
>>>>>>>>>>>>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>>>>>>>>>>>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>>>>>>>>>>>>             if (!job->s_fence)
>>>>>>>>>>>>>>                     return -ENOMEM;
>>>>>>>>>>>>>>             job->id = atomic64_inc_return(&sched->job_id_count);
>>>>>>>>>>>>>> @@ -606,13 +610,47 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>>>         EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         /**
>>>>>>>>>>>>>> - * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>>>>>>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>>>>>>>>>>>>> + * @job: scheduler job to arm
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>>>>>>>>>>>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>>>>>>>>>>>>> + * or other places that need to track the completion of this job.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>>>>>>>>>>>>> + * considerations.
>>>>>>>>>>>>>>          *
>>>>>>>>>>>>>> + * This can only be called if drm_sched_job_init() succeeded.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_sched_job_cleanup - clean up scheduler job resources
>>>>>>>>>>>>>>          * @job: scheduler job to clean up
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Cleans up the resources allocated with drm_sched_job_init().
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Drivers should call this from their error unwind code if @job is aborted
>>>>>>>>>>>>>> + * before drm_sched_job_arm() is called.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * After that point of no return @job is committed to be executed by the
>>>>>>>>>>>>>> + * scheduler, and this function should be called from the
>>>>>>>>>>>>>> + * &drm_sched_backend_ops.free_job callback.
>>>>>>>>>>>>>>          */
>>>>>>>>>>>>>>         void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>>>>>>>>>         {
>>>>>>>>>>>>>> -     dma_fence_put(&job->s_fence->finished);
>>>>>>>>>>>>>> +     if (!kref_read(&job->s_fence->finished.refcount)) {
>>>>>>>>>>>>>> +             /* drm_sched_job_arm() has been called */
>>>>>>>>>>>>>> +             dma_fence_put(&job->s_fence->finished);
>>>>>>>>>>>>>> +     } else {
>>>>>>>>>>>>>> +             /* aborted job before committing to run it */
>>>>>>>>>>>>>> +             drm_sched_fence_free(&job->s_fence->finished.rcu);
>>>>>>>>>>>>>> +     }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             job->s_fence = NULL;
>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>         EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>>>>>> index 4eb354226972..5c3a99027ecd 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
>>>>>>>>>>>>>> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>>>>>>>>>>>>>>             if (ret)
>>>>>>>>>>>>>>                     return ret;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +     drm_sched_job_arm(&job->base);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             /* put by scheduler job completion */
>>>>>>>>>>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>>>>>>>>>>> index 88ae7f331bb1..83afc3aa8e2f 100644
>>>>>>>>>>>>>> --- a/include/drm/gpu_scheduler.h
>>>>>>>>>>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>>>>>>>>>>> @@ -348,6 +348,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>>>>>>>>>>>>         int drm_sched_job_init(struct drm_sched_job *job,
>>>>>>>>>>>>>>                            struct drm_sched_entity *entity,
>>>>>>>>>>>>>>                            void *owner);
>>>>>>>>>>>>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>>>>>>>>>>>>         void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>>>>>>>>>>>>                                         struct drm_gpu_scheduler **sched_list,
>>>>>>>>>>>>>>                                            unsigned int num_sched_list);
>>>>>>>>>>>>>> @@ -387,8 +388,12 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>>>>>>>>>>>>                                        enum drm_sched_priority priority);
>>>>>>>>>>>>>>         bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -struct drm_sched_fence *drm_sched_fence_create(
>>>>>>>>>>>>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>>>>>>>>>>>>             struct drm_sched_entity *s_entity, void *owner);
>>>>>>>>>>>>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>>>>>>>>>>>>> +                       struct drm_sched_entity *entity);
>>>>>>>>>>>>>> +void drm_sched_fence_free(struct rcu_head *rcu);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>         void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>>>>>>>>>>>>         void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>>>>>>>>>>>>
>>>>>> --
>>>>>> Daniel Vetter
>>>>>> Software Engineer, Intel Corporation
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C9ff11edafb334411dbf508d942026d53%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613400464979063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=MMhNTs1WSu%2B07ho3MOap4fbbpAh2vkCd0IJ0snpYvYo%3D&amp;reserved=0
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C9ff11edafb334411dbf508d942026d53%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637613400464979063%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=MMhNTs1WSu%2B07ho3MOap4fbbpAh2vkCd0IJ0snpYvYo%3D&amp;reserved=0
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@  static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	if (r)
 		goto error_unlock;
 
+	drm_sched_job_arm(&job->base);
+
 	/* No memory allocation is allowed while holding the notifier lock.
 	 * The lock is held until amdgpu_cs_submit is finished and fence is
 	 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@  int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 	if (r)
 		return r;
 
+	drm_sched_job_arm(&job->base);
+
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
 	drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@  int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	if (ret)
 		goto out_unlock;
 
+	drm_sched_job_arm(&submit->sched_job);
+
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
 	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
 						submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index dba8329937a3..38f755580507 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@  int lima_sched_task_init(struct lima_sched_task *task,
 		return err;
 	}
 
+	drm_sched_job_arm(&task->base);
+
 	task->num_bos = num_bos;
 	task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71a72fb50e6b..2992dc85325f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -288,6 +288,8 @@  int panfrost_job_push(struct panfrost_job *job)
 		goto unlock;
 	}
 
+	drm_sched_job_arm(&job->base);
+
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 79554aa4dbb1..f7347c284886 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -485,9 +485,9 @@  void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * @sched_job: job to submit
  * @entity: scheduler entity
  *
- * Note: To guarantee that the order of insertion to queue matches
- * the job's fence sequence number this function should be
- * called with drm_sched_job_init under common lock.
+ * Note: To guarantee that the order of insertion to queue matches the job's
+ * fence sequence number this function should be called with drm_sched_job_arm()
+ * under common lock.
  *
  * Returns 0 for success, negative error code otherwise.
  */
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..c451ee9a30d7 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -90,7 +90,7 @@  static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
  *
  * Free up the fence memory after the RCU grace period.
  */
-static void drm_sched_fence_free(struct rcu_head *rcu)
+void drm_sched_fence_free(struct rcu_head *rcu)
 {
 	struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
 	struct drm_sched_fence *fence = to_drm_sched_fence(f);
@@ -152,11 +152,10 @@  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
 }
 EXPORT_SYMBOL(to_drm_sched_fence);
 
-struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
-					       void *owner)
+struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
+					      void *owner)
 {
 	struct drm_sched_fence *fence = NULL;
-	unsigned seq;
 
 	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
 	if (fence == NULL)
@@ -166,13 +165,19 @@  struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
 	fence->sched = entity->rq->sched;
 	spin_lock_init(&fence->lock);
 
+	return fence;
+}
+
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity)
+{
+	unsigned seq;
+
 	seq = atomic_inc_return(&entity->fence_seq);
 	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
 		       &fence->lock, entity->fence_context, seq);
 	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
 		       &fence->lock, entity->fence_context + 1, seq);
-
-	return fence;
 }
 
 module_init(drm_sched_fence_slab_init);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 33c414d55fab..5e84e1500c32 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,9 +48,11 @@ 
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/completion.h>
+#include <linux/dma-resv.h>
 #include <uapi/linux/sched/types.h>
 
 #include <drm/drm_print.h>
+#include <drm/drm_gem.h>
 #include <drm/gpu_scheduler.h>
 #include <drm/spsc_queue.h>
 
@@ -569,7 +571,6 @@  EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
 
 /**
  * drm_sched_job_init - init a scheduler job
- *
  * @job: scheduler job to init
  * @entity: scheduler entity to use
  * @owner: job owner for debugging
@@ -577,6 +578,9 @@  EXPORT_SYMBOL(drm_sched_resubmit_jobs_ext);
  * Refer to drm_sched_entity_push_job() documentation
  * for locking considerations.
  *
+ * Drivers must make sure drm_sched_job_cleanup() if this function returns
+ * successfully, even when @job is aborted before drm_sched_job_arm() is called.
+ *
  * Returns 0 for success, negative error code otherwise.
  */
 int drm_sched_job_init(struct drm_sched_job *job,
@@ -594,7 +598,7 @@  int drm_sched_job_init(struct drm_sched_job *job,
 	job->sched = sched;
 	job->entity = entity;
 	job->s_priority = entity->rq - sched->sched_rq;
-	job->s_fence = drm_sched_fence_create(entity, owner);
+	job->s_fence = drm_sched_fence_alloc(entity, owner);
 	if (!job->s_fence)
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
@@ -606,13 +610,47 @@  int drm_sched_job_init(struct drm_sched_job *job,
 EXPORT_SYMBOL(drm_sched_job_init);
 
 /**
- * drm_sched_job_cleanup - clean up scheduler job resources
+ * drm_sched_job_arm - arm a scheduler job for execution
+ * @job: scheduler job to arm
+ *
+ * This arms a scheduler job for execution. Specifically it initializes the
+ * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
+ * or other places that need to track the completion of this job.
+ *
+ * Refer to drm_sched_entity_push_job() documentation for locking
+ * considerations.
  *
+ * This can only be called if drm_sched_job_init() succeeded.
+ */
+void drm_sched_job_arm(struct drm_sched_job *job)
+{
+	drm_sched_fence_init(job->s_fence, job->entity);
+}
+EXPORT_SYMBOL(drm_sched_job_arm);
+
+/**
+ * drm_sched_job_cleanup - clean up scheduler job resources
  * @job: scheduler job to clean up
+ *
+ * Cleans up the resources allocated with drm_sched_job_init().
+ *
+ * Drivers should call this from their error unwind code if @job is aborted
+ * before drm_sched_job_arm() is called.
+ *
+ * After that point of no return @job is committed to be executed by the
+ * scheduler, and this function should be called from the
+ * &drm_sched_backend_ops.free_job callback.
  */
 void drm_sched_job_cleanup(struct drm_sched_job *job)
 {
-	dma_fence_put(&job->s_fence->finished);
+	if (!kref_read(&job->s_fence->finished.refcount)) {
+		/* drm_sched_job_arm() has been called */
+		dma_fence_put(&job->s_fence->finished);
+	} else {
+		/* aborted job before committing to run it */
+		drm_sched_fence_free(&job->s_fence->finished.rcu);
+	}
+
 	job->s_fence = NULL;
 }
 EXPORT_SYMBOL(drm_sched_job_cleanup);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 4eb354226972..5c3a99027ecd 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -475,6 +475,8 @@  v3d_push_job(struct v3d_file_priv *v3d_priv,
 	if (ret)
 		return ret;
 
+	drm_sched_job_arm(&job->base);
+
 	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	/* put by scheduler job completion */
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 88ae7f331bb1..83afc3aa8e2f 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -348,6 +348,7 @@  void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
+void drm_sched_job_arm(struct drm_sched_job *job);
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
@@ -387,8 +388,12 @@  void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 
-struct drm_sched_fence *drm_sched_fence_create(
+struct drm_sched_fence *drm_sched_fence_alloc(
 	struct drm_sched_entity *s_entity, void *owner);
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity);
+void drm_sched_fence_free(struct rcu_head *rcu);
+
 void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
 void drm_sched_fence_finished(struct drm_sched_fence *fence);