Message ID | 20250109133710.39404-4-phasta@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | drm/sched: Documentation and refcount improvements | expand |
On 09/01/2025 13:37, Philipp Stanner wrote: > From: Philipp Stanner <pstanner@redhat.com> > > drm_sched_backend_ops.run_job() returns a dma_fence for the scheduler. > That fence is signalled by the driver once the hardware completed the > associated job. The scheduler does not increment the reference count on > that fence, but implicitly expects to inherit this fence from run_job(). > > This is relatively subtle and prone to misunderstandings. > > This implies that, to keep a reference for itself, a driver needs to > call dma_fence_get() in addition to dma_fence_init() in that callback. > > It's further complicated by the fact that the scheduler even decrements > the refcount in drm_sched_run_job_work() since it created a new > reference in drm_sched_fence_scheduled(). It does, however, still use > its pointer to the fence after calling dma_fence_put() - which is safe > because of the aforementioned new reference, but actually still violates > the refcounting rules. > > Improve the explanatory comment for that decrement. > > Move the call to dma_fence_put() to the position behind the last usage > of the fence. > > Document the necessity to increment the reference count in > drm_sched_backend_ops.run_job(). > > Signed-off-by: Philipp Stanner <pstanner@redhat.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 10 +++++++--- > include/drm/gpu_scheduler.h | 19 +++++++++++++++---- > 2 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 57da84908752..5f46c01eb01e 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1218,15 +1218,19 @@ static void drm_sched_run_job_work(struct work_struct *w) > drm_sched_fence_scheduled(s_fence, fence); > > if (!IS_ERR_OR_NULL(fence)) { > - /* Drop for original kref_init of the fence */ > - dma_fence_put(fence); > - > r = dma_fence_add_callback(fence, &sched_job->cb, > drm_sched_job_done_cb); > if (r == -ENOENT) > drm_sched_job_done(sched_job, fence->error); > else if (r) > DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); > + > + /* > + * s_fence took a new reference to fence in the call to > + * drm_sched_fence_scheduled() above. The reference passed by > + * run_job() above is now not needed any longer. Drop it. > + */ > + dma_fence_put(fence); > } else { > drm_sched_job_done(sched_job, IS_ERR(fence) ? > PTR_ERR(fence) : 0); > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 95e17504e46a..d5cd2a78f27c 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -420,10 +420,21 @@ struct drm_sched_backend_ops { > struct drm_sched_entity *s_entity); > > /** > - * @run_job: Called to execute the job once all of the dependencies > - * have been resolved. This may be called multiple times, if > - * timedout_job() has happened and drm_sched_job_recovery() > - * decides to try it again. > + * @run_job: Called to execute the job once all of the dependencies > + * have been resolved. This may be called multiple times, if > + * timedout_job() has happened and drm_sched_job_recovery() decides to > + * try it again. > + * > + * @sched_job: the job to run > + * > + * Returns: dma_fence the driver must signal once the hardware has > + * completed the job ("hardware fence"). > + * > + * Note that the scheduler expects to 'inherit' its own reference to > + * this fence from the callback. It does not invoke an extra > + * dma_fence_get() on it. Consequently, this callback must take a > + * reference for the scheduler, and additional ones for the driver's > + * respective needs. Another thing which would be really good to document here is what other things run_job can return apart from the fence. For instance amdgpu can return an ERR_PTR but I haven't looked into other drivers. Regards, Tvrtko > */ > struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); >
On Thu, 2025-01-09 at 13:44 +0000, Tvrtko Ursulin wrote: > > On 09/01/2025 13:37, Philipp Stanner wrote: > > From: Philipp Stanner <pstanner@redhat.com> > > > > drm_sched_backend_ops.run_job() returns a dma_fence for the > > scheduler. > > That fence is signalled by the driver once the hardware completed > > the > > associated job. The scheduler does not increment the reference > > count on > > that fence, but implicitly expects to inherit this fence from > > run_job(). > > > > This is relatively subtle and prone to misunderstandings. > > > > This implies that, to keep a reference for itself, a driver needs > > to > > call dma_fence_get() in addition to dma_fence_init() in that > > callback. > > > > It's further complicated by the fact that the scheduler even > > decrements > > the refcount in drm_sched_run_job_work() since it created a new > > reference in drm_sched_fence_scheduled(). It does, however, still > > use > > its pointer to the fence after calling dma_fence_put() - which is > > safe > > because of the aforementioned new reference, but actually still > > violates > > the refcounting rules. > > > > Improve the explanatory comment for that decrement. > > > > Move the call to dma_fence_put() to the position behind the last > > usage > > of the fence. > > > > Document the necessity to increment the reference count in > > drm_sched_backend_ops.run_job(). > > > > Signed-off-by: Philipp Stanner <pstanner@redhat.com> > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 10 +++++++--- > > include/drm/gpu_scheduler.h | 19 +++++++++++++++---- > > 2 files changed, 22 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index 57da84908752..5f46c01eb01e 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -1218,15 +1218,19 @@ static void drm_sched_run_job_work(struct > > work_struct *w) > > drm_sched_fence_scheduled(s_fence, fence); > > > > if (!IS_ERR_OR_NULL(fence)) { > > - /* Drop for original kref_init of the fence */ > > - dma_fence_put(fence); > > - > > r = dma_fence_add_callback(fence, &sched_job->cb, > > drm_sched_job_done_cb); > > if (r == -ENOENT) > > drm_sched_job_done(sched_job, fence- > > >error); > > else if (r) > > DRM_DEV_ERROR(sched->dev, "fence add > > callback failed (%d)\n", r); > > + > > + /* > > + * s_fence took a new reference to fence in the > > call to > > + * drm_sched_fence_scheduled() above. The > > reference passed by > > + * run_job() above is now not needed any longer. > > Drop it. > > + */ > > + dma_fence_put(fence); > > } else { > > drm_sched_job_done(sched_job, IS_ERR(fence) ? > > PTR_ERR(fence) : 0); > > diff --git a/include/drm/gpu_scheduler.h > > b/include/drm/gpu_scheduler.h > > index 95e17504e46a..d5cd2a78f27c 100644 > > --- a/include/drm/gpu_scheduler.h > > +++ b/include/drm/gpu_scheduler.h > > @@ -420,10 +420,21 @@ struct drm_sched_backend_ops { > > struct drm_sched_entity > > *s_entity); > > > > /** > > - * @run_job: Called to execute the job once all of the > > dependencies > > - * have been resolved. This may be called multiple times, > > if > > - * timedout_job() has happened and > > drm_sched_job_recovery() > > - * decides to try it again. > > + * @run_job: Called to execute the job once all of the > > dependencies > > + * have been resolved. This may be called multiple times, > > if > > + * timedout_job() has happened and > > drm_sched_job_recovery() decides to > > + * try it again. > > + * > > + * @sched_job: the job to run > > + * > > + * Returns: dma_fence the driver must signal once the > > hardware has > > + * completed the job ("hardware fence"). > > + * > > + * Note that the scheduler expects to 'inherit' its own > > reference to > > + * this fence from the callback. It does not invoke an > > extra > > + * dma_fence_get() on it. Consequently, this callback must > > take a > > + * reference for the scheduler, and additional ones for > > the driver's > > + * respective needs. > > Another thing which would be really good to document here is what > other > things run_job can return apart from the fence. > > For instance amdgpu can return an ERR_PTR but I haven't looked into > other drivers. ERR_PTR and NULL are both fine. But good catch, I'll add that in v2 P. > > Regards, > > Tvrtko > > > */ > > struct dma_fence *(*run_job)(struct drm_sched_job > > *sched_job); > > >
Am 09.01.25 um 14:37 schrieb Philipp Stanner: > From: Philipp Stanner <pstanner@redhat.com> > > drm_sched_backend_ops.run_job() returns a dma_fence for the scheduler. > That fence is signalled by the driver once the hardware completed the > associated job. The scheduler does not increment the reference count on > that fence, but implicitly expects to inherit this fence from run_job(). > > This is relatively subtle and prone to misunderstandings. > > This implies that, to keep a reference for itself, a driver needs to > call dma_fence_get() in addition to dma_fence_init() in that callback. > > It's further complicated by the fact that the scheduler even decrements > the refcount in drm_sched_run_job_work() since it created a new > reference in drm_sched_fence_scheduled(). It does, however, still use > its pointer to the fence after calling dma_fence_put() - which is safe > because of the aforementioned new reference, but actually still violates > the refcounting rules. > > Improve the explanatory comment for that decrement. > > Move the call to dma_fence_put() to the position behind the last usage > of the fence. > > Document the necessity to increment the reference count in > drm_sched_backend_ops.run_job(). > > Signed-off-by: Philipp Stanner <pstanner@redhat.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 10 +++++++--- > include/drm/gpu_scheduler.h | 19 +++++++++++++++---- > 2 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 57da84908752..5f46c01eb01e 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1218,15 +1218,19 @@ static void drm_sched_run_job_work(struct work_struct *w) > drm_sched_fence_scheduled(s_fence, fence); > > if (!IS_ERR_OR_NULL(fence)) { > - /* Drop for original kref_init of the fence */ > - dma_fence_put(fence); > - > r = dma_fence_add_callback(fence, &sched_job->cb, > drm_sched_job_done_cb); > if (r == -ENOENT) > drm_sched_job_done(sched_job, fence->error); > else if (r) > DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); > + > + /* > + * s_fence took a new reference to fence in the call to > + * drm_sched_fence_scheduled() above. The reference passed by > + * run_job() above is now not needed any longer. Drop it. > + */ > + dma_fence_put(fence); > } else { > drm_sched_job_done(sched_job, IS_ERR(fence) ? > PTR_ERR(fence) : 0); > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 95e17504e46a..d5cd2a78f27c 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -420,10 +420,21 @@ struct drm_sched_backend_ops { > struct drm_sched_entity *s_entity); > > /** > - * @run_job: Called to execute the job once all of the dependencies > - * have been resolved. This may be called multiple times, if > - * timedout_job() has happened and drm_sched_job_recovery() > - * decides to try it again. > + * @run_job: Called to execute the job once all of the dependencies > + * have been resolved. This may be called multiple times, if > + * timedout_job() has happened and drm_sched_job_recovery() decides to > + * try it again. I just came to realize that this hack with calling run_job multiple times won't work any more with this patch here. The background is that you can't allocate memory for a newly returned fence and as far as I know no driver pre allocates multiple HW fences for a job. So at least amdgpu used to re-use the same HW fence as return over and over again, just re-initialized the reference count. I removed that hack from amdgpu, but just FYI it could be that other driver did the same. Apart from that concern I think that this patch is really the right thing and that driver hacks relying on the order of dropping references are fundamentally broken approaches. So feel free to add Reviewed-by: Christian König <christian.koenig@amd.com>. Regards, Christian. > + * > + * @sched_job: the job to run > + * > + * Returns: dma_fence the driver must signal once the hardware has > + * completed the job ("hardware fence"). > + * > + * Note that the scheduler expects to 'inherit' its own reference to > + * this fence from the callback. It does not invoke an extra > + * dma_fence_get() on it. Consequently, this callback must take a > + * reference for the scheduler, and additional ones for the driver's > + * respective needs. > */ > struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); >
On Thu, Jan 09, 2025 at 02:37:10PM +0100, Philipp Stanner wrote: > From: Philipp Stanner <pstanner@redhat.com> > > drm_sched_backend_ops.run_job() returns a dma_fence for the scheduler. > That fence is signalled by the driver once the hardware completed the > associated job. The scheduler does not increment the reference count on > that fence, but implicitly expects to inherit this fence from run_job(). > > This is relatively subtle and prone to misunderstandings. > > This implies that, to keep a reference for itself, a driver needs to > call dma_fence_get() in addition to dma_fence_init() in that callback. > > It's further complicated by the fact that the scheduler even decrements > the refcount in drm_sched_run_job_work() since it created a new > reference in drm_sched_fence_scheduled(). It does, however, still use > its pointer to the fence after calling dma_fence_put() - which is safe > because of the aforementioned new reference, but actually still violates > the refcounting rules. > > Improve the explanatory comment for that decrement. > > Move the call to dma_fence_put() to the position behind the last usage > of the fence. > > Document the necessity to increment the reference count in > drm_sched_backend_ops.run_job(). > > Signed-off-by: Philipp Stanner <pstanner@redhat.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 10 +++++++--- > include/drm/gpu_scheduler.h | 19 +++++++++++++++---- > 2 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 57da84908752..5f46c01eb01e 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1218,15 +1218,19 @@ static void drm_sched_run_job_work(struct work_struct *w) > drm_sched_fence_scheduled(s_fence, fence); > > if (!IS_ERR_OR_NULL(fence)) { > - /* Drop for original kref_init of the fence */ > - dma_fence_put(fence); > - > r = dma_fence_add_callback(fence, &sched_job->cb, > drm_sched_job_done_cb); > if (r == -ENOENT) > drm_sched_job_done(sched_job, fence->error); > else if (r) > DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); > + > + /* > + * s_fence took a new reference to fence in the call to > + * drm_sched_fence_scheduled() above. The reference passed by I think mentioning that in this context is a bit misleading. The reason we can put the fence here, is because we stop using the local fence pointer we have a reference for (from run_job()). This has nothing to do with the fact that drm_sched_fence_scheduled() took its own reference when it stored a copy of this fence pointer in a separate data structure. With that fixed, Reviewed-by: Danilo Krummrich <dakr@kernel.org> > + * run_job() above is now not needed any longer. Drop it. > + */ > + dma_fence_put(fence); > } else { > drm_sched_job_done(sched_job, IS_ERR(fence) ? > PTR_ERR(fence) : 0); > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 95e17504e46a..d5cd2a78f27c 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -420,10 +420,21 @@ struct drm_sched_backend_ops { > struct drm_sched_entity *s_entity); > > /** > - * @run_job: Called to execute the job once all of the dependencies > - * have been resolved. This may be called multiple times, if > - * timedout_job() has happened and drm_sched_job_recovery() > - * decides to try it again. > + * @run_job: Called to execute the job once all of the dependencies > + * have been resolved. This may be called multiple times, if > + * timedout_job() has happened and drm_sched_job_recovery() decides to > + * try it again. > + * > + * @sched_job: the job to run > + * > + * Returns: dma_fence the driver must signal once the hardware has > + * completed the job ("hardware fence"). > + * > + * Note that the scheduler expects to 'inherit' its own reference to > + * this fence from the callback. It does not invoke an extra > + * dma_fence_get() on it. Consequently, this callback must take a > + * reference for the scheduler, and additional ones for the driver's > + * respective needs. > */ > struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); > > -- > 2.47.1 >
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 57da84908752..5f46c01eb01e 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1218,15 +1218,19 @@ static void drm_sched_run_job_work(struct work_struct *w) drm_sched_fence_scheduled(s_fence, fence); if (!IS_ERR_OR_NULL(fence)) { - /* Drop for original kref_init of the fence */ - dma_fence_put(fence); - r = dma_fence_add_callback(fence, &sched_job->cb, drm_sched_job_done_cb); if (r == -ENOENT) drm_sched_job_done(sched_job, fence->error); else if (r) DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); + + /* + * s_fence took a new reference to fence in the call to + * drm_sched_fence_scheduled() above. The reference passed by + * run_job() above is now not needed any longer. Drop it. + */ + dma_fence_put(fence); } else { drm_sched_job_done(sched_job, IS_ERR(fence) ? PTR_ERR(fence) : 0); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 95e17504e46a..d5cd2a78f27c 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -420,10 +420,21 @@ struct drm_sched_backend_ops { struct drm_sched_entity *s_entity); /** - * @run_job: Called to execute the job once all of the dependencies - * have been resolved. This may be called multiple times, if - * timedout_job() has happened and drm_sched_job_recovery() - * decides to try it again. + * @run_job: Called to execute the job once all of the dependencies + * have been resolved. This may be called multiple times, if + * timedout_job() has happened and drm_sched_job_recovery() decides to + * try it again. + * + * @sched_job: the job to run + * + * Returns: dma_fence the driver must signal once the hardware has + * completed the job ("hardware fence"). + * + * Note that the scheduler expects to 'inherit' its own reference to + * this fence from the callback. It does not invoke an extra + * dma_fence_get() on it. Consequently, this callback must take a + * reference for the scheduler, and additional ones for the driver's + * respective needs. */ struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);