Message ID | 20191129135908.2439529-9-boris.brezillon@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | panfrost: Fixes for 5.4 | expand |
On 29/11/2019 13:59, Boris Brezillon wrote: > Userspace might tag a BO purgeable while it's still referenced by GPU > jobs. We need to make sure the shrinker does not purge such BOs until > all jobs referencing it are finished. > > Fixes: 013b65101315 ("drm/panfrost: Add madvise and shrinker support") > Cc: <stable@vger.kernel.org> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> I'm happy with this, but I would also argue that it was previously user space's job not to mark a BO purgeable while it's in use by the GPU. This is in some ways an ABI change so we should be sure this is something that we want to support "forever" more. But if Mesa has 'always' been assuming this behaviour we might as well match. Reviewed-by: Steven Price <steven.price@arm.com> Steve > --- > drivers/gpu/drm/panfrost/panfrost_drv.c | 1 + > drivers/gpu/drm/panfrost/panfrost_gem.h | 6 ++++++ > drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 ++ > drivers/gpu/drm/panfrost/panfrost_job.c | 7 ++++++- > 4 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c > index b406b5243b40..297c0e7304d2 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > @@ -166,6 +166,7 @@ panfrost_lookup_bos(struct drm_device *dev, > break; > } > > + atomic_inc(&bo->gpu_usecount); > job->mappings[i] = mapping; > } > > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h > index ca1bc9019600..b3517ff9630c 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_gem.h > +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h > @@ -30,6 +30,12 @@ struct panfrost_gem_object { > struct mutex lock; > } mappings; > > + /* > + * Count the number of jobs referencing this BO so we don't let the > + * shrinker reclaim this object prematurely. > + */ > + atomic_t gpu_usecount; > + > bool noexec :1; > bool is_heap :1; > }; > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c > index b36df326c860..288e46c40673 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c > +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c > @@ -41,6 +41,8 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj) > struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); > struct panfrost_gem_object *bo = to_panfrost_bo(obj); > > + if (atomic_read(&bo->gpu_usecount)) > + return false; > > if (!mutex_trylock(&shmem->pages_lock)) > return false; > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c > index c85d45be3b5e..2b12aa87ff32 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > @@ -270,8 +270,13 @@ static void panfrost_job_cleanup(struct kref *ref) > dma_fence_put(job->render_done_fence); > > if (job->mappings) { > - for (i = 0; i < job->bo_count; i++) > + for (i = 0; i < job->bo_count; i++) { > + if (!job->mappings[i]) > + break; > + > + atomic_dec(&job->mappings[i]->obj->gpu_usecount); > panfrost_gem_mapping_put(job->mappings[i]); > + } > kvfree(job->mappings); > } > >
On Fri, 29 Nov 2019 15:48:15 +0000 Steven Price <steven.price@arm.com> wrote: > On 29/11/2019 13:59, Boris Brezillon wrote: > > Userspace might tag a BO purgeable while it's still referenced by GPU > > jobs. We need to make sure the shrinker does not purge such BOs until > > all jobs referencing it are finished. > > > > Fixes: 013b65101315 ("drm/panfrost: Add madvise and shrinker support") > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> > > I'm happy with this, but I would also argue that it was previously user > space's job not to mark a BO purgeable while it's in use by the GPU. I was not aware of this design choice. > This is in some ways an ABI change so we should be sure this is > something that we want to support "forever" more. Well, in that case, the ABI change is bringing extra robustness, with AFAICT, no downside for those that were taking care of that in userspace. > But if Mesa has > 'always' been assuming this behaviour we might as well match. I think so, and VC4 seems to have the same expectations. > > Reviewed-by: Steven Price <steven.price@arm.com> Thanks for the review.
On 29/11/2019 16:07, Boris Brezillon wrote: > On Fri, 29 Nov 2019 15:48:15 +0000 > Steven Price <steven.price@arm.com> wrote: > >> On 29/11/2019 13:59, Boris Brezillon wrote: >>> Userspace might tag a BO purgeable while it's still referenced by GPU >>> jobs. We need to make sure the shrinker does not purge such BOs until >>> all jobs referencing it are finished. >>> >>> Fixes: 013b65101315 ("drm/panfrost: Add madvise and shrinker support") >>> Cc: <stable@vger.kernel.org> >>> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> >> >> I'm happy with this, but I would also argue that it was previously user >> space's job not to mark a BO purgeable while it's in use by the GPU. > > I was not aware of this design choice. When I was maintaining mali_kbase I would have said no to this ;) But thankfully I've moved on! I'm not sure whether anyone actually made a design choice for Panfrost here. >> This is in some ways an ABI change so we should be sure this is >> something that we want to support "forever" more. > > Well, in that case, the ABI change is bringing extra robustness, with > AFAICT, no downside for those that were taking care of that in > userspace. Yes, there's no downside for user space - this is just giving user space extra freedom. >> But if Mesa has >> 'always' been assuming this behaviour we might as well match. > > I think so, and VC4 seems to have the same expectations. This seems like enough justification to me. Steve >> >> Reviewed-by: Steven Price <steven.price@arm.com> > > Thanks for the review. > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel >
On 29/11/2019 1:59 pm, Boris Brezillon wrote: > Userspace might tag a BO purgeable while it's still referenced by GPU > jobs. We need to make sure the shrinker does not purge such BOs until > all jobs referencing it are finished. Nit: for extra robustness, perhaps it's worth using the refcount_t API rather than bare atomic_t? Robin. > Fixes: 013b65101315 ("drm/panfrost: Add madvise and shrinker support") > Cc: <stable@vger.kernel.org> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> > --- > drivers/gpu/drm/panfrost/panfrost_drv.c | 1 + > drivers/gpu/drm/panfrost/panfrost_gem.h | 6 ++++++ > drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 ++ > drivers/gpu/drm/panfrost/panfrost_job.c | 7 ++++++- > 4 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c > index b406b5243b40..297c0e7304d2 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > @@ -166,6 +166,7 @@ panfrost_lookup_bos(struct drm_device *dev, > break; > } > > + atomic_inc(&bo->gpu_usecount); > job->mappings[i] = mapping; > } > > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h > index ca1bc9019600..b3517ff9630c 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_gem.h > +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h > @@ -30,6 +30,12 @@ struct panfrost_gem_object { > struct mutex lock; > } mappings; > > + /* > + * Count the number of jobs referencing this BO so we don't let the > + * shrinker reclaim this object prematurely. > + */ > + atomic_t gpu_usecount; > + > bool noexec :1; > bool is_heap :1; > }; > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c > index b36df326c860..288e46c40673 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c > +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c > @@ -41,6 +41,8 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj) > struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); > struct panfrost_gem_object *bo = to_panfrost_bo(obj); > > + if (atomic_read(&bo->gpu_usecount)) > + return false; > > if (!mutex_trylock(&shmem->pages_lock)) > return false; > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c > index c85d45be3b5e..2b12aa87ff32 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_job.c > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c > @@ -270,8 +270,13 @@ static void panfrost_job_cleanup(struct kref *ref) > dma_fence_put(job->render_done_fence); > > if (job->mappings) { > - for (i = 0; i < job->bo_count; i++) > + for (i = 0; i < job->bo_count; i++) { > + if (!job->mappings[i]) > + break; > + > + atomic_dec(&job->mappings[i]->obj->gpu_usecount); > panfrost_gem_mapping_put(job->mappings[i]); > + } > kvfree(job->mappings); > } > >
On Mon, 2 Dec 2019 12:50:20 +0000 Robin Murphy <robin.murphy@arm.com> wrote: > On 29/11/2019 1:59 pm, Boris Brezillon wrote: > > Userspace might tag a BO purgeable while it's still referenced by GPU > > jobs. We need to make sure the shrinker does not purge such BOs until > > all jobs referencing it are finished. > > Nit: for extra robustness, perhaps it's worth using the refcount_t API > rather than bare atomic_t? I considered doing that. The problem is, we start counting from 0, not 1, and the refcount API assumes counters start at 0, and should never see a 0 -> 1 transition. I guess we could do if (refcount_inc_not_zero()) { ... } else { refcount_set(1); ... } so we at least get the integer overflow/underflow protection. Anyway, I'm reworking the gem_shmem code so we can refcount the sgt users (I actually re-use the page_use_count counter and turn into a refcount_t so we don't need to take the lock if it's > 0). With this change I think I won't need the gpu_usecount.
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index b406b5243b40..297c0e7304d2 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -166,6 +166,7 @@ panfrost_lookup_bos(struct drm_device *dev, break; } + atomic_inc(&bo->gpu_usecount); job->mappings[i] = mapping; } diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h index ca1bc9019600..b3517ff9630c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.h +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h @@ -30,6 +30,12 @@ struct panfrost_gem_object { struct mutex lock; } mappings; + /* + * Count the number of jobs referencing this BO so we don't let the + * shrinker reclaim this object prematurely. + */ + atomic_t gpu_usecount; + bool noexec :1; bool is_heap :1; }; diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c index b36df326c860..288e46c40673 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c @@ -41,6 +41,8 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj) struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); struct panfrost_gem_object *bo = to_panfrost_bo(obj); + if (atomic_read(&bo->gpu_usecount)) + return false; if (!mutex_trylock(&shmem->pages_lock)) return false; diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index c85d45be3b5e..2b12aa87ff32 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -270,8 +270,13 @@ static void panfrost_job_cleanup(struct kref *ref) dma_fence_put(job->render_done_fence); if (job->mappings) { - for (i = 0; i < job->bo_count; i++) + for (i = 0; i < job->bo_count; i++) { + if (!job->mappings[i]) + break; + + atomic_dec(&job->mappings[i]->obj->gpu_usecount); panfrost_gem_mapping_put(job->mappings[i]); + } kvfree(job->mappings); }
Userspace might tag a BO purgeable while it's still referenced by GPU jobs. We need to make sure the shrinker does not purge such BOs until all jobs referencing it are finished. Fixes: 013b65101315 ("drm/panfrost: Add madvise and shrinker support") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> --- drivers/gpu/drm/panfrost/panfrost_drv.c | 1 + drivers/gpu/drm/panfrost/panfrost_gem.h | 6 ++++++ drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 ++ drivers/gpu/drm/panfrost/panfrost_job.c | 7 ++++++- 4 files changed, 15 insertions(+), 1 deletion(-)