diff mbox series

[1/2] accel/ivpu: Add dma fence to command buffers only

Message ID 20230331113603.2802515-2-stanislaw.gruszka@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series accel/ivpu: Fixes for linux-6.3-rc6 | expand

Commit Message

Stanislaw Gruszka March 31, 2023, 11:36 a.m. UTC
From: Karol Wachowski <karol.wachowski@linux.intel.com>

Currently job->done_fence is added to every BO handle within a job. If job
handle (command buffer) is shared between multiple submits, KMD will add
the fence in each of them. Then bo_wait_ioctl() executed on command buffer
will exit only when all jobs containing that handle are done.

This creates deadlock scenario for user mode driver in case when job handle
is added as dependency of another job, because bo_wait_ioctl() of first job
will wait until second job finishes, and second job can not finish before
first one.

Having fences added only to job buffer handle allows user space to execute
bo_wait_ioctl() on the job even if it's handle is submitted with other job.

Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 drivers/accel/ivpu/ivpu_job.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

Comments

Jeffrey Hugo April 4, 2023, 4:26 p.m. UTC | #1
On 3/31/2023 5:36 AM, Stanislaw Gruszka wrote:
> From: Karol Wachowski <karol.wachowski@linux.intel.com>
> 
> Currently job->done_fence is added to every BO handle within a job. If job
> handle (command buffer) is shared between multiple submits, KMD will add
> the fence in each of them. Then bo_wait_ioctl() executed on command buffer
> will exit only when all jobs containing that handle are done.
> 
> This creates deadlock scenario for user mode driver in case when job handle
> is added as dependency of another job, because bo_wait_ioctl() of first job
> will wait until second job finishes, and second job can not finish before
> first one.
> 
> Having fences added only to job buffer handle allows user space to execute
> bo_wait_ioctl() on the job even if it's handle is submitted with other job.
> 
> Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic")
> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Daniel Vetter April 6, 2023, 4:57 p.m. UTC | #2
On Fri, Mar 31, 2023 at 01:36:02PM +0200, Stanislaw Gruszka wrote:
> From: Karol Wachowski <karol.wachowski@linux.intel.com>
> 
> Currently job->done_fence is added to every BO handle within a job. If job
> handle (command buffer) is shared between multiple submits, KMD will add
> the fence in each of them. Then bo_wait_ioctl() executed on command buffer
> will exit only when all jobs containing that handle are done.
> 
> This creates deadlock scenario for user mode driver in case when job handle
> is added as dependency of another job, because bo_wait_ioctl() of first job
> will wait until second job finishes, and second job can not finish before
> first one.
> 
> Having fences added only to job buffer handle allows user space to execute
> bo_wait_ioctl() on the job even if it's handle is submitted with other job.
> 
> Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic")
> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

Uh this is confused on a lot of levels ...

Frist having a new driver which uses the dma_resv/bo implicit fencing for
umd synchronization is not great at all. The modern way of doing is:
- in/out dependencies are handling with drm_syncobj
- userspace waits on the drm_syncobj, not with a driver-private wait ioctl
  on a specific bo

The other issue is that the below starts to fall over once you do dynamic
memory management, for that case you _always_ have to install a fence.

Now fear not, there's a solution here:
- you always install a fence (i.e. revert this patch), but by default is a
  DMA_RESV_USAGE_BOOKKEEP fence. See the kerneldoc for enum dma_resv_usage
  for what that means.
- only for the special job bo you set the DMA_RESV_USAGE_READ flag. You
  want read because really that's what the gpu is doing for the job bo.
- the bo_wait ioctl then waits for write access internally

Should result in the same uapi as your patch here, but without abusing
dma_resv in a way that'll go boom.

Note that userspace can get at the dma_resv READ/WRITE fences through
ioctls on a dma-buf, so which one you pick here is uabi relevant.
bo-as-job-fence is USAGE_READ.

Please take care of this in -next.

Cheers, Daniel

> ---
>  drivers/accel/ivpu/ivpu_job.c | 18 +++++++-----------
>  1 file changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
> index 910301fae435..3c6f1e16cf2f 100644
> --- a/drivers/accel/ivpu/ivpu_job.c
> +++ b/drivers/accel/ivpu/ivpu_job.c
> @@ -461,26 +461,22 @@ ivpu_job_prepare_bos_for_submit(struct drm_file *file, struct ivpu_job *job, u32
>  
>  	job->cmd_buf_vpu_addr = bo->vpu_addr + commands_offset;
>  
> -	ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, buf_count,
> -					&acquire_ctx);
> +	ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx);
>  	if (ret) {
>  		ivpu_warn(vdev, "Failed to lock reservations: %d\n", ret);
>  		return ret;
>  	}
>  
> -	for (i = 0; i < buf_count; i++) {
> -		ret = dma_resv_reserve_fences(job->bos[i]->base.resv, 1);
> -		if (ret) {
> -			ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret);
> -			goto unlock_reservations;
> -		}
> +	ret = dma_resv_reserve_fences(bo->base.resv, 1);
> +	if (ret) {
> +		ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret);
> +		goto unlock_reservations;
>  	}
>  
> -	for (i = 0; i < buf_count; i++)
> -		dma_resv_add_fence(job->bos[i]->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE);
> +	dma_resv_add_fence(bo->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE);
>  
>  unlock_reservations:
> -	drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, buf_count, &acquire_ctx);
> +	drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx);
>  
>  	wmb(); /* Flush write combining buffers */
>  
> -- 
> 2.25.1
>
Stanislaw Gruszka April 12, 2023, 8:50 a.m. UTC | #3
On Thu, Apr 06, 2023 at 06:57:34PM +0200, Daniel Vetter wrote:
> On Fri, Mar 31, 2023 at 01:36:02PM +0200, Stanislaw Gruszka wrote:
> > From: Karol Wachowski <karol.wachowski@linux.intel.com>
> > 
> > Currently job->done_fence is added to every BO handle within a job. If job
> > handle (command buffer) is shared between multiple submits, KMD will add
> > the fence in each of them. Then bo_wait_ioctl() executed on command buffer
> > will exit only when all jobs containing that handle are done.
> > 
> > This creates deadlock scenario for user mode driver in case when job handle
> > is added as dependency of another job, because bo_wait_ioctl() of first job
> > will wait until second job finishes, and second job can not finish before
> > first one.
> > 
> > Having fences added only to job buffer handle allows user space to execute
> > bo_wait_ioctl() on the job even if it's handle is submitted with other job.
> > 
> > Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic")
> > Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
> > Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
> 
> Uh this is confused on a lot of levels ...
> 
> Frist having a new driver which uses the dma_resv/bo implicit fencing for
> umd synchronization is not great at all. The modern way of doing is:
> - in/out dependencies are handling with drm_syncobj
> - userspace waits on the drm_syncobj, not with a driver-private wait ioctl
>   on a specific bo

We have synobj on our TODO list, will work on it.

> The other issue is that the below starts to fall over once you do dynamic
> memory management, for that case you _always_ have to install a fence.
> 
> Now fear not, there's a solution here:
> - you always install a fence (i.e. revert this patch), but by default is a
>   DMA_RESV_USAGE_BOOKKEEP fence. See the kerneldoc for enum dma_resv_usage
>   for what that means.
> - only for the special job bo you set the DMA_RESV_USAGE_READ flag. You
>   want read because really that's what the gpu is doing for the job bo.
> - the bo_wait ioctl then waits for write access internally

In our case VPU can write to command buffer (there is special context
save area), so I think keeping USAGE_WRITE is appropriate.

> Should result in the same uapi as your patch here, but without abusing
> dma_resv in a way that'll go boom.
> 
> Note that userspace can get at the dma_resv READ/WRITE fences through
> ioctls on a dma-buf, so which one you pick here is uabi relevant.
> bo-as-job-fence is USAGE_READ.
> 
> Please take care of this in -next.

Sure.

Regards
Stanislaw
diff mbox series

Patch

diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 910301fae435..3c6f1e16cf2f 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -461,26 +461,22 @@  ivpu_job_prepare_bos_for_submit(struct drm_file *file, struct ivpu_job *job, u32
 
 	job->cmd_buf_vpu_addr = bo->vpu_addr + commands_offset;
 
-	ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, buf_count,
-					&acquire_ctx);
+	ret = drm_gem_lock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx);
 	if (ret) {
 		ivpu_warn(vdev, "Failed to lock reservations: %d\n", ret);
 		return ret;
 	}
 
-	for (i = 0; i < buf_count; i++) {
-		ret = dma_resv_reserve_fences(job->bos[i]->base.resv, 1);
-		if (ret) {
-			ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret);
-			goto unlock_reservations;
-		}
+	ret = dma_resv_reserve_fences(bo->base.resv, 1);
+	if (ret) {
+		ivpu_warn(vdev, "Failed to reserve fences: %d\n", ret);
+		goto unlock_reservations;
 	}
 
-	for (i = 0; i < buf_count; i++)
-		dma_resv_add_fence(job->bos[i]->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE);
+	dma_resv_add_fence(bo->base.resv, job->done_fence, DMA_RESV_USAGE_WRITE);
 
 unlock_reservations:
-	drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, buf_count, &acquire_ctx);
+	drm_gem_unlock_reservations((struct drm_gem_object **)job->bos, 1, &acquire_ctx);
 
 	wmb(); /* Flush write combining buffers */