diff mbox series

drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail

Message ID 20200727213017.852589-1-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail | expand

Commit Message

Daniel Vetter July 27, 2020, 9:30 p.m. UTC
Trying to grab dma_resv_lock while in commit_tail before we've done
all the code that leads to the eventual signalling of the vblank event
(which can be a dma_fence) is deadlock-y. Don't do that.

Here the solution is easy because just grabbing locks to read
something races anyway. We don't need to bother, READ_ONCE is
equivalent. And avoids the locking issue.

v2: Also take into account tmz_surface boolean, plus just delete the
old code.

Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
DC-folks, I think this split out patch from my series here

https://lore.kernel.org/dri-devel/20200707201229.472834-1-daniel.vetter@ffwll.ch/

should be ready for review/merging. I fixed it up a bit so that it's not
just a gross hack :-)

Cheers, Daniel


---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

Comments

Christian König July 28, 2020, 6:56 a.m. UTC | #1
Am 27.07.20 um 23:30 schrieb Daniel Vetter:
> Trying to grab dma_resv_lock while in commit_tail before we've done
> all the code that leads to the eventual signalling of the vblank event
> (which can be a dma_fence) is deadlock-y. Don't do that.
>
> Here the solution is easy because just grabbing locks to read
> something races anyway. We don't need to bother, READ_ONCE is
> equivalent. And avoids the locking issue.
>
> v2: Also take into account tmz_surface boolean, plus just delete the
> old code.
>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> DC-folks, I think this split out patch from my series here
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20200707201229.472834-1-daniel.vetter%40ffwll.ch%2F&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7C8a4f5736682a4b5c943e08d832747ab1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637314823145521840&amp;sdata=qd7Nrox62Lr%2FXWbJJFVskg9RYL4%2FoRVCFjR6rUDMA5E%3D&amp;reserved=0
>
> should be ready for review/merging. I fixed it up a bit so that it's not
> just a gross hack :-)
>
> Cheers, Daniel
>
>
> ---
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 ++++++-------------
>   1 file changed, 6 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 21ec64fe5527..a20b62b1f2ef 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -6959,20 +6959,13 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>   			DRM_ERROR("Waiting for fences timed out!");
>   
>   		/*
> -		 * TODO This might fail and hence better not used, wait
> -		 * explicitly on fences instead
> -		 * and in general should be called for
> -		 * blocking commit to as per framework helpers
> +		 * We cannot reserve buffers here, which means the normal flag
> +		 * access functions don't work. Paper over this with READ_ONCE,
> +		 * but maybe the flags are invariant enough that not even that
> +		 * would be needed.
>   		 */
> -		r = amdgpu_bo_reserve(abo, true);
> -		if (unlikely(r != 0))
> -			DRM_ERROR("failed to reserve buffer before flip\n");
> -
> -		amdgpu_bo_get_tiling_flags(abo, &tiling_flags);
> -
> -		tmz_surface = amdgpu_bo_encrypted(abo);
> -
> -		amdgpu_bo_unreserve(abo);
> +		tiling_flags = READ_ONCE(abo->tiling_flags);
> +		tmz_surface = READ_ONCE(abo->flags) & AMDGPU_GEM_CREATE_ENCRYPTED;

Yeah, the abo->flags are mostly fixed after creation, especially the 
encrypted flag can't change or we corrupt page table tables. So that 
should work fine.

Anybody who picks this up feel free to add an Reviewed-by: Christian 
König <christian.koenig@amd.com>.

Regards,
Christian.

>   
>   		fill_dc_plane_info_and_addr(
>   			dm->adev, new_plane_state, tiling_flags,
Alex Deucher July 29, 2020, 10:10 p.m. UTC | #2
Applied.  Thanks!

Alex

On Tue, Jul 28, 2020 at 2:56 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 27.07.20 um 23:30 schrieb Daniel Vetter:
> > Trying to grab dma_resv_lock while in commit_tail before we've done
> > all the code that leads to the eventual signalling of the vblank event
> > (which can be a dma_fence) is deadlock-y. Don't do that.
> >
> > Here the solution is easy because just grabbing locks to read
> > something races anyway. We don't need to bother, READ_ONCE is
> > equivalent. And avoids the locking issue.
> >
> > v2: Also take into account tmz_surface boolean, plus just delete the
> > old code.
> >
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> > DC-folks, I think this split out patch from my series here
> >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20200707201229.472834-1-daniel.vetter%40ffwll.ch%2F&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7C8a4f5736682a4b5c943e08d832747ab1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637314823145521840&amp;sdata=qd7Nrox62Lr%2FXWbJJFVskg9RYL4%2FoRVCFjR6rUDMA5E%3D&amp;reserved=0
> >
> > should be ready for review/merging. I fixed it up a bit so that it's not
> > just a gross hack :-)
> >
> > Cheers, Daniel
> >
> >
> > ---
> >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 ++++++-------------
> >   1 file changed, 6 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 21ec64fe5527..a20b62b1f2ef 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -6959,20 +6959,13 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> >                       DRM_ERROR("Waiting for fences timed out!");
> >
> >               /*
> > -              * TODO This might fail and hence better not used, wait
> > -              * explicitly on fences instead
> > -              * and in general should be called for
> > -              * blocking commit to as per framework helpers
> > +              * We cannot reserve buffers here, which means the normal flag
> > +              * access functions don't work. Paper over this with READ_ONCE,
> > +              * but maybe the flags are invariant enough that not even that
> > +              * would be needed.
> >                */
> > -             r = amdgpu_bo_reserve(abo, true);
> > -             if (unlikely(r != 0))
> > -                     DRM_ERROR("failed to reserve buffer before flip\n");
> > -
> > -             amdgpu_bo_get_tiling_flags(abo, &tiling_flags);
> > -
> > -             tmz_surface = amdgpu_bo_encrypted(abo);
> > -
> > -             amdgpu_bo_unreserve(abo);
> > +             tiling_flags = READ_ONCE(abo->tiling_flags);
> > +             tmz_surface = READ_ONCE(abo->flags) & AMDGPU_GEM_CREATE_ENCRYPTED;
>
> Yeah, the abo->flags are mostly fixed after creation, especially the
> encrypted flag can't change or we corrupt page table tables. So that
> should work fine.
>
> Anybody who picks this up feel free to add an Reviewed-by: Christian
> König <christian.koenig@amd.com>.
>
> Regards,
> Christian.
>
> >
> >               fill_dc_plane_info_and_addr(
> >                       dm->adev, new_plane_state, tiling_flags,
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 21ec64fe5527..a20b62b1f2ef 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6959,20 +6959,13 @@  static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 			DRM_ERROR("Waiting for fences timed out!");
 
 		/*
-		 * TODO This might fail and hence better not used, wait
-		 * explicitly on fences instead
-		 * and in general should be called for
-		 * blocking commit to as per framework helpers
+		 * We cannot reserve buffers here, which means the normal flag
+		 * access functions don't work. Paper over this with READ_ONCE,
+		 * but maybe the flags are invariant enough that not even that
+		 * would be needed.
 		 */
-		r = amdgpu_bo_reserve(abo, true);
-		if (unlikely(r != 0))
-			DRM_ERROR("failed to reserve buffer before flip\n");
-
-		amdgpu_bo_get_tiling_flags(abo, &tiling_flags);
-
-		tmz_surface = amdgpu_bo_encrypted(abo);
-
-		amdgpu_bo_unreserve(abo);
+		tiling_flags = READ_ONCE(abo->tiling_flags);
+		tmz_surface = READ_ONCE(abo->flags) & AMDGPU_GEM_CREATE_ENCRYPTED;
 
 		fill_dc_plane_info_and_addr(
 			dm->adev, new_plane_state, tiling_flags,