diff mbox series

[1/2] drm/ttm: fix locking in vmap/vunmap TTM GEM helpers

Message ID 20220715111533.467012-1-christian.koenig@amd.com (mailing list archive)
State New, archived
Headers show
Series [1/2] drm/ttm: fix locking in vmap/vunmap TTM GEM helpers | expand

Commit Message

Christian König July 15, 2022, 11:15 a.m. UTC
I've stumbled over this while reviewing patches for DMA-buf and it looks
like we completely messed the locking up here.

In general most TTM function should only be called while holding the
appropriate BO resv lock. Without this we could break the internal
buffer object state here.

Only compile tested!

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 43676605f890 drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
---
 drivers/gpu/drm/drm_gem_ttm_helper.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Dmitry Osipenko July 15, 2022, 11:31 a.m. UTC | #1
On 7/15/22 14:15, Christian König wrote:
> I've stumbled over this while reviewing patches for DMA-buf and it looks
> like we completely messed the locking up here.
> 
> In general most TTM function should only be called while holding the
> appropriate BO resv lock. Without this we could break the internal
> buffer object state here.

Could you please clarify which part of the TTM state is affected?
Christian König July 15, 2022, 11:33 a.m. UTC | #2
Am 15.07.22 um 13:31 schrieb Dmitry Osipenko:
> On 7/15/22 14:15, Christian König wrote:
>> I've stumbled over this while reviewing patches for DMA-buf and it looks
>> like we completely messed the locking up here.
>>
>> In general most TTM function should only be called while holding the
>> appropriate BO resv lock. Without this we could break the internal
>> buffer object state here.
> Could you please clarify which part of the TTM state is affected?

The ttm_bo_vmap() function calls ttm_mem_io_reserve() with bo->resource 
as parameter.

Since bo->resource is protected by the lock we must make sure that we 
are holding the lock while doing this.

Regards,
Christian.
Dmitry Osipenko July 15, 2022, 12:52 p.m. UTC | #3
On 7/15/22 14:33, Christian König wrote:
> Am 15.07.22 um 13:31 schrieb Dmitry Osipenko:
>> On 7/15/22 14:15, Christian König wrote:
>>> I've stumbled over this while reviewing patches for DMA-buf and it looks
>>> like we completely messed the locking up here.
>>>
>>> In general most TTM function should only be called while holding the
>>> appropriate BO resv lock. Without this we could break the internal
>>> buffer object state here.
>> Could you please clarify which part of the TTM state is affected?
> 
> The ttm_bo_vmap() function calls ttm_mem_io_reserve() with bo->resource
> as parameter.
> 
> Since bo->resource is protected by the lock we must make sure that we
> are holding the lock while doing this.

Thanks, I see now that the resource members need locking.
Thomas Zimmermann July 15, 2022, 3:36 p.m. UTC | #4
Hi

Am 15.07.22 um 13:15 schrieb Christian König:
> I've stumbled over this while reviewing patches for DMA-buf and it looks
> like we completely messed the locking up here.
> 
> In general most TTM function should only be called while holding the
> appropriate BO resv lock. Without this we could break the internal
> buffer object state here.

I remember that I tried to fix this before and you mentioned that it's 
not allowed to hold this lock here. It would possibly deadlock with 
exported buffers. Did this change with Dmitry's rework of the locking 
semantics?

Best regards
Thomas

> 
> Only compile tested!
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Fixes: 43676605f890 drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
> ---
>   drivers/gpu/drm/drm_gem_ttm_helper.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c b/drivers/gpu/drm/drm_gem_ttm_helper.c
> index d5962a34c01d..e5fc875990c4 100644
> --- a/drivers/gpu/drm/drm_gem_ttm_helper.c
> +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
> @@ -64,8 +64,13 @@ int drm_gem_ttm_vmap(struct drm_gem_object *gem,
>   		     struct iosys_map *map)
>   {
>   	struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
> +	int ret;
> +
> +	dma_resv_lock(gem->resv, NULL);
> +	ret = ttm_bo_vmap(bo, map);
> +	dma_resv_unlock(gem->resv);
>   
> -	return ttm_bo_vmap(bo, map);
> +	return ret;
>   }
>   EXPORT_SYMBOL(drm_gem_ttm_vmap);
>   
> @@ -82,7 +87,9 @@ void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
>   {
>   	struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>   
> +	dma_resv_lock(gem->resv, NULL);
>   	ttm_bo_vunmap(bo, map);
> +	dma_resv_unlock(gem->resv);
>   }
>   EXPORT_SYMBOL(drm_gem_ttm_vunmap);
>
Christian König July 15, 2022, 3:58 p.m. UTC | #5
Am 15.07.22 um 17:36 schrieb Thomas Zimmermann:
> Hi
>
> Am 15.07.22 um 13:15 schrieb Christian König:
>> I've stumbled over this while reviewing patches for DMA-buf and it looks
>> like we completely messed the locking up here.
>>
>> In general most TTM function should only be called while holding the
>> appropriate BO resv lock. Without this we could break the internal
>> buffer object state here.
>
> I remember that I tried to fix this before and you mentioned that it's 
> not allowed to hold this lock here. It would possibly deadlock with 
> exported buffers. Did this change with Dmitry's rework of the locking 
> semantics?

No, can you point me to that?

My assumption was that drm_gem_vmap()/vunmap() is always called with the 
lock held, but Dmitry's work is now suggesting otherwise.

We certainly need to hold the lock when we call ttm_bo_vmap()/vunmap() 
because it needs to access the bo->resource.

Thanks,
Christian.

>
> Best regards
> Thomas
>
>>
>> Only compile tested!
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> Fixes: 43676605f890 drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
>> ---
>>   drivers/gpu/drm/drm_gem_ttm_helper.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c 
>> b/drivers/gpu/drm/drm_gem_ttm_helper.c
>> index d5962a34c01d..e5fc875990c4 100644
>> --- a/drivers/gpu/drm/drm_gem_ttm_helper.c
>> +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
>> @@ -64,8 +64,13 @@ int drm_gem_ttm_vmap(struct drm_gem_object *gem,
>>                struct iosys_map *map)
>>   {
>>       struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>> +    int ret;
>> +
>> +    dma_resv_lock(gem->resv, NULL);
>> +    ret = ttm_bo_vmap(bo, map);
>> +    dma_resv_unlock(gem->resv);
>>   -    return ttm_bo_vmap(bo, map);
>> +    return ret;
>>   }
>>   EXPORT_SYMBOL(drm_gem_ttm_vmap);
>>   @@ -82,7 +87,9 @@ void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
>>   {
>>       struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>>   +    dma_resv_lock(gem->resv, NULL);
>>       ttm_bo_vunmap(bo, map);
>> +    dma_resv_unlock(gem->resv);
>>   }
>>   EXPORT_SYMBOL(drm_gem_ttm_vunmap);
>
Dmitry Osipenko July 17, 2022, 9:34 p.m. UTC | #6
On 7/15/22 18:58, Christian König wrote:
> Am 15.07.22 um 17:36 schrieb Thomas Zimmermann:
>> Hi
>>
>> Am 15.07.22 um 13:15 schrieb Christian König:
>>> I've stumbled over this while reviewing patches for DMA-buf and it looks
>>> like we completely messed the locking up here.
>>>
>>> In general most TTM function should only be called while holding the
>>> appropriate BO resv lock. Without this we could break the internal
>>> buffer object state here.
>>
>> I remember that I tried to fix this before and you mentioned that it's
>> not allowed to hold this lock here. It would possibly deadlock with
>> exported buffers. Did this change with Dmitry's rework of the locking
>> semantics?
> 
> No, can you point me to that?
> 
> My assumption was that drm_gem_vmap()/vunmap() is always called with the
> lock held, but Dmitry's work is now suggesting otherwise.
> 
> We certainly need to hold the lock when we call ttm_bo_vmap()/vunmap()
> because it needs to access the bo->resource.

The today's vmapping code paths all should be unlocked for the exported
TTM dma-bufs. My locking patches will change that. To me the Christian's
fix looks good. If there are out-of-tree drivers that do the locking on
the importer side, then we don't really care about them in the upstream
and they will have to adapt.

Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Thomas Zimmermann July 19, 2022, 7:19 a.m. UTC | #7
Hi

Am 15.07.22 um 17:58 schrieb Christian König:
> Am 15.07.22 um 17:36 schrieb Thomas Zimmermann:
>> Hi
>>
>> Am 15.07.22 um 13:15 schrieb Christian König:
>>> I've stumbled over this while reviewing patches for DMA-buf and it looks
>>> like we completely messed the locking up here.
>>>
>>> In general most TTM function should only be called while holding the
>>> appropriate BO resv lock. Without this we could break the internal
>>> buffer object state here.
>>
>> I remember that I tried to fix this before and you mentioned that it's 
>> not allowed to hold this lock here. It would possibly deadlock with 
>> exported buffers. Did this change with Dmitry's rework of the locking 
>> semantics?
> 
> No, can you point me to that?

I thought it was somewhere in the lengthy discussion at

 
https://lore.kernel.org/dri-devel/20201112132117.27228-1-tzimmermann@suse.de/

but now I cannot find it any longer. :/

> 
> My assumption was that drm_gem_vmap()/vunmap() is always called with the 
> lock held, but Dmitry's work is now suggesting otherwise.

IIRC the lock was supposed to be held, but every drivers does it 
differently. And dma-buf locking essentially worked by chance.

Best regards
Thomas

> 
> We certainly need to hold the lock when we call ttm_bo_vmap()/vunmap() 
> because it needs to access the bo->resource.
> 
> Thanks,
> Christian.
> 
>>
>> Best regards
>> Thomas
>>
>>>
>>> Only compile tested!
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> Fixes: 43676605f890 drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
>>> ---
>>>   drivers/gpu/drm/drm_gem_ttm_helper.c | 9 ++++++++-
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c 
>>> b/drivers/gpu/drm/drm_gem_ttm_helper.c
>>> index d5962a34c01d..e5fc875990c4 100644
>>> --- a/drivers/gpu/drm/drm_gem_ttm_helper.c
>>> +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
>>> @@ -64,8 +64,13 @@ int drm_gem_ttm_vmap(struct drm_gem_object *gem,
>>>                struct iosys_map *map)
>>>   {
>>>       struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>>> +    int ret;
>>> +
>>> +    dma_resv_lock(gem->resv, NULL);
>>> +    ret = ttm_bo_vmap(bo, map);
>>> +    dma_resv_unlock(gem->resv);
>>>   -    return ttm_bo_vmap(bo, map);
>>> +    return ret;
>>>   }
>>>   EXPORT_SYMBOL(drm_gem_ttm_vmap);
>>>   @@ -82,7 +87,9 @@ void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
>>>   {
>>>       struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>>>   +    dma_resv_lock(gem->resv, NULL);
>>>       ttm_bo_vunmap(bo, map);
>>> +    dma_resv_unlock(gem->resv);
>>>   }
>>>   EXPORT_SYMBOL(drm_gem_ttm_vunmap);
>>
>
Christian König July 19, 2022, 7:27 a.m. UTC | #8
Am 19.07.22 um 09:19 schrieb Thomas Zimmermann:
> Hi
>
> Am 15.07.22 um 17:58 schrieb Christian König:
>> Am 15.07.22 um 17:36 schrieb Thomas Zimmermann:
>>> Hi
>>>
>>> Am 15.07.22 um 13:15 schrieb Christian König:
>>>> I've stumbled over this while reviewing patches for DMA-buf and it 
>>>> looks
>>>> like we completely messed the locking up here.
>>>>
>>>> In general most TTM function should only be called while holding the
>>>> appropriate BO resv lock. Without this we could break the internal
>>>> buffer object state here.
>>>
>>> I remember that I tried to fix this before and you mentioned that 
>>> it's not allowed to hold this lock here. It would possibly deadlock 
>>> with exported buffers. Did this change with Dmitry's rework of the 
>>> locking semantics?
>>
>> No, can you point me to that?
>
> I thought it was somewhere in the lengthy discussion at
>
>
> https://lore.kernel.org/dri-devel/20201112132117.27228-1-tzimmermann@suse.de/ 
>
>
> but now I cannot find it any longer. :/
>
>>
>> My assumption was that drm_gem_vmap()/vunmap() is always called with 
>> the lock held, but Dmitry's work is now suggesting otherwise.
>
> IIRC the lock was supposed to be held, but every drivers does it 
> differently. And dma-buf locking essentially worked by chance.

I guess we where just lucky that nobody moved the BO while it was being 
mapped.

I've now pushed the patch to take the look to drm-misc-fixes. If some 
call path really called this while already holding the locks we will 
obviously run into trouble, but I think that should be pretty easy to 
sort out.

Regards,
Christian.

>
> Best regards
> Thomas
>
>>
>> We certainly need to hold the lock when we call 
>> ttm_bo_vmap()/vunmap() because it needs to access the bo->resource.
>>
>> Thanks,
>> Christian.
>>
>>>
>>> Best regards
>>> Thomas
>>>
>>>>
>>>> Only compile tested!
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> Fixes: 43676605f890 drm/ttm: Add vmap/vunmap to TTM and TTM GEM 
>>>> helpers
>>>> ---
>>>>   drivers/gpu/drm/drm_gem_ttm_helper.c | 9 ++++++++-
>>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c 
>>>> b/drivers/gpu/drm/drm_gem_ttm_helper.c
>>>> index d5962a34c01d..e5fc875990c4 100644
>>>> --- a/drivers/gpu/drm/drm_gem_ttm_helper.c
>>>> +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
>>>> @@ -64,8 +64,13 @@ int drm_gem_ttm_vmap(struct drm_gem_object *gem,
>>>>                struct iosys_map *map)
>>>>   {
>>>>       struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>>>> +    int ret;
>>>> +
>>>> +    dma_resv_lock(gem->resv, NULL);
>>>> +    ret = ttm_bo_vmap(bo, map);
>>>> +    dma_resv_unlock(gem->resv);
>>>>   -    return ttm_bo_vmap(bo, map);
>>>> +    return ret;
>>>>   }
>>>>   EXPORT_SYMBOL(drm_gem_ttm_vmap);
>>>>   @@ -82,7 +87,9 @@ void drm_gem_ttm_vunmap(struct drm_gem_object 
>>>> *gem,
>>>>   {
>>>>       struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
>>>>   +    dma_resv_lock(gem->resv, NULL);
>>>>       ttm_bo_vunmap(bo, map);
>>>> +    dma_resv_unlock(gem->resv);
>>>>   }
>>>>   EXPORT_SYMBOL(drm_gem_ttm_vunmap);
>>>
>>
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c b/drivers/gpu/drm/drm_gem_ttm_helper.c
index d5962a34c01d..e5fc875990c4 100644
--- a/drivers/gpu/drm/drm_gem_ttm_helper.c
+++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
@@ -64,8 +64,13 @@  int drm_gem_ttm_vmap(struct drm_gem_object *gem,
 		     struct iosys_map *map)
 {
 	struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+	int ret;
+
+	dma_resv_lock(gem->resv, NULL);
+	ret = ttm_bo_vmap(bo, map);
+	dma_resv_unlock(gem->resv);
 
-	return ttm_bo_vmap(bo, map);
+	return ret;
 }
 EXPORT_SYMBOL(drm_gem_ttm_vmap);
 
@@ -82,7 +87,9 @@  void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
 {
 	struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
 
+	dma_resv_lock(gem->resv, NULL);
 	ttm_bo_vunmap(bo, map);
+	dma_resv_unlock(gem->resv);
 }
 EXPORT_SYMBOL(drm_gem_ttm_vunmap);