diff mbox series

drm/ttm: fix one use-after-free

Message ID 20230705053544.346139-1-Lang.Yu@amd.com (mailing list archive)
State New, archived
Headers show
Series drm/ttm: fix one use-after-free | expand

Commit Message

Lang Yu July 5, 2023, 5:35 a.m. UTC
[   67.399887] refcount_t: underflow; use-after-free.
[   67.399901] WARNING: CPU: 0 PID: 3172 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110
[   67.400124] RIP: 0010:refcount_warn_saturate+0xc2/0x110
[   67.400173] Call Trace:
[   67.400176]  <TASK>
[   67.400181]  ttm_mem_evict_first+0x4fe/0x5b0 [ttm]
[   67.400216]  ttm_bo_mem_space+0x1e3/0x240 [ttm]
[   67.400239]  ttm_bo_validate+0xc7/0x190 [ttm]
[   67.400253]  ? ww_mutex_trylock+0x1b1/0x390
[   67.400266]  ttm_bo_init_reserved+0x183/0x1c0 [ttm]
[   67.400280]  ? __rwlock_init+0x3d/0x70
[   67.400292]  amdgpu_bo_create+0x1cd/0x4f0 [amdgpu]
[   67.400607]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
[   67.400980]  amdgpu_bo_create_user+0x38/0x70 [amdgpu]
[   67.401291]  amdgpu_gem_object_create+0x77/0xb0 [amdgpu]
[   67.401641]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
[   67.401958]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x228/0xa30 [amdgpu]
[   67.402433]  kfd_ioctl_alloc_memory_of_gpu+0x14e/0x390 [amdgpu]
[   67.402824]  ? lock_release+0x13f/0x290
[   67.402838]  kfd_ioctl+0x1e0/0x640 [amdgpu]
[   67.403205]  ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu]
[   67.403579]  ? tomoyo_file_ioctl+0x19/0x20
[   67.403590]  __x64_sys_ioctl+0x95/0xd0
[   67.403601]  do_syscall_64+0x3b/0x90
[   67.403609]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Lang Yu July 5, 2023, 8:31 a.m. UTC | #1
Please ignore this patch, it will cause another issue.
Will send a new one.

Regards,
Lang

On 07/05/ , Lang Yu wrote:
> [   67.399887] refcount_t: underflow; use-after-free.
> [   67.399901] WARNING: CPU: 0 PID: 3172 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110
> [   67.400124] RIP: 0010:refcount_warn_saturate+0xc2/0x110
> [   67.400173] Call Trace:
> [   67.400176]  <TASK>
> [   67.400181]  ttm_mem_evict_first+0x4fe/0x5b0 [ttm]
> [   67.400216]  ttm_bo_mem_space+0x1e3/0x240 [ttm]
> [   67.400239]  ttm_bo_validate+0xc7/0x190 [ttm]
> [   67.400253]  ? ww_mutex_trylock+0x1b1/0x390
> [   67.400266]  ttm_bo_init_reserved+0x183/0x1c0 [ttm]
> [   67.400280]  ? __rwlock_init+0x3d/0x70
> [   67.400292]  amdgpu_bo_create+0x1cd/0x4f0 [amdgpu]
> [   67.400607]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
> [   67.400980]  amdgpu_bo_create_user+0x38/0x70 [amdgpu]
> [   67.401291]  amdgpu_gem_object_create+0x77/0xb0 [amdgpu]
> [   67.401641]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
> [   67.401958]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x228/0xa30 [amdgpu]
> [   67.402433]  kfd_ioctl_alloc_memory_of_gpu+0x14e/0x390 [amdgpu]
> [   67.402824]  ? lock_release+0x13f/0x290
> [   67.402838]  kfd_ioctl+0x1e0/0x640 [amdgpu]
> [   67.403205]  ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu]
> [   67.403579]  ? tomoyo_file_ioctl+0x19/0x20
> [   67.403590]  __x64_sys_ioctl+0x95/0xd0
> [   67.403601]  do_syscall_64+0x3b/0x90
> [   67.403609]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
> 
> Signed-off-by: Lang Yu <Lang.Yu@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index bd5dae4d1624..e047b191001c 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -308,6 +308,9 @@ static void ttm_bo_delayed_delete(struct work_struct *work)
>  
>  	bo = container_of(work, typeof(*bo), delayed_delete);
>  
> +	if (!ttm_bo_get_unless_zero(bo))
> +		return;
> +
>  	dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
>  			      MAX_SCHEDULE_TIMEOUT);
>  	dma_resv_lock(bo->base.resv, NULL);
> -- 
> 2.25.1
>
Christian König July 5, 2023, 9:06 a.m. UTC | #2
I was just to complain that this is certainly incorrect.

But it's strange that ttm_mem_evict_first causes the warning in the 
first place since it should never try to evict a BO which is about to be 
destroyed.

Regards,
Christian.

Am 05.07.23 um 10:31 schrieb Lang Yu:
> Please ignore this patch, it will cause another issue.
> Will send a new one.
>
> Regards,
> Lang
>
> On 07/05/ , Lang Yu wrote:
>> [   67.399887] refcount_t: underflow; use-after-free.
>> [   67.399901] WARNING: CPU: 0 PID: 3172 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110
>> [   67.400124] RIP: 0010:refcount_warn_saturate+0xc2/0x110
>> [   67.400173] Call Trace:
>> [   67.400176]  <TASK>
>> [   67.400181]  ttm_mem_evict_first+0x4fe/0x5b0 [ttm]
>> [   67.400216]  ttm_bo_mem_space+0x1e3/0x240 [ttm]
>> [   67.400239]  ttm_bo_validate+0xc7/0x190 [ttm]
>> [   67.400253]  ? ww_mutex_trylock+0x1b1/0x390
>> [   67.400266]  ttm_bo_init_reserved+0x183/0x1c0 [ttm]
>> [   67.400280]  ? __rwlock_init+0x3d/0x70
>> [   67.400292]  amdgpu_bo_create+0x1cd/0x4f0 [amdgpu]
>> [   67.400607]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
>> [   67.400980]  amdgpu_bo_create_user+0x38/0x70 [amdgpu]
>> [   67.401291]  amdgpu_gem_object_create+0x77/0xb0 [amdgpu]
>> [   67.401641]  ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
>> [   67.401958]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x228/0xa30 [amdgpu]
>> [   67.402433]  kfd_ioctl_alloc_memory_of_gpu+0x14e/0x390 [amdgpu]
>> [   67.402824]  ? lock_release+0x13f/0x290
>> [   67.402838]  kfd_ioctl+0x1e0/0x640 [amdgpu]
>> [   67.403205]  ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu]
>> [   67.403579]  ? tomoyo_file_ioctl+0x19/0x20
>> [   67.403590]  __x64_sys_ioctl+0x95/0xd0
>> [   67.403601]  do_syscall_64+0x3b/0x90
>> [   67.403609]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>
>> Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
>>
>> Signed-off-by: Lang Yu <Lang.Yu@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index bd5dae4d1624..e047b191001c 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -308,6 +308,9 @@ static void ttm_bo_delayed_delete(struct work_struct *work)
>>   
>>   	bo = container_of(work, typeof(*bo), delayed_delete);
>>   
>> +	if (!ttm_bo_get_unless_zero(bo))
>> +		return;
>> +
>>   	dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
>>   			      MAX_SCHEDULE_TIMEOUT);
>>   	dma_resv_lock(bo->base.resv, NULL);
>> -- 
>> 2.25.1
>>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bd5dae4d1624..e047b191001c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -308,6 +308,9 @@  static void ttm_bo_delayed_delete(struct work_struct *work)
 
 	bo = container_of(work, typeof(*bo), delayed_delete);
 
+	if (!ttm_bo_get_unless_zero(bo))
+		return;
+
 	dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
 			      MAX_SCHEDULE_TIMEOUT);
 	dma_resv_lock(bo->base.resv, NULL);