diff mbox series

drm/ttm: fix missing NULL check in ttm_device_swapout

Message ID 20220603104604.456991-1-christian.koenig@amd.com (mailing list archive)
State New, archived
Headers show
Series drm/ttm: fix missing NULL check in ttm_device_swapout | expand

Commit Message

Christian König June 3, 2022, 10:46 a.m. UTC
Resources about to be destructed are not tied to BOs any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Felix Kuehling June 3, 2022, 7:37 p.m. UTC | #1
On 2022-06-03 06:46, Christian König wrote:
> Resources about to be destructed are not tied to BOs any more.
I've been seeing a backtrace in that area with a patch series I'm 
working on, but didn't have enough time to track it down yet. I'll try 
if this patch fixes it.

Regards,
   Felix


>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index a0562ab386f5..e7147e304637 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>   
>   		ttm_resource_manager_for_each_res(man, &cursor, res) {
>   			struct ttm_buffer_object *bo = res->bo;
> -			uint32_t num_pages = PFN_UP(bo->base.size);
> +			uint32_t num_pages;
>   
> +			if (!bo)
> +				continue;
> +
> +			num_pages = PFN_UP(bo->base.size);
>   			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>   			/* ttm_bo_swapout has dropped the lru_lock */
>   			if (!ret)
Felix Kuehling June 3, 2022, 10:44 p.m. UTC | #2
[+amd-gfx]


On 2022-06-03 15:37, Felix Kuehling wrote:
>
> On 2022-06-03 06:46, Christian König wrote:
>> Resources about to be destructed are not tied to BOs any more.
> I've been seeing a backtrace in that area with a patch series I'm 
> working on, but didn't have enough time to track it down yet. I'll try 
> if this patch fixes it.

The patch doesn't apply on amd-staging-drm-next. I made the following 
change instead, which fixes my problem (and I do see the pr_err being 
triggered):

--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -157,6 +157,10 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
                         list_for_each_entry(bo, &man->lru[j], lru) {
                                 uint32_t num_pages = PFN_UP(bo->base.size);
  
+                               if (!bo->resource) {
+                                       pr_err("### bo->resource is NULL\n");
+                                       continue;
+                               }
                                 ret = ttm_bo_swapout(bo, ctx, gfp_flags);
                                 /* ttm_bo_swapout has dropped the lru_lock */
                                 if (!ret)

>
> Regards,
>   Felix
>
>
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
>> b/drivers/gpu/drm/ttm/ttm_device.c
>> index a0562ab386f5..e7147e304637 100644
>> --- a/drivers/gpu/drm/ttm/ttm_device.c
>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
>> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, 
>> struct ttm_operation_ctx *ctx,
>>             ttm_resource_manager_for_each_res(man, &cursor, res) {
>>               struct ttm_buffer_object *bo = res->bo;
>> -            uint32_t num_pages = PFN_UP(bo->base.size);
>> +            uint32_t num_pages;
>>   +            if (!bo)
>> +                continue;
>> +
>> +            num_pages = PFN_UP(bo->base.size);
>>               ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>>               /* ttm_bo_swapout has dropped the lru_lock */
>>               if (!ret)
Christian König June 6, 2022, 10:20 a.m. UTC | #3
Am 04.06.22 um 00:44 schrieb Felix Kuehling:
> [+amd-gfx]
>
>
> On 2022-06-03 15:37, Felix Kuehling wrote:
>>
>> On 2022-06-03 06:46, Christian König wrote:
>>> Resources about to be destructed are not tied to BOs any more.
>> I've been seeing a backtrace in that area with a patch series I'm 
>> working on, but didn't have enough time to track it down yet. I'll 
>> try if this patch fixes it.
>
> The patch doesn't apply on amd-staging-drm-next. I made the following 
> change instead, which fixes my problem (and I do see the pr_err being 
> triggered):
>
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -157,6 +157,10 @@ int ttm_device_swapout(struct ttm_device *bdev, 
> struct ttm_operation_ctx *ctx,
>                         list_for_each_entry(bo, &man->lru[j], lru) {
>                                 uint32_t num_pages = 
> PFN_UP(bo->base.size);
>
> +                               if (!bo->resource) {
> +                                       pr_err("### bo->resource is 
> NULL\n");
> +                                       continue;
> +                               }

Yeah, that should be functional identical.

Can I get an rb for that? Going to provide backports to older kernels as 
well then.

Regards,
Christian.

> ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>                                 /* ttm_bo_swapout has dropped the 
> lru_lock */
>                                 if (!ret)
>
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
>>> b/drivers/gpu/drm/ttm/ttm_device.c
>>> index a0562ab386f5..e7147e304637 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_device.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
>>> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, 
>>> struct ttm_operation_ctx *ctx,
>>>             ttm_resource_manager_for_each_res(man, &cursor, res) {
>>>               struct ttm_buffer_object *bo = res->bo;
>>> -            uint32_t num_pages = PFN_UP(bo->base.size);
>>> +            uint32_t num_pages;
>>>   +            if (!bo)
>>> +                continue;
>>> +
>>> +            num_pages = PFN_UP(bo->base.size);
>>>               ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>>>               /* ttm_bo_swapout has dropped the lru_lock */
>>>               if (!ret)
Felix Kuehling June 6, 2022, 1:15 p.m. UTC | #4
Am 2022-06-03 um 06:46 schrieb Christian König:
> Resources about to be destructed are not tied to BOs any more.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>


> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index a0562ab386f5..e7147e304637 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>   
>   		ttm_resource_manager_for_each_res(man, &cursor, res) {
>   			struct ttm_buffer_object *bo = res->bo;
> -			uint32_t num_pages = PFN_UP(bo->base.size);
> +			uint32_t num_pages;
>   
> +			if (!bo)
> +				continue;
> +
> +			num_pages = PFN_UP(bo->base.size);
>   			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>   			/* ttm_bo_swapout has dropped the lru_lock */
>   			if (!ret)
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index a0562ab386f5..e7147e304637 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -156,8 +156,12 @@  int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 
 		ttm_resource_manager_for_each_res(man, &cursor, res) {
 			struct ttm_buffer_object *bo = res->bo;
-			uint32_t num_pages = PFN_UP(bo->base.size);
+			uint32_t num_pages;
 
+			if (!bo)
+				continue;
+
+			num_pages = PFN_UP(bo->base.size);
 			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
 			/* ttm_bo_swapout has dropped the lru_lock */
 			if (!ret)