diff mbox series

drm/amdkfd: fix potential null pointer dereference on pointer peer_dev

Message ID 20190629133114.14271-1-colin.king@canonical.com (mailing list archive)
State New, archived
Headers show
Series drm/amdkfd: fix potential null pointer dereference on pointer peer_dev | expand

Commit Message

Colin King June 29, 2019, 1:31 p.m. UTC
From: Colin Ian King <colin.king@canonical.com>

The call to kfd_topology_device_by_proximity_domain can return a NULL
pointer so add a null pointer check on peer_dev to the existing null
pointer check on peer_dev->gpu to avoid any potential null pointer
dereferences.

Addresses-Coverity: ("Dereference on null return value")
Fixes: ae9a25aea7f3 ("drm/amdkfd: Generate xGMI direct iolink")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Liu, Shaoyun July 2, 2019, 3:29 p.m. UTC | #1
From the comments , "we will  loop GPUs that already be processed (with 
lower value of proximity_domain) ",  the device should already been 
added into the  topology_device_list.  So in this case , 
kfd_topology_device_by_proximity_domain will not return a NULL pointer.  
If you really get the null pointer dereferences here , we must have  
some bigger problem and  can not solved by added the null check here.

Regards

shaoyun.liu

On 2019-06-29 9:31 a.m., Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> The call to kfd_topology_device_by_proximity_domain can return a NULL
> pointer so add a null pointer check on peer_dev to the existing null
> pointer check on peer_dev->gpu to avoid any potential null pointer
> dereferences.
>
> Addresses-Coverity: ("Dereference on null return value")
> Fixes: ae9a25aea7f3 ("drm/amdkfd: Generate xGMI direct iolink")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> index 4e3fc284f6ac..cb6b46cfa6c2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> @@ -1293,7 +1293,7 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>   	if (kdev->hive_id) {
>   		for (nid = 0; nid < proximity_domain; ++nid) {
>   			peer_dev = kfd_topology_device_by_proximity_domain(nid);
> -			if (!peer_dev->gpu)
> +			if (!peer_dev || !peer_dev->gpu)
>   				continue;
>   			if (peer_dev->gpu->hive_id != kdev->hive_id)
>   				continue;
Felix Kuehling July 2, 2019, 6:24 p.m. UTC | #2
I think this could happen if KFD initialization fails for a device. 
Currently we'd add the device, and then remove it again. That may leave 
a gap in the proximity domains. Oak just had a fix recently to clean 
that up by only adding KFD devices to the topology after successful 
initialization.

Regards,
   Felix

On 2019-07-02 11:29 a.m., Liu, Shaoyun wrote:
>   From the comments , "we will  loop GPUs that already be processed (with
> lower value of proximity_domain) ",  the device should already been
> added into the  topology_device_list.  So in this case ,
> kfd_topology_device_by_proximity_domain will not return a NULL pointer.
> If you really get the null pointer dereferences here , we must have
> some bigger problem and  can not solved by added the null check here.
>
> Regards
>
> shaoyun.liu
>
> On 2019-06-29 9:31 a.m., Colin King wrote:
>> From: Colin Ian King <colin.king@canonical.com>
>>
>> The call to kfd_topology_device_by_proximity_domain can return a NULL
>> pointer so add a null pointer check on peer_dev to the existing null
>> pointer check on peer_dev->gpu to avoid any potential null pointer
>> dereferences.
>>
>> Addresses-Coverity: ("Dereference on null return value")
>> Fixes: ae9a25aea7f3 ("drm/amdkfd: Generate xGMI direct iolink")
>> Signed-off-by: Colin Ian King <colin.king@canonical.com>
>> ---
>>    drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> index 4e3fc284f6ac..cb6b46cfa6c2 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> @@ -1293,7 +1293,7 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>    	if (kdev->hive_id) {
>>    		for (nid = 0; nid < proximity_domain; ++nid) {
>>    			peer_dev = kfd_topology_device_by_proximity_domain(nid);
>> -			if (!peer_dev->gpu)
>> +			if (!peer_dev || !peer_dev->gpu)
>>    				continue;
>>    			if (peer_dev->gpu->hive_id != kdev->hive_id)
>>    				continue;
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 4e3fc284f6ac..cb6b46cfa6c2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1293,7 +1293,7 @@  static int kfd_create_vcrat_image_gpu(void *pcrat_image,
 	if (kdev->hive_id) {
 		for (nid = 0; nid < proximity_domain; ++nid) {
 			peer_dev = kfd_topology_device_by_proximity_domain(nid);
-			if (!peer_dev->gpu)
+			if (!peer_dev || !peer_dev->gpu)
 				continue;
 			if (peer_dev->gpu->hive_id != kdev->hive_id)
 				continue;