diff mbox series

[v1] Fix: SYNCOBJ TIMELINE Test failed.

Message ID 20220629060236.3283445-1-jesse.zhang@amd.com (mailing list archive)
State New, archived
Headers show
Series [v1] Fix: SYNCOBJ TIMELINE Test failed. | expand

Commit Message

Zhang, Jesse(Jie) June 29, 2022, 6:02 a.m. UTC
The issue cause by the commit :

721255b527(drm/syncobj: flatten dma_fence_chains on transfer).

Because it use the point of dma_fence incorrectly

Correct the point of dma_fence by fence array

Signed-off-by: jie1zhan <jesse.zhang@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
---
 drivers/gpu/drm/drm_syncobj.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Christian König June 29, 2022, 9:12 a.m. UTC | #1
Am 29.06.22 um 08:02 schrieb jie1zhan:
>   The issue cause by the commit :
>
> 721255b527(drm/syncobj: flatten dma_fence_chains on transfer).
>
> Because it use the point of dma_fence incorrectly
>
> Correct the point of dma_fence by fence array

Well that patch is just utterly nonsense as far as I can see.

>
> Signed-off-by: jie1zhan <jesse.zhang@amd.com>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>

I have strong doubts that Nirmoy has reviewed this and I certainly 
haven't reviewed it.

Christian.

> ---
>   drivers/gpu/drm/drm_syncobj.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index 7e48dcd1bee4..d5db818f1c76 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f)
>   		goto free_fences;
>   
>   	dma_fence_put(*f);
> -	*f = &array->base;
> +	*f = array->fences[0];
>   	return 0;
>   
>   free_fences:
Nirmoy Das June 30, 2022, 7:01 a.m. UTC | #2
On 6/29/2022 11:12 AM, Christian König wrote:
> Am 29.06.22 um 08:02 schrieb jie1zhan:
>>   The issue cause by the commit :
>>
>> 721255b527(drm/syncobj: flatten dma_fence_chains on transfer).
>>
>> Because it use the point of dma_fence incorrectly
>>
>> Correct the point of dma_fence by fence array
>
> Well that patch is just utterly nonsense as far as I can see.
>
>>
>> Signed-off-by: jie1zhan <jesse.zhang@amd.com>
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>
>> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
>
> I have strong doubts that Nirmoy has reviewed this and I certainly 
> haven't reviewed it.


I haven't  reviewed this either.


Nirmoy

>
> Christian.
>
>> ---
>>   drivers/gpu/drm/drm_syncobj.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_syncobj.c 
>> b/drivers/gpu/drm/drm_syncobj.c
>> index 7e48dcd1bee4..d5db818f1c76 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct 
>> dma_fence **f)
>>           goto free_fences;
>>         dma_fence_put(*f);
>> -    *f = &array->base;
>> +    *f = array->fences[0];
>>       return 0;
>>     free_fences:
>
Zhang, Jesse(Jie) June 30, 2022, 3:26 p.m. UTC | #3
[AMD Official Use Only - General]


Hi  Christian,
If we remove the following patch, the  "syncobj timeline test" can pass.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=721255b52700b320c4ae2e23d57f7d9ad1db50b9


The following log is provided by AMD CQE team. They run the amdgpu_test tool on ubuntu22 (kernel version 5.15.0-39)
Suite: SYNCOBJ TIMELINE Tests
  Test: syncobj timeline test ...FAILED
    1. sources/drm/tests/amdgpu/syncobj_tests.c:299  - CU_ASSERT_EQUAL(payload,18)
    2. sources/drm/tests/amdgpu/syncobj_tests.c:309  - CU_ASSERT_EQUAL(payload,20)
You can get more detail information by the attachment.

So we need fix this issue. And if you have any better solution to solve the issue, please let me know.

Thanks
Jesse

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com> 
Sent: Wednesday, 29 June 2022 5:12 pm
To: Zhang, Jesse(Jie) <Jesse.Zhang@amd.com>; broonie@kernel.org; alsa-devel@alsa-project.org
Cc: Mukunda, Vijendar <Vijendar.Mukunda@amd.com>; Hiregoudar, Basavaraj <Basavaraj.Hiregoudar@amd.com>; Dommati, Sunil-kumar <Sunil-kumar.Dommati@amd.com>; Pandey, Ajit Kumar <AjitKumar.Pandey@amd.com>; Nirmoy Das <nirmoy.das@linux.intel.com>; Maarten Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard <mripard@kernel.org>; Thomas Zimmermann <tzimmermann@suse.de>; David Airlie <airlied@linux.ie>; Daniel Vetter <daniel@ffwll.ch>; Sumit Semwal <sumit.semwal@linaro.org>; open list:DRM DRIVERS <dri-devel@lists.freedesktop.org>; open list <linux-kernel@vger.kernel.org>; open list:DMA BUFFER SHARING FRAMEWORK <linux-media@vger.kernel.org>; moderated list:DMA BUFFER SHARING FRAMEWORK <linaro-mm-sig@lists.linaro.org>
Subject: Re: [PATCH v1] Fix: SYNCOBJ TIMELINE Test failed.

Am 29.06.22 um 08:02 schrieb jie1zhan:
>   The issue cause by the commit :
>
> 721255b527(drm/syncobj: flatten dma_fence_chains on transfer).
>
> Because it use the point of dma_fence incorrectly
>
> Correct the point of dma_fence by fence array

Well that patch is just utterly nonsense as far as I can see.

>
> Signed-off-by: jie1zhan <jesse.zhang@amd.com>
>
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>

I have strong doubts that Nirmoy has reviewed this and I certainly haven't reviewed it.

Christian.

> ---
>   drivers/gpu/drm/drm_syncobj.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c 
> b/drivers/gpu/drm/drm_syncobj.c index 7e48dcd1bee4..d5db818f1c76 
> 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f)
>   		goto free_fences;
>   
>   	dma_fence_put(*f);
> -	*f = &array->base;
> +	*f = array->fences[0];
>   	return 0;
>   
>   free_fences:
[AMD Official Use Only - General]


Hi Christian,

Our QA find the "Syncobj timeline" test failed on ubuntu22 (kernel version 5.15.0-39).  The relate ticket as following:

https://ontrack-internal.amd.com/browse/SWDEV-343186



We trace the root cause of this issue, and found it cause by your patch.  As the following:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab66fdace8581ef3b4e7cf5381a168ed4058d779.



I add a patch , please help to review.



diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c

index 7e48dcd1bee4..d5db818f1c76 100644

--- a/drivers/gpu/drm/drm_syncobj.c

+++ b/drivers/gpu/drm/drm_syncobj.c

@@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f)

                goto free_fences;



        dma_fence_put(*f);

-       *f = &array->base;

+       *f = array->fences[0];

        return 0;



Attach the patch file.



Thanks

Jesse
Christian König June 30, 2022, 3:41 p.m. UTC | #4
Hi Jesse,

yes, I know that's a well known bug.

The Intel guys have already narrowed it down to a missing 
dma_fence_enable_signaling() in the syncobj code path.

I strongly suggest to work together with them to find where that needs 
to be added instead.

Regards,
Christian.

Am 30.06.22 um 17:26 schrieb Zhang, Jesse(Jie):
> [AMD Official Use Only - General]
>
>
> Hi  Christian,
> If we remove the following patch, the  "syncobj timeline test" can pass.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=721255b52700b320c4ae2e23d57f7d9ad1db50b9
>
>
> The following log is provided by AMD CQE team. They run the amdgpu_test tool on ubuntu22 (kernel version 5.15.0-39)
> Suite: SYNCOBJ TIMELINE Tests
>    Test: syncobj timeline test ...FAILED
>      1. sources/drm/tests/amdgpu/syncobj_tests.c:299  - CU_ASSERT_EQUAL(payload,18)
>      2. sources/drm/tests/amdgpu/syncobj_tests.c:309  - CU_ASSERT_EQUAL(payload,20)
> You can get more detail information by the attachment.
>
> So we need fix this issue. And if you have any better solution to solve the issue, please let me know.
>
> Thanks
> Jesse
>
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Wednesday, 29 June 2022 5:12 pm
> To: Zhang, Jesse(Jie) <Jesse.Zhang@amd.com>; broonie@kernel.org; alsa-devel@alsa-project.org
> Cc: Mukunda, Vijendar <Vijendar.Mukunda@amd.com>; Hiregoudar, Basavaraj <Basavaraj.Hiregoudar@amd.com>; Dommati, Sunil-kumar <Sunil-kumar.Dommati@amd.com>; Pandey, Ajit Kumar <AjitKumar.Pandey@amd.com>; Nirmoy Das <nirmoy.das@linux.intel.com>; Maarten Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard <mripard@kernel.org>; Thomas Zimmermann <tzimmermann@suse.de>; David Airlie <airlied@linux.ie>; Daniel Vetter <daniel@ffwll.ch>; Sumit Semwal <sumit.semwal@linaro.org>; open list:DRM DRIVERS <dri-devel@lists.freedesktop.org>; open list <linux-kernel@vger.kernel.org>; open list:DMA BUFFER SHARING FRAMEWORK <linux-media@vger.kernel.org>; moderated list:DMA BUFFER SHARING FRAMEWORK <linaro-mm-sig@lists.linaro.org>
> Subject: Re: [PATCH v1] Fix: SYNCOBJ TIMELINE Test failed.
>
> Am 29.06.22 um 08:02 schrieb jie1zhan:
>>    The issue cause by the commit :
>>
>> 721255b527(drm/syncobj: flatten dma_fence_chains on transfer).
>>
>> Because it use the point of dma_fence incorrectly
>>
>> Correct the point of dma_fence by fence array
> Well that patch is just utterly nonsense as far as I can see.
>
>> Signed-off-by: jie1zhan <jesse.zhang@amd.com>
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>
>> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
> I have strong doubts that Nirmoy has reviewed this and I certainly haven't reviewed it.
>
> Christian.
>
>> ---
>>    drivers/gpu/drm/drm_syncobj.c | 2 +-
>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>> b/drivers/gpu/drm/drm_syncobj.c index 7e48dcd1bee4..d5db818f1c76
>> 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f)
>>    		goto free_fences;
>>    
>>    	dma_fence_put(*f);
>> -	*f = &array->base;
>> +	*f = array->fences[0];
>>    	return 0;
>>    
>>    free_fences:
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 7e48dcd1bee4..d5db818f1c76 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -887,7 +887,7 @@  static int drm_syncobj_flatten_chain(struct dma_fence **f)
 		goto free_fences;
 
 	dma_fence_put(*f);
-	*f = &array->base;
+	*f = array->fences[0];
 	return 0;
 
 free_fences: