Message ID | 20220629060236.3283445-1-jesse.zhang@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] Fix: SYNCOBJ TIMELINE Test failed. | expand |
Am 29.06.22 um 08:02 schrieb jie1zhan: > The issue cause by the commit : > > 721255b527(drm/syncobj: flatten dma_fence_chains on transfer). > > Because it use the point of dma_fence incorrectly > > Correct the point of dma_fence by fence array Well that patch is just utterly nonsense as far as I can see. > > Signed-off-by: jie1zhan <jesse.zhang@amd.com> > > Reviewed-by: Christian König <christian.koenig@amd.com> > > Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com> I have strong doubts that Nirmoy has reviewed this and I certainly haven't reviewed it. Christian. > --- > drivers/gpu/drm/drm_syncobj.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c > index 7e48dcd1bee4..d5db818f1c76 100644 > --- a/drivers/gpu/drm/drm_syncobj.c > +++ b/drivers/gpu/drm/drm_syncobj.c > @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f) > goto free_fences; > > dma_fence_put(*f); > - *f = &array->base; > + *f = array->fences[0]; > return 0; > > free_fences:
On 6/29/2022 11:12 AM, Christian König wrote: > Am 29.06.22 um 08:02 schrieb jie1zhan: >> The issue cause by the commit : >> >> 721255b527(drm/syncobj: flatten dma_fence_chains on transfer). >> >> Because it use the point of dma_fence incorrectly >> >> Correct the point of dma_fence by fence array > > Well that patch is just utterly nonsense as far as I can see. > >> >> Signed-off-by: jie1zhan <jesse.zhang@amd.com> >> >> Reviewed-by: Christian König <christian.koenig@amd.com> >> >> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com> > > I have strong doubts that Nirmoy has reviewed this and I certainly > haven't reviewed it. I haven't reviewed this either. Nirmoy > > Christian. > >> --- >> drivers/gpu/drm/drm_syncobj.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_syncobj.c >> b/drivers/gpu/drm/drm_syncobj.c >> index 7e48dcd1bee4..d5db818f1c76 100644 >> --- a/drivers/gpu/drm/drm_syncobj.c >> +++ b/drivers/gpu/drm/drm_syncobj.c >> @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct >> dma_fence **f) >> goto free_fences; >> dma_fence_put(*f); >> - *f = &array->base; >> + *f = array->fences[0]; >> return 0; >> free_fences: >
[AMD Official Use Only - General] Hi Christian, If we remove the following patch, the "syncobj timeline test" can pass. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=721255b52700b320c4ae2e23d57f7d9ad1db50b9 The following log is provided by AMD CQE team. They run the amdgpu_test tool on ubuntu22 (kernel version 5.15.0-39) Suite: SYNCOBJ TIMELINE Tests Test: syncobj timeline test ...FAILED 1. sources/drm/tests/amdgpu/syncobj_tests.c:299 - CU_ASSERT_EQUAL(payload,18) 2. sources/drm/tests/amdgpu/syncobj_tests.c:309 - CU_ASSERT_EQUAL(payload,20) You can get more detail information by the attachment. So we need fix this issue. And if you have any better solution to solve the issue, please let me know. Thanks Jesse -----Original Message----- From: Koenig, Christian <Christian.Koenig@amd.com> Sent: Wednesday, 29 June 2022 5:12 pm To: Zhang, Jesse(Jie) <Jesse.Zhang@amd.com>; broonie@kernel.org; alsa-devel@alsa-project.org Cc: Mukunda, Vijendar <Vijendar.Mukunda@amd.com>; Hiregoudar, Basavaraj <Basavaraj.Hiregoudar@amd.com>; Dommati, Sunil-kumar <Sunil-kumar.Dommati@amd.com>; Pandey, Ajit Kumar <AjitKumar.Pandey@amd.com>; Nirmoy Das <nirmoy.das@linux.intel.com>; Maarten Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard <mripard@kernel.org>; Thomas Zimmermann <tzimmermann@suse.de>; David Airlie <airlied@linux.ie>; Daniel Vetter <daniel@ffwll.ch>; Sumit Semwal <sumit.semwal@linaro.org>; open list:DRM DRIVERS <dri-devel@lists.freedesktop.org>; open list <linux-kernel@vger.kernel.org>; open list:DMA BUFFER SHARING FRAMEWORK <linux-media@vger.kernel.org>; moderated list:DMA BUFFER SHARING FRAMEWORK <linaro-mm-sig@lists.linaro.org> Subject: Re: [PATCH v1] Fix: SYNCOBJ TIMELINE Test failed. Am 29.06.22 um 08:02 schrieb jie1zhan: > The issue cause by the commit : > > 721255b527(drm/syncobj: flatten dma_fence_chains on transfer). > > Because it use the point of dma_fence incorrectly > > Correct the point of dma_fence by fence array Well that patch is just utterly nonsense as far as I can see. > > Signed-off-by: jie1zhan <jesse.zhang@amd.com> > > Reviewed-by: Christian König <christian.koenig@amd.com> > > Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com> I have strong doubts that Nirmoy has reviewed this and I certainly haven't reviewed it. Christian. > --- > drivers/gpu/drm/drm_syncobj.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_syncobj.c > b/drivers/gpu/drm/drm_syncobj.c index 7e48dcd1bee4..d5db818f1c76 > 100644 > --- a/drivers/gpu/drm/drm_syncobj.c > +++ b/drivers/gpu/drm/drm_syncobj.c > @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f) > goto free_fences; > > dma_fence_put(*f); > - *f = &array->base; > + *f = array->fences[0]; > return 0; > > free_fences: [AMD Official Use Only - General] Hi Christian, Our QA find the "Syncobj timeline" test failed on ubuntu22 (kernel version 5.15.0-39). The relate ticket as following: https://ontrack-internal.amd.com/browse/SWDEV-343186 We trace the root cause of this issue, and found it cause by your patch. As the following: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab66fdace8581ef3b4e7cf5381a168ed4058d779. I add a patch , please help to review. diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 7e48dcd1bee4..d5db818f1c76 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f) goto free_fences; dma_fence_put(*f); - *f = &array->base; + *f = array->fences[0]; return 0; Attach the patch file. Thanks Jesse
Hi Jesse, yes, I know that's a well known bug. The Intel guys have already narrowed it down to a missing dma_fence_enable_signaling() in the syncobj code path. I strongly suggest to work together with them to find where that needs to be added instead. Regards, Christian. Am 30.06.22 um 17:26 schrieb Zhang, Jesse(Jie): > [AMD Official Use Only - General] > > > Hi Christian, > If we remove the following patch, the "syncobj timeline test" can pass. > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=721255b52700b320c4ae2e23d57f7d9ad1db50b9 > > > The following log is provided by AMD CQE team. They run the amdgpu_test tool on ubuntu22 (kernel version 5.15.0-39) > Suite: SYNCOBJ TIMELINE Tests > Test: syncobj timeline test ...FAILED > 1. sources/drm/tests/amdgpu/syncobj_tests.c:299 - CU_ASSERT_EQUAL(payload,18) > 2. sources/drm/tests/amdgpu/syncobj_tests.c:309 - CU_ASSERT_EQUAL(payload,20) > You can get more detail information by the attachment. > > So we need fix this issue. And if you have any better solution to solve the issue, please let me know. > > Thanks > Jesse > > -----Original Message----- > From: Koenig, Christian <Christian.Koenig@amd.com> > Sent: Wednesday, 29 June 2022 5:12 pm > To: Zhang, Jesse(Jie) <Jesse.Zhang@amd.com>; broonie@kernel.org; alsa-devel@alsa-project.org > Cc: Mukunda, Vijendar <Vijendar.Mukunda@amd.com>; Hiregoudar, Basavaraj <Basavaraj.Hiregoudar@amd.com>; Dommati, Sunil-kumar <Sunil-kumar.Dommati@amd.com>; Pandey, Ajit Kumar <AjitKumar.Pandey@amd.com>; Nirmoy Das <nirmoy.das@linux.intel.com>; Maarten Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard <mripard@kernel.org>; Thomas Zimmermann <tzimmermann@suse.de>; David Airlie <airlied@linux.ie>; Daniel Vetter <daniel@ffwll.ch>; Sumit Semwal <sumit.semwal@linaro.org>; open list:DRM DRIVERS <dri-devel@lists.freedesktop.org>; open list <linux-kernel@vger.kernel.org>; open list:DMA BUFFER SHARING FRAMEWORK <linux-media@vger.kernel.org>; moderated list:DMA BUFFER SHARING FRAMEWORK <linaro-mm-sig@lists.linaro.org> > Subject: Re: [PATCH v1] Fix: SYNCOBJ TIMELINE Test failed. > > Am 29.06.22 um 08:02 schrieb jie1zhan: >> The issue cause by the commit : >> >> 721255b527(drm/syncobj: flatten dma_fence_chains on transfer). >> >> Because it use the point of dma_fence incorrectly >> >> Correct the point of dma_fence by fence array > Well that patch is just utterly nonsense as far as I can see. > >> Signed-off-by: jie1zhan <jesse.zhang@amd.com> >> >> Reviewed-by: Christian König <christian.koenig@amd.com> >> >> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com> > I have strong doubts that Nirmoy has reviewed this and I certainly haven't reviewed it. > > Christian. > >> --- >> drivers/gpu/drm/drm_syncobj.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_syncobj.c >> b/drivers/gpu/drm/drm_syncobj.c index 7e48dcd1bee4..d5db818f1c76 >> 100644 >> --- a/drivers/gpu/drm/drm_syncobj.c >> +++ b/drivers/gpu/drm/drm_syncobj.c >> @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f) >> goto free_fences; >> >> dma_fence_put(*f); >> - *f = &array->base; >> + *f = array->fences[0]; >> return 0; >> >> free_fences:
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 7e48dcd1bee4..d5db818f1c76 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f) goto free_fences; dma_fence_put(*f); - *f = &array->base; + *f = array->fences[0]; return 0; free_fences: