diff mbox series

[topic/core-for-CI] Revert "drm/i915/dg2: Add relocation exception"

Message ID 20220323174638.11017-1-zbigniew.kempczynski@intel.com (mailing list archive)
State New, archived
Headers show
Series [topic/core-for-CI] Revert "drm/i915/dg2: Add relocation exception" | expand

Commit Message

Zbigniew Kempczyński March 23, 2022, 5:46 p.m. UTC
This reverts commit 904ebf2ba89edaeba5c7c10540e43dba63541dc6.

Failures on dg2 tests were caused by invalid alignment when local memory
was in use. Changes which adopt alignment according to gen were already
merged in IGT so lets revert relocation temporary enabler for dg2. Keeping
it is a little bit problematic for IGT because on premerge we would see
results with kernel which supports relocation. To see no-relocation
results we need to send disabler (like this revert), point IGT with
"Test-with" tag what is cumbersome and time consuming so lets do this
permanently. If we will see some failures they need to be fixed instead
of keeping relocation enabler.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Zbigniew Kempczyński March 24, 2022, 8 a.m. UTC | #1
On Wed, Mar 23, 2022 at 06:51:17PM +0000, Patchwork wrote:
>    Patch Details
> 
>    Series:  Revert "drm/i915/dg2: Add relocation exception" (rev2)              
>    URL:     https://patchwork.freedesktop.org/series/101669/                    
>    State:   failure                                                             
>    Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22660/index.html 
> 
>            CI Bug Log - changes from CI_DRM_11398 -> Patchwork_22660
> 
> Summary
> 
>    FAILURE
> 
>    Serious unknown changes coming with Patchwork_22660 absolutely need to be
>    verified manually.
> 
>    If you think the reported changes have nothing to do with the changes
>    introduced in Patchwork_22660, please notify your bug team to allow them
>    to document this new failure mode, which will reduce false positives in
>    CI.
> 
>    External URL:
>    https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22660/index.html
> 
> Participating hosts (45 -> 42)
> 
>    Additional (5): bat-dg2-8 bat-dg2-9 fi-kbl-8809g bat-rpls-1 bat-jsl-1
>    Missing (8): fi-kbl-soraka shard-tglu fi-hsw-4200u bat-adlm-1 fi-bsw-cyan
>    fi-ctg-p8600 shard-rkl fi-bdw-samus
> 
> Possible new issues
> 
>    Here are the unknown changes that may have been introduced in
>    Patchwork_22660:
> 
>   IGT changes
> 
>     Possible regressions
> 
>      * igt@i915_selftest@live@hangcheck:
>           * fi-hsw-4770: PASS -> INCOMPLETE

Unrelated to the change.

> 
>     Suppressed
> 
>    The following results come from untrusted machines, tests, or statuses.
>    They do not affect the overall result.
> 
>      * igt@gem_busy@busy@all:
> 
>           * {bat-dg2-8}: NOTRUN -> INCOMPLETE
>      * igt@gem_exec_gttfill@basic:
> 
>           * {bat-dg2-9}: NOTRUN -> SKIP
>      * igt@gem_exec_suspend@basic-s0@smem:
> 
>           * {bat-dg2-9}: NOTRUN -> FAIL +7 similar issues

For this one I got success on no-reloc code:

./gem_exec_suspend --run basic-S0
IGT-Version: 1.26-NO-GIT (x86_64) (Linux: 5.17.0+ x86_64)
Starting subtest: basic-S0
Starting dynamic subtest: smem
[cmd] rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup from "freeze" using /dev/rtc0 at Thu Mar 24 07:56:03 2022
Dynamic subtest smem: SUCCESS (2.817s)
Starting dynamic subtest: lmem0
[cmd] rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup from "freeze" using /dev/rtc0 at Thu Mar 24 07:56:20 2022
Dynamic subtest lmem0: SUCCESS (3.184s)
Subtest basic-S0: SUCCESS (6.001s)

@Lucas - please consider reverting relocations for dg2. This will speed
up our work and I've fixed yesterday all potential problems with no-reloc
for BAT on dg2.
--
Zbigniew

> 
> Known issues
> 
>    Here are the changes found in Patchwork_22660 that come from known issues:
> 
>   IGT changes
> 
>     Issues hit
> 
>      * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
> 
>           * fi-snb-2600: NOTRUN -> SKIP (fdo#109271) +17 similar issues
>      * igt@gem_exec_suspend@basic-s0@smem:
> 
>           * fi-kbl-8809g: NOTRUN -> DMESG-WARN (i915#4962) +1 similar issue
>      * igt@gem_huc_copy@huc-copy:
> 
>           * fi-kbl-8809g: NOTRUN -> SKIP (fdo#109271 / i915#2190)
>      * igt@gem_lmem_swapping@random-engines:
> 
>           * fi-kbl-8809g: NOTRUN -> SKIP (fdo#109271 / i915#4613) +3 similar
>             issues
>      * igt@i915_pm_rpm@basic-rte:
> 
>           * fi-kbl-8809g: NOTRUN -> SKIP (fdo#109271) +54 similar issues
>      * igt@i915_selftest@live@gt_engines:
> 
>           * bat-dg1-6: PASS -> INCOMPLETE (i915#4418)
>      * igt@kms_chamelium@hdmi-edid-read:
> 
>           * fi-kbl-8809g: NOTRUN -> SKIP (fdo#109271 / fdo#111827) +8 similar
>             issues
>      * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-c:
> 
>           * fi-kbl-8809g: NOTRUN -> SKIP (fdo#109271 / i915#5341)
>      * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
> 
>           * fi-kbl-8809g: NOTRUN -> SKIP (fdo#109271 / i915#533)
>      * igt@runner@aborted:
> 
>           * fi-hsw-4770: NOTRUN -> FAIL (fdo#109271 / i915#1436 / i915#2722 /
>             i915#4312)
> 
>           * bat-dg1-6: NOTRUN -> FAIL (i915#4312 / i915#5257)
> 
>     Possible fixes
> 
>      * igt@i915_selftest@live@gt_timelines:
> 
>           * {bat-rpls-2}: DMESG-WARN (i915#4391) -> PASS
>      * igt@i915_selftest@live@hangcheck:
> 
>           * fi-snb-2600: INCOMPLETE (i915#3921) -> PASS
>      * igt@kms_busy@basic@modeset:
> 
>           * {bat-adlp-6}: DMESG-WARN (i915#3576) -> PASS
>      * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-c:
> 
>           * fi-cfl-8109u: DMESG-WARN (i915#295 / i915#5341) -> PASS
>      * igt@kms_pipe_crc_basic@read-crc-pipe-b:
> 
>           * fi-cfl-8109u: DMESG-WARN (i915#295) -> PASS +10 similar issues
> 
>     Warnings
> 
>      * igt@runner@aborted:
>           * fi-apl-guc: FAIL (i915#2426 / i915#4312) -> FAIL (i915#4312)
> 
>    {name}: This element is suppressed. This means it is ignored when
>    computing
>    the status of the difference (SUCCESS, WARNING, or FAILURE).
> 
> Build changes
> 
>      * Linux: CI_DRM_11398 -> Patchwork_22660
> 
>    CI-20190529: 20190529
>    CI_DRM_11398: da54e0aff302424358b14f443a9be2f84bb6ca47 @
>    git://anongit.freedesktop.org/gfx-ci/linux
>    IGT_6389: fa423f527496936a759eb838b023642deea7625f @
>    https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>    Patchwork_22660: 1efe3a8db87fa1aa51256528bd863ffaadea5689 @
>    git://anongit.freedesktop.org/gfx-ci/linux
> 
>    == Linux commits ==
> 
>    1efe3a8db87f Revert "drm/i915/dg2: Add relocation exception"
Lucas De Marchi March 29, 2022, 7:50 p.m. UTC | #2
On Thu, Mar 24, 2022 at 09:00:52AM +0100, Zbigniew Kempczyński wrote:
>On Wed, Mar 23, 2022 at 06:51:17PM +0000, Patchwork wrote:
>>    Patch Details
>>
>>    Series:  Revert "drm/i915/dg2: Add relocation exception" (rev2)
>>    URL:     https://patchwork.freedesktop.org/series/101669/
>>    State:   failure
>>    Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22660/index.html
>>
>>            CI Bug Log - changes from CI_DRM_11398 -> Patchwork_22660
>>
>> Summary
>>
>>    FAILURE
>>
>>    Serious unknown changes coming with Patchwork_22660 absolutely need to be
>>    verified manually.
>>
>>    If you think the reported changes have nothing to do with the changes
>>    introduced in Patchwork_22660, please notify your bug team to allow them
>>    to document this new failure mode, which will reduce false positives in
>>    CI.
>>
>>    External URL:
>>    https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22660/index.html
>>
>> Participating hosts (45 -> 42)
>>
>>    Additional (5): bat-dg2-8 bat-dg2-9 fi-kbl-8809g bat-rpls-1 bat-jsl-1
>>    Missing (8): fi-kbl-soraka shard-tglu fi-hsw-4200u bat-adlm-1 fi-bsw-cyan
>>    fi-ctg-p8600 shard-rkl fi-bdw-samus
>>
>> Possible new issues
>>
>>    Here are the unknown changes that may have been introduced in
>>    Patchwork_22660:
>>
>>   IGT changes
>>
>>     Possible regressions
>>
>>      * igt@i915_selftest@live@hangcheck:
>>           * fi-hsw-4770: PASS -> INCOMPLETE
>
>Unrelated to the change.
>
>>
>>     Suppressed
>>
>>    The following results come from untrusted machines, tests, or statuses.
>>    They do not affect the overall result.
>>
>>      * igt@gem_busy@busy@all:
>>
>>           * {bat-dg2-8}: NOTRUN -> INCOMPLETE
>>      * igt@gem_exec_gttfill@basic:
>>
>>           * {bat-dg2-9}: NOTRUN -> SKIP
>>      * igt@gem_exec_suspend@basic-s0@smem:
>>
>>           * {bat-dg2-9}: NOTRUN -> FAIL +7 similar issues
>
>For this one I got success on no-reloc code:
>
>./gem_exec_suspend --run basic-S0
>IGT-Version: 1.26-NO-GIT (x86_64) (Linux: 5.17.0+ x86_64)
>Starting subtest: basic-S0
>Starting dynamic subtest: smem
>[cmd] rtcwake: assuming RTC uses UTC ...
>rtcwake: wakeup from "freeze" using /dev/rtc0 at Thu Mar 24 07:56:03 2022
>Dynamic subtest smem: SUCCESS (2.817s)
>Starting dynamic subtest: lmem0
>[cmd] rtcwake: assuming RTC uses UTC ...
>rtcwake: wakeup from "freeze" using /dev/rtc0 at Thu Mar 24 07:56:20 2022
>Dynamic subtest lmem0: SUCCESS (3.184s)
>Subtest basic-S0: SUCCESS (6.001s)
>
>@Lucas - please consider reverting relocations for dg2. This will speed
>up our work and I've fixed yesterday all potential problems with no-reloc
>for BAT on dg2.

Please trigger a re-test on this to compare to a recent drm-tip.
It would be bad if this then blocks the execution of other tests and we
drop the pass and execution rates.

Lucas De Marchi
Lucas De Marchi March 31, 2022, 5:16 p.m. UTC | #3
On Wed, Mar 23, 2022 at 06:46:38PM +0100, Zbigniew Kempczyński wrote:
>This reverts commit 904ebf2ba89edaeba5c7c10540e43dba63541dc6.
>
>Failures on dg2 tests were caused by invalid alignment when local memory
>was in use. Changes which adopt alignment according to gen were already
>merged in IGT so lets revert relocation temporary enabler for dg2. Keeping
>it is a little bit problematic for IGT because on premerge we would see
>results with kernel which supports relocation. To see no-relocation
>results we need to send disabler (like this revert), point IGT with
>"Test-with" tag what is cumbersome and time consuming so lets do this
>permanently. If we will see some failures they need to be fixed instead
>of keeping relocation enabler.
>
>Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>Cc: Dave Airlie <airlied@redhat.com>
>Cc: Daniel Vetter <daniel.vetter@intel.com>
>Cc: Jason Ekstrand <jason@jlekstrand.net>
>Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>

thanks. I double checked BAT and things seem to be equivalent now
without that hack. I removed it from topic/core-for-CI.

How are we with the igt tests executed in full run?

thanks
Lucas De Marchi


>---
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>index 42a49fd2f2ab..8b0b4aeb6716 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>@@ -501,7 +501,7 @@ static bool platform_has_relocs_enabled(const struct i915_execbuffer *eb)
> 	 */
> 	if (GRAPHICS_VER(eb->i915) < 12 || IS_TIGERLAKE(eb->i915) ||
> 	    IS_ROCKETLAKE(eb->i915) || IS_ALDERLAKE_S(eb->i915) ||
>-	    IS_ALDERLAKE_P(eb->i915) || IS_DG2(eb->i915))
>+	    IS_ALDERLAKE_P(eb->i915))
> 		return true;
>
> 	return false;
>-- 
>2.32.0
>
Lucas De Marchi April 6, 2022, 3:36 p.m. UTC | #4
On Thu, Mar 31, 2022 at 10:16:02AM -0700, Lucas De Marchi wrote:
>On Wed, Mar 23, 2022 at 06:46:38PM +0100, Zbigniew Kempczyński wrote:
>>This reverts commit 904ebf2ba89edaeba5c7c10540e43dba63541dc6.
>>
>>Failures on dg2 tests were caused by invalid alignment when local memory
>>was in use. Changes which adopt alignment according to gen were already
>>merged in IGT so lets revert relocation temporary enabler for dg2. Keeping
>>it is a little bit problematic for IGT because on premerge we would see
>>results with kernel which supports relocation. To see no-relocation
>>results we need to send disabler (like this revert), point IGT with
>>"Test-with" tag what is cumbersome and time consuming so lets do this
>>permanently. If we will see some failures they need to be fixed instead
>>of keeping relocation enabler.
>>
>>Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>Cc: Dave Airlie <airlied@redhat.com>
>>Cc: Daniel Vetter <daniel.vetter@intel.com>
>>Cc: Jason Ekstrand <jason@jlekstrand.net>
>>Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>
>thanks. I double checked BAT and things seem to be equivalent now
>without that hack. I removed it from topic/core-for-CI.
>
>How are we with the igt tests executed in full run?

ping?

Also, we still have this exception for ADL-P, ADL-S and RKL. Should we
go ahead and try removing them?

Lucas De Marchi
Zbigniew Kempczyński May 27, 2022, 6:54 a.m. UTC | #5
On Wed, Apr 06, 2022 at 08:36:32AM -0700, Lucas De Marchi wrote:
> On Thu, Mar 31, 2022 at 10:16:02AM -0700, Lucas De Marchi wrote:
> > On Wed, Mar 23, 2022 at 06:46:38PM +0100, Zbigniew Kempczyński wrote:
> > > This reverts commit 904ebf2ba89edaeba5c7c10540e43dba63541dc6.
> > > 
> > > Failures on dg2 tests were caused by invalid alignment when local memory
> > > was in use. Changes which adopt alignment according to gen were already
> > > merged in IGT so lets revert relocation temporary enabler for dg2. Keeping
> > > it is a little bit problematic for IGT because on premerge we would see
> > > results with kernel which supports relocation. To see no-relocation
> > > results we need to send disabler (like this revert), point IGT with
> > > "Test-with" tag what is cumbersome and time consuming so lets do this
> > > permanently. If we will see some failures they need to be fixed instead
> > > of keeping relocation enabler.
> > > 
> > > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > > Cc: Dave Airlie <airlied@redhat.com>
> > > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > 
> > thanks. I double checked BAT and things seem to be equivalent now
> > without that hack. I removed it from topic/core-for-CI.
> > 
> > How are we with the igt tests executed in full run?
> 
> ping?
> 
> Also, we still have this exception for ADL-P, ADL-S and RKL. Should we
> go ahead and try removing them?
> 
> Lucas De Marchi

Yes, please drop ADL*/RKL patch in core-for-CI. 

--
Zbigniew
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 42a49fd2f2ab..8b0b4aeb6716 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -501,7 +501,7 @@  static bool platform_has_relocs_enabled(const struct i915_execbuffer *eb)
 	 */
 	if (GRAPHICS_VER(eb->i915) < 12 || IS_TIGERLAKE(eb->i915) ||
 	    IS_ROCKETLAKE(eb->i915) || IS_ALDERLAKE_S(eb->i915) ||
-	    IS_ALDERLAKE_P(eb->i915) || IS_DG2(eb->i915))
+	    IS_ALDERLAKE_P(eb->i915))
 		return true;
 
 	return false;