mbox series

[0/4] Update whitelist support for new hardware

Message ID 20190614002838.3072-1-robert.m.fosha@intel.com (mailing list archive)
Headers show
Series Update whitelist support for new hardware | expand

Message

Fosha, Robert M June 14, 2019, 12:28 a.m. UTC
Recent hardware adds support for finer-grained control over
whitelisting, allowing registers to be whitelisted independently
for reads and/or writes. This series will add the basic plumbing
to support that.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Robert M. Fosha <robert.m.fosha@intel.com>

John Harrison (4):
  drm/i915: Support flags in whitlist WAs
  drm/i915: Support whitelist workarounds on all engines
  drm/i915: Add whitelist workarounds for CFL
  drm/i915: Add whitelist workarounds for ICL

 drivers/gpu/drm/i915/gt/intel_workarounds.c | 191 +++++++++++++++++---
 drivers/gpu/drm/i915/i915_reg.h             |   7 +
 2 files changed, 172 insertions(+), 26 deletions(-)

Comments

Tvrtko Ursulin June 14, 2019, 5:19 p.m. UTC | #1
On 14/06/2019 15:23, Patchwork wrote:
> == Series Details ==
> 
> Series: Update whitelist support for new hardware
> URL   : https://patchwork.freedesktop.org/series/62076/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_6265 -> Patchwork_13279
> ====================================================
> 
> Summary
> -------
> 
>    **FAILURE**
> 
>    Serious unknown changes coming with Patchwork_13279 absolutely need to be
>    verified manually.
>    
>    If you think the reported changes have nothing to do with the changes
>    introduced in Patchwork_13279, please notify your bug team to allow them
>    to document this new failure mode, which will reduce false positives in CI.
> 
>    External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/
> 
> Possible new issues
> -------------------
> 
>    Here are the unknown changes that may have been introduced in Patchwork_13279:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>    * igt@i915_selftest@live_workarounds:
>      - fi-cfl-guc:         [PASS][1] -> [DMESG-FAIL][2]
>     [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-cfl-guc/igt@i915_selftest@live_workarounds.html
>     [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-cfl-guc/igt@i915_selftest@live_workarounds.html
>      - fi-cfl-8109u:       [PASS][3] -> [DMESG-FAIL][4]
>     [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html
>     [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-cfl-8109u/igt@i915_selftest@live_workarounds.html
>      - fi-whl-u:           [PASS][5] -> [DMESG-FAIL][6]
>     [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-whl-u/igt@i915_selftest@live_workarounds.html
>     [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-whl-u/igt@i915_selftest@live_workarounds.html
>      - fi-cfl-8700k:       [PASS][7] -> [DMESG-FAIL][8]
>     [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-cfl-8700k/igt@i915_selftest@live_workarounds.html
>     [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-cfl-8700k/igt@i915_selftest@live_workarounds.html
>      - fi-cml-u2:          [PASS][9] -> [DMESG-FAIL][10]
>     [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-cml-u2/igt@i915_selftest@live_workarounds.html
>     [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-cml-u2/igt@i915_selftest@live_workarounds.html

Read-back of 0x20ec (GEN9_CS_DEBUG_MODE1) seems to be causing issues. It 
could be put on the wo_register list if appropriate. And/or made 
RING_FORCE_TO_NONPRIV_RW.

However now I realize that handling of new whitelisting modes is missing 
from the selftest altogether. All RING_FORCE_TO_NONPRIV_RD and 
RING_FORCE_TO_NONPRIV_WR probably need explicit handling there - first 
ones to check they cannot be written to, and skip read-back on the 
second one.

Regards,

Tvrtko

> 
>    
> Known issues
> ------------
> 
>    Here are the changes found in Patchwork_13279 that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>    * igt@gem_exec_create@basic:
>      - fi-cml-u:           [PASS][11] -> [INCOMPLETE][12] ([fdo#110566])
>     [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-cml-u/igt@gem_exec_create@basic.html
>     [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-cml-u/igt@gem_exec_create@basic.html
> 
>    * igt@i915_selftest@live_blt:
>      - fi-skl-iommu:       [PASS][13] -> [INCOMPLETE][14] ([fdo#108602])
>     [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-skl-iommu/igt@i915_selftest@live_blt.html
>     [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-skl-iommu/igt@i915_selftest@live_blt.html
> 
>    
> #### Possible fixes ####
> 
>    * igt@kms_chamelium@hdmi-hpd-fast:
>      - fi-kbl-7500u:       [FAIL][15] ([fdo#109485]) -> [PASS][16]
>     [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6265/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
>     [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
> 
>    
>    [fdo#108602]: https://bugs.freedesktop.org/show_bug.cgi?id=108602
>    [fdo#109485]: https://bugs.freedesktop.org/show_bug.cgi?id=109485
>    [fdo#110566]: https://bugs.freedesktop.org/show_bug.cgi?id=110566
> 
> 
> Participating hosts (53 -> 47)
> ------------------------------
> 
>    Additional (1): fi-byt-j1900
>    Missing    (7): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus
> 
> 
> Build changes
> -------------
> 
>    * Linux: CI_DRM_6265 -> Patchwork_13279
> 
>    CI_DRM_6265: 657b9f601946cab518d8911ea92dc0f437a1f4b4 @ git://anongit.freedesktop.org/gfx-ci/linux
>    IGT_5055: 495287320225e7f180d384cad7b207b77154438f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
>    Patchwork_13279: f0e2fb83377c7146bce4b758e107f46194607038 @ git://anongit.freedesktop.org/gfx-ci/linux
> 
> 
> == Linux commits ==
> 
> f0e2fb83377c drm/i915: Add whitelist workarounds for ICL
> 4330ded0139b drm/i915: Add whitelist workarounds for CFL
> 4ac7341292ad drm/i915: Support whitelist workarounds on all engines
> 75becc1f2fcf drm/i915: Support flags in whitlist WAs
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13279/
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
John Harrison June 14, 2019, 6:17 p.m. UTC | #2
On 6/14/2019 10:19, Tvrtko Ursulin wrote:
>
> Read-back of 0x20ec (GEN9_CS_DEBUG_MODE1) seems to be causing issues. 
> It could be put on the wo_register list if appropriate. And/or made 
> RING_FORCE_TO_NONPRIV_RW.
>
> However now I realize that handling of new whitelisting modes is 
> missing from the selftest altogether. All RING_FORCE_TO_NONPRIV_RD and 
> RING_FORCE_TO_NONPRIV_WR probably need explicit handling there - first 
> ones to check they cannot be written to, and skip read-back on the 
> second one.
>
> Regards,
>
> Tvrtko

It looks like there are multiple issues.

I think the quickest/simplest option right now is to drop a bunch of the 
ICL updates (because there are too many for the current 12 slot maximum) 
and to drop the problematic CFL updates.

The only changes that are urgently needed are the HUC status registers. 
Unfortunately, they are read-only entries. So the self-test will need to 
be updated to cope with those. Not sure what it can do though. Do we 
just skip read-only entries? The test can't know what a valid value to 
read back is.

The rest we can put back in later on once all the other issues have been 
resolved.

John.
Chris Wilson June 14, 2019, 6:22 p.m. UTC | #3
Quoting John Harrison (2019-06-14 19:17:25)
> On 6/14/2019 10:19, Tvrtko Ursulin wrote:
> >
> > Read-back of 0x20ec (GEN9_CS_DEBUG_MODE1) seems to be causing issues. 
> > It could be put on the wo_register list if appropriate. And/or made 
> > RING_FORCE_TO_NONPRIV_RW.
> >
> > However now I realize that handling of new whitelisting modes is 
> > missing from the selftest altogether. All RING_FORCE_TO_NONPRIV_RD and 
> > RING_FORCE_TO_NONPRIV_WR probably need explicit handling there - first 
> > ones to check they cannot be written to, and skip read-back on the 
> > second one.
> >
> > Regards,
> >
> > Tvrtko
> 
> It looks like there are multiple issues.
> 
> I think the quickest/simplest option right now is to drop a bunch of the 
> ICL updates (because there are too many for the current 12 slot maximum) 
> and to drop the problematic CFL updates.
> 
> The only changes that are urgently needed are the HUC status registers. 
> Unfortunately, they are read-only entries. So the self-test will need to 
> be updated to cope with those. Not sure what it can do though. Do we 
> just skip read-only entries? The test can't know what a valid value to 
> read back is.

There's an exception list [wo_register] for registers that don't
conform to the test expectations.
-Chris
Tvrtko Ursulin June 14, 2019, 6:44 p.m. UTC | #4
On 14/06/2019 19:22, Chris Wilson wrote:
> Quoting John Harrison (2019-06-14 19:17:25)
>> On 6/14/2019 10:19, Tvrtko Ursulin wrote:
>>>
>>> Read-back of 0x20ec (GEN9_CS_DEBUG_MODE1) seems to be causing issues.
>>> It could be put on the wo_register list if appropriate. And/or made
>>> RING_FORCE_TO_NONPRIV_RW.
>>>
>>> However now I realize that handling of new whitelisting modes is
>>> missing from the selftest altogether. All RING_FORCE_TO_NONPRIV_RD and
>>> RING_FORCE_TO_NONPRIV_WR probably need explicit handling there - first
>>> ones to check they cannot be written to, and skip read-back on the
>>> second one.
>>>
>>> Regards,
>>>
>>> Tvrtko
>>
>> It looks like there are multiple issues.
>>
>> I think the quickest/simplest option right now is to drop a bunch of the
>> ICL updates (because there are too many for the current 12 slot maximum)
>> and to drop the problematic CFL updates.
>>
>> The only changes that are urgently needed are the HUC status registers.
>> Unfortunately, they are read-only entries. So the self-test will need to
>> be updated to cope with those. Not sure what it can do though. Do we
>> just skip read-only entries? The test can't know what a valid value to
>> read back is.

Plan sounds good to me. (Drop everything apart from HuC.)

> There's an exception list [wo_register] for registers that don't
> conform to the test expectations.

But that's not appropriate for read-only ones. I think for these we 
modify the selftest to read before and after writing and check value is 
unchanged. Is that doable today or Monday at latest?

Regards,

Tvrtko