mbox series

[CI,0/3] add more probe failures

Message ID 20190802122125.21952-1-michal.wajdeczko@intel.com (mailing list archive)
Headers show
Series add more probe failures | expand

Message

Michal Wajdeczko Aug. 2, 2019, 12:21 p.m. UTC
v3: fix Gen9 issue discovered by the v2
v4: rebased
v5: more injected errors and more fixes

Michal Wajdeczko (3):
  drm/i915: Add i915 to i915_inject_probe_failure
  drm/i915/uc: Inject probe errors into intel_uc_init_hw
  drm/i915/wopcm: Don't fail on WOPCM partitioning failure

 .../gpu/drm/i915/display/intel_connector.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  4 +++
 drivers/gpu/drm/i915/gt/uc/intel_huc.c        |  8 +++--
 drivers/gpu/drm/i915/gt/uc/intel_uc.c         | 25 +++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 18 +++++++----
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      | 14 ++++-----
 drivers/gpu/drm/i915/i915_drv.c               | 27 +++++++++--------
 drivers/gpu/drm/i915/i915_drv.h               | 12 ++++----
 drivers/gpu/drm/i915/i915_gem.c               | 18 ++++-------
 drivers/gpu/drm/i915/i915_pci.c               |  2 +-
 drivers/gpu/drm/i915/intel_gvt.c              |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  2 +-
 drivers/gpu/drm/i915/intel_wopcm.c            | 30 +++++++++----------
 drivers/gpu/drm/i915/intel_wopcm.h            |  2 +-
 15 files changed, 97 insertions(+), 71 deletions(-)

Comments

Chris Wilson Aug. 2, 2019, 1:30 p.m. UTC | #1
Quoting Patchwork (2019-08-02 14:12:38)
> == Series Details ==
> 
> Series: add more probe failures (rev6)
> URL   : https://patchwork.freedesktop.org/series/64390/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_6614 -> Patchwork_13848
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_13848 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_13848, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_13848:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@i915_module_load@reload-with-fault-injection:
>     - fi-cfl-guc:         [PASS][1] -> [DMESG-WARN][2]
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6614/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html
>    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html

Looks like we didn't flush the cleanup we started before reporting the
error. Maybe,
https://patchwork.freedesktop.org/patch/321279/?series=64612&rev=1
-Chris
Michal Wajdeczko Aug. 2, 2019, 5:02 p.m. UTC | #2
On Fri, 02 Aug 2019 15:30:19 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Patchwork (2019-08-02 14:12:38)
>> == Series Details ==
>>
>> Series: add more probe failures (rev6)
>> URL   : https://patchwork.freedesktop.org/series/64390/
>> State : failure
>>
>> == Summary ==
>>
>> CI Bug Log - changes from CI_DRM_6614 -> Patchwork_13848
>> ====================================================
>>
>> Summary
>> -------
>>
>>   **FAILURE**
>>
>>   Serious unknown changes coming with Patchwork_13848 absolutely need  
>> to be
>>   verified manually.
>>
>>   If you think the reported changes have nothing to do with the changes
>>   introduced in Patchwork_13848, please notify your bug team to allow  
>> them
>>   to document this new failure mode, which will reduce false positives  
>> in CI.
>>
>>   External URL:  
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/
>>
>> Possible new issues
>> -------------------
>>
>>   Here are the unknown changes that may have been introduced in  
>> Patchwork_13848:
>>
>> ### IGT changes ###
>>
>> #### Possible regressions ####
>>
>>   * igt@i915_module_load@reload-with-fault-injection:
>>     - fi-cfl-guc:         [PASS][1] -> [DMESG-WARN][2]
>>    [1]:  
>> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6614/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html
>>    [2]:  
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html
>
> Looks like we didn't flush the cleanup we started before reporting the
> error. Maybe,

it's "leaked" guc err log, that we now try to free after gem, see:

https://cgit.freedesktop.org/drm/drm-tip/commit/?id=6f76098fe0f3f0b519b2ad528b4319195d6d0f73
Daniele Ceraolo Spurio Aug. 2, 2019, 6:07 p.m. UTC | #3
On 8/2/19 10:02 AM, Michal Wajdeczko wrote:
> On Fri, 02 Aug 2019 15:30:19 +0200, Chris Wilson 
> <chris@chris-wilson.co.uk> wrote:
> 
>> Quoting Patchwork (2019-08-02 14:12:38)
>>> == Series Details ==
>>>
>>> Series: add more probe failures (rev6)
>>> URL   : https://patchwork.freedesktop.org/series/64390/
>>> State : failure
>>>
>>> == Summary ==
>>>
>>> CI Bug Log - changes from CI_DRM_6614 -> Patchwork_13848
>>> ====================================================
>>>
>>> Summary
>>> -------
>>>
>>>   **FAILURE**
>>>
>>>   Serious unknown changes coming with Patchwork_13848 absolutely need 
>>> to be
>>>   verified manually.
>>>
>>>   If you think the reported changes have nothing to do with the changes
>>>   introduced in Patchwork_13848, please notify your bug team to allow 
>>> them
>>>   to document this new failure mode, which will reduce false 
>>> positives in CI.
>>>
>>>   External URL: 
>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/
>>>
>>> Possible new issues
>>> -------------------
>>>
>>>   Here are the unknown changes that may have been introduced in 
>>> Patchwork_13848:
>>>
>>> ### IGT changes ###
>>>
>>> #### Possible regressions ####
>>>
>>>   * igt@i915_module_load@reload-with-fault-injection:
>>>     - fi-cfl-guc:         [PASS][1] -> [DMESG-WARN][2]
>>>    [1]: 
>>> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6614/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html 
>>>
>>>    [2]: 
>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html 
>>>
>>
>> Looks like we didn't flush the cleanup we started before reporting the
>> error. Maybe,
> 
> it's "leaked" guc err log, that we now try to free after gem, see:
> 
> https://cgit.freedesktop.org/drm/drm-tip/commit/?id=6f76098fe0f3f0b519b2ad528b4319195d6d0f73 
> 

Not sure why but I was convinced that the log release was in uc_fini 
(but I should probably have checked). late_release isn't the place to 
release the logs IMO, but not sure where to put it since we don't really 
do anything with the log at the moment (we stopped wedging on uc failure 
a long time ago and the log is therefore never accessible). What's the 
plan with https://patchwork.freedesktop.org/patch/298002/? that would 
make the log useful again and skip uc_fini on load failure, making that 
the best place to put the release in.

Daniele
Michal Wajdeczko Aug. 2, 2019, 6:14 p.m. UTC | #4
On Fri, 02 Aug 2019 20:07:22 +0200, Daniele Ceraolo Spurio  
<daniele.ceraolospurio@intel.com> wrote:

>
>
> On 8/2/19 10:02 AM, Michal Wajdeczko wrote:
>> On Fri, 02 Aug 2019 15:30:19 +0200, Chris Wilson  
>> <chris@chris-wilson.co.uk> wrote:
>>
>>> Quoting Patchwork (2019-08-02 14:12:38)
>>>> == Series Details ==
>>>>
>>>> Series: add more probe failures (rev6)
>>>> URL   : https://patchwork.freedesktop.org/series/64390/
>>>> State : failure
>>>>
>>>> == Summary ==
>>>>
>>>> CI Bug Log - changes from CI_DRM_6614 -> Patchwork_13848
>>>> ====================================================
>>>>
>>>> Summary
>>>> -------
>>>>
>>>>   **FAILURE**
>>>>
>>>>   Serious unknown changes coming with Patchwork_13848 absolutely need  
>>>> to be
>>>>   verified manually.
>>>>
>>>>   If you think the reported changes have nothing to do with the  
>>>> changes
>>>>   introduced in Patchwork_13848, please notify your bug team to allow  
>>>> them
>>>>   to document this new failure mode, which will reduce false  
>>>> positives in CI.
>>>>
>>>>   External URL:  
>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/
>>>>
>>>> Possible new issues
>>>> -------------------
>>>>
>>>>   Here are the unknown changes that may have been introduced in  
>>>> Patchwork_13848:
>>>>
>>>> ### IGT changes ###
>>>>
>>>> #### Possible regressions ####
>>>>
>>>>   * igt@i915_module_load@reload-with-fault-injection:
>>>>     - fi-cfl-guc:         [PASS][1] -> [DMESG-WARN][2]
>>>>    [1]:  
>>>> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6614/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html  
>>>>    [2]:  
>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13848/fi-cfl-guc/igt@i915_module_load@reload-with-fault-injection.html
>>>
>>> Looks like we didn't flush the cleanup we started before reporting the
>>> error. Maybe,
>>  it's "leaked" guc err log, that we now try to free after gem, see:
>>   
>> https://cgit.freedesktop.org/drm/drm-tip/commit/?id=6f76098fe0f3f0b519b2ad528b4319195d6d0f73
>
> Not sure why but I was convinced that the log release was in uc_fini  
> (but I should probably have checked). late_release isn't the place to  
> release the logs IMO, but not sure where to put it since we don't really  
> do anything with the log at the moment (we stopped wedging on uc failure  
> a long time ago and the log is therefore never accessible). What's the  
> plan with https://patchwork.freedesktop.org/patch/298002/? that would  
> make the log useful again and skip uc_fini on load failure, making that  
> the best place to put the release in.

restoring -EIO is next on my list
for now I want to make probe error path more robust
updated series with fix for last issue will be sent shortly

~Michal