diff mbox series

drm/i915/tgl/psr: Fix glitches when doing frontbuffer modifications

Message ID 20201002231627.24528-1-jose.souza@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/tgl/psr: Fix glitches when doing frontbuffer modifications | expand

Commit Message

Souza, Jose Oct. 2, 2020, 11:16 p.m. UTC
Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
glitches that are often reproduced when executing CPU intensive
workloads while a eDP 4K panel is attached.

Manually exiting PSR causes the frontbuffer to be updated without
glitches and the IOMMU errors are also gone but this comes at the cost
of less time with PSR active.

So using this workaround until this issue is root caused and a better
fix is found.

The current code is already ready to enable PSR after this exit if
there is not other frontbuffer modifications.

Adding a new if block in psr_force_hw_tracking_exit() instead of reuse
the else/gen8- block because the plan is to revert this workaround
as soon as a better solution is found.

Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
---
 drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Souza, Jose Oct. 5, 2020, 6:55 p.m. UTC | #1
On Sat, 2020-10-03 at 01:26 +0000, Patchwork wrote:
> Patch Details
> Series:	drm/i915/tgl/psr: Fix glitches when doing frontbuffer modifications
> URL:	https://patchwork.freedesktop.org/series/82351/
> State:	failure
> Details:	https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18617/index.html
> CI Bug Log - changes from CI_DRM_9093_full -> Patchwork_18617_full
> Summary
> FAILURE
> 
> Serious unknown changes coming with Patchwork_18617_full absolutely need to be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_18617_full, please notify your bug team to allow them
> to document this new failure mode, which will reduce false positives in CI.
> 
> Possible new issues
> Here are the unknown changes that may have been introduced in Patchwork_18617_full:
> 
> IGT changes
> Possible regressions
> igt@gem_userptr_blits@unsync-unmap-cycles:
> 
> shard-skl: PASS -> TIMEOUT
> igt@kms_cursor_edge_walk@pipe-c-64x64-right-edge:
> 
> shard-hsw: PASS -> INCOMPLETE

The two above don't are not related as this change only affects TGL.


> igt@kms_cursor_legacy@all-pipes-forked-bo:
> 
> shard-tglb: PASS -> INCOMPLETE

Something went pretty wrong in this test executing by the logs but don't look related.


> igt@kms_psr2_su@frontbuffer:
> 
> shard-tglb: PASS -> FAIL +1 similar issue


This failure is expected with this change.

> Suppressed
> The following results come from untrusted machines, tests, or statuses.
> They do not affect the overall result.
> 
> {igt@kms_async_flips@test-time-stamp}:
> shard-tglb: PASS -> FAIL
> Known issues
> Here are the changes found in Patchwork_18617_full that come from known issues:
> 
> IGT changes
> Issues hit
> igt@gem_exec_create@madvise:
> 
> shard-glk: PASS -> DMESG-WARN (i915#118 / i915#95)
> igt@i915_pm_rc6_residency@rc6-idle:
> 
> shard-hsw: PASS -> FAIL (i915#1860)
> igt@kms_cursor_crc@pipe-b-cursor-suspend:
> 
> shard-skl: PASS -> INCOMPLETE (i915#300)
> igt@kms_cursor_legacy@flip-vs-cursor-atomic:
> 
> shard-tglb: PASS -> FAIL (i915#2346) +3 similar issues
> igt@kms_cursor_legacy@flip-vs-cursor-busy-crc-atomic:
> 
> shard-kbl: PASS -> DMESG-WARN (i915#1982)
> igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-untiled:
> 
> shard-apl: PASS -> DMESG-WARN (i915#1635 / i915#1982)
> igt@kms_flip@flip-vs-blocking-wf-vblank@a-edp1:
> 
> shard-skl: PASS -> DMESG-WARN (i915#1982) +9 similar issues
> igt@kms_flip@flip-vs-expired-vblank@c-edp1:
> 
> shard-skl: PASS -> FAIL (i915#79)
> igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:
> 
> shard-kbl: PASS -> DMESG-WARN (i915#180) +3 similar issues
> igt@kms_flip_tiling@flip-changes-tiling:
> 
> shard-skl: PASS -> FAIL (i915#699)
> igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-cpu:
> 
> shard-skl: PASS -> FAIL (i915#49) +1 similar issue
> igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-gtt:
> 
> shard-tglb: PASS -> DMESG-WARN (i915#1982) +1 similar issue
> igt@kms_hdr@bpc-switch-dpms:
> 
> shard-skl: PASS -> FAIL (i915#1188)
> igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes:
> 
> shard-iclb: PASS -> INCOMPLETE (i915#1185 / i915#250)
> igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
> 
> shard-skl: PASS -> FAIL (fdo#108145 / i915#265) +1 similar issue
> igt@kms_psr@psr2_sprite_mmap_gtt:
> 
> shard-iclb: PASS -> SKIP (fdo#109441)
> Possible fixes
> igt@gem_exec_reloc@basic-cpu-gtt-active:
> 
> shard-skl: DMESG-WARN (i915#1982) -> PASS +3 similar issues
> igt@gem_exec_reloc@basic-many-active@vecs0:
> 
> shard-glk: FAIL (i915#2389) -> PASS
> {igt@kms_async_flips@async-flip-with-page-flip-events}:
> 
> shard-kbl: FAIL (i915#2521) -> PASS
> igt@kms_big_fb@linear-8bpp-rotate-0:
> 
> shard-apl: DMESG-WARN (i915#1635 / i915#1982) -> PASS
> igt@kms_flip@plain-flip-fb-recreate@b-edp1:
> 
> shard-skl: FAIL (i915#2122) -> PASS
> igt@kms_flip_tiling@flip-changes-tiling-yf:
> 
> shard-kbl: DMESG-WARN (i915#1982) -> PASS
> igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-draw-render:
> 
> shard-tglb: DMESG-WARN (i915#1982) -> PASS
> igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
> 
> shard-skl: FAIL (fdo#108145 / i915#265) -> PASS
> igt@kms_psr@psr2_cursor_mmap_cpu:
> 
> shard-iclb: SKIP (fdo#109441) -> PASS +1 similar issue
> igt@kms_setmode@basic:
> 
> shard-glk: FAIL (i915#31) -> PASS
> igt@kms_vblank@pipe-b-ts-continuation-suspend:
> 
> shard-kbl: DMESG-WARN (i915#180) -> PASS
> Warnings
> igt@i915_pm_rc6_residency@rc6-idle:
> 
> shard-iclb: FAIL (i915#1515) -> WARN (i915#1515)
> igt@kms_cursor_legacy@flip-vs-cursor-atomic:
> 
> shard-skl: DMESG-FAIL (i915#1982) -> DMESG-WARN (i915#1982)
> igt@kms_vblank@pipe-a-ts-continuation-suspend:
> 
> shard-skl: INCOMPLETE (i915#198) -> DMESG-WARN (i915#1982)
> igt@runner@aborted:
> 
> shard-skl: FAIL (i915#1436) -> FAIL (i915#1611 / i915#2029)
> {name}: This element is suppressed. This means it is ignored when computing
> the status of the difference (SUCCESS, WARNING, or FAILURE).
> 
> Participating hosts (11 -> 10)
> Missing (1): pig-glk-j5005
> 
> Build changes
> Linux: CI_DRM_9093 -> Patchwork_18617
> CI-20190529: 20190529
> CI_DRM_9093: 827ebff930c6340ed1c1c274909717525951c496 @ git://anongit.freedesktop.org/gfx-ci/linux
> IGT_5798: 430bad5a53c08125fbd48978ed6a66f61a33a40b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
> Patchwork_18617: f1ada30987fd65e158d57983299cc772f8af8a7a @ git://anongit.freedesktop.org/gfx-ci/linux
> piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
> 
>
Gwan-gyeong Mun Oct. 12, 2020, 6:12 p.m. UTC | #2
After applying this patch, the psr screen glitch issue is still seen.
On Fri, 2020-10-02 at 16:16 -0700, José Roberto de Souza wrote:
> Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
> glitches that are often reproduced when executing CPU intensive
> workloads while a eDP 4K panel is attached.
> 
> Manually exiting PSR causes the frontbuffer to be updated without
> glitches and the IOMMU errors are also gone but this comes at the
> cost
> of less time with PSR active.
> 
> So using this workaround until this issue is root caused and a better
> fix is found.
> 
> The current code is already ready to enable PSR after this exit if
> there is not other frontbuffer modifications.
> 
> Adding a new if block in psr_force_hw_tracking_exit() instead of
> reuse
> the else/gen8- block because the plan is to revert this workaround
> as soon as a better solution is found.
> 
> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> b/drivers/gpu/drm/i915/display/intel_psr.c
> index 8a9d0bdde1bf..8630121dbbbe 100644
> --- a/drivers/gpu/drm/i915/display/intel_psr.c
> +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> @@ -1152,7 +1152,21 @@ void intel_psr_disable(struct intel_dp
> *intel_dp,
>  
>  static void psr_force_hw_tracking_exit(struct drm_i915_private
> *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) >= 9)
> +	if (IS_TIGERLAKE(dev_priv))
> +		/*
> +		 * Writes to CURSURFLIVE in TGL are causing IOMMU
> errors and
> +		 * visual glitches that are often reproduced when
> executing
> +		 * CPU intensive workloads while a eDP 4K panel is
> attached.
> +		 *
> +		 * Manually exiting PSR causes the frontbuffer to be
> updated
> +		 * without glitches and the IOMMU errors are also gone
> but
> +		 * this comes at the cost of less time with PSR active.
> +		 *
> +		 * So using this workaround until this issue is root
> caused
> +		 * and a better fix is found.
> +		 */
> +		intel_psr_exit(dev_priv);
> +	else if (INTEL_GEN(dev_priv) >= 9)
>  		/*
>  		 * Display WA #0884: skl+
>  		 * This documented WA for bxt can be safely applied
Souza, Jose Oct. 12, 2020, 6:15 p.m. UTC | #3
On Mon, 2020-10-12 at 19:12 +0100, Mun, Gwan-gyeong wrote:
> After applying this patch, the psr screen glitch issue is still seen.

Same IOMMU errors too? In my end it is fixed.
Can you also give a try without the DMC firmware and without this changes?


> On Fri, 2020-10-02 at 16:16 -0700, José Roberto de Souza wrote:
> > Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
> > glitches that are often reproduced when executing CPU intensive
> > workloads while a eDP 4K panel is attached.
> > 
> > Manually exiting PSR causes the frontbuffer to be updated without
> > glitches and the IOMMU errors are also gone but this comes at the
> > cost
> > of less time with PSR active.
> > 
> > So using this workaround until this issue is root caused and a better
> > fix is found.
> > 
> > The current code is already ready to enable PSR after this exit if
> > there is not other frontbuffer modifications.
> > 
> > Adding a new if block in psr_force_hw_tracking_exit() instead of
> > reuse
> > the else/gen8- block because the plan is to revert this workaround
> > as soon as a better solution is found.
> > 
> > Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> > b/drivers/gpu/drm/i915/display/intel_psr.c
> > index 8a9d0bdde1bf..8630121dbbbe 100644
> > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > @@ -1152,7 +1152,21 @@ void intel_psr_disable(struct intel_dp
> > *intel_dp,
> >  
> > 
> > 
> > 
> >  static void psr_force_hw_tracking_exit(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	if (INTEL_GEN(dev_priv) >= 9)
> > +	if (IS_TIGERLAKE(dev_priv))
> > +		/*
> > +		 * Writes to CURSURFLIVE in TGL are causing IOMMU
> > errors and
> > +		 * visual glitches that are often reproduced when
> > executing
> > +		 * CPU intensive workloads while a eDP 4K panel is
> > attached.
> > +		 *
> > +		 * Manually exiting PSR causes the frontbuffer to be
> > updated
> > +		 * without glitches and the IOMMU errors are also gone
> > but
> > +		 * this comes at the cost of less time with PSR active.
> > +		 *
> > +		 * So using this workaround until this issue is root
> > caused
> > +		 * and a better fix is found.
> > +		 */
> > +		intel_psr_exit(dev_priv);
> > +	else if (INTEL_GEN(dev_priv) >= 9)
> >  		/*
> >  		 * Display WA #0884: skl+
> >  		 * This documented WA for bxt can be safely applied
Gwan-gyeong Mun Oct. 12, 2020, 7:04 p.m. UTC | #4
On Mon, 2020-10-12 at 11:15 -0700, Souza, Jose wrote:
> On Mon, 2020-10-12 at 19:12 +0100, Mun, Gwan-gyeong wrote:
> > After applying this patch, the psr screen glitch issue is still
> > seen.
> 
> Same IOMMU errors too? In my end it is fixed.
> Can you also give a try without the DMC firmware and without this
> changes?
> 
- Result with DMC firmware (tgl_dmc_ver2_08.bin, the latest drm-tip
requires this version) showes PSR screen glitch issue.
- Result without DMC firmware does not show PSR screen glitch issue.
> 
> > On Fri, 2020-10-02 at 16:16 -0700, José Roberto de Souza wrote:
> > > Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
> > > glitches that are often reproduced when executing CPU intensive
> > > workloads while a eDP 4K panel is attached.
> > > 
> > > Manually exiting PSR causes the frontbuffer to be updated without
> > > glitches and the IOMMU errors are also gone but this comes at the
> > > cost
> > > of less time with PSR active.
> > > 
> > > So using this workaround until this issue is root caused and a
> > > better
> > > fix is found.
> > > 
> > > The current code is already ready to enable PSR after this exit
> > > if
> > > there is not other frontbuffer modifications.
> > > 
> > > Adding a new if block in psr_force_hw_tracking_exit() instead of
> > > reuse
> > > the else/gen8- block because the plan is to revert this
> > > workaround
> > > as soon as a better solution is found.
> > > 
> > > Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
> > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> > > b/drivers/gpu/drm/i915/display/intel_psr.c
> > > index 8a9d0bdde1bf..8630121dbbbe 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > > @@ -1152,7 +1152,21 @@ void intel_psr_disable(struct intel_dp
> > > *intel_dp,
> > >  
> > > 
> > > 
> > > 
> > >  static void psr_force_hw_tracking_exit(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > > -	if (INTEL_GEN(dev_priv) >= 9)
> > > +	if (IS_TIGERLAKE(dev_priv))
> > > +		/*
> > > +		 * Writes to CURSURFLIVE in TGL are causing IOMMU
> > > errors and
> > > +		 * visual glitches that are often reproduced when
> > > executing
> > > +		 * CPU intensive workloads while a eDP 4K panel is
> > > attached.
> > > +		 *
> > > +		 * Manually exiting PSR causes the frontbuffer to be
> > > updated
> > > +		 * without glitches and the IOMMU errors are also gone
> > > but
> > > +		 * this comes at the cost of less time with PSR active.
> > > +		 *
> > > +		 * So using this workaround until this issue is root
> > > caused
> > > +		 * and a better fix is found.
> > > +		 */
> > > +		intel_psr_exit(dev_priv);
> > > +	else if (INTEL_GEN(dev_priv) >= 9)
> > >  		/*
> > >  		 * Display WA #0884: skl+
> > >  		 * This documented WA for bxt can be safely applied
Gwan-gyeong Mun Oct. 22, 2020, 12:43 p.m. UTC | #5
1. While testing the problematic scenario, it has not always shown the
IOMMU DAMR related below errors on the drm-tip. 
   (sometimes the error messages raised, but some times it has not
happened on the same kernel and scenario.
  
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [00:02.0] PASID 0xffffffff fault addr
0xfc001000 [fault reason 06] PTE Read access is not set
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [00:02.0] PASID 0xffffffff fault addr
0xfc000000 [fault reason 06] PTE Read access is not set

2  After applying this patch the screen glitch issues have been
remarkably alleviated.
  - Eventhough there infrequently showed the screen glitch issues.
  - But I agree to apply this patch as a workaround by adding the
explanation below.
 
3. The dc state and PSR enable/disable scenarios has been changed by
this patch.
      
(1)Before applying patch
  enable psr 
    -> (front buffer updates) 
        -> intel_psr_flush 
                  ^   -> psr_force_hw_tracking_exit()
                  |                   : write CURSURFLIVE 
                  |                          |
                  |  (front buffer updates)  |
                  +--------------------------+

    PSR enabled -------------------------------------- -->
   ( DC state controlled by DMC firmware)
                                   

 (2) After applying patch
  enable psr 
   ^  -> (front buffer updates) 
   |       -> intel_psr_flush
   |          -> psr_force_hw_tracking_exit()
   |               : call intel_psr_exit()
   |                                 -> disable psr
   |                                         |
   |                                         |
   +-----------------------------------------+

PSR enabled ---------------------------> PSR disabled
  ^                                          |
  |                                          |
  +------------------------------------------+
   ( DC state controlled by DMC firmware)

the repeating of enabling and disabling of PSR by the rapid screen
updates prevents entering of low power dc states.
Infereing from this scenario, it indirectly touches DC state and it
alleviates the issue.


On Fri, 2020-10-02 at 16:16 -0700, José Roberto de Souza wrote:
> Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
> glitches that are often reproduced when executing CPU intensive
> workloads while a eDP 4K panel is attached.
> 
> Manually exiting PSR causes the frontbuffer to be updated without
> glitches and the IOMMU errors are also gone but this comes at the
> cost
> of less time with PSR active.
> 
> So using this workaround until this issue is root caused and a better
> fix is found.
> 
> The current code is already ready to enable PSR after this exit if
> there is not other frontbuffer modifications.
> 
> Adding a new if block in psr_force_hw_tracking_exit() instead of
> reuse
> the else/gen8- block because the plan is to revert this workaround
> as soon as a better solution is found.
> 
> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> b/drivers/gpu/drm/i915/display/intel_psr.c
> index 8a9d0bdde1bf..8630121dbbbe 100644
> --- a/drivers/gpu/drm/i915/display/intel_psr.c
> +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> @@ -1152,7 +1152,21 @@ void intel_psr_disable(struct intel_dp
> *intel_dp,
>  
>  static void psr_force_hw_tracking_exit(struct drm_i915_private
> *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) >= 9)
> +	if (IS_TIGERLAKE(dev_priv))
> +		/*
> +		 * Writes to CURSURFLIVE in TGL are causing IOMMU
> errors and
> +		 * visual glitches that are often reproduced when
> executing
> +		 * CPU intensive workloads while a eDP 4K panel is
> attached.
> +		 *
> +		 * Manually exiting PSR causes the frontbuffer to be
> updated
> +		 * without glitches and the IOMMU errors are also gone
> but
> +		 * this comes at the cost of less time with PSR active.
> +		 *
> +		 * So using this workaround until this issue is root
> caused
> +		 * and a better fix is found.
> +		 */
> +		intel_psr_exit(dev_priv);
> +	else if (INTEL_GEN(dev_priv) >= 9)
>  		/*
>  		 * Display WA #0884: skl+
>  		 * This documented WA for bxt can be safely applied
Gwan-gyeong Mun Oct. 22, 2020, 12:48 p.m. UTC | #6
On Thu, 2020-10-22 at 12:43 +0000, Mun, Gwan-gyeong wrote:
> 1. While testing the problematic scenario, it has not always shown
> the
> IOMMU DAMR related below errors on the drm-tip. 
>    (sometimes the error messages raised, but some times it has not
> happened on the same kernel and scenario.
>   
> DMAR: DRHD: handling fault status reg 2
> DMAR: [DMA Read] Request device [00:02.0] PASID 0xffffffff fault addr
> 0xfc001000 [fault reason 06] PTE Read access is not set
> DMAR: DRHD: handling fault status reg 3
> DMAR: [DMA Read] Request device [00:02.0] PASID 0xffffffff fault addr
> 0xfc000000 [fault reason 06] PTE Read access is not set
> 
> 2  After applying this patch the screen glitch issues have been
> remarkably alleviated.
>   - Eventhough there infrequently showed the screen glitch issues.
>   - But I agree to apply this patch as a workaround by adding the
> explanation below.
>  
> 3. The dc state and PSR enable/disable scenarios has been changed by
> this patch.
>       
> (1)Before applying patch
>   enable psr 
>     -> (front buffer updates) 
>         -> intel_psr_flush 
>                   ^   -> psr_force_hw_tracking_exit()
>                   |                   : write CURSURFLIVE 
>                   |                          |
>                   |  (front buffer updates)  |
>                   +--------------------------+
> 
>     PSR enabled -------------------------------------- -->
>    ( DC state controlled by DMC firmware)
>                                    
> 
>  (2) After applying patch
>   enable psr 
>    ^  -> (front buffer updates) 
>    |       -> intel_psr_flush
>    |          -> psr_force_hw_tracking_exit()
>    |               : call intel_psr_exit()
>    |                                 -> disable psr
>    |                                         |
>    |                                         |
>    +-----------------------------------------+
> 
> PSR enabled ---------------------------> PSR disabled
>   ^                                          |
>   |                                          |
>   +------------------------------------------+
>    ( DC state controlled by DMC firmware)
> 
> the repeating of enabling and disabling of PSR by the rapid screen
> updates prevents entering of low power dc states.
> Infereing from this scenario, it indirectly touches DC state and it
> alleviates the issue.
> 
with the previous comments,
Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> 
> On Fri, 2020-10-02 at 16:16 -0700, José Roberto de Souza wrote:
> > Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
> > glitches that are often reproduced when executing CPU intensive
> > workloads while a eDP 4K panel is attached.
> > 
> > Manually exiting PSR causes the frontbuffer to be updated without
> > glitches and the IOMMU errors are also gone but this comes at the
> > cost
> > of less time with PSR active.
> > 
> > So using this workaround until this issue is root caused and a
> > better
> > fix is found.
> > 
> > The current code is already ready to enable PSR after this exit if
> > there is not other frontbuffer modifications.
> > 
> > Adding a new if block in psr_force_hw_tracking_exit() instead of
> > reuse
> > the else/gen8- block because the plan is to revert this workaround
> > as soon as a better solution is found.
> > 
> > Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> > b/drivers/gpu/drm/i915/display/intel_psr.c
> > index 8a9d0bdde1bf..8630121dbbbe 100644
> > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > @@ -1152,7 +1152,21 @@ void intel_psr_disable(struct intel_dp
> > *intel_dp,
> >  
> >  static void psr_force_hw_tracking_exit(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	if (INTEL_GEN(dev_priv) >= 9)
> > +	if (IS_TIGERLAKE(dev_priv))
> > +		/*
> > +		 * Writes to CURSURFLIVE in TGL are causing IOMMU
> > errors and
> > +		 * visual glitches that are often reproduced when
> > executing
> > +		 * CPU intensive workloads while a eDP 4K panel is
> > attached.
> > +		 *
> > +		 * Manually exiting PSR causes the frontbuffer to be
> > updated
> > +		 * without glitches and the IOMMU errors are also gone
> > but
> > +		 * this comes at the cost of less time with PSR active.
> > +		 *
> > +		 * So using this workaround until this issue is root
> > caused
> > +		 * and a better fix is found.
> > +		 */
> > +		intel_psr_exit(dev_priv);
> > +	else if (INTEL_GEN(dev_priv) >= 9)
> >  		/*
> >  		 * Display WA #0884: skl+
> >  		 * This documented WA for bxt can be safely applied
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Souza, Jose Oct. 23, 2020, 9:27 p.m. UTC | #7
On Thu, 2020-10-22 at 13:48 +0100, Mun, Gwan-gyeong wrote:
> On Thu, 2020-10-22 at 12:43 +0000, Mun, Gwan-gyeong wrote:
> > 1. While testing the problematic scenario, it has not always shown
> > the
> > IOMMU DAMR related below errors on the drm-tip. 
> >    (sometimes the error messages raised, but some times it has not
> > happened on the same kernel and scenario.
> >   
> > 
> > 
> > 
> > DMAR: DRHD: handling fault status reg 2
> > DMAR: [DMA Read] Request device [00:02.0] PASID 0xffffffff fault addr
> > 0xfc001000 [fault reason 06] PTE Read access is not set
> > DMAR: DRHD: handling fault status reg 3
> > DMAR: [DMA Read] Request device [00:02.0] PASID 0xffffffff fault addr
> > 0xfc000000 [fault reason 06] PTE Read access is not set
> > 
> > 2  After applying this patch the screen glitch issues have been
> > remarkably alleviated.
> >   - Eventhough there infrequently showed the screen glitch issues.
> >   - But I agree to apply this patch as a workaround by adding the
> > explanation below.
> >  
> > 
> > 
> > 
> > 3. The dc state and PSR enable/disable scenarios has been changed by
> > this patch.
> >       
> > 
> > 
> > 
> > (1)Before applying patch
> >   enable psr 
> >     -> (front buffer updates) 
> >         -> intel_psr_flush 
> >                   ^   -> psr_force_hw_tracking_exit()
> >                   |                   : write CURSURFLIVE 
> >                   |                          |
> >                   |  (front buffer updates)  |
> >                   +--------------------------+
> > 
> >     PSR enabled -------------------------------------- -->
> >    ( DC state controlled by DMC firmware)


This scenario also causes the display to go out of DC5 too as it needs to wakeup send the PSR1 full update or the PSR2 selective update.
But I agree that doing a PSR exit will cause PSR to not be active for a few frames less than the CURSURFLIVE write.

But we already have this scenario for platforms older than gen9, so all this comments are not necessary.

> >                                    
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >  (2) After applying patch
> >   enable psr 
> >    ^  -> (front buffer updates) 
> >    |       -> intel_psr_flush
> >    |          -> psr_force_hw_tracking_exit()
> >    |               : call intel_psr_exit()
> >    |                                 -> disable psr
> >    |                                         |
> >    |                                         |
> >    +-----------------------------------------+
> > 
> > PSR enabled ---------------------------> PSR disabled
> >   ^                                          |
> >   |                                          |
> >   +------------------------------------------+
> >    ( DC state controlled by DMC firmware)
> > 
> > the repeating of enabling and disabling of PSR by the rapid screen
> > updates prevents entering of low power dc states.
> > Infereing from this scenario, it indirectly touches DC state and it
> > alleviates the issue.
> > 
> with the previous comments,
> Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>

Thanks going to push this and send the IGT side changes.

> > 
> > On Fri, 2020-10-02 at 16:16 -0700, José Roberto de Souza wrote:
> > > Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
> > > glitches that are often reproduced when executing CPU intensive
> > > workloads while a eDP 4K panel is attached.
> > > 
> > > Manually exiting PSR causes the frontbuffer to be updated without
> > > glitches and the IOMMU errors are also gone but this comes at the
> > > cost
> > > of less time with PSR active.
> > > 
> > > So using this workaround until this issue is root caused and a
> > > better
> > > fix is found.
> > > 
> > > The current code is already ready to enable PSR after this exit if
> > > there is not other frontbuffer modifications.
> > > 
> > > Adding a new if block in psr_force_hw_tracking_exit() instead of
> > > reuse
> > > the else/gen8- block because the plan is to revert this workaround
> > > as soon as a better solution is found.
> > > 
> > > Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_psr.c | 16 +++++++++++++++-
> > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> > > b/drivers/gpu/drm/i915/display/intel_psr.c
> > > index 8a9d0bdde1bf..8630121dbbbe 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > > @@ -1152,7 +1152,21 @@ void intel_psr_disable(struct intel_dp
> > > *intel_dp,
> > >  
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > >  static void psr_force_hw_tracking_exit(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > > -	if (INTEL_GEN(dev_priv) >= 9)
> > > +	if (IS_TIGERLAKE(dev_priv))
> > > +		/*
> > > +		 * Writes to CURSURFLIVE in TGL are causing IOMMU
> > > errors and
> > > +		 * visual glitches that are often reproduced when
> > > executing
> > > +		 * CPU intensive workloads while a eDP 4K panel is
> > > attached.
> > > +		 *
> > > +		 * Manually exiting PSR causes the frontbuffer to be
> > > updated
> > > +		 * without glitches and the IOMMU errors are also gone
> > > but
> > > +		 * this comes at the cost of less time with PSR active.
> > > +		 *
> > > +		 * So using this workaround until this issue is root
> > > caused
> > > +		 * and a better fix is found.
> > > +		 */
> > > +		intel_psr_exit(dev_priv);
> > > +	else if (INTEL_GEN(dev_priv) >= 9)
> > >  		/*
> > >  		 * Display WA #0884: skl+
> > >  		 * This documented WA for bxt can be safely applied
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Souza, Jose Oct. 23, 2020, 9:31 p.m. UTC | #8
On Mon, 2020-10-05 at 21:48 +0000, Patchwork wrote:
> Patch Details
> Series: drm/i915/tgl/psr: Fix glitches when doing frontbuffer modifications (rev2) URL: https://patchwork.freedesktop.org/series/82351/ State:
> failure Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18625/index.html 
> CI Bug Log - changes from CI_DRM_9097_full -> Patchwork_18625_fullSummaryFAILURE
> Serious unknown changes coming with Patchwork_18625_full absolutely need to be
>  verified manually.
> If you think the reported changes have nothing to do with the changes
>  introduced in Patchwork_18625_full, please notify your bug team to allow them
>  to document this new failure mode, which will reduce false positives in CI.
> Possible new issuesHere are the unknown changes that may have been introduced in Patchwork_18625_full:
> IGT changesPossible regressions * igt@kms_cursor_legacy@all-pipes-forked-move:shard-tglb: PASS -> INCOMPLETE

The failure above it not related.

>  * igt@kms_psr2_su@frontbuffer:shard-tglb: PASS -> FAIL +1 similar issue

The failure above is expected by this patch, will follow up with the IGT change to skip this test in TGL.

Patch pushed to dinq, thanks for the review GG.

> Known issuesHere are the changes found in Patchwork_18625_full that come from known issues:
> IGT changesIssues hit * igt@gem_exec_reloc@basic-many-active@rcs0:shard-hsw: PASS -> FAIL (i915#2389)
>  * igt@i915_pm_rc6_residency@rc6-fence:shard-hsw: PASS -> WARN (i915#1519)
>  * igt@kms_cursor_crc@pipe-a-cursor-suspend:shard-kbl: PASS -> DMESG-WARN (i915#180) +5 similar issues
>  * igt@kms_cursor_edge_walk@pipe-c-128x128-right-edge:shard-glk: PASS -> DMESG-WARN (i915#1982)
>  * igt@kms_cursor_legacy@flip-vs-cursor-atomic:shard-tglb: PASS -> FAIL (i915#2346) +3 similar issues
>  * igt@kms_cursor_legacy@pipe-c-torture-bo:shard-tglb: PASS -> DMESG-WARN (i915#128)
>  * igt@kms_hdr@bpc-switch-dpms:shard-skl: PASS -> FAIL (i915#1188)
>  * igt@kms_plane@plane-panning-bottom-right-pipe-b-planes:shard-skl: PASS -> DMESG-WARN (i915#1982) +9 similar issues
>  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:shard-skl: PASS -> INCOMPLETE (i915#648)
>  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:shard-skl: PASS -> FAIL (fdo#108145 / i915#265) +1 similar issue
>  * igt@kms_psr@psr2_sprite_plane_move:shard-iclb: PASS -> SKIP (fdo#109441) +2 similar issues
> Possible fixes * igt@gem_exec_reloc@basic-many-active@bcs0:shard-glk: FAIL (i915#2389) -> PASS
>  * igt@gem_mmap_gtt@medium-copy-xy:shard-iclb: DMESG-WARN (i915#1982) -> PASS
>  * igt@gem_mmap_offset@blt-coherency:shard-glk: FAIL (i915#2328) -> PASS
>  * igt@gem_userptr_blits@sync-unmap-cycles:shard-skl: TIMEOUT (i915#2424) -> PASS
>  * {igt@kms_async_flips@async-flip-with-page-flip-events}:shard-kbl: FAIL (i915#2521) -> PASS
>  * igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels@edp-1-pipe-a:shard-skl: DMESG-WARN (i915#1982) -> PASS +4 similar
> issues
>  * igt@kms_flip@flip-vs-blocking-wf-vblank@b-edp1:shard-tglb: DMESG-WARN (i915#1982) -> PASS +3 similar issues
>  * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1:shard-skl: FAIL (i915#2122) -> PASS +1 similar issue
>  * igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:shard-kbl: DMESG-WARN (i915#180) -> PASS +6 similar issues
>  * igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw:shard-kbl: DMESG-WARN (i915#1982) -> PASS
>  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:shard-skl: INCOMPLETE (i915#198) -> PASS +1 similar issue
>  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:shard-skl: FAIL (fdo#108145 / i915#265) -> PASS
>  * igt@kms_psr@psr2_cursor_render:shard-iclb: SKIP (fdo#109441) -> PASS +1 similar issue
>  * igt@kms_setmode@basic:shard-apl: FAIL (i915#1635 / i915#31) -> PASS
>  * igt@sysfs_timeslice_duration@timeout@vecs0:shard-glk: FAIL (i915#1755) -> PASS
> {name}: This element is suppressed. This means it is ignored when computing
>  the status of the difference (SUCCESS, WARNING, or FAILURE).
> Participating hosts (11 -> 12)Additional (1): pig-snb-2600 
> Build changes * Linux: CI_DRM_9097 -> Patchwork_18625
> CI-20190529: 20190529
>  CI_DRM_9097: 5f854df6a9500c0888864bb0be25995ccb696e41 @ git://anongit.freedesktop.org/gfx-ci/linux
>  IGT_5800: 982ca4122fd4f04ad3dfa80c6246f190b36e0c72 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
>  Patchwork_18625: 68870e5d103dec6a4a8c09849a771dee2dbd4ecb @ git://anongit.freedesktop.org/gfx-ci/linux
>  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c
index 8a9d0bdde1bf..8630121dbbbe 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -1152,7 +1152,21 @@  void intel_psr_disable(struct intel_dp *intel_dp,
 
 static void psr_force_hw_tracking_exit(struct drm_i915_private *dev_priv)
 {
-	if (INTEL_GEN(dev_priv) >= 9)
+	if (IS_TIGERLAKE(dev_priv))
+		/*
+		 * Writes to CURSURFLIVE in TGL are causing IOMMU errors and
+		 * visual glitches that are often reproduced when executing
+		 * CPU intensive workloads while a eDP 4K panel is attached.
+		 *
+		 * Manually exiting PSR causes the frontbuffer to be updated
+		 * without glitches and the IOMMU errors are also gone but
+		 * this comes at the cost of less time with PSR active.
+		 *
+		 * So using this workaround until this issue is root caused
+		 * and a better fix is found.
+		 */
+		intel_psr_exit(dev_priv);
+	else if (INTEL_GEN(dev_priv) >= 9)
 		/*
 		 * Display WA #0884: skl+
 		 * This documented WA for bxt can be safely applied