diff mbox

[4/6] drm/i915/skl: Always wait for pipes to update after a flush

Message ID 1469048403-32016-5-git-send-email-cpaul@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

cpaul@redhat.com July 20, 2016, 9 p.m. UTC
As we've learned, all watermark updates on Skylake have to be strictly
atomic or things fail. While the bspec doesn't mandate that we need to
wait for pipes to finish after the third iteration of flushes, not doing
so gives us the opportunity to break this atomicity later. This example
assumes that we're lucky enough not to be interrupted by the scheduler
at any point during this:

 - Start with pipe A and pipe B enabled
 - Enable pipe C
 - Flush pipe A in pass 1, wait until update finishes
 - Flush pipe B in pass 3, continue without waiting for next vblank
 - Start another wm update
 - We enter the next vblank for pipe B before we finish writing all the
   vm values
 - *Underrun*

As such, we always need to wait for each pipe we flush to update so as
to never break this atomicity.

Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration")
Signed-off-by: Lyude <cpaul@redhat.com>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com> <cpaul@redhat.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Matt Roper July 21, 2016, 12:27 a.m. UTC | #1
On Wed, Jul 20, 2016 at 05:00:00PM -0400, Lyude wrote:
> As we've learned, all watermark updates on Skylake have to be strictly
> atomic or things fail. While the bspec doesn't mandate that we need to
> wait for pipes to finish after the third iteration of flushes, not doing
> so gives us the opportunity to break this atomicity later. This example
> assumes that we're lucky enough not to be interrupted by the scheduler
> at any point during this:
> 
>  - Start with pipe A and pipe B enabled
>  - Enable pipe C
>  - Flush pipe A in pass 1, wait until update finishes
>  - Flush pipe B in pass 3, continue without waiting for next vblank
>  - Start another wm update
>  - We enter the next vblank for pipe B before we finish writing all the
>    vm values
>  - *Underrun*
> 
> As such, we always need to wait for each pipe we flush to update so as
> to never break this atomicity.

I'm not sure I follow this commit.  If we're enabling a new pipe, the
the allocation for A and B are generally going to shrink, so they'll
usually be flushed in passes 1 and 2, not 3.

My understanding is that the problem that still remains (now that your
first three patches have made progress towards fixing underruns) is that
our DDB updates and flushes (which come from update_watermarks) happen
pre-evasion, whereas the watermarks themselves now happen under evasion.
We really want both the new DDB value and the new watermark value to be
written together and take effect on the same vblank.  I think the
problem is that you might have a shrinking DDB allocation (e.g., because
a new pipe was added or you changed a mode that changed the DDB balance)
which some of the existing WM values exceed.  You can have a sequence
like this:

  - update_wm:
     - write new (smaller) DDB
     - flush DDB
  - vblank happens, old (big) wm + new (small) ddb = underrun
  - vblank evasion:
     - write new plane regs and WM's
     - flush
  - post-evasion vblank happens, underrun is corrected

I think ultimately we want to move the DDB register writes into the
update functions that happen under evasion, just like you did for the WM
registers.  However just doing this the straightforward way won't
satisfy our requirements about pipe update ordering (the three passes
you see today in skl_flush_wm_values).  To make that work, I think the
general approach is that we need to basically replace the
for_each_crtc_in_state() loop in intel_atomic_commit_tail() with some
new CRTC iterator that processes CRTC's in a more intelligent ordering.
We've computed our DDB changes during the atomic check phase, so we
already know which allocations are shrinking, growing, etc. and we
should be able to calculate an appropriate CRTC ordering at the same
time.

With an intelligent CRTC iterator that follows the pre-computed pipe
ordering rules (and adds the necessary vblank waits between each
"phase"), I think we should be able to just write both DDB and WM values
in the skl_update_primary_plane() and similar functions and let the
existing flushes that happen take care of flushing them out at the
appropriate time.  Of course I've kicked that idea around in my head for
a while, but haven't had time to actually write any code for it, so I
may be completely overlooking some stumbling block that makes it much
more complicated than I'm envisioning.


Matt

> 
> Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration")
> Signed-off-by: Lyude <cpaul@redhat.com>
> Cc: stable@vger.kernel.org
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> Cc: Hans de Goede <hdegoede@redhat.com> <cpaul@redhat.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 788db86..2e31df4 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3859,8 +3859,11 @@ static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
>  	/*
>  	 * Third pass: flush the pipes that got more space allocated.
>  	 *
> -	 * We don't need to actively wait for the update here, next vblank
> -	 * will just get more DDB space with the correct WM values.
> +	 * While the hardware doesn't require to wait for the next vblank here,
> +	 * continuing before the pipe finishes updating could result in us
> +	 * trying to update the wm values again before the pipe finishes
> +	 * updating, which results in the hardware using intermediate wm values
> +	 * and subsequently underrunning pipes.
>  	 */
>  	for_each_intel_crtc(dev, crtc) {
>  		if (!crtc->active)
> @@ -3876,6 +3879,16 @@ static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
>  			continue;
>  
>  		skl_wm_flush_pipe(dev_priv, pipe, 3);
> +
> +		/*
> +		 * The only time we can get away with not waiting for an update
> +		 * is when we just enabled the pipe, e.g. when it doesn't have
> +		 * vblanks enabled anyway.
> +		 */
> +		if (drm_crtc_vblank_get(&crtc->base) == 0) {
> +			intel_wait_for_vblank(dev, pipe);
> +			drm_crtc_vblank_put(&crtc->base);
> +		}
>  	}
>  }
>  
> -- 
> 2.7.4
>
cpaul@redhat.com July 21, 2016, 5:01 p.m. UTC | #2
Two things for this one:
1. I completely messed up the description on this patchset, this was for fixing
underruns on pipe disablement. I'm impressed I wrote up that whole description
without realizing that at all, lol.
2. This patch may not actually be necessary. On the original git branch I was
testing this on it was required to prevent underruns on pipe disables, but
trying this on nightly I don't seem to reproduce those underruns even when I
remove this patch, so I guess we can drop this from the series

On Wed, 2016-07-20 at 17:27 -0700, Matt Roper wrote:
> On Wed, Jul 20, 2016 at 05:00:00PM -0400, Lyude wrote:
> > 
> > As we've learned, all watermark updates on Skylake have to be strictly
> > atomic or things fail. While the bspec doesn't mandate that we need to
> > wait for pipes to finish after the third iteration of flushes, not doing
> > so gives us the opportunity to break this atomicity later. This example
> > assumes that we're lucky enough not to be interrupted by the scheduler
> > at any point during this:
> > 
> >  - Start with pipe A and pipe B enabled
> >  - Enable pipe C
> >  - Flush pipe A in pass 1, wait until update finishes
> >  - Flush pipe B in pass 3, continue without waiting for next vblank
> >  - Start another wm update
> >  - We enter the next vblank for pipe B before we finish writing all the
> >    vm values
> >  - *Underrun*
> > 
> > As such, we always need to wait for each pipe we flush to update so as
> > to never break this atomicity.
> 
> I'm not sure I follow this commit.  If we're enabling a new pipe, the
> the allocation for A and B are generally going to shrink, so they'll
> usually be flushed in passes 1 and 2, not 3.
> 
> My understanding is that the problem that still remains (now that your
> first three patches have made progress towards fixing underruns) is that
> our DDB updates and flushes (which come from update_watermarks) happen
> pre-evasion, whereas the watermarks themselves now happen under evasion.
> We really want both the new DDB value and the new watermark value to be
> written together and take effect on the same vblank.  I think the
> problem is that you might have a shrinking DDB allocation (e.g., because
> a new pipe was added or you changed a mode that changed the DDB balance)
> which some of the existing WM values exceed.  You can have a sequence
> like this:
> 
>   - update_wm:
>      - write new (smaller) DDB
>      - flush DDB
>   - vblank happens, old (big) wm + new (small) ddb = underrun
>   - vblank evasion:
>      - write new plane regs and WM's
>      - flush
>   - post-evasion vblank happens, underrun is corrected
> 
> I think ultimately we want to move the DDB register writes into the
> update functions that happen under evasion, just like you did for the WM
> registers.  However just doing this the straightforward way won't
> satisfy our requirements about pipe update ordering (the three passes
> you see today in skl_flush_wm_values).  To make that work, I think the
> general approach is that we need to basically replace the
> for_each_crtc_in_state() loop in intel_atomic_commit_tail() with some
> new CRTC iterator that processes CRTC's in a more intelligent ordering.
> We've computed our DDB changes during the atomic check phase, so we
> already know which allocations are shrinking, growing, etc. and we
> should be able to calculate an appropriate CRTC ordering at the same
> time.
> 
> With an intelligent CRTC iterator that follows the pre-computed pipe
> ordering rules (and adds the necessary vblank waits between each
> "phase"), I think we should be able to just write both DDB and WM values
> in the skl_update_primary_plane() and similar functions and let the
> existing flushes that happen take care of flushing them out at the
> appropriate time.  Of course I've kicked that idea around in my head for
> a while, but haven't had time to actually write any code for it, so I
> may be completely overlooking some stumbling block that makes it much
> more complicated than I'm envisioning.
> 
> 
> Matt
> 
> > 
> > 
> > Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration")
> > Signed-off-by: Lyude <cpaul@redhat.com>
> > Cc: stable@vger.kernel.org
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> > Cc: Hans de Goede <hdegoede@redhat.com> <cpaul@redhat.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_pm.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 788db86..2e31df4 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3859,8 +3859,11 @@ static void skl_flush_wm_values(struct
> > drm_i915_private *dev_priv,
> >  	/*
> >  	 * Third pass: flush the pipes that got more space allocated.
> >  	 *
> > -	 * We don't need to actively wait for the update here, next vblank
> > -	 * will just get more DDB space with the correct WM values.
> > +	 * While the hardware doesn't require to wait for the next vblank
> > here,
> > +	 * continuing before the pipe finishes updating could result in us
> > +	 * trying to update the wm values again before the pipe finishes
> > +	 * updating, which results in the hardware using intermediate wm
> > values
> > +	 * and subsequently underrunning pipes.
> >  	 */
> >  	for_each_intel_crtc(dev, crtc) {
> >  		if (!crtc->active)
> > @@ -3876,6 +3879,16 @@ static void skl_flush_wm_values(struct
> > drm_i915_private *dev_priv,
> >  			continue;
> >  
> >  		skl_wm_flush_pipe(dev_priv, pipe, 3);
> > +
> > +		/*
> > +		 * The only time we can get away with not waiting for an
> > update
> > +		 * is when we just enabled the pipe, e.g. when it doesn't
> > have
> > +		 * vblanks enabled anyway.
> > +		 */
> > +		if (drm_crtc_vblank_get(&crtc->base) == 0) {
> > +			intel_wait_for_vblank(dev, pipe);
> > +			drm_crtc_vblank_put(&crtc->base);
> > +		}
> >  	}
> >  }
> >  
> > -- 
> > 2.7.4
> > 
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 788db86..2e31df4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3859,8 +3859,11 @@  static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
 	/*
 	 * Third pass: flush the pipes that got more space allocated.
 	 *
-	 * We don't need to actively wait for the update here, next vblank
-	 * will just get more DDB space with the correct WM values.
+	 * While the hardware doesn't require to wait for the next vblank here,
+	 * continuing before the pipe finishes updating could result in us
+	 * trying to update the wm values again before the pipe finishes
+	 * updating, which results in the hardware using intermediate wm values
+	 * and subsequently underrunning pipes.
 	 */
 	for_each_intel_crtc(dev, crtc) {
 		if (!crtc->active)
@@ -3876,6 +3879,16 @@  static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
 			continue;
 
 		skl_wm_flush_pipe(dev_priv, pipe, 3);
+
+		/*
+		 * The only time we can get away with not waiting for an update
+		 * is when we just enabled the pipe, e.g. when it doesn't have
+		 * vblanks enabled anyway.
+		 */
+		if (drm_crtc_vblank_get(&crtc->base) == 0) {
+			intel_wait_for_vblank(dev, pipe);
+			drm_crtc_vblank_put(&crtc->base);
+		}
 	}
 }