diff mbox series

[1/3] drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling

Message ID 20210209021918.16234-1-ville.syrjala@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series [1/3] drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling | expand

Commit Message

Ville Syrjälä Feb. 9, 2021, 2:19 a.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

ilk+ planes get notably unhappy when the plane x+w exceeds
the stride. This wasn't a problem previously because we
always aligned SURF to the closest tile boundary so the
x offset never got particularly large. But now with async
flips we have to align to 256KiB instead and thus this
becomes a real issue.

On ilk/snb/ivb it looks like the accesses just just wrap
early to the next tile row when scanout goes past the
SURF+n*stride boundary, hsw/bdw suffer more heavily and
start to underrun constantly. i965/g4x appear to be immune.
vlv/chv I've not yet checked.

Let's borrow another trick from the skl+ code and search
backwards for a better SURF offset in the hopes of getting the
x offset below the limit. IIRC when I ran into a similar issue
on skl years ago it was causing the hardware to fall over
pretty hard as well.

And let's be consistent and include i965/g4x in the check
as well, just in case I just got super lucky somehow when
I wasn't able to reproduce the issue. Not that it really
matters since we still use 4k SURF alignment for i965/g4x
anyway.

Fixes: 6ede6b0616b2 ("drm/i915: Implement async flips for vlv/chv")
Fixes: 4bb18054adc4 ("drm/i915: Implement async flip for ilk/snb")
Fixes: 2a636e240c77 ("drm/i915: Implement async flip for ivb/hsw")
Fixes: cda195f13abd ("drm/i915: Implement async flips for bdw")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/display/i9xx_plane.c | 27 +++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Chris Wilson Feb. 9, 2021, 9:22 a.m. UTC | #1
Quoting Ville Syrjala (2021-02-09 02:19:16)
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> ilk+ planes get notably unhappy when the plane x+w exceeds
> the stride. This wasn't a problem previously because we
> always aligned SURF to the closest tile boundary so the
> x offset never got particularly large. But now with async
> flips we have to align to 256KiB instead and thus this
> becomes a real issue.
> 
> On ilk/snb/ivb it looks like the accesses just just wrap
> early to the next tile row when scanout goes past the
> SURF+n*stride boundary, hsw/bdw suffer more heavily and
> start to underrun constantly. i965/g4x appear to be immune.
> vlv/chv I've not yet checked.
> 
> Let's borrow another trick from the skl+ code and search
> backwards for a better SURF offset in the hopes of getting the
> x offset below the limit. IIRC when I ran into a similar issue
> on skl years ago it was causing the hardware to fall over
> pretty hard as well.
> 
> And let's be consistent and include i965/g4x in the check
> as well, just in case I just got super lucky somehow when
> I wasn't able to reproduce the issue. Not that it really
> matters since we still use 4k SURF alignment for i965/g4x
> anyway.
> 
> Fixes: 6ede6b0616b2 ("drm/i915: Implement async flips for vlv/chv")
> Fixes: 4bb18054adc4 ("drm/i915: Implement async flip for ilk/snb")
> Fixes: 2a636e240c77 ("drm/i915: Implement async flip for ivb/hsw")
> Fixes: cda195f13abd ("drm/i915: Implement async flips for bdw")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/display/i9xx_plane.c | 27 +++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c b/drivers/gpu/drm/i915/display/i9xx_plane.c
> index 0523e2c79d16..8a52beaed2da 100644
> --- a/drivers/gpu/drm/i915/display/i9xx_plane.c
> +++ b/drivers/gpu/drm/i915/display/i9xx_plane.c
> @@ -255,6 +255,33 @@ int i9xx_check_plane_surface(struct intel_plane_state *plane_state)
>         else
>                 offset = 0;
>  
> +       /*
> +        * When using an X-tiled surface the plane starts to
> +        * misbehave if the x offset + width exceeds the stride.
> +        * hsw/bdw: underrun galore
> +        * ilk/snb/ivb: wrap to the next tile row mid scanout
> +        * i965/g4x: so far appear immune to this
> +        * vlv/chv: TODO check
> +        *
> +        * Linear surfaces seem to work just fine, even on hsw/bdw
> +        * despite them not using the linear offset anymore.
> +        */
> +       if (INTEL_GEN(dev_priv) >= 4 && fb->modifier == I915_FORMAT_MOD_X_TILED) {
> +               u32 alignment = intel_surf_alignment(fb, 0);
> +               int cpp = fb->format->cpp[0];
> +
> +               while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
> +                       if (offset == 0) {
> +                               drm_dbg_kms(&dev_priv->drm,
> +                                           "Unable to find suitable display surface offset due to X-tiling\n");
> +                               return -EINVAL;
> +                       }
> +
> +                       offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
> +                                                                  offset, offset - alignment);

As offset decreases, src_x goes up; but modulus the pitch. So long as
the alignment is not a multiple of the pitch, src_x will change on each
iteration. And after the adjustment, the offset is stored in
plane_state.

So this loop would fail for any power-of-two stride, but at the same
time that would put the src_x + src_w out-of-bounds in the supplied
coordinates. The only way src_x + src_w would exceed stride legally is
if we have chosen an aligned offset that causes that, thus there should
exist an offset where src_x + src_w does not exceed the stride.

The reason for choosing a nearby tile offset was to reduce src_x/src_y
to fit within the crtc limits. While remapping could be used to solve
that, the aligned_offset computation allows reuse of a single view.

Since offset, src_x are a function of the plane input parameters, this
should be possible to exercise with carefully selected framebuffers and
modesetting. Right? Is there a test case for this?

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Chris Wilson Feb. 9, 2021, 9:50 a.m. UTC | #2
Quoting Chris Wilson (2021-02-09 09:22:09)
> Quoting Ville Syrjala (2021-02-09 02:19:16)
> > +               while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
> > +                       if (offset == 0) {
> > +                               drm_dbg_kms(&dev_priv->drm,
> > +                                           "Unable to find suitable display surface offset due to X-tiling\n");
> > +                               return -EINVAL;
> > +                       }
> > +
> > +                       offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
> > +                                                                  offset, offset - alignment);

> The reason for choosing a nearby tile offset was to reduce src_x/src_y
> to fit within the crtc limits. While remapping could be used to solve
> that, the aligned_offset computation allows reuse of a single view.

Should there not be a second constraint on the loop to make sure src_x +
src_w is less than 4095/8191/etc?
-Chris
Ville Syrjälä Feb. 9, 2021, 2:44 p.m. UTC | #3
On Tue, Feb 09, 2021 at 09:50:28AM +0000, Chris Wilson wrote:
> Quoting Chris Wilson (2021-02-09 09:22:09)
> > Quoting Ville Syrjala (2021-02-09 02:19:16)
> > > +               while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
> > > +                       if (offset == 0) {
> > > +                               drm_dbg_kms(&dev_priv->drm,
> > > +                                           "Unable to find suitable display surface offset due to X-tiling\n");
> > > +                               return -EINVAL;
> > > +                       }
> > > +
> > > +                       offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
> > > +                                                                  offset, offset - alignment);
> 
> > The reason for choosing a nearby tile offset was to reduce src_x/src_y
> > to fit within the crtc limits. While remapping could be used to solve
> > that, the aligned_offset computation allows reuse of a single view.
> 
> Should there not be a second constraint on the loop to make sure src_x +
> src_w is less than 4095/8191/etc?

Yeah, but we don't have that in the skl code either atm.
Should add it to both.

And if it can actually fail I guess we should just fall back
to remapping rather than telling the user they can't have a
working display. So far I never did the mental gymnastics to
come up with an actually failing scenario.
Ville Syrjälä Feb. 9, 2021, 2:51 p.m. UTC | #4
On Tue, Feb 09, 2021 at 04:44:12PM +0200, Ville Syrjälä wrote:
> On Tue, Feb 09, 2021 at 09:50:28AM +0000, Chris Wilson wrote:
> > Quoting Chris Wilson (2021-02-09 09:22:09)
> > > Quoting Ville Syrjala (2021-02-09 02:19:16)
> > > > +               while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
> > > > +                       if (offset == 0) {
> > > > +                               drm_dbg_kms(&dev_priv->drm,
> > > > +                                           "Unable to find suitable display surface offset due to X-tiling\n");
> > > > +                               return -EINVAL;
> > > > +                       }
> > > > +
> > > > +                       offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
> > > > +                                                                  offset, offset - alignment);
> > 
> > > The reason for choosing a nearby tile offset was to reduce src_x/src_y
> > > to fit within the crtc limits. While remapping could be used to solve
> > > that, the aligned_offset computation allows reuse of a single view.
> > 
> > Should there not be a second constraint on the loop to make sure src_x +
> > src_w is less than 4095/8191/etc?
> 
> Yeah, but we don't have that in the skl code either atm.
> Should add it to both.

Actually no. We already cap the max stride such that it never
exceeds that limit. So the single check already covers that.

What I think we should be checking is that src_y stays below the
appropriate limit. Although I'm not sure if we could realistically
hit a case where that fails but still find a suitably aligned
offset before hitting 0. Oh and I've not actually confirmed
whether src_y+src_h also has an upper limit or not.
Ville Syrjälä Feb. 9, 2021, 3:09 p.m. UTC | #5
On Tue, Feb 09, 2021 at 03:22:09AM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [1/3] drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling
> URL   : https://patchwork.freedesktop.org/series/86882/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_9747 -> Patchwork_19637
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_19637 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_19637, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19637/index.html
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_19637:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@vgem_basic@unload:
>     - fi-kbl-soraka:      NOTRUN -> [DMESG-WARN][1]
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19637/fi-kbl-soraka/igt@vgem_basic@unload.html

<3> [558.016425] i915 0000:00:02.0: [drm] *ERROR* Potential atomic update failure on pipe A

I guess we've been throwing that under these two:
https://gitlab.freedesktop.org/drm/intel/-/issues/86
https://gitlab.freedesktop.org/drm/intel/-/issues/558
Ville Syrjälä Feb. 9, 2021, 3:21 p.m. UTC | #6
On Tue, Feb 09, 2021 at 09:22:09AM +0000, Chris Wilson wrote:
> Quoting Ville Syrjala (2021-02-09 02:19:16)
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > ilk+ planes get notably unhappy when the plane x+w exceeds
> > the stride. This wasn't a problem previously because we
> > always aligned SURF to the closest tile boundary so the
> > x offset never got particularly large. But now with async
> > flips we have to align to 256KiB instead and thus this
> > becomes a real issue.
> > 
> > On ilk/snb/ivb it looks like the accesses just just wrap
> > early to the next tile row when scanout goes past the
> > SURF+n*stride boundary, hsw/bdw suffer more heavily and
> > start to underrun constantly. i965/g4x appear to be immune.
> > vlv/chv I've not yet checked.
> > 
> > Let's borrow another trick from the skl+ code and search
> > backwards for a better SURF offset in the hopes of getting the
> > x offset below the limit. IIRC when I ran into a similar issue
> > on skl years ago it was causing the hardware to fall over
> > pretty hard as well.
> > 
> > And let's be consistent and include i965/g4x in the check
> > as well, just in case I just got super lucky somehow when
> > I wasn't able to reproduce the issue. Not that it really
> > matters since we still use 4k SURF alignment for i965/g4x
> > anyway.
> > 
> > Fixes: 6ede6b0616b2 ("drm/i915: Implement async flips for vlv/chv")
> > Fixes: 4bb18054adc4 ("drm/i915: Implement async flip for ilk/snb")
> > Fixes: 2a636e240c77 ("drm/i915: Implement async flip for ivb/hsw")
> > Fixes: cda195f13abd ("drm/i915: Implement async flips for bdw")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/i9xx_plane.c | 27 +++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c b/drivers/gpu/drm/i915/display/i9xx_plane.c
> > index 0523e2c79d16..8a52beaed2da 100644
> > --- a/drivers/gpu/drm/i915/display/i9xx_plane.c
> > +++ b/drivers/gpu/drm/i915/display/i9xx_plane.c
> > @@ -255,6 +255,33 @@ int i9xx_check_plane_surface(struct intel_plane_state *plane_state)
> >         else
> >                 offset = 0;
> >  
> > +       /*
> > +        * When using an X-tiled surface the plane starts to
> > +        * misbehave if the x offset + width exceeds the stride.
> > +        * hsw/bdw: underrun galore
> > +        * ilk/snb/ivb: wrap to the next tile row mid scanout
> > +        * i965/g4x: so far appear immune to this
> > +        * vlv/chv: TODO check
> > +        *
> > +        * Linear surfaces seem to work just fine, even on hsw/bdw
> > +        * despite them not using the linear offset anymore.
> > +        */
> > +       if (INTEL_GEN(dev_priv) >= 4 && fb->modifier == I915_FORMAT_MOD_X_TILED) {
> > +               u32 alignment = intel_surf_alignment(fb, 0);
> > +               int cpp = fb->format->cpp[0];
> > +
> > +               while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
> > +                       if (offset == 0) {
> > +                               drm_dbg_kms(&dev_priv->drm,
> > +                                           "Unable to find suitable display surface offset due to X-tiling\n");
> > +                               return -EINVAL;
> > +                       }
> > +
> > +                       offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
> > +                                                                  offset, offset - alignment);
> 
> As offset decreases, src_x goes up; but modulus the pitch. So long as
> the alignment is not a multiple of the pitch, src_x will change on each
> iteration. And after the adjustment, the offset is stored in
> plane_state.
> 
> So this loop would fail for any power-of-two stride, but at the same
> time that would put the src_x + src_w out-of-bounds in the supplied
> coordinates. The only way src_x + src_w would exceed stride legally is
> if we have chosen an aligned offset that causes that, thus there should
> exist an offset where src_x + src_w does not exceed the stride.
> 
> The reason for choosing a nearby tile offset was to reduce src_x/src_y
> to fit within the crtc limits. While remapping could be used to solve
> that, the aligned_offset computation allows reuse of a single view.
> 
> Since offset, src_x are a function of the plane input parameters, this
> should be possible to exercise with carefully selected framebuffers and
> modesetting. Right? Is there a test case for this?

My idea was to extend kms_big_fb for these sort of things.
While I originally made it purely to test remapping it should
be possible to extend it for non-remapped fbs as well. IIRC 
J-P did at least some work towards that goal, but I guess
it's only in the internal copy for whatever reason.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Ta.
Juha-Pekka Heikkila Feb. 10, 2021, 12:05 p.m. UTC | #7
On 9.2.2021 17.21, Ville Syrjälä wrote:
> On Tue, Feb 09, 2021 at 09:22:09AM +0000, Chris Wilson wrote:
>> Quoting Ville Syrjala (2021-02-09 02:19:16)
>>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>>
>>> ilk+ planes get notably unhappy when the plane x+w exceeds
>>> the stride. This wasn't a problem previously because we
>>> always aligned SURF to the closest tile boundary so the
>>> x offset never got particularly large. But now with async
>>> flips we have to align to 256KiB instead and thus this
>>> becomes a real issue.
>>>
>>> On ilk/snb/ivb it looks like the accesses just just wrap
>>> early to the next tile row when scanout goes past the
>>> SURF+n*stride boundary, hsw/bdw suffer more heavily and
>>> start to underrun constantly. i965/g4x appear to be immune.
>>> vlv/chv I've not yet checked.
>>>
>>> Let's borrow another trick from the skl+ code and search
>>> backwards for a better SURF offset in the hopes of getting the
>>> x offset below the limit. IIRC when I ran into a similar issue
>>> on skl years ago it was causing the hardware to fall over
>>> pretty hard as well.
>>>
>>> And let's be consistent and include i965/g4x in the check
>>> as well, just in case I just got super lucky somehow when
>>> I wasn't able to reproduce the issue. Not that it really
>>> matters since we still use 4k SURF alignment for i965/g4x
>>> anyway.
>>>
>>> Fixes: 6ede6b0616b2 ("drm/i915: Implement async flips for vlv/chv")
>>> Fixes: 4bb18054adc4 ("drm/i915: Implement async flip for ilk/snb")
>>> Fixes: 2a636e240c77 ("drm/i915: Implement async flip for ivb/hsw")
>>> Fixes: cda195f13abd ("drm/i915: Implement async flips for bdw")
>>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/display/i9xx_plane.c | 27 +++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c b/drivers/gpu/drm/i915/display/i9xx_plane.c
>>> index 0523e2c79d16..8a52beaed2da 100644
>>> --- a/drivers/gpu/drm/i915/display/i9xx_plane.c
>>> +++ b/drivers/gpu/drm/i915/display/i9xx_plane.c
>>> @@ -255,6 +255,33 @@ int i9xx_check_plane_surface(struct intel_plane_state *plane_state)
>>>          else
>>>                  offset = 0;
>>>   
>>> +       /*
>>> +        * When using an X-tiled surface the plane starts to
>>> +        * misbehave if the x offset + width exceeds the stride.
>>> +        * hsw/bdw: underrun galore
>>> +        * ilk/snb/ivb: wrap to the next tile row mid scanout
>>> +        * i965/g4x: so far appear immune to this
>>> +        * vlv/chv: TODO check
>>> +        *
>>> +        * Linear surfaces seem to work just fine, even on hsw/bdw
>>> +        * despite them not using the linear offset anymore.
>>> +        */
>>> +       if (INTEL_GEN(dev_priv) >= 4 && fb->modifier == I915_FORMAT_MOD_X_TILED) {
>>> +               u32 alignment = intel_surf_alignment(fb, 0);
>>> +               int cpp = fb->format->cpp[0];
>>> +
>>> +               while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
>>> +                       if (offset == 0) {
>>> +                               drm_dbg_kms(&dev_priv->drm,
>>> +                                           "Unable to find suitable display surface offset due to X-tiling\n");
>>> +                               return -EINVAL;
>>> +                       }
>>> +
>>> +                       offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
>>> +                                                                  offset, offset - alignment);
>>
>> As offset decreases, src_x goes up; but modulus the pitch. So long as
>> the alignment is not a multiple of the pitch, src_x will change on each
>> iteration. And after the adjustment, the offset is stored in
>> plane_state.
>>
>> So this loop would fail for any power-of-two stride, but at the same
>> time that would put the src_x + src_w out-of-bounds in the supplied
>> coordinates. The only way src_x + src_w would exceed stride legally is
>> if we have chosen an aligned offset that causes that, thus there should
>> exist an offset where src_x + src_w does not exceed the stride.
>>
>> The reason for choosing a nearby tile offset was to reduce src_x/src_y
>> to fit within the crtc limits. While remapping could be used to solve
>> that, the aligned_offset computation allows reuse of a single view.
>>
>> Since offset, src_x are a function of the plane input parameters, this
>> should be possible to exercise with carefully selected framebuffers and
>> modesetting. Right? Is there a test case for this?
> 
> My idea was to extend kms_big_fb for these sort of things.
> While I originally made it purely to test remapping it should
> be possible to extend it for non-remapped fbs as well. IIRC
> J-P did at least some work towards that goal, but I guess
> it's only in the internal copy for whatever reason.

There are those max-hw-stride subtests in kms_big_fb which would go for 
this but it's all in internal trees.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c b/drivers/gpu/drm/i915/display/i9xx_plane.c
index 0523e2c79d16..8a52beaed2da 100644
--- a/drivers/gpu/drm/i915/display/i9xx_plane.c
+++ b/drivers/gpu/drm/i915/display/i9xx_plane.c
@@ -255,6 +255,33 @@  int i9xx_check_plane_surface(struct intel_plane_state *plane_state)
 	else
 		offset = 0;
 
+	/*
+	 * When using an X-tiled surface the plane starts to
+	 * misbehave if the x offset + width exceeds the stride.
+	 * hsw/bdw: underrun galore
+	 * ilk/snb/ivb: wrap to the next tile row mid scanout
+	 * i965/g4x: so far appear immune to this
+	 * vlv/chv: TODO check
+	 *
+	 * Linear surfaces seem to work just fine, even on hsw/bdw
+	 * despite them not using the linear offset anymore.
+	 */
+	if (INTEL_GEN(dev_priv) >= 4 && fb->modifier == I915_FORMAT_MOD_X_TILED) {
+		u32 alignment = intel_surf_alignment(fb, 0);
+		int cpp = fb->format->cpp[0];
+
+		while ((src_x + src_w) * cpp > plane_state->color_plane[0].stride) {
+			if (offset == 0) {
+				drm_dbg_kms(&dev_priv->drm,
+					    "Unable to find suitable display surface offset due to X-tiling\n");
+				return -EINVAL;
+			}
+
+			offset = intel_plane_adjust_aligned_offset(&src_x, &src_y, plane_state, 0,
+								   offset, offset - alignment);
+		}
+	}
+
 	/*
 	 * Put the final coordinates back so that the src
 	 * coordinate checks will see the right values.