diff mbox

[5/5] drm/i915: fix gtt space allocated for tiled objects

Message ID 1357327133.4147.9.camel@ideak-mobl (mailing list archive)
State New, archived
Headers show

Commit Message

Imre Deak Jan. 4, 2013, 7:18 p.m. UTC
On Fri, 2013-01-04 at 17:47 +0000, Chris Wilson wrote:
> On Fri, 04 Jan 2013 19:23:42 +0200, Imre Deak <imre.deak@intel.com> wrote:
> > On Fri, 2013-01-04 at 17:07 +0000, Chris Wilson wrote:
> > > On Fri,  4 Jan 2013 18:42:00 +0200, Imre Deak <imre.deak@intel.com> wrote:
> > > > The gtt space needed for tiled objects might be bigger than the linear
> > > > size programmed into the correpsonding fence register. For example for
> > > > the following buffer on a Gen5+ HW:
> > > > 
> > > > - allocation size: 4096 bytes
> > > > - tiling mode: X tiled
> > > > - stride: 1536
> > > > 
> > > > we need (1536 / 512) * 4096 bytes of gtt space to cover all the pixels
> > > > in the buffer, but at the moment we allocate only 4096. This means that
> > > > any buffer following this tiled buffer in the gtt space will be
> > > > corrupted if pixels belonging to the 2nd and 3rd tiles are written.
> > > > 
> > > > Fix this by rounding up the size of the allocated gtt space to the next
> > > > tile row address. The page frames beyond the allocation size will be
> > > > backed by the single gtt scratch page used already elsewhere for similar
> > > > padding.
> > > > 
> > > > Note that this is more of a security/robustness problem and not fixing any
> > > > reported issue that I know of. This is because applications will normally
> > > > access only the part of the buffer that is tile row size aligned.
> > > 
> > > There should not be any reported issues because all userspace already
> > > allocates up to the end of tile-row and stride should be enforced to be
> > > a multiple of tile-width. So the use of DIV_ROUND_UP implies a
> > > programming error that should have been reported back to userspace
> > > earlier. We can extend that by checking to make sure userspace has
> > > allocated a valid buffer, that is, it has allocated sufficient pages for
> > > the sampler access into the tiled buffer (or reject the set-tiling).
> > > -Chris
> > 
> > Ok, I tested this with older UXA that still allocated non-aligned
> > buffers. If that's not the case any more then rejecting set-tiling if
> > it's called on a non tile-row size aligned buffer would work too.
> 
> Meep. A long time ago we got the calcuations wrong (slightly less for
> gen2), but it was and still is a userspace bug with the potential of
> handing the GPU.
> 
> A multiple of tile_height tall and a mutiple of tile_width across should
> always be an exact number of pages (and an exact multiple of tile-row
> pages). And we should have been obeying that since the introduction of
> set-tiling (ignoring the aforementioned bugs) - so I'd really like to
> see any evidence of userspace getting that wrong.

On Ubuntu 12.10, with xf86-video-intel 2.20.9 I get the attached log
with the following patch applied:

 	if (stride < tile_width)
 		return false;
dmesg | grep unaligned | cut -c 16- | sort | uniq -c
      2 unaligned tiling: comm compiz size 10485760 mode X stride 6656 size-mod-tilerow-size 49152
      2 unaligned tiling: comm compiz size 10485760 mode Y stride 6400 size-mod-tilerow-size 40960
      1 unaligned tiling: comm compiz size 1572864 mode Y stride 2944 size-mod-tilerow-size 65536
      1 unaligned tiling: comm compiz size 163840 mode X stride 6144 size-mod-tilerow-size 16384
      1 unaligned tiling: comm compiz size 163840 mode Y stride 1536 size-mod-tilerow-size 16384
      2 unaligned tiling: comm compiz size 229376 mode Y stride 1152 size-mod-tilerow-size 8192
      1 unaligned tiling: comm compiz size 2621440 mode Y stride 4736 size-mod-tilerow-size 45056
      1 unaligned tiling: comm compiz size 262144 mode X stride 5120 size-mod-tilerow-size 16384
      2 unaligned tiling: comm compiz size 327680 mode X stride 6144 size-mod-tilerow-size 32768
      6 unaligned tiling: comm compiz size 327680 mode X stride 6656 size-mod-tilerow-size 8192
      1 unaligned tiling: comm compiz size 327680 mode Y stride 3072 size-mod-tilerow-size 32768
      1 unaligned tiling: comm compiz size 327680 mode Y stride 4736 size-mod-tilerow-size 24576
      1 unaligned tiling: comm compiz size 3670016 mode X stride 3072 size-mod-tilerow-size 8192
      1 unaligned tiling: comm compiz size 458752 mode Y stride 6400 size-mod-tilerow-size 49152
      1 unaligned tiling: comm compiz size 524288 mode Y stride 384 size-mod-tilerow-size 8192
      1 unaligned tiling: comm compiz size 6291456 mode X stride 6656 size-mod-tilerow-size 8192
      2 unaligned tiling: comm compiz size 6291456 mode Y stride 6400 size-mod-tilerow-size 147456
      6 unaligned tiling: comm compiz size 65536 mode Y stride 640 size-mod-tilerow-size 4096
      1 unaligned tiling: comm unity_support_t size 6291456 mode Y stride 6400 size-mod-tilerow-size 147456
     69 unaligned tiling: comm Xorg size 114688 mode X stride 2560 size-mod-tilerow-size 12288
    152 unaligned tiling: comm Xorg size 1310720 mode X stride 1536 size-mod-tilerow-size 8192
      3 unaligned tiling: comm Xorg size 131072 mode X stride 2560 size-mod-tilerow-size 8192
    317 unaligned tiling: comm Xorg size 131072 mode X stride 3072 size-mod-tilerow-size 8192
      4 unaligned tiling: comm Xorg size 163840 mode X stride 1536 size-mod-tilerow-size 4096
     87 unaligned tiling: comm Xorg size 163840 mode X stride 3072 size-mod-tilerow-size 16384
   1104 unaligned tiling: comm Xorg size 163840 mode X stride 3584 size-mod-tilerow-size 20480
      1 unaligned tiling: comm Xorg size 163840 mode X stride 4608 size-mod-tilerow-size 16384
      1 unaligned tiling: comm Xorg size 163840 mode X stride 6656 size-mod-tilerow-size 4096
      2 unaligned tiling: comm Xorg size 196608 mode X stride 2560 size-mod-tilerow-size 12288
     12 unaligned tiling: comm Xorg size 196608 mode X stride 3584 size-mod-tilerow-size 24576
     20 unaligned tiling: comm Xorg size 196608 mode X stride 4608 size-mod-tilerow-size 12288
      2 unaligned tiling: comm Xorg size 2097152 mode X stride 2560 size-mod-tilerow-size 8192
     10 unaligned tiling: comm Xorg size 229376 mode X stride 3072 size-mod-tilerow-size 8192
    193 unaligned tiling: comm Xorg size 2621440 mode X stride 4608 size-mod-tilerow-size 4096
      4 unaligned tiling: comm Xorg size 262144 mode X stride 1536 size-mod-tilerow-size 4096
     16 unaligned tiling: comm Xorg size 262144 mode X stride 4608 size-mod-tilerow-size 4096
      1 unaligned tiling: comm Xorg size 3145728 mode X stride 5632 size-mod-tilerow-size 36864
      3 unaligned tiling: comm Xorg size 327680 mode X stride 1536 size-mod-tilerow-size 8192
      2 unaligned tiling: comm Xorg size 393216 mode X stride 4608 size-mod-tilerow-size 24576
      3 unaligned tiling: comm Xorg size 40960 mode X stride 1536 size-mod-tilerow-size 4096
      1 unaligned tiling: comm Xorg size 5242880 mode X stride 5632 size-mod-tilerow-size 16384
    116 unaligned tiling: comm Xorg size 6291456 mode X stride 6656 size-mod-tilerow-size 8192
    280 unaligned tiling: comm Xorg size 65536 mode X stride 1536 size-mod-tilerow-size 4096
      6 unaligned tiling: comm Xorg size 65536 mode X stride 2560 size-mod-tilerow-size 4096

Comments

Chris Wilson Jan. 4, 2013, 8:32 p.m. UTC | #1
On Fri, 04 Jan 2013 21:18:53 +0200, Imre Deak <imre.deak@intel.com> wrote:
> On Fri, 2013-01-04 at 17:47 +0000, Chris Wilson wrote:
> > Meep. A long time ago we got the calcuations wrong (slightly less for
> > gen2), but it was and still is a userspace bug with the potential of
> > handing the GPU.
> > 
> > A multiple of tile_height tall and a mutiple of tile_width across should
> > always be an exact number of pages (and an exact multiple of tile-row
> > pages). And we should have been obeying that since the introduction of
> > set-tiling (ignoring the aforementioned bugs) - so I'd really like to
> > see any evidence of userspace getting that wrong.
> 
> On Ubuntu 12.10, with xf86-video-intel 2.20.9 I get the attached log
> with the following patch applied:
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c
> b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 7b0a3b3..846e96f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -236,9 +236,17 @@ i915_tiling_ok(struct drm_device *dev, int stride,
> int size, int tiling_mode)
>  	if (INTEL_INFO(dev)->gen >= 4) {
>  		if (stride & (tile_width - 1))
>  			return false;
> -		return true;
>  	}
>  
> +	if (size % (stride / tile_width * PAGE_SIZE))
> +		printk("unaligned tiling: comm %.*s size %u mode %c stride %d
> size-mod-tilerow-size %zd\n",
> +			(int)sizeof(current->comm), current->comm,
> +			size, tiling_mode == I915_TILING_X ? 'X' : 'Y',
> +			stride, size % (stride / tile_width * PAGE_SIZE));
> +
> +	if (INTEL_INFO(dev)->gen >= 4)
> +		return true;
> +
>  	/* Pre-965 needs power of two tile widths */
>  	if (stride < tile_width)
>  		return false;
> 
> 
> dmesg | grep unaligned | cut -c 16- | sort | uniq -c
>       2 unaligned tiling: comm compiz size 10485760 mode X stride 6656 size-mod-tilerow-size 49152
>       2 unaligned tiling: comm compiz size 10485760 mode Y stride 6400 size-mod-tilerow-size 40960
>       1 unaligned tiling: comm compiz size 1572864 mode Y stride 2944 size-mod-tilerow-size 65536

Hmm, I also forgot to mention we then pluck a bo out of cache returning
a potentially larger than required size. However, looks like I need to
double check userspace to make sure we are overallocating tile rows.

So we have a choice of overallocating the GTT for fenced regions, or
rounding the fence region to a tile row which preserves the existing
functional userspace behaviour. The right answer is indeed to modify the
behaviour to overallocate physical space so the fence region doesn't
overlap another bo. Next up you need to review what happens after a
change of tiling and whether we rebind before the next fenced GTT
access.
-Chris
Imre Deak Jan. 4, 2013, 8:54 p.m. UTC | #2
On Fri, 2013-01-04 at 20:32 +0000, Chris Wilson wrote:
> On Fri, 04 Jan 2013 21:18:53 +0200, Imre Deak <imre.deak@intel.com> wrote:
> > On Fri, 2013-01-04 at 17:47 +0000, Chris Wilson wrote:
> > > Meep. A long time ago we got the calcuations wrong (slightly less for
> > > gen2), but it was and still is a userspace bug with the potential of
> > > handing the GPU.
> > > 
> > > A multiple of tile_height tall and a mutiple of tile_width across should
> > > always be an exact number of pages (and an exact multiple of tile-row
> > > pages). And we should have been obeying that since the introduction of
> > > set-tiling (ignoring the aforementioned bugs) - so I'd really like to
> > > see any evidence of userspace getting that wrong.
> > 
> > On Ubuntu 12.10, with xf86-video-intel 2.20.9 I get the attached log
> > with the following patch applied:
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > index 7b0a3b3..846e96f 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> > @@ -236,9 +236,17 @@ i915_tiling_ok(struct drm_device *dev, int stride,
> > int size, int tiling_mode)
> >  	if (INTEL_INFO(dev)->gen >= 4) {
> >  		if (stride & (tile_width - 1))
> >  			return false;
> > -		return true;
> >  	}
> >  
> > +	if (size % (stride / tile_width * PAGE_SIZE))
> > +		printk("unaligned tiling: comm %.*s size %u mode %c stride %d
> > size-mod-tilerow-size %zd\n",
> > +			(int)sizeof(current->comm), current->comm,
> > +			size, tiling_mode == I915_TILING_X ? 'X' : 'Y',
> > +			stride, size % (stride / tile_width * PAGE_SIZE));
> > +
> > +	if (INTEL_INFO(dev)->gen >= 4)
> > +		return true;
> > +
> >  	/* Pre-965 needs power of two tile widths */
> >  	if (stride < tile_width)
> >  		return false;
> > 
> > 
> > dmesg | grep unaligned | cut -c 16- | sort | uniq -c
> >       2 unaligned tiling: comm compiz size 10485760 mode X stride 6656 size-mod-tilerow-size 49152
> >       2 unaligned tiling: comm compiz size 10485760 mode Y stride 6400 size-mod-tilerow-size 40960
> >       1 unaligned tiling: comm compiz size 1572864 mode Y stride 2944 size-mod-tilerow-size 65536
> 
> Hmm, I also forgot to mention we then pluck a bo out of cache returning
> a potentially larger than required size. However, looks like I need to
> double check userspace to make sure we are overallocating tile rows.

Ok, I checked now the result is somewhat similar with 2.20.17.

> So we have a choice of overallocating the GTT for fenced regions, or
> rounding the fence region to a tile row which preserves the existing
> functional userspace behaviour.

Do you mean rounding the size down to the closest tile row address (and
program that to the fence reg)? I had such a version, but I thought it
got too complicated due to the old HW's power-of-two alignment
requirement. Here we would still need to over allocate if the
power-of-two size happens to be non tile-row size aligned.. But I guess
it's still doable if we want to save some gtt space.

> The right answer is indeed to modify the
> behaviour to overallocate physical space so the fence region doesn't
> overlap another bo. Next up you need to review what happens after a
> change of tiling and whether we rebind before the next fenced GTT
> access.

Ah right, atm only an incorrect alignment will cause an unbind, but we
would also need to check the new gtt size if we chose to over allocate.

--Imre
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c
b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 7b0a3b3..846e96f 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -236,9 +236,17 @@  i915_tiling_ok(struct drm_device *dev, int stride,
int size, int tiling_mode)
 	if (INTEL_INFO(dev)->gen >= 4) {
 		if (stride & (tile_width - 1))
 			return false;
-		return true;
 	}
 
+	if (size % (stride / tile_width * PAGE_SIZE))
+		printk("unaligned tiling: comm %.*s size %u mode %c stride %d
size-mod-tilerow-size %zd\n",
+			(int)sizeof(current->comm), current->comm,
+			size, tiling_mode == I915_TILING_X ? 'X' : 'Y',
+			stride, size % (stride / tile_width * PAGE_SIZE));
+
+	if (INTEL_INFO(dev)->gen >= 4)
+		return true;
+
 	/* Pre-965 needs power of two tile widths */