Message ID | 1357317721-6313-6-git-send-email-imre.deak@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 4 Jan 2013 18:42:00 +0200, Imre Deak <imre.deak@intel.com> wrote: > The gtt space needed for tiled objects might be bigger than the linear > size programmed into the correpsonding fence register. For example for > the following buffer on a Gen5+ HW: > > - allocation size: 4096 bytes > - tiling mode: X tiled > - stride: 1536 > > we need (1536 / 512) * 4096 bytes of gtt space to cover all the pixels > in the buffer, but at the moment we allocate only 4096. This means that > any buffer following this tiled buffer in the gtt space will be > corrupted if pixels belonging to the 2nd and 3rd tiles are written. > > Fix this by rounding up the size of the allocated gtt space to the next > tile row address. The page frames beyond the allocation size will be > backed by the single gtt scratch page used already elsewhere for similar > padding. > > Note that this is more of a security/robustness problem and not fixing any > reported issue that I know of. This is because applications will normally > access only the part of the buffer that is tile row size aligned. There should not be any reported issues because all userspace already allocates up to the end of tile-row and stride should be enforced to be a multiple of tile-width. So the use of DIV_ROUND_UP implies a programming error that should have been reported back to userspace earlier. We can extend that by checking to make sure userspace has allocated a valid buffer, that is, it has allocated sufficient pages for the sampler access into the tiled buffer (or reject the set-tiling). -Chris
On Fri, 2013-01-04 at 17:07 +0000, Chris Wilson wrote: > On Fri, 4 Jan 2013 18:42:00 +0200, Imre Deak <imre.deak@intel.com> wrote: > > The gtt space needed for tiled objects might be bigger than the linear > > size programmed into the correpsonding fence register. For example for > > the following buffer on a Gen5+ HW: > > > > - allocation size: 4096 bytes > > - tiling mode: X tiled > > - stride: 1536 > > > > we need (1536 / 512) * 4096 bytes of gtt space to cover all the pixels > > in the buffer, but at the moment we allocate only 4096. This means that > > any buffer following this tiled buffer in the gtt space will be > > corrupted if pixels belonging to the 2nd and 3rd tiles are written. > > > > Fix this by rounding up the size of the allocated gtt space to the next > > tile row address. The page frames beyond the allocation size will be > > backed by the single gtt scratch page used already elsewhere for similar > > padding. > > > > Note that this is more of a security/robustness problem and not fixing any > > reported issue that I know of. This is because applications will normally > > access only the part of the buffer that is tile row size aligned. > > There should not be any reported issues because all userspace already > allocates up to the end of tile-row and stride should be enforced to be > a multiple of tile-width. So the use of DIV_ROUND_UP implies a > programming error that should have been reported back to userspace > earlier. We can extend that by checking to make sure userspace has > allocated a valid buffer, that is, it has allocated sufficient pages for > the sampler access into the tiled buffer (or reject the set-tiling). > -Chris Ok, I tested this with older UXA that still allocated non-aligned buffers. If that's not the case any more then rejecting set-tiling if it's called on a non tile-row size aligned buffer would work too. --Imre
On Fri, 04 Jan 2013 19:23:42 +0200, Imre Deak <imre.deak@intel.com> wrote: > On Fri, 2013-01-04 at 17:07 +0000, Chris Wilson wrote: > > On Fri, 4 Jan 2013 18:42:00 +0200, Imre Deak <imre.deak@intel.com> wrote: > > > The gtt space needed for tiled objects might be bigger than the linear > > > size programmed into the correpsonding fence register. For example for > > > the following buffer on a Gen5+ HW: > > > > > > - allocation size: 4096 bytes > > > - tiling mode: X tiled > > > - stride: 1536 > > > > > > we need (1536 / 512) * 4096 bytes of gtt space to cover all the pixels > > > in the buffer, but at the moment we allocate only 4096. This means that > > > any buffer following this tiled buffer in the gtt space will be > > > corrupted if pixels belonging to the 2nd and 3rd tiles are written. > > > > > > Fix this by rounding up the size of the allocated gtt space to the next > > > tile row address. The page frames beyond the allocation size will be > > > backed by the single gtt scratch page used already elsewhere for similar > > > padding. > > > > > > Note that this is more of a security/robustness problem and not fixing any > > > reported issue that I know of. This is because applications will normally > > > access only the part of the buffer that is tile row size aligned. > > > > There should not be any reported issues because all userspace already > > allocates up to the end of tile-row and stride should be enforced to be > > a multiple of tile-width. So the use of DIV_ROUND_UP implies a > > programming error that should have been reported back to userspace > > earlier. We can extend that by checking to make sure userspace has > > allocated a valid buffer, that is, it has allocated sufficient pages for > > the sampler access into the tiled buffer (or reject the set-tiling). > > -Chris > > Ok, I tested this with older UXA that still allocated non-aligned > buffers. If that's not the case any more then rejecting set-tiling if > it's called on a non tile-row size aligned buffer would work too. Meep. A long time ago we got the calcuations wrong (slightly less for gen2), but it was and still is a userspace bug with the potential of handing the GPU. A multiple of tile_height tall and a mutiple of tile_width across should always be an exact number of pages (and an exact multiple of tile-row pages). And we should have been obeying that since the introduction of set-tiling (ignoring the aforementioned bugs) - so I'd really like to see any evidence of userspace getting that wrong. -Chris
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a20edca..3a755e4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1577,7 +1577,11 @@ void i915_gem_free_all_phys_object(struct drm_device *dev); void i915_gem_release(struct drm_device *dev, struct drm_file *file); uint32_t -i915_gem_get_gtt_size(struct drm_device *dev, uint32_t size, int tiling_mode); +i915_gem_get_gtt_linear_size(struct drm_device *dev, uint32_t size, + int tiling_mode); +uint32_t +i915_gem_get_gtt_physical_size(struct drm_device *dev, uint32_t size, + int tiling_mode, int stride); uint32_t i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size, int tiling_mode, bool fenced); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a878b9f..548ee8b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1476,7 +1476,8 @@ i915_gem_get_tile_width(struct drm_device *dev, int tiling_mode) } uint32_t -i915_gem_get_gtt_size(struct drm_device *dev, uint32_t size, int tiling_mode) +i915_gem_get_gtt_linear_size(struct drm_device *dev, uint32_t size, + int tiling_mode) { uint32_t gtt_size; @@ -1496,6 +1497,26 @@ i915_gem_get_gtt_size(struct drm_device *dev, uint32_t size, int tiling_mode) return gtt_size; } +uint32_t +i915_gem_get_gtt_physical_size(struct drm_device *dev, uint32_t size, + int tiling_mode, int stride) +{ + uint32_t linear_size; + int tile_y_num; + int tile_row_size; + + if (tiling_mode == I915_TILING_NONE) + return size; + + linear_size = i915_gem_get_gtt_linear_size(dev, size, tiling_mode); + tile_y_num = DIV_ROUND_UP(stride, + i915_gem_get_tile_width(dev, tiling_mode)); + tile_row_size = tile_y_num * PAGE_SIZE; + size = roundup(size, tile_row_size); + + return max(size, linear_size); +} + /** * i915_gem_get_gtt_alignment - return required GTT alignment for an object * @obj: object to check @@ -1519,7 +1540,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size, * Previous chips need to be aligned to the size of the smallest * fence register that can contain the object. */ - return i915_gem_get_gtt_size(dev, size, tiling_mode); + return i915_gem_get_gtt_linear_size(dev, size, tiling_mode); } static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj) @@ -2566,8 +2587,10 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg, } if (obj) { - u32 size = obj->gtt_space->size; + u32 size; + size = i915_gem_get_gtt_linear_size(dev, obj->base.size, + obj->tiling_mode); val = (uint64_t)((obj->gtt_offset + size - 4096) & 0xfffff000) << 32; val |= obj->gtt_offset & 0xfffff000; @@ -2590,10 +2613,12 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg, u32 val; if (obj) { - u32 size = obj->gtt_space->size; + u32 size; int pitch_val; int tile_width; + size = i915_gem_get_gtt_linear_size(dev, obj->base.size, + obj->tiling_mode); WARN((obj->gtt_offset & ~I915_FENCE_START_MASK) || (size & -size) != size || (obj->gtt_offset & (size - 1)), @@ -2634,9 +2659,11 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, uint32_t val; if (obj) { - u32 size = obj->gtt_space->size; + u32 size; uint32_t pitch_val; + size = i915_gem_get_gtt_linear_size(dev, obj->base.size, + obj->tiling_mode); WARN((obj->gtt_offset & ~I830_FENCE_START_MASK) || (size & -size) != size || (obj->gtt_offset & (size - 1)), @@ -2929,9 +2956,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, return -EINVAL; } - fence_size = i915_gem_get_gtt_size(dev, - obj->base.size, - obj->tiling_mode); + fence_size = i915_gem_get_gtt_physical_size(dev, obj->base.size, + obj->tiling_mode, obj->stride); fence_alignment = i915_gem_get_gtt_alignment(dev, obj->base.size, obj->tiling_mode, true); diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c index 7b0a3b3..e738fcb 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c @@ -251,7 +251,8 @@ i915_tiling_ok(struct drm_device *dev, int stride, int size, int tiling_mode) /* Is the current GTT allocation valid for the change in tiling? */ static bool -i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode) +i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode, + int stride) { u32 size; @@ -269,7 +270,8 @@ i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode) return false; } - size = i915_gem_get_gtt_size(obj->base.dev, obj->base.size, tiling_mode); + size = i915_gem_get_gtt_physical_size(obj->base.dev, obj->base.size, + tiling_mode, stride); if (obj->gtt_space->size != size) return false; @@ -355,7 +357,8 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, obj->map_and_fenceable = obj->gtt_space == NULL || (obj->gtt_offset + obj->base.size <= dev_priv->mm.gtt_mappable_end && - i915_gem_object_fence_ok(obj, args->tiling_mode)); + i915_gem_object_fence_ok(obj, args->tiling_mode, + args->stride)); /* Rebind if we need a change of alignment */ if (!obj->map_and_fenceable) {
The gtt space needed for tiled objects might be bigger than the linear size programmed into the correpsonding fence register. For example for the following buffer on a Gen5+ HW: - allocation size: 4096 bytes - tiling mode: X tiled - stride: 1536 we need (1536 / 512) * 4096 bytes of gtt space to cover all the pixels in the buffer, but at the moment we allocate only 4096. This means that any buffer following this tiled buffer in the gtt space will be corrupted if pixels belonging to the 2nd and 3rd tiles are written. Fix this by rounding up the size of the allocated gtt space to the next tile row address. The page frames beyond the allocation size will be backed by the single gtt scratch page used already elsewhere for similar padding. Note that this is more of a security/robustness problem and not fixing any reported issue that I know of. This is because applications will normally access only the part of the buffer that is tile row size aligned. Signed-off-by: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 6 ++++- drivers/gpu/drm/i915/i915_gem.c | 42 ++++++++++++++++++++++++++------ drivers/gpu/drm/i915/i915_gem_tiling.c | 9 ++++--- 3 files changed, 45 insertions(+), 12 deletions(-)