Message ID | 20220307134038.30525-5-ramalingam.c@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/ttm: Evict and restore of compressed object | expand |
On 07/03/2022 13:40, Ramalingam C wrote: > On Xe-HP and later devices, dedicated compression control state (CCS) > stored in local memory is used for each surface, to support the > 3D and media compression formats. > > The memory required for the CCS of the entire local memory is 1/256 of > the local memory size. So before the kernel boot, the required memory > is reserved for the CCS data and a secure register will be programmed > with the CCS base address > > So when an object is allocated in local memory, dont need to explicitly > allocate the space for ccs data. But when the obj is evicted into the > smem, to hold the compression related data along with the obj extra space > is needed in smem. i.e obj_size + (obj_size/256). > > Hence when a smem pages are allocated for an obj with lmem placement > possibility we create with the extra pages required for the ccs data for > the obj size. > > v2: > Used imperative wording [Thomas] > > Signed-off-by: Ramalingam C <ramalingam.c@intel.com> > cc: Christian Koenig <christian.koenig@amd.com> > cc: Hellstrom Thomas <thomas.hellstrom@intel.com> > Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 23 ++++++++++++++++++++++- > 1 file changed, 22 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > index 1a8262f5f692..c7a36861c38d 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > @@ -20,6 +20,7 @@ > #include "gem/i915_gem_ttm.h" > #include "gem/i915_gem_ttm_move.h" > #include "gem/i915_gem_ttm_pm.h" > +#include "gt/intel_gpu_commands.h" > > #define I915_TTM_PRIO_PURGE 0 > #define I915_TTM_PRIO_NO_PAGES 1 > @@ -255,12 +256,27 @@ static const struct i915_refct_sgt_ops tt_rsgt_ops = { > .release = i915_ttm_tt_release > }; > > +static inline bool > +i915_gem_object_has_lmem_placement(struct drm_i915_gem_object *obj) > +{ > + int i; > + > + for (i = 0; i < obj->mm.n_placements; i++) > + if (obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL) > + return true; > + > + return false; > +} > + > static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, > uint32_t page_flags) > { > + struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915), > + bdev); > struct ttm_resource_manager *man = > ttm_manager_type(bo->bdev, bo->resource->mem_type); > struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); > + unsigned long ccs_pages = 0; > enum ttm_caching caching; > struct i915_ttm_tt *i915_tt; > int ret; > @@ -283,7 +299,12 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, > i915_tt->is_shmem = true; > } > > - ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching, 0); > + if (HAS_FLAT_CCS(i915) && i915_gem_object_has_lmem_placement(obj)) > + ccs_pages = DIV_ROUND_UP(DIV_ROUND_UP(bo->base.size, > + NUM_BYTES_PER_CCS_BYTE), > + PAGE_SIZE); Did you figure out how to handle the case where we have LMEM + SMEM, and are unable to place the object into LMEM, and then it just ends up being kept in SMEM? AFAIK the vm.insert_entries code has always just assumed that the vma sg_table size is the same as the vma->size, and so will happily create PTEs for the hidden ccs page(s), which might corrupt the users vm, depending on the exact layout. Also it looks like the _shmem_writeback() call should now use ttm_tt->num_pages, instead of the object size? > + > + ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching, ccs_pages); > if (ret) > goto err_free; >
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 1a8262f5f692..c7a36861c38d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -20,6 +20,7 @@ #include "gem/i915_gem_ttm.h" #include "gem/i915_gem_ttm_move.h" #include "gem/i915_gem_ttm_pm.h" +#include "gt/intel_gpu_commands.h" #define I915_TTM_PRIO_PURGE 0 #define I915_TTM_PRIO_NO_PAGES 1 @@ -255,12 +256,27 @@ static const struct i915_refct_sgt_ops tt_rsgt_ops = { .release = i915_ttm_tt_release }; +static inline bool +i915_gem_object_has_lmem_placement(struct drm_i915_gem_object *obj) +{ + int i; + + for (i = 0; i < obj->mm.n_placements; i++) + if (obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL) + return true; + + return false; +} + static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { + struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915), + bdev); struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + unsigned long ccs_pages = 0; enum ttm_caching caching; struct i915_ttm_tt *i915_tt; int ret; @@ -283,7 +299,12 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, i915_tt->is_shmem = true; } - ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching, 0); + if (HAS_FLAT_CCS(i915) && i915_gem_object_has_lmem_placement(obj)) + ccs_pages = DIV_ROUND_UP(DIV_ROUND_UP(bo->base.size, + NUM_BYTES_PER_CCS_BYTE), + PAGE_SIZE); + + ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching, ccs_pages); if (ret) goto err_free;