Message ID | 20220201104132.3050-1-ramalingam.c@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | drm/i915/dg2: Enabling 64k page size and flat ccs | expand |
Just a note here. To enable the dg2 with basic support sooner on CI we have taken a subset of this series separtely at https://patchwork.freedesktop.org/series/100419/ Remaining patches will be pursued on top the above series. Thanks for the review comments. We will fix them working with reviewers. Thanks. Ram. On 2022-02-01 at 16:11:13 +0530, Ramalingam C wrote: > This series introduces the enabling patches for new memory compression > feature Flat CCS and 64k page support for i915 local memory, along with > documentation on the uAPI impact. Included the details of the feature and > the implications on the uAPI below. Which is also added into > Documentation/gpu/rfc/i915_dg2.rst > > DG2 64K page size support: > ========================= > > On discrete platforms, starting from DG2, we have to contend with GTT > page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE > objects. Specifically the hardware only supports 64K or larger GTT > page sizes for such memory. The kernel will already ensure that all > I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page > sizes underneath. > > Note that the returned size here will always reflect any required > rounding up done by the kernel, i.e 4K will now become 64K on devices > such as DG2. > > Special DG2 GTT address alignment requirement: > > The GTT alignment will also need to be at least 2M for such objects. > > Note that due to how the hardware implements 64K GTT page support, we > have some further complications: > > 1) The entire PDE (which covers a 2MB virtual address range), must > contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same > PDE is forbidden by the hardware. > > 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM > objects. > > To keep things simple for userland, we mandate that any GTT mappings > must be aligned to and rounded up to 2MB. As this only wastes virtual > address space and avoids userland having to copy any needlessly > complicated PDE sharing scheme (coloring) and only affects DG2, this > is deemed to be a good compromise. > > Flat CCS support for lmem > ========================= > On Xe-HP and later devices, we use dedicated compression control state > (CCS) stored in local memory for each surface, to support the 3D and > media compression formats. > > The memory required for the CCS of the entire local memory is 1/256 of > the local memory size. So before the kernel boot, the required memory is > reserved for the CCS data and a secure register will be programmed with > the CCS base address. > > Flat CCS data needs to be cleared when a lmem object is allocated. And > CCS data can be copied in and out of CCS region through > XY_CTRL_SURF_COPY_BLT. CPU can’t access the CCS data directly. > > When we exaust the lmem, if the object’s placements support smem, then > we can directly decompress the compressed lmem object into smem and > start using it from smem itself. > > But when we need to swapout the compressed lmem object into a smem > region though objects’ placement doesn’t support smem, then we copy the > lmem content as it is into smem region along with ccs data (using > XY_CTRL_SURF_COPY_BLT). When the object is referred, lmem content will > be swaped in along with restoration of the CCS data (using > XY_CTRL_SURF_COPY_BLT) at corresponding location. > > Flat-CCS Modifiers for different compression formats > ==================================================== > I915_FORMAT_MOD_4_TILED_DG2_RC_CCS - used to indicate the buffers of > Flat CCS render compression formats. Though the general layout is same > as I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression > algorithm is used. Render compression uses 128 byte compression blocks > > I915_FORMAT_MOD_4_TILED_DG2_MC_CCS -used to indicate the buffers of Flat > CCS media compression formats. Though the general layout is same as > I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm > is used. Media compression uses 256 byte compression blocks. > > I915_FORMAT_MOD_4_TILED_DG2_RC_CCS_CC - used to indicate the buffers of > Flat CCS clear color render compression formats. Unified compression > format for clear color render compression. The genral layout is a tiled > layout using 4Kb tiles i.e Tile4 layout. Fast clear color value expected > by HW is located in fb at offset 0 of plane#1 > > v2: > Fixed some formatting issues and platform naming issues > Added some more documentation on Flat-CCS > > v3: > Plane programming is handled for flat-ccs and clear color > Tile4 and flat ccs modifier patches are rebased on table based > modifier reference method > Three patches are squashed > Y tile is pruned for DG2. > flat_ccs_cc plane format info is added > Added mesa, compute and media ppl for required uAPI ack. > > v4: > Rebasing of the patches > > v5: > KDoc is enhanced for cc modifier. [Nanley & Lionel] > inbuild macro usage for functional fix [Bob] > Addressed review comments from Matt > Platform coverage fix for modifiers [Imre] > > Abdiel Janulgue (1): > drm/i915/lmem: Enable lmem for platforms with Flat CCS > > Anshuman Gupta (1): > drm/i915/dg2: Flat CCS Support > > Ayaz A Siddiqui (1): > drm/i915/gt: Clear compress metadata for Xe_HP platforms > > CQ Tang (1): > drm/i915/xehpsdv: Add has_flat_ccs to device info > > Matt Roper (1): > drm/i915/dg2: Add DG2 unified compression > > Matthew Auld (6): > drm/i915: enforce min GTT alignment for discrete cards > drm/i915: support 64K GTT pages for discrete cards > drm/i915/gtt: allow overriding the pt alignment > drm/i915/gtt: add xehpsdv_ppgtt_insert_entry > drm/i915/migrate: add acceleration support for DG2 > drm/i915/uapi: document behaviour for DG2 64K support > > Mika Kahola (1): > uapi/drm/dg2: Introduce format modifier for DG2 clear color > > Ramalingam C (4): > drm/i915: add needs_compact_pt flag > Doc/gpu/rfc/i915: i915 DG2 64k pagesize uAPI > drm/i915/Flat-CCS: Document on Flat-CCS memory compression > Doc/gpu/rfc/i915: i915 DG2 flat-CCS uAPI > > Robert Beckett (1): > drm/i915: add gtt misalignment test > > Stanislav Lisovskiy (2): > drm/i915: Introduce new Tile 4 format > drm/i915/dg2: Tile 4 plane format support > > Documentation/gpu/rfc/i915_dg2.rst | 32 ++ > Documentation/gpu/rfc/index.rst | 3 + > drivers/gpu/drm/i915/display/intel_display.c | 5 +- > drivers/gpu/drm/i915/display/intel_fb.c | 68 +++- > drivers/gpu/drm/i915/display/intel_fb.h | 1 + > drivers/gpu/drm/i915/display/intel_fbc.c | 1 + > .../drm/i915/display/intel_plane_initial.c | 1 + > .../drm/i915/display/skl_universal_plane.c | 70 +++- > .../gpu/drm/i915/gem/selftests/huge_pages.c | 60 ++++ > .../i915/gem/selftests/i915_gem_client_blt.c | 21 +- > drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 158 +++++++- > drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 14 + > drivers/gpu/drm/i915/gt/intel_gt.c | 19 + > drivers/gpu/drm/i915/gt/intel_gt.h | 1 + > drivers/gpu/drm/i915/gt/intel_gtt.c | 12 + > drivers/gpu/drm/i915/gt/intel_gtt.h | 31 +- > drivers/gpu/drm/i915/gt/intel_migrate.c | 336 ++++++++++++++++-- > drivers/gpu/drm/i915/gt/intel_ppgtt.c | 17 +- > drivers/gpu/drm/i915/gt/intel_region_lmem.c | 24 +- > drivers/gpu/drm/i915/i915_drv.h | 18 +- > drivers/gpu/drm/i915/i915_pci.c | 4 + > drivers/gpu/drm/i915/i915_reg.h | 4 + > drivers/gpu/drm/i915/i915_vma.c | 9 + > drivers/gpu/drm/i915/intel_device_info.h | 3 + > drivers/gpu/drm/i915/intel_pm.c | 1 + > drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 224 ++++++++++-- > include/uapi/drm/drm_fourcc.h | 43 +++ > include/uapi/drm/i915_drm.h | 44 ++- > 28 files changed, 1102 insertions(+), 122 deletions(-) > create mode 100644 Documentation/gpu/rfc/i915_dg2.rst > > -- > 2.20.1 >