mbox series

[RFC,0/2] drm/i915/ttm: Evict and store of compressed object

Message ID 20220207093743.14467-1-ramalingam.c@intel.com (mailing list archive)
Headers show
Series drm/i915/ttm: Evict and store of compressed object | expand

Message

Ramalingam C Feb. 7, 2022, 9:37 a.m. UTC
On flat-ccs capable platform we need to evict and resore the ccs data
along with the corresponding main memory.

This ccs data can only be access through BLT engine through a special
cmd ( )

To support above requirement of flat-ccs enabled i915 platforms this
series adds new param called ccs_pages_needed to the ttm_tt_init(),
to increase the ttm_tt->num_pages of system memory when the obj has the
lmem placement possibility.

This will be on top of the flat-ccs enabling series
https://patchwork.freedesktop.org/series/95686/

For more about flat-ccs feature please have a look at
https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5

Testing of the series is WIP and looking forward for the early review on
the amendment to ttm_tt_init and the approach.

Ramalingam C (2):
  drm/i915/ttm: Add extra pages for handling ccs data
  drm/i915/migrate: Evict and restore the ccs data

 drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
 drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++----------
 drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
 drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
 drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
 include/drm/ttm/ttm_tt.h                   |   4 +-
 8 files changed, 191 insertions(+), 139 deletions(-)

Comments

Christian König Feb. 7, 2022, 11:41 a.m. UTC | #1
Am 07.02.22 um 10:37 schrieb Ramalingam C:
> On flat-ccs capable platform we need to evict and resore the ccs data
> along with the corresponding main memory.
>
> This ccs data can only be access through BLT engine through a special
> cmd ( )
>
> To support above requirement of flat-ccs enabled i915 platforms this
> series adds new param called ccs_pages_needed to the ttm_tt_init(),
> to increase the ttm_tt->num_pages of system memory when the obj has the
> lmem placement possibility.

Well question is why isn't the buffer object allocated with the extra 
space in the first place?

Regards,
Christian.

>
> This will be on top of the flat-ccs enabling series
> https://patchwork.freedesktop.org/series/95686/
>
> For more about flat-ccs feature please have a look at
> https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5
>
> Testing of the series is WIP and looking forward for the early review on
> the amendment to ttm_tt_init and the approach.
>
> Ramalingam C (2):
>    drm/i915/ttm: Add extra pages for handling ccs data
>    drm/i915/migrate: Evict and restore the ccs data
>
>   drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
>   drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++----------
>   drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
>   drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
>   drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
>   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
>   include/drm/ttm/ttm_tt.h                   |   4 +-
>   8 files changed, 191 insertions(+), 139 deletions(-)
>
Hellstrom, Thomas Feb. 7, 2022, 1:49 p.m. UTC | #2
Hi, Christian,

On Mon, 2022-02-07 at 12:41 +0100, Christian König wrote:
> Am 07.02.22 um 10:37 schrieb Ramalingam C:
> > On flat-ccs capable platform we need to evict and resore the ccs data
> > along with the corresponding main memory.
> > 
> > This ccs data can only be access through BLT engine through a special
> > cmd ( )
> > 
> > To support above requirement of flat-ccs enabled i915 platforms this
> > series adds new param called ccs_pages_needed to the ttm_tt_init(),
> > to increase the ttm_tt->num_pages of system memory when the obj has
> > the
> > lmem placement possibility.
> 
> Well question is why isn't the buffer object allocated with the extra
> space in the first place?

That wastes precious VRAM. The extra space is needed only when the bo
is evicted.

We've had a previous short disussion on this here:
https://lists.freedesktop.org/archives/dri-devel/2021-August/321161.html

Thanks,
Thomas


> 
> Regards,
> Christian.
> 
> > 
> > This will be on top of the flat-ccs enabling series
> > https://patchwork.freedesktop.org/series/95686/
> > 
> > For more about flat-ccs feature please have a look at
> > https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5
> > 
> > Testing of the series is WIP and looking forward for the early review
> > on
> > the amendment to ttm_tt_init and the approach.
> > 
> > Ramalingam C (2):
> >    drm/i915/ttm: Add extra pages for handling ccs data
> >    drm/i915/migrate: Evict and restore the ccs data
> > 
> >   drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
> >   drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++-------
> > ---
> >   drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
> >   drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
> >   drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
> >   include/drm/ttm/ttm_tt.h                   |   4 +-
> >   8 files changed, 191 insertions(+), 139 deletions(-)
> > 
> 

----------------------------------------------------------------------
Intel Sweden AB
Registered Office: Isafjordsgatan 30B, 164 40 Kista, Stockholm, Sweden
Registration Number: 556189-6027

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Ramalingam C Feb. 7, 2022, 1:53 p.m. UTC | #3
On 2022-02-07 at 12:41:59 +0100, Christian König wrote:
> Am 07.02.22 um 10:37 schrieb Ramalingam C:
> > On flat-ccs capable platform we need to evict and resore the ccs data
> > along with the corresponding main memory.
> > 
> > This ccs data can only be access through BLT engine through a special
> > cmd ( )
> > 
> > To support above requirement of flat-ccs enabled i915 platforms this
> > series adds new param called ccs_pages_needed to the ttm_tt_init(),
> > to increase the ttm_tt->num_pages of system memory when the obj has the
> > lmem placement possibility.
> 
> Well question is why isn't the buffer object allocated with the extra space
> in the first place?
Hi Christian,

On Xe-HP and later devices, we use dedicated compression control state (CCS)
stored in local memory for each surface, to support the 3D and media
compression formats.

The memory required for the CCS of the entire local memory is 1/256 of the
local memory size. So before the kernel boot, the required memory is reserved
for the CCS data and a secure register will be programmed with the CCS base
address

So when we allocate a object in local memory we dont need to explicitly
allocate the space for ccs data. But when we evict the obj into the smem
 to hold the compression related data along with the obj we need smem
 space of obj_size + (obj_size/256).

 Hence when we create smem for an obj with lmem placement possibility we
 create with the extra space.

 Ram.
> 
> Regards,
> Christian.
> 
> > 
> > This will be on top of the flat-ccs enabling series
> > https://patchwork.freedesktop.org/series/95686/
> > 
> > For more about flat-ccs feature please have a look at
> > https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5
> > 
> > Testing of the series is WIP and looking forward for the early review on
> > the amendment to ttm_tt_init and the approach.
> > 
> > Ramalingam C (2):
> >    drm/i915/ttm: Add extra pages for handling ccs data
> >    drm/i915/migrate: Evict and restore the ccs data
> > 
> >   drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
> >   drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++----------
> >   drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
> >   drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
> >   drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
> >   include/drm/ttm/ttm_tt.h                   |   4 +-
> >   8 files changed, 191 insertions(+), 139 deletions(-)
> > 
>
Christian König Feb. 7, 2022, 2:37 p.m. UTC | #4
Am 07.02.22 um 14:53 schrieb Ramalingam C:
> On 2022-02-07 at 12:41:59 +0100, Christian König wrote:
>> Am 07.02.22 um 10:37 schrieb Ramalingam C:
>>> On flat-ccs capable platform we need to evict and resore the ccs data
>>> along with the corresponding main memory.
>>>
>>> This ccs data can only be access through BLT engine through a special
>>> cmd ( )
>>>
>>> To support above requirement of flat-ccs enabled i915 platforms this
>>> series adds new param called ccs_pages_needed to the ttm_tt_init(),
>>> to increase the ttm_tt->num_pages of system memory when the obj has the
>>> lmem placement possibility.
>> Well question is why isn't the buffer object allocated with the extra space
>> in the first place?
> Hi Christian,
>
> On Xe-HP and later devices, we use dedicated compression control state (CCS)
> stored in local memory for each surface, to support the 3D and media
> compression formats.
>
> The memory required for the CCS of the entire local memory is 1/256 of the
> local memory size. So before the kernel boot, the required memory is reserved
> for the CCS data and a secure register will be programmed with the CCS base
> address
>
> So when we allocate a object in local memory we dont need to explicitly
> allocate the space for ccs data. But when we evict the obj into the smem
>   to hold the compression related data along with the obj we need smem
>   space of obj_size + (obj_size/256).
>
>   Hence when we create smem for an obj with lmem placement possibility we
>   create with the extra space.

Exactly that's what I've been missing in the cover letter and/or commit 
messages, comments etc..

Over all sounds like a valid explanation to me, just one comment on the 
code/naming:

>   int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
> -		uint32_t page_flags, enum ttm_caching caching)
> +		uint32_t page_flags, enum ttm_caching caching,
> +		unsigned long ccs_pages)

Please don't try to leak any i915 specific stuff into common TTM code.

For example use the wording extra_pages instead of ccs_pages here.

Apart from that looks good to me,
Christian.

>
>   Ram.
>> Regards,
>> Christian.
>>
>>> This will be on top of the flat-ccs enabling series
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F95686%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=V9wQZvb0JwtplIBSYYXGzrg%2BEMvn4hfkscziPFDvZDY%3D&reserved=0
>>>
>>> For more about flat-ccs feature please have a look at
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F471777%2F%3Fseries%3D95686%26rev%3D5&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=aYjoTKMbZvi%2Fnr7hkSH4SYGxZIv8Dj210dNrBnUNpQw%3D&reserved=0
>>>
>>> Testing of the series is WIP and looking forward for the early review on
>>> the amendment to ttm_tt_init and the approach.
>>>
>>> Ramalingam C (2):
>>>     drm/i915/ttm: Add extra pages for handling ccs data
>>>     drm/i915/migrate: Evict and restore the ccs data
>>>
>>>    drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
>>>    drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
>>>    drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++----------
>>>    drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
>>>    drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
>>>    drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
>>>    include/drm/ttm/ttm_tt.h                   |   4 +-
>>>    8 files changed, 191 insertions(+), 139 deletions(-)
>>>
Ramalingam C Feb. 7, 2022, 2:47 p.m. UTC | #5
On 2022-02-07 at 15:37:09 +0100, Christian König wrote:
> Am 07.02.22 um 14:53 schrieb Ramalingam C:
> > On 2022-02-07 at 12:41:59 +0100, Christian König wrote:
> > > Am 07.02.22 um 10:37 schrieb Ramalingam C:
> > > > On flat-ccs capable platform we need to evict and resore the ccs data
> > > > along with the corresponding main memory.
> > > > 
> > > > This ccs data can only be access through BLT engine through a special
> > > > cmd ( )
> > > > 
> > > > To support above requirement of flat-ccs enabled i915 platforms this
> > > > series adds new param called ccs_pages_needed to the ttm_tt_init(),
> > > > to increase the ttm_tt->num_pages of system memory when the obj has the
> > > > lmem placement possibility.
> > > Well question is why isn't the buffer object allocated with the extra space
> > > in the first place?
> > Hi Christian,
> > 
> > On Xe-HP and later devices, we use dedicated compression control state (CCS)
> > stored in local memory for each surface, to support the 3D and media
> > compression formats.
> > 
> > The memory required for the CCS of the entire local memory is 1/256 of the
> > local memory size. So before the kernel boot, the required memory is reserved
> > for the CCS data and a secure register will be programmed with the CCS base
> > address
> > 
> > So when we allocate a object in local memory we dont need to explicitly
> > allocate the space for ccs data. But when we evict the obj into the smem
> >   to hold the compression related data along with the obj we need smem
> >   space of obj_size + (obj_size/256).
> > 
> >   Hence when we create smem for an obj with lmem placement possibility we
> >   create with the extra space.
> 
> Exactly that's what I've been missing in the cover letter and/or commit
> messages, comments etc..
> 
> Over all sounds like a valid explanation to me, just one comment on the
> code/naming:
> 
> >   int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
> > -		uint32_t page_flags, enum ttm_caching caching)
> > +		uint32_t page_flags, enum ttm_caching caching,
> > +		unsigned long ccs_pages)
> 
> Please don't try to leak any i915 specific stuff into common TTM code.
> 
> For example use the wording extra_pages instead of ccs_pages here.
> 
> Apart from that looks good to me,

Thank you. I will address the comments on naming.

Ram
> Christian.
> 
> > 
> >   Ram.
> > > Regards,
> > > Christian.
> > > 
> > > > This will be on top of the flat-ccs enabling series
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F95686%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=V9wQZvb0JwtplIBSYYXGzrg%2BEMvn4hfkscziPFDvZDY%3D&reserved=0
> > > > 
> > > > For more about flat-ccs feature please have a look at
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F471777%2F%3Fseries%3D95686%26rev%3D5&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=aYjoTKMbZvi%2Fnr7hkSH4SYGxZIv8Dj210dNrBnUNpQw%3D&reserved=0
> > > > 
> > > > Testing of the series is WIP and looking forward for the early review on
> > > > the amendment to ttm_tt_init and the approach.
> > > > 
> > > > Ramalingam C (2):
> > > >     drm/i915/ttm: Add extra pages for handling ccs data
> > > >     drm/i915/migrate: Evict and restore the ccs data
> > > > 
> > > >    drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
> > > >    drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
> > > >    drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++----------
> > > >    drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
> > > >    drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
> > > >    drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
> > > >    drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
> > > >    include/drm/ttm/ttm_tt.h                   |   4 +-
> > > >    8 files changed, 191 insertions(+), 139 deletions(-)
> > > > 
>
Nirmoy Das Feb. 7, 2022, 2:49 p.m. UTC | #6
Thanks for the clarification, Ram!

On 07/02/2022 14:53, Ramalingam C wrote:
> On 2022-02-07 at 12:41:59 +0100, Christian König wrote:
>> Am 07.02.22 um 10:37 schrieb Ramalingam C:
>>> On flat-ccs capable platform we need to evict and resore the ccs data
>>> along with the corresponding main memory.
>>>
>>> This ccs data can only be access through BLT engine through a special
>>> cmd ( )
>>>
>>> To support above requirement of flat-ccs enabled i915 platforms this
>>> series adds new param called ccs_pages_needed to the ttm_tt_init(),
>>> to increase the ttm_tt->num_pages of system memory when the obj has the
>>> lmem placement possibility.
>> Well question is why isn't the buffer object allocated with the extra space
>> in the first place?
> Hi Christian,
>
> On Xe-HP and later devices, we use dedicated compression control state (CCS)
> stored in local memory for each surface, to support the 3D and media
> compression formats.
>
> The memory required for the CCS of the entire local memory is 1/256 of the
> local memory size. So before the kernel boot, the required memory is reserved
> for the CCS data and a secure register will be programmed with the CCS base
> address
>
> So when we allocate a object in local memory we dont need to explicitly
> allocate the space for ccs data. But when we evict the obj into the smem
>   to hold the compression related data along with the obj we need smem
>   space of obj_size + (obj_size/256).
>
>   Hence when we create smem for an obj with lmem placement possibility we
>   create with the extra space.
>
>   Ram.
>> Regards,
>> Christian.
>>
>>> This will be on top of the flat-ccs enabling series
>>> https://patchwork.freedesktop.org/series/95686/
>>>
>>> For more about flat-ccs feature please have a look at
>>> https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5
>>>
>>> Testing of the series is WIP and looking forward for the early review on
>>> the amendment to ttm_tt_init and the approach.
>>>
>>> Ramalingam C (2):
>>>     drm/i915/ttm: Add extra pages for handling ccs data
>>>     drm/i915/migrate: Evict and restore the ccs data
>>>
>>>    drivers/gpu/drm/drm_gem_vram_helper.c      |   2 +-
>>>    drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
>>>    drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++++++++++----------
>>>    drivers/gpu/drm/qxl/qxl_ttm.c              |   2 +-
>>>    drivers/gpu/drm/ttm/ttm_agp_backend.c      |   2 +-
>>>    drivers/gpu/drm/ttm/ttm_tt.c               |  12 +-
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
>>>    include/drm/ttm/ttm_tt.h                   |   4 +-
>>>    8 files changed, 191 insertions(+), 139 deletions(-)
>>>