mbox series

[0/1] Replace shmem memory region and object backend with TTM

Message ID 20220427113404.401741-1-adrian.larumbe@collabora.com (mailing list archive)
Headers show
Series Replace shmem memory region and object backend with TTM | expand

Message

Adrián Larumbe April 27, 2022, 11:34 a.m. UTC
This patch is an attempt at eliminating the old shmem memory region and GEM
object backend, in favour of a TTM-based one that is able to manage objects
placed on both system and local memory.

Known issues:

Many GPU hungs in machines of GEN <= 5. My assumption is this has something
 to do with a caching issues, but everywhere across the TTM backend code
 I've tried to handle object creation and getting its pages with the same
 set of caching and coherency properties as in the old shmem backend.

Object passed to shmem_create_from_object somehow not being flushed after
 being written into at lrc_init_state. Seems thatwith the new backend and
 when pinning an intel_context, either i915_gem_object_pin_map is not
 creating a kernel mapping with the right caching properties or else
 flushing it afterwards doesn't do anything.

 This leads to a GPU hung because the engine's default state that is read
 with shmem_read doesn't reflect what had been written into it previously
 by vmap'ing the object's pages. The only workaround I could find was
 manually setting the shmem file's pages dirty and putting them back, but
 this looks hacky and wasteful for big BO's

Besides all this, I haven't yet implemented the pread callback for TTM
object backend, as it seems CI's BAT test list doesn't include it.

Adrian Larumbe (1):
  drm/i915: Replace shmem memory region and object backend with TTM

 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   |  12 +-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c     |  32 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c     |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c    | 397 +------------------
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 212 +++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h      |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c |  11 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c        |  64 ++-
 drivers/gpu/drm/i915/intel_memory_region.c   |   7 +-
 10 files changed, 333 insertions(+), 412 deletions(-)

Comments

Tvrtko Ursulin April 29, 2022, 9:14 a.m. UTC | #1
On 27/04/2022 12:34, Adrian Larumbe wrote:
> This patch is an attempt at eliminating the old shmem memory region and GEM
> object backend, in favour of a TTM-based one that is able to manage objects
> placed on both system and local memory.
> 
> Known issues:
> 
> Many GPU hungs in machines of GEN <= 5. My assumption is this has something
>   to do with a caching issues, but everywhere across the TTM backend code
>   I've tried to handle object creation and getting its pages with the same
>   set of caching and coherency properties as in the old shmem backend.
> 
> Object passed to shmem_create_from_object somehow not being flushed after
>   being written into at lrc_init_state. Seems thatwith the new backend and
>   when pinning an intel_context, either i915_gem_object_pin_map is not
>   creating a kernel mapping with the right caching properties or else
>   flushing it afterwards doesn't do anything.
> 
>   This leads to a GPU hung because the engine's default state that is read
>   with shmem_read doesn't reflect what had been written into it previously
>   by vmap'ing the object's pages. The only workaround I could find was
>   manually setting the shmem file's pages dirty and putting them back, but
>   this looks hacky and wasteful for big BO's

Aside, sounds like RFC would be the appropriate classification for the 
series.

But anyway, the thing I need to mention - how is THP support in the TTM 
backend? If not there it is something we absolutely need to have in 
order to avoid serious perf regressions.

It's the i915_gemfs_init call your patch removes. Even though you do 
leave the unused file dangling.

Regards,

Tvrtko

> Besides all this, I haven't yet implemented the pread callback for TTM
> object backend, as it seems CI's BAT test list doesn't include it.
> 
> Adrian Larumbe (1):
>    drm/i915: Replace shmem memory region and object backend with TTM
> 
>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   |  12 +-
>   drivers/gpu/drm/i915/gem/i915_gem_mman.c     |  32 +-
>   drivers/gpu/drm/i915/gem/i915_gem_object.h   |   2 +-
>   drivers/gpu/drm/i915/gem/i915_gem_phys.c     |   5 +-
>   drivers/gpu/drm/i915/gem/i915_gem_shmem.c    | 397 +------------------
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 212 +++++++++-
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.h      |   3 +
>   drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c |  11 +-
>   drivers/gpu/drm/i915/gt/shmem_utils.c        |  64 ++-
>   drivers/gpu/drm/i915/intel_memory_region.c   |   7 +-
>   10 files changed, 333 insertions(+), 412 deletions(-)
>