mbox series

[v7,0/8] TTM LRU-walk cherry-picks

Message ID 20240705153206.68526-1-thomas.hellstrom@linux.intel.com (mailing list archive)
Headers show
Series TTM LRU-walk cherry-picks | expand

Message

Thomas Hellström July 5, 2024, 3:31 p.m. UTC
These are cherry-picks from the xe shrinker series here:

https://patchwork.freedesktop.org/series/131815/

extracted to speed up review progress and inclusion.

The series provides a restartable LRU walk and makes it possible
resume the walk after dropping the lock to evict or swap out.

Patch 1-4 implements restartable LRU list iteration.

Patch 5 implements a LRU walker + resv locking helper

Patch 6 moves TTM swapping over to the walker.

Patch 7 moves TTM eviction over to the walker.

Patch 8 Balances the struct ttm_resource_cursor interface

v2:
- Squash obsolete revision history in the patch commit messages.
- Fix a couple of review comments by Christian
- Don't store the mem_type in the TTM managers but in the
  resource cursor.
- Rename introduced TTM *back_up* function names to *backup*
- Add ttm pool recovery fault injection.
- Shrinker xe kunit test
- Various bugfixes

v3:
- Address some review comments from Matthew Brost and Christian König.
- Use the restartable LRU walk for TTM swapping and eviction.
- Provide a POC drm_exec locking implementation for exhaustive
  eviction. (Christian König).

v4:
- Remove the RFC exhaustive eviction part. While the path to exhaustive
  eviction is pretty clear and demonstrated in v3, there is still some
  drm_exec work that needs to be agreed and implemented.
- Add shrinker power management. On some hw we need to wake when shrinking.
- Fix the lru walker helper for -EALREADY errors.
- Add drm/xe: Increase the XE_PL_TT watermark.

v5:
- Update also TTM kunit tests
- Handle ghost- and zombie objects in the shrinker.
- A couple of compile- and UAF fixes reported by Kernel Build Robot and
  Dan Carpenter.

v6:
- Address review comments from Matthew Brost as detailed in patches
  4/12, 5/12, 6/12, 7/12, 8/12.

v7:
- Drop previous patches 8-12 for now and concentrate on 1-7
- Add a new patch 8 to balance the ttm_resource_cursor interface
  (Christian König)
- Fix various style comments from Christian König in patch 5-7.
- Update Reviewed-by: and Acked tags.

Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>

Thomas Hellström (8):
  drm/ttm: Allow TTM LRU list nodes of different types
  drm/ttm: Slightly clean up LRU list iteration
  drm/ttm: Use LRU hitches
  drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist
    moves
  drm/ttm: Provide a generic LRU walker helper
  drm/ttm: Use the LRU walker helper for swapping
  drm/ttm: Use the LRU walker for eviction
  drm/ttm: Balance ttm_resource_cursor_init() and
    ttm_resource_cursor_fini()

 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |   4 +
 drivers/gpu/drm/ttm/tests/ttm_bo_test.c       |   6 +-
 drivers/gpu/drm/ttm/tests/ttm_resource_test.c |   2 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  | 461 ++++++++----------
 drivers/gpu/drm/ttm/ttm_bo_util.c             | 153 ++++++
 drivers/gpu/drm/ttm/ttm_device.c              |  29 +-
 drivers/gpu/drm/ttm/ttm_resource.c            | 269 +++++++---
 drivers/gpu/drm/xe/xe_vm.c                    |   4 +
 include/drm/ttm/ttm_bo.h                      |  48 +-
 include/drm/ttm/ttm_resource.h                | 109 ++++-
 10 files changed, 732 insertions(+), 353 deletions(-)