[RFC,0/8] Per client engine busyness
mbox series

Message ID 20191219180019.25562-1-tvrtko.ursulin@linux.intel.com
Headers show
Series
  • Per client engine busyness
Related show

Message

Tvrtko Ursulin Dec. 19, 2019, 6 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Now tracks GPU time for clients who exit with GPU work left running.
 * No more global toggle - it is now constantly on.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 9
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── enable_stats

Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

I will post the corresponding patch to intel_gpu_top for reference as well.

Tvrtko Ursulin (8):
  drm/i915: Switch context id allocation directoy to xarray
  drm/i915: Reference count struct drm_i915_file_private
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Track per-context engine busyness
  drm/i915: Track all user contexts per client
  drm/i915: Contexts can use struct pid stored in the client
  drm/i915: Expose per-engine client busyness

 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 113 ++++++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
 .../gpu/drm/i915/gem/selftests/mock_context.c |   3 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 ++
 drivers/gpu/drm/i915/gt/intel_context.h       |  11 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  52 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |   7 +-
 drivers/gpu/drm/i915/i915_drv.c               |   4 -
 drivers/gpu/drm/i915/i915_drv.h               |  66 ++++-
 drivers/gpu/drm/i915/i915_gem.c               | 236 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |   6 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 14 files changed, 486 insertions(+), 81 deletions(-)