mbox series

[0/9] Per client engine busyness

Message ID 20200318110146.22339-1-tvrtko.ursulin@linux.intel.com (mailing list archive)
Headers show
Series Per client engine busyness | expand

Message

Tvrtko Ursulin March 18, 2020, 11:01 a.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Checkpatch cleanup and bits of review feedback only.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── 9
	    ├── busy
	    │   ├── 0
	    │   ├── 1
	    │   ├── 2
	    │   └── 3
	    ├── name
	    └── pid

Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

It is stil a RFC since it misses dedicated test cases to ensure things really
work as advertised.

Tvrtko Ursulin (9):
  drm/i915: Update client name on context create
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Use explicit flag to mark unreachable intel_context
  drm/i915: Track runtime spent in unreachable intel_contexts
  drm/i915: Track runtime spent in closed GEM contexts
  drm/i915: Track all user contexts per client
  drm/i915: Expose per-engine client busyness
  drm/i915: Track context current active time
  drm/i915: Prefer software tracked context busyness

 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  63 +++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  18 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   6 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  25 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  55 +++-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |  10 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  29 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 274 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_drm_client.h        |  33 ++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  25 +-
 12 files changed, 473 insertions(+), 88 deletions(-)

Comments

Tvrtko Ursulin March 18, 2020, 12:16 p.m. UTC | #1
On 18/03/2020 11:01, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Another re-spin of the per-client engine busyness series. Highlights from this
> version:
> 

Broken version with one patch missing, apologies for the spam.

Regards,

Tvrtko

>   * Checkpatch cleanup and bits of review feedback only.
> 
> Internally we track time spent on engines for each struct intel_context. This
> can serve as a building block for several features from the want list:
> smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
> wanted by some customers, cgroups controller, dynamic SSEU tuning,...
> 
> Externally, in sysfs, we expose time spent on GPU per client and per engine
> class.
> 
> Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
> a "screenshot":
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s
> 
>        IMC reads:     4414 MiB/s
>       IMC writes:     3805 MiB/s
> 
>            ENGINE      BUSY                                      MI_SEMA MI_WAIT
>       Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
>         Blitter/0    0.00% |                                   |      0%      0%
>           Video/0    0.00% |                                   |      0%      0%
>    VideoEnhance/0    0.00% |                                   |      0%      0%
> 
>    PID            NAME  Render/3D      Blitter        Video      VideoEnhance
>   2733       neverball |██████▌     ||            ||            ||            |
>   2047            Xorg |███▊        ||            ||            ||            |
>   2737        glxgears |█▍          ||            ||            ||            |
>   2128           xfwm4 |            ||            ||            ||            |
>   2047            Xorg |            ||            ||            ||            |
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Implementation wise we add a a bunch of files in sysfs like:
> 
> 	# cd /sys/class/drm/card0/clients/
> 	# tree
> 	.
> 	├── 7
> 	│   ├── busy
> 	│   │   ├── 0
> 	│   │   ├── 1
> 	│   │   ├── 2
> 	│   │   └── 3
> 	│   ├── name
> 	│   └── pid
> 	├── 8
> 	│   ├── busy
> 	│   │   ├── 0
> 	│   │   ├── 1
> 	│   │   ├── 2
> 	│   │   └── 3
> 	│   ├── name
> 	│   └── pid
> 	└── 9
> 	    ├── busy
> 	    │   ├── 0
> 	    │   ├── 1
> 	    │   ├── 2
> 	    │   └── 3
> 	    ├── name
> 	    └── pid
> 
> Files in 'busy' directories are numbered using the engine class ABI values and
> they contain accumulated nanoseconds each client spent on engines of a
> respective class.
> 
> It is stil a RFC since it misses dedicated test cases to ensure things really
> work as advertised.
> 
> Tvrtko Ursulin (9):
>    drm/i915: Update client name on context create
>    drm/i915: Make GEM contexts track DRM clients
>    drm/i915: Use explicit flag to mark unreachable intel_context
>    drm/i915: Track runtime spent in unreachable intel_contexts
>    drm/i915: Track runtime spent in closed GEM contexts
>    drm/i915: Track all user contexts per client
>    drm/i915: Expose per-engine client busyness
>    drm/i915: Track context current active time
>    drm/i915: Prefer software tracked context busyness
> 
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  63 +++-
>   .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +-
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>   drivers/gpu/drm/i915/gt/intel_context.c       |  18 +-
>   drivers/gpu/drm/i915/gt/intel_context.h       |   6 +-
>   drivers/gpu/drm/i915/gt/intel_context_types.h |  25 +-
>   drivers/gpu/drm/i915/gt/intel_lrc.c           |  55 +++-
>   drivers/gpu/drm/i915/gt/selftest_lrc.c        |  10 +-
>   drivers/gpu/drm/i915/i915_debugfs.c           |  29 +-
>   drivers/gpu/drm/i915/i915_drm_client.c        | 274 +++++++++++++++++-
>   drivers/gpu/drm/i915/i915_drm_client.h        |  33 ++-
>   drivers/gpu/drm/i915/i915_gpu_error.c         |  25 +-
>   12 files changed, 473 insertions(+), 88 deletions(-)
>