Message ID | 20181003120406.6784-1-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | 21st century intel_gpu_top | expand |
Quoting Tvrtko Ursulin (2018-10-03 13:03:53) > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > A collection of patches which I have been sending before, sometimes together and > sometimes separately, which enable intel_gpu_top to report queue depths (also > translates as overall GPU load average) and per DRM client per engine busyness. Queued falls apart with v.engine and I don't have a good suggestion for a remedy. :( -Chris
On 03/10/2018 13:36, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-10-03 13:03:53) >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> >> A collection of patches which I have been sending before, sometimes together and >> sometimes separately, which enable intel_gpu_top to report queue depths (also >> translates as overall GPU load average) and per DRM client per engine busyness. > > Queued falls apart with v.engine and I don't have a good suggestion for > a remedy. :( Indeed, I forgot about it. I have now even found a few months old branch with queued and runnable removed already. I think we also talked about the option of exposing aggregate engine class counters but that also has problems. We could go global and not expose this per engine, but that wouldn't make <gen11 users happy. Regards, Tvrtko
On 03/10/2018 13:03, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > A collection of patches which I have been sending before, sometimes together and > sometimes separately, which enable intel_gpu_top to report queue depths (also > translates as overall GPU load average) and per DRM client per engine busyness. > > This enables a fancy intel_gpu_top which looks like this (a picture is worth a > thousand words): > > intel-gpu-top - load avg 3.30, 1.51, 0.08; 949/ 949 MHz; 0% RC6; 14.66 Watts; 3605 irqs/s > > IMC reads: 4651 MiB/s > IMC writes: 25 MiB/s > > ENGINE BUSY Q r R MI_SEMA MI_WAIT > Render/3D/0 61.51% |█████████████████████████████████████████████▌ | 3 0 1 0% 0% > Blitter/0 0.00% | | 0 0 0 0% 0% > Video/0 60.86% |█████████████████████████████████████████████ | 1 0 1 0% 0% > Video/1 59.04% |███████████████████████████████████████████▋ | 1 0 1 0% 0% > VideoEnhance/0 0.00% | | 0 0 0 0% 0% > > PID NAME Render/3D/0 Blitter/0 Video/0 Video/1 VideoEnhance/0 > 23373 gem_wsim |█████▎ || ||████████▍ ||█████▎ || | > 23374 gem_wsim |███▉ || ||██▏ ||███ || | > 23375 gem_wsim |███ || ||█▍ ||███▌ || | > > All of this work actually came to be via different feature requests not directly > asking for this. Things like engine queue depth query and per context engine > busyness ioctl. Those bits need userspace which is not there yet and so I have > removed them from this posting to avoid confusion. > > What remains is a set of patches which add some PMU counters and a completely > new sysfs interface to enable intel_gpu_top to read the per client stats. > > IGT counterpart will be sent separately. FWIW at least one more person thinks this would be a nice to have feature - https://twitter.com/IntelGraphics/status/1047991913972826112. But it sure feels weird to cross-link twitter to intel-gfx! Sign of times.. :) Regards, Tvrtko > > Tvrtko Ursulin (13): > drm/i915/pmu: Fix enable count array size and bounds checking > drm/i915: Keep a count of requests waiting for a slot on GPU > drm/i915: Keep a count of requests submitted from userspace > drm/i915/pmu: Add queued counter > drm/i915/pmu: Add runnable counter > drm/i915/pmu: Add running counter > drm/i915: Store engine backpointer in the intel_context > drm/i915: Move intel_engine_context_in/out into intel_lrc.c > drm/i915: Track per-context engine busyness > drm/i915: Expose list of clients in sysfs > drm/i915: Update client name on context create > drm/i915: Expose per-engine client busyness > drm/i915: Add sysfs toggle to enable per-client engine stats > > drivers/gpu/drm/i915/i915_drv.h | 39 +++++ > drivers/gpu/drm/i915/i915_gem.c | 197 +++++++++++++++++++++++- > drivers/gpu/drm/i915/i915_gem_context.c | 18 ++- > drivers/gpu/drm/i915/i915_gem_context.h | 18 +++ > drivers/gpu/drm/i915/i915_pmu.c | 103 +++++++++++-- > drivers/gpu/drm/i915/i915_request.c | 10 ++ > drivers/gpu/drm/i915/i915_sysfs.c | 81 ++++++++++ > drivers/gpu/drm/i915/intel_engine_cs.c | 33 +++- > drivers/gpu/drm/i915/intel_lrc.c | 109 ++++++++++++- > drivers/gpu/drm/i915/intel_ringbuffer.h | 76 +++------ > include/uapi/drm/i915_drm.h | 19 ++- > 11 files changed, 614 insertions(+), 89 deletions(-) >
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> A collection of patches which I have been sending before, sometimes together and sometimes separately, which enable intel_gpu_top to report queue depths (also translates as overall GPU load average) and per DRM client per engine busyness. This enables a fancy intel_gpu_top which looks like this (a picture is worth a thousand words): intel-gpu-top - load avg 3.30, 1.51, 0.08; 949/ 949 MHz; 0% RC6; 14.66 Watts; 3605 irqs/s IMC reads: 4651 MiB/s IMC writes: 25 MiB/s ENGINE BUSY Q r R MI_SEMA MI_WAIT Render/3D/0 61.51% |█████████████████████████████████████████████▌ | 3 0 1 0% 0% Blitter/0 0.00% | | 0 0 0 0% 0% Video/0 60.86% |█████████████████████████████████████████████ | 1 0 1 0% 0% Video/1 59.04% |███████████████████████████████████████████▋ | 1 0 1 0% 0% VideoEnhance/0 0.00% | | 0 0 0 0% 0% PID NAME Render/3D/0 Blitter/0 Video/0 Video/1 VideoEnhance/0 23373 gem_wsim |█████▎ || ||████████▍ ||█████▎ || | 23374 gem_wsim |███▉ || ||██▏ ||███ || | 23375 gem_wsim |███ || ||█▍ ||███▌ || | All of this work actually came to be via different feature requests not directly asking for this. Things like engine queue depth query and per context engine busyness ioctl. Those bits need userspace which is not there yet and so I have removed them from this posting to avoid confusion. What remains is a set of patches which add some PMU counters and a completely new sysfs interface to enable intel_gpu_top to read the per client stats. IGT counterpart will be sent separately. Tvrtko Ursulin (13): drm/i915/pmu: Fix enable count array size and bounds checking drm/i915: Keep a count of requests waiting for a slot on GPU drm/i915: Keep a count of requests submitted from userspace drm/i915/pmu: Add queued counter drm/i915/pmu: Add runnable counter drm/i915/pmu: Add running counter drm/i915: Store engine backpointer in the intel_context drm/i915: Move intel_engine_context_in/out into intel_lrc.c drm/i915: Track per-context engine busyness drm/i915: Expose list of clients in sysfs drm/i915: Update client name on context create drm/i915: Expose per-engine client busyness drm/i915: Add sysfs toggle to enable per-client engine stats drivers/gpu/drm/i915/i915_drv.h | 39 +++++ drivers/gpu/drm/i915/i915_gem.c | 197 +++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_context.c | 18 ++- drivers/gpu/drm/i915/i915_gem_context.h | 18 +++ drivers/gpu/drm/i915/i915_pmu.c | 103 +++++++++++-- drivers/gpu/drm/i915/i915_request.c | 10 ++ drivers/gpu/drm/i915/i915_sysfs.c | 81 ++++++++++ drivers/gpu/drm/i915/intel_engine_cs.c | 33 +++- drivers/gpu/drm/i915/intel_lrc.c | 109 ++++++++++++- drivers/gpu/drm/i915/intel_ringbuffer.h | 76 +++------ include/uapi/drm/i915_drm.h | 19 ++- 11 files changed, 614 insertions(+), 89 deletions(-)