mbox series

[v4,0/8] drm/xe: Per client usage

Message ID 20240515214258.59209-1-lucas.demarchi@intel.com (mailing list archive)
Headers show
Series drm/xe: Per client usage | expand

Message

Lucas De Marchi May 15, 2024, 9:42 p.m. UTC
v4 of https://lore.kernel.org/all/20240507224510.442971-1-lucas.demarchi@intel.com

Add per-client usage statistics to xe. This ports xe to use the common
method in drm to export the usage to userspace per client (where 1
client == 1 drm fd open).

However instead of using the current format measured in nsec, this
creates a new one. The intention here is not to mix the GPU clock domain
with the CPU clock. It allows to cover a few more use cases without
extra complications.

This version is tested on an ADL-P and also checked gputop with i915 to
make sure not regressed. Last patch also contains the documentation for
the new key and sample output as requested in v1.

The pre-existent drm-cycles-<keystr> is used as is, which allows gputop
to work with xe.

This last patch still has some open discussion from v2, so we may need
to hold it a little more.

v2:
  - Create a new drm-total-cycles instead of re-using drm-engine with a
    different unit
  - Add documentation for the new interface and clarify usage of
    xe_lrc_update_timestamp()

v3:
  - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" -
    see commit message
  - Reorder commits so the ones that are useful in other patch series
    come first

v4:
  - Fix some comments and documentation
  - Add 2 patches so we cache on the gt the mask of engines visible to
    userspace and the per-class capacity. Previously we were doing this
    during the query, but besides not being very efficient as pointed
    by Tvrtko, we were also not handling correclty reserved engines and
    engines "hidden" by e.g. ccs_mode.  So move that part to be executed
    on init and when changing the available engines.
  - Simplify the fdinfo output loop since now we have the information
    cached on gt. This also moves the read of the gpu timestamp out
    of the loop as suggested by Tvrtko and using the helpers implemented
    in the new patches.

Lucas De Marchi (6):
  drm/xe: Promote xe_hw_engine_class_to_str()
  drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
  drm/xe: Add helper to capture engine timestamp
  drm/xe: Cache data about user-visible engines
  drm/xe: Add helper to return any available hw engine
  drm/xe/client: Print runtime to fdinfo

Umesh Nerlige Ramappa (2):
  drm/xe/lrc: Add helper to capture context timestamp
  drm/xe: Add helper to accumulate exec queue runtime

 Documentation/gpu/drm-usage-stats.rst         |  21 ++-
 Documentation/gpu/xe/index.rst                |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst   |  10 ++
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h       |   1 +
 drivers/gpu/drm/xe/xe_device_types.h          |   3 +
 drivers/gpu/drm/xe/xe_drm_client.c            | 121 +++++++++++++++++-
 drivers/gpu/drm/xe/xe_exec_queue.c            |  37 ++++++
 drivers/gpu/drm/xe/xe_exec_queue.h            |   1 +
 drivers/gpu/drm/xe/xe_execlist.c              |   1 +
 drivers/gpu/drm/xe/xe_gt.c                    |  34 +++++
 drivers/gpu/drm/xe/xe_gt.h                    |  20 +++
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c           |   1 +
 drivers/gpu/drm/xe/xe_gt_types.h              |  21 ++-
 drivers/gpu/drm/xe/xe_guc_submit.c            |   2 +
 drivers/gpu/drm/xe/xe_hw_engine.c             |  27 ++++
 drivers/gpu/drm/xe/xe_hw_engine.h             |   3 +
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  18 ---
 drivers/gpu/drm/xe/xe_lrc.c                   |  12 ++
 drivers/gpu/drm/xe/xe_lrc.h                   |  14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h             |   3 +
 20 files changed, 329 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst