mbox series

[v4,0/1] Delay disabling GuC scheduling of an idle context

Message ID 20221006060224.691844-1-alan.previn.teres.alexis@intel.com (mailing list archive)
Headers show
Series Delay disabling GuC scheduling of an idle context | expand

Message

Alan Previn Oct. 6, 2022, 6:02 a.m. UTC
This series adds a delay before disabling scheduling of the guc-context
when a context has become idle to avoid costly re-registration that may
occur immediately after. The 2nd patch should explain it quite well.

The origin of this series was posted by Matthew Brost back in Oct 2021
(https://patchwork.freedesktop.org/series/96167/). However no real
world workload performance impact was available until recently proving
it's intended results.

This series is a redo of a prior patch that was reverted:
2ccddb758079d0c62ce03e69ee8929bb212f7799 drm/i915/guc: Add delay to
disable scheduling after pin count goes to zero

The cause for the reversion is now fixed here (was not caught due to
issues with CI reporting at that time). Two additional changes included
in this redo and restarting as new series / revs:
 - Resolve race between guc_request_alloc and guc_context_close in
completing the delayed disable-guc-scheduling worker.
 - GT Reset flow properly cancelling delayed disable-sched worker and
   closing contexts that were were still awaiting that delayed task.

Changes from prior revs:
   v3: - Removed the now unused context_has_committed_requests (John
         Harrison)
       - Renamed the worker from sched_disable_delay to
         sched_disable_delay_work
   v2: - Removed unused function prototype.
   v1: - Changed the added guc's sched_disable_foo debugfs tunable knobs
         to unsigned int type (Tvrtko Ursulin)
       - Added more comments in the race-resolution code change
         between guc_request_alloc and context-close (Tvrtko Ursulin)
       - Increased the timeout on the race-resolution code change
         between guc_request_alloc and context-close (Daniele Ceraolo Spurio)
       - As part of guc reset preparation flow, instead of creating a new
         function (taking a whole round of locks) to deal with the contexts 
         that are in the midst of awaiting the delayed-disable-sched worker
         move that code inside scrub_guc_desc_for_outstanding_g2h before
         we check for 'pending_disable' contexts.

Matthew Brost (1):
  drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis

 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   8 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   7 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |  16 ++
 .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c    |  61 +++++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 210 +++++++++++++++---
 drivers/gpu/drm/i915/i915_selftest.h          |   2 +
 7 files changed, 274 insertions(+), 32 deletions(-)