diff mbox series

drm/i915: Fix context runtime accounting

Message ID 20230320151423.1708436-1-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: Fix context runtime accounting | expand

Commit Message

Tvrtko Ursulin March 20, 2023, 3:14 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

When considering whether to mark one context as stopped and another as
started we need to look at whether the previous and new _contexts_ are
different and not just requests. Otherwise the software tracked context
start time was incorrectly updated to the most recent lite-restore time-
stamp, which was in some cases resulting in active time going backward,
until the context switch (typically the hearbeat pulse) would synchronise
with the hardware tracked context runtime. Easiest use case to observe
this behaviour was with a full screen clients with close to 100% engine
load.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: bb6287cb1886 ("drm/i915: Track context current active time")
Cc: <stable@vger.kernel.org> # v5.19+
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Matthew Auld March 31, 2023, 6:25 a.m. UTC | #1
On Mon, 20 Mar 2023 at 15:14, Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> When considering whether to mark one context as stopped and another as
> started we need to look at whether the previous and new _contexts_ are
> different and not just requests. Otherwise the software tracked context
> start time was incorrectly updated to the most recent lite-restore time-
> stamp, which was in some cases resulting in active time going backward,
> until the context switch (typically the hearbeat pulse) would synchronise
> with the hardware tracked context runtime. Easiest use case to observe
> this behaviour was with a full screen clients with close to 100% engine
> load.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Fixes: bb6287cb1886 ("drm/i915: Track context current active time")
> Cc: <stable@vger.kernel.org> # v5.19+

Seems reasonable to me, fwiw,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Tvrtko Ursulin March 31, 2023, 8:35 a.m. UTC | #2
On 31/03/2023 07:25, Matthew Auld wrote:
> On Mon, 20 Mar 2023 at 15:14, Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> When considering whether to mark one context as stopped and another as
>> started we need to look at whether the previous and new _contexts_ are
>> different and not just requests. Otherwise the software tracked context
>> start time was incorrectly updated to the most recent lite-restore time-
>> stamp, which was in some cases resulting in active time going backward,
>> until the context switch (typically the hearbeat pulse) would synchronise
>> with the hardware tracked context runtime. Easiest use case to observe
>> this behaviour was with a full screen clients with close to 100% engine
>> load.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Fixes: bb6287cb1886 ("drm/i915: Track context current active time")
>> Cc: <stable@vger.kernel.org> # v5.19+
> 
> Seems reasonable to me, fwiw,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

Thanks, pushed!

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1bbe6708d0a7..750326434677 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2018,6 +2018,8 @@  process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
 	 * inspecting the queue to see if we need to resumbit.
 	 */
 	if (*prev != *execlists->active) { /* elide lite-restores */
+		struct intel_context *prev_ce = NULL, *active_ce = NULL;
+
 		/*
 		 * Note the inherent discrepancy between the HW runtime,
 		 * recorded as part of the context switch, and the CPU
@@ -2029,9 +2031,15 @@  process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
 		 * and correct overselves later when updating from HW.
 		 */
 		if (*prev)
-			lrc_runtime_stop((*prev)->context);
+			prev_ce = (*prev)->context;
 		if (*execlists->active)
-			lrc_runtime_start((*execlists->active)->context);
+			active_ce = (*execlists->active)->context;
+		if (prev_ce != active_ce) {
+			if (prev_ce)
+				lrc_runtime_stop(prev_ce);
+			if (active_ce)
+				lrc_runtime_start(active_ce);
+		}
 		new_timeslice(execlists);
 	}