diff mbox series

i915/pmu: Fix zero delta busyness issue

Message ID 20250121222557.721355-1-umesh.nerlige.ramappa@intel.com (mailing list archive)
State New
Headers show
Series i915/pmu: Fix zero delta busyness issue | expand

Commit Message

Umesh Nerlige Ramappa Jan. 21, 2025, 10:25 p.m. UTC
When running igt@gem_exec_balancer@individual for multiple iterations,
it is seen that the delta busyness returned by PMU is 0. The issue stems
from a combination of 2 implementation specific details:

1) gt_park is throttling __update_guc_busyness_stats() so that it does
not hog PCI bandwidth for some use cases. (Ref: 59bcdb564b3ba)

2) busyness implementation always returns monotonically increasing
counters. (Ref: cf907f6d29421)

If an application queried an engine while it was active,
engine->stats.guc.running is set to true. Following that, if all PM
wakeref's are released, then gt is parked. At this time the throttling
of __update_guc_busyness_stats() may result in a missed update to the
running state of the engine (due to (1) above). This means subsequent
calls to guc_engine_busyness() will think that the engine is still
running and they will keep updating the cached counter (stats->total).
This results in an inflated cached counter.

Later when the application runs a workload and queries for busyness, we
return the cached value since it is larger than the actual value (due to
(2) above)

All subsequent queries will return the same large (inflated) value, so
the application sees a delta busyness of zero.

In order to fix the issue,
- reset the running state of engines in busyness_park outside of the
  throttling code.
- when busyness is queried and PM wakeref is not active, do not
  calculate active engine time since we do not expect engines to be
  active without a wakeref.

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366
Fixes: cf907f6d2942 ("i915/guc: Ensure busyness counter increases motonically")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

Comments

Rodrigo Vivi Jan. 21, 2025, 11:10 p.m. UTC | #1
On Tue, Jan 21, 2025 at 02:25:57PM -0800, Umesh Nerlige Ramappa wrote:

drm/i915/pmu as tag please...

> When running igt@gem_exec_balancer@individual for multiple iterations,
> it is seen that the delta busyness returned by PMU is 0. The issue stems
> from a combination of 2 implementation specific details:
> 
> 1) gt_park is throttling __update_guc_busyness_stats() so that it does
> not hog PCI bandwidth for some use cases. (Ref: 59bcdb564b3ba)
> 
> 2) busyness implementation always returns monotonically increasing
> counters. (Ref: cf907f6d29421)
> 
> If an application queried an engine while it was active,
> engine->stats.guc.running is set to true. Following that, if all PM
> wakeref's are released, then gt is parked. At this time the throttling
> of __update_guc_busyness_stats() may result in a missed update to the
> running state of the engine (due to (1) above). This means subsequent
> calls to guc_engine_busyness() will think that the engine is still
> running and they will keep updating the cached counter (stats->total).
> This results in an inflated cached counter.
> 
> Later when the application runs a workload and queries for busyness, we
> return the cached value since it is larger than the actual value (due to
> (2) above)
> 
> All subsequent queries will return the same large (inflated) value, so
> the application sees a delta busyness of zero.
> 
> In order to fix the issue,
> - reset the running state of engines in busyness_park outside of the
>   throttling code.
> - when busyness is queried and PM wakeref is not active, do not
>   calculate active engine time since we do not expect engines to be
>   active without a wakeref.
> 
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366
> Fixes: cf907f6d2942 ("i915/guc: Ensure busyness counter increases motonically")
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 12f1ba7ca9c1..b7a831e1fc85 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -1372,7 +1372,7 @@ static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now)
>  	}
>  
>  	total = intel_gt_clock_interval_to_ns(gt, stats->total_gt_clks);
> -	if (stats->running) {
> +	if (wakeref && stats->running) {

do you really need to check this wakeref if you are now setting running to
false earlier?

And if we do, what about moving this entire block to inside the
existing if (wakeref) ?!

>  		u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk;
>  
>  		total += intel_gt_clock_interval_to_ns(gt, clk);
> @@ -1469,6 +1469,19 @@ static void __reset_guc_busyness_stats(struct intel_guc *guc)
>  	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
>  }
>  
> +static void __update_guc_busyness_running_state(struct intel_guc *guc)
> +{
> +	struct intel_gt *gt = guc_to_gt(guc);
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&guc->timestamp.lock, flags);
> +	for_each_engine(engine, gt, id)
> +		engine->stats.guc.running = false;
> +	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
> +}
> +
>  static void __update_guc_busyness_stats(struct intel_guc *guc)
>  {
>  	struct intel_gt *gt = guc_to_gt(guc);
> @@ -1619,6 +1632,9 @@ void intel_guc_busyness_park(struct intel_gt *gt)
>  	if (!guc_submission_initialized(guc))
>  		return;
>  
> +	/* Assume no engines are running and set running state to false */
> +	__update_guc_busyness_running_state(guc);
> +

Why not to move the entire __reset_guc_busyness_stats earlier then?

>  	/*
>  	 * There is a race with suspend flow where the worker runs after suspend
>  	 * and causes an unclaimed register access warning. Cancel the worker
> -- 
> 2.34.1
>
Umesh Nerlige Ramappa Jan. 21, 2025, 11:58 p.m. UTC | #2
On Tue, Jan 21, 2025 at 06:10:34PM -0500, Rodrigo Vivi wrote:
>On Tue, Jan 21, 2025 at 02:25:57PM -0800, Umesh Nerlige Ramappa wrote:
>
>drm/i915/pmu as tag please...

will do

>
>> When running igt@gem_exec_balancer@individual for multiple iterations,
>> it is seen that the delta busyness returned by PMU is 0. The issue stems
>> from a combination of 2 implementation specific details:
>>
>> 1) gt_park is throttling __update_guc_busyness_stats() so that it does
>> not hog PCI bandwidth for some use cases. (Ref: 59bcdb564b3ba)
>>
>> 2) busyness implementation always returns monotonically increasing
>> counters. (Ref: cf907f6d29421)
>>
>> If an application queried an engine while it was active,
>> engine->stats.guc.running is set to true. Following that, if all PM
>> wakeref's are released, then gt is parked. At this time the throttling
>> of __update_guc_busyness_stats() may result in a missed update to the
>> running state of the engine (due to (1) above). This means subsequent
>> calls to guc_engine_busyness() will think that the engine is still
>> running and they will keep updating the cached counter (stats->total).
>> This results in an inflated cached counter.
>>
>> Later when the application runs a workload and queries for busyness, we
>> return the cached value since it is larger than the actual value (due to
>> (2) above)
>>
>> All subsequent queries will return the same large (inflated) value, so
>> the application sees a delta busyness of zero.
>>
>> In order to fix the issue,
>> - reset the running state of engines in busyness_park outside of the
>>   throttling code.
>> - when busyness is queried and PM wakeref is not active, do not
>>   calculate active engine time since we do not expect engines to be
>>   active without a wakeref.
>>
>> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366
>> Fixes: cf907f6d2942 ("i915/guc: Ensure busyness counter increases motonically")
>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>> ---
>>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +++++++++++++++++-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> index 12f1ba7ca9c1..b7a831e1fc85 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
>> @@ -1372,7 +1372,7 @@ static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now)
>>  	}
>>
>>  	total = intel_gt_clock_interval_to_ns(gt, stats->total_gt_clks);
>> -	if (stats->running) {
>> +	if (wakeref && stats->running) {
>
>do you really need to check this wakeref if you are now setting running to
>false earlier?

Maybe not. I did try it without this wakeref and that passes too.

>
>And if we do, what about moving this entire block to inside the
>existing if (wakeref) ?!

Yes, looks like I could move it inside the existing wakeref check block, 
but I think I will drop this check since running is being set to false 
in the gt_park code path.

>
>>  		u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk;
>>
>>  		total += intel_gt_clock_interval_to_ns(gt, clk);
>> @@ -1469,6 +1469,19 @@ static void __reset_guc_busyness_stats(struct intel_guc *guc)
>>  	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
>>  }
>>
>> +static void __update_guc_busyness_running_state(struct intel_guc *guc)
>> +{
>> +	struct intel_gt *gt = guc_to_gt(guc);
>> +	struct intel_engine_cs *engine;
>> +	enum intel_engine_id id;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&guc->timestamp.lock, flags);
>> +	for_each_engine(engine, gt, id)
>> +		engine->stats.guc.running = false;
>> +	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
>> +}
>> +
>>  static void __update_guc_busyness_stats(struct intel_guc *guc)
>>  {
>>  	struct intel_gt *gt = guc_to_gt(guc);
>> @@ -1619,6 +1632,9 @@ void intel_guc_busyness_park(struct intel_gt *gt)
>>  	if (!guc_submission_initialized(guc))
>>  		return;
>>
>> +	/* Assume no engines are running and set running state to false */
>> +	__update_guc_busyness_running_state(guc);
>> +
>
>Why not to move the entire __reset_guc_busyness_stats earlier then?

Not sure what you mean by __reset_guc_busyness_stats because that is 
only called when reset is in progress.

Do you mean __update_guc_busyness_stats()?

Only the running state needs to be updated for every gt_park. Updates to 
other stats can be throttled based on the jiffies code in 
intel_guc_busyness_park().

Thanks,
Umesh

>
>>  	/*
>>  	 * There is a race with suspend flow where the worker runs after suspend
>>  	 * and causes an unclaimed register access warning. Cancel the worker
>> --
>> 2.34.1
>>
Rodrigo Vivi Jan. 22, 2025, 10:12 a.m. UTC | #3
On Tue, Jan 21, 2025 at 03:58:03PM -0800, Umesh Nerlige Ramappa wrote:
> On Tue, Jan 21, 2025 at 06:10:34PM -0500, Rodrigo Vivi wrote:
> > On Tue, Jan 21, 2025 at 02:25:57PM -0800, Umesh Nerlige Ramappa wrote:
> > 
> > drm/i915/pmu as tag please...
> 
> will do
> 
> > 
> > > When running igt@gem_exec_balancer@individual for multiple iterations,
> > > it is seen that the delta busyness returned by PMU is 0. The issue stems
> > > from a combination of 2 implementation specific details:
> > > 
> > > 1) gt_park is throttling __update_guc_busyness_stats() so that it does
> > > not hog PCI bandwidth for some use cases. (Ref: 59bcdb564b3ba)
> > > 
> > > 2) busyness implementation always returns monotonically increasing
> > > counters. (Ref: cf907f6d29421)
> > > 
> > > If an application queried an engine while it was active,
> > > engine->stats.guc.running is set to true. Following that, if all PM
> > > wakeref's are released, then gt is parked. At this time the throttling
> > > of __update_guc_busyness_stats() may result in a missed update to the
> > > running state of the engine (due to (1) above). This means subsequent
> > > calls to guc_engine_busyness() will think that the engine is still
> > > running and they will keep updating the cached counter (stats->total).
> > > This results in an inflated cached counter.
> > > 
> > > Later when the application runs a workload and queries for busyness, we
> > > return the cached value since it is larger than the actual value (due to
> > > (2) above)
> > > 
> > > All subsequent queries will return the same large (inflated) value, so
> > > the application sees a delta busyness of zero.
> > > 
> > > In order to fix the issue,
> > > - reset the running state of engines in busyness_park outside of the
> > >   throttling code.
> > > - when busyness is queried and PM wakeref is not active, do not
> > >   calculate active engine time since we do not expect engines to be
> > >   active without a wakeref.
> > > 
> > > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366
> > > Fixes: cf907f6d2942 ("i915/guc: Ensure busyness counter increases motonically")
> > > Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> > > ---
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +++++++++++++++++-
> > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > index 12f1ba7ca9c1..b7a831e1fc85 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > @@ -1372,7 +1372,7 @@ static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now)
> > >  	}
> > > 
> > >  	total = intel_gt_clock_interval_to_ns(gt, stats->total_gt_clks);
> > > -	if (stats->running) {
> > > +	if (wakeref && stats->running) {
> > 
> > do you really need to check this wakeref if you are now setting running to
> > false earlier?
> 
> Maybe not. I did try it without this wakeref and that passes too.
> 
> > 
> > And if we do, what about moving this entire block to inside the
> > existing if (wakeref) ?!
> 
> Yes, looks like I could move it inside the existing wakeref check block, but
> I think I will drop this check since running is being set to false in the
> gt_park code path.
> 
> > 
> > >  		u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk;
> > > 
> > >  		total += intel_gt_clock_interval_to_ns(gt, clk);
> > > @@ -1469,6 +1469,19 @@ static void __reset_guc_busyness_stats(struct intel_guc *guc)
> > >  	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
> > >  }
> > > 
> > > +static void __update_guc_busyness_running_state(struct intel_guc *guc)
> > > +{
> > > +	struct intel_gt *gt = guc_to_gt(guc);
> > > +	struct intel_engine_cs *engine;
> > > +	enum intel_engine_id id;
> > > +	unsigned long flags;
> > > +
> > > +	spin_lock_irqsave(&guc->timestamp.lock, flags);
> > > +	for_each_engine(engine, gt, id)
> > > +		engine->stats.guc.running = false;
> > > +	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
> > > +}
> > > +
> > >  static void __update_guc_busyness_stats(struct intel_guc *guc)
> > >  {
> > >  	struct intel_gt *gt = guc_to_gt(guc);
> > > @@ -1619,6 +1632,9 @@ void intel_guc_busyness_park(struct intel_gt *gt)
> > >  	if (!guc_submission_initialized(guc))
> > >  		return;
> > > 
> > > +	/* Assume no engines are running and set running state to false */
> > > +	__update_guc_busyness_running_state(guc);
> > > +
> > 
> > Why not to move the entire __reset_guc_busyness_stats earlier then?
> 
> Not sure what you mean by __reset_guc_busyness_stats because that is only
> called when reset is in progress.
> 
> Do you mean __update_guc_busyness_stats()?
> 
> Only the running state needs to be updated for every gt_park. Updates to
> other stats can be throttled based on the jiffies code in
> intel_guc_busyness_park().

ah okay then, please ignore me in this last point...

> 
> Thanks,
> Umesh
> 
> > 
> > >  	/*
> > >  	 * There is a race with suspend flow where the worker runs after suspend
> > >  	 * and causes an unclaimed register access warning. Cancel the worker
> > > --
> > > 2.34.1
> > >
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 12f1ba7ca9c1..b7a831e1fc85 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1372,7 +1372,7 @@  static ktime_t guc_engine_busyness(struct intel_engine_cs *engine, ktime_t *now)
 	}
 
 	total = intel_gt_clock_interval_to_ns(gt, stats->total_gt_clks);
-	if (stats->running) {
+	if (wakeref && stats->running) {
 		u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk;
 
 		total += intel_gt_clock_interval_to_ns(gt, clk);
@@ -1469,6 +1469,19 @@  static void __reset_guc_busyness_stats(struct intel_guc *guc)
 	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
 }
 
+static void __update_guc_busyness_running_state(struct intel_guc *guc)
+{
+	struct intel_gt *gt = guc_to_gt(guc);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&guc->timestamp.lock, flags);
+	for_each_engine(engine, gt, id)
+		engine->stats.guc.running = false;
+	spin_unlock_irqrestore(&guc->timestamp.lock, flags);
+}
+
 static void __update_guc_busyness_stats(struct intel_guc *guc)
 {
 	struct intel_gt *gt = guc_to_gt(guc);
@@ -1619,6 +1632,9 @@  void intel_guc_busyness_park(struct intel_gt *gt)
 	if (!guc_submission_initialized(guc))
 		return;
 
+	/* Assume no engines are running and set running state to false */
+	__update_guc_busyness_running_state(guc);
+
 	/*
 	 * There is a race with suspend flow where the worker runs after suspend
 	 * and causes an unclaimed register access warning. Cancel the worker