diff mbox series

[2/3] drm/i915/guc/slpc: Add waitboost functionality for SLPC

Message ID 20211101043937.35747-3-vinay.belgaumkar@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/guc/slpc: Implement waitboost for SLPC | expand

Commit Message

Vinay Belgaumkar Nov. 1, 2021, 4:39 a.m. UTC
Add helper in RPS code for handling SLPC and non-SLPC paths.
When boost is requested in the SLPC path, we can ask GuC to ramp
up the frequency req by setting the minimum frequency to boost freq.
Reset freq back to the min softlimit when there are no more waiters.

v2: Schedule a worker thread which can boost freq from within
an interrupt context as well.

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_rps.c         | 26 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_rps.h         |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 19 +++++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h |  1 +
 drivers/gpu/drm/i915/i915_request.c         |  2 +-
 5 files changed, 48 insertions(+), 1 deletion(-)

Comments

Dixit, Ashutosh Nov. 1, 2021, 8:28 p.m. UTC | #1
On Sun, 31 Oct 2021 21:39:36 -0700, Belgaumkar, Vinay wrote:
>
> @@ -945,6 +960,17 @@ void intel_rps_boost(struct i915_request *rq)
>	if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
>		struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
>
> +		if (rps_uses_slpc(rps)) {
> +			slpc = rps_to_slpc(rps);
> +
> +			/* Return if old value is non zero */
> +			if (atomic_fetch_inc(&slpc->num_waiters))
> +				return;
> +
> +			if (intel_rps_get_requested_frequency(rps) < slpc->boost_freq)

I think this check is not needed because:

a. The waitboost code only changes min_freq. i915 code should not depend on
   how GuC changes requested_freq in response to change in min_freq.

b. What is more worrisome is that when we "de-boost" we set min_freq to
   min_freq_softlimit. If GuC e.g. has a delay in bringing requested_freq
   down and intel_rps_boost() gets called meanwhile we will miss the one
   opportunity we have to boost the freq (when num_waiters goes from 0 to
   1. Asking GuC to boost when actual_freq is already boost_freq is
   harmless in comparison). So to avoid this risk of missing the chance to
   boost I think we should delete this check and replace the code above
   with something like:

                if (rps_uses_slpc(rps)) {
                        struct intel_guc_slpc *slpc = rps_to_slpc(rps);

                        if (slpc->boost_freq <= slpc->min_freq_softlimit)
                                return;

                        if (!atomic_fetch_inc(&slpc->num_waiters))
                                schedule_work(&slpc->boost_work);

                        return;
                }

Note that this check:

                if (slpc->boost_freq <= slpc->min_freq_softlimit)
                                return;

(which is basically a degenerate case in which we don't have to do
anything), can be probably be implemented when boost_freq is set in sysfs,
or may already be encompassed in "val < slpc->min_freq" in
intel_guc_slpc_set_boost_freq() in which case this check can also be
skipped from this function.

> +void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
> +{
> +	/* Return min back to the softlimit.
> +	 * This is called during request retire,
> +	 * so we don't need to fail that if the
> +	 * set_param fails.
> +	 */

nit: maybe follow kernel multi-line comment format.
Vinay Belgaumkar Nov. 2, 2021, 12:19 a.m. UTC | #2
On 11/1/2021 1:28 PM, Dixit, Ashutosh wrote:
> On Sun, 31 Oct 2021 21:39:36 -0700, Belgaumkar, Vinay wrote:
>>
>> @@ -945,6 +960,17 @@ void intel_rps_boost(struct i915_request *rq)
>> 	if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
>> 		struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
>>
>> +		if (rps_uses_slpc(rps)) {
>> +			slpc = rps_to_slpc(rps);
>> +
>> +			/* Return if old value is non zero */
>> +			if (atomic_fetch_inc(&slpc->num_waiters))
>> +				return;
>> +
>> +			if (intel_rps_get_requested_frequency(rps) < slpc->boost_freq)
> 
> I think this check is not needed because:
> 
> a. The waitboost code only changes min_freq. i915 code should not depend on
>     how GuC changes requested_freq in response to change in min_freq.
> 
> b. What is more worrisome is that when we "de-boost" we set min_freq to
>     min_freq_softlimit. If GuC e.g. has a delay in bringing requested_freq
>     down and intel_rps_boost() gets called meanwhile we will miss the one
>     opportunity we have to boost the freq (when num_waiters goes from 0 to
>     1. Asking GuC to boost when actual_freq is already boost_freq is
>     harmless in comparison). So to avoid this risk of missing the chance to
>     boost I think we should delete this check and replace the code above
>     with something like:
> 
>                  if (rps_uses_slpc(rps)) {
>                          struct intel_guc_slpc *slpc = rps_to_slpc(rps);
> 
>                          if (slpc->boost_freq <= slpc->min_freq_softlimit)
>                                  return;
> 
>                          if (!atomic_fetch_inc(&slpc->num_waiters))
>                                  schedule_work(&slpc->boost_work);
> 
>                          return;
>                  }
> 
> Note that this check:
> 
>                  if (slpc->boost_freq <= slpc->min_freq_softlimit)
>                                  return;
> 
> (which is basically a degenerate case in which we don't have to do
> anything), can be probably be implemented when boost_freq is set in sysfs,
> or may already be encompassed in "val < slpc->min_freq" in
> intel_guc_slpc_set_boost_freq() in which case this check can also be
> skipped from this function.

We already have that check in set_boost_freq function. So, just adding 
the atomic_fetch_inc check.

> 
>> +void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
>> +{
>> +	/* Return min back to the softlimit.
>> +	 * This is called during request retire,
>> +	 * so we don't need to fail that if the
>> +	 * set_param fails.
>> +	 */
> 
> nit: maybe follow kernel multi-line comment format.
> 
Ok.

Thanks,
Vinay.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 5e275f8dda8c..b2d5b1747086 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -936,8 +936,23 @@  void intel_rps_park(struct intel_rps *rps)
 	GT_TRACE(rps_to_gt(rps), "park:%x\n", rps->cur_freq);
 }
 
+void intel_rps_dec_waiters(struct intel_rps *rps)
+{
+	struct intel_guc_slpc *slpc;
+
+	if (rps_uses_slpc(rps)) {
+		slpc = rps_to_slpc(rps);
+
+		intel_guc_slpc_dec_waiters(slpc);
+	} else {
+		atomic_dec(&rps->num_waiters);
+	}
+}
+
 void intel_rps_boost(struct i915_request *rq)
 {
+	struct intel_guc_slpc *slpc;
+
 	if (i915_request_signaled(rq) || i915_request_has_waitboost(rq))
 		return;
 
@@ -945,6 +960,17 @@  void intel_rps_boost(struct i915_request *rq)
 	if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
 		struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
 
+		if (rps_uses_slpc(rps)) {
+			slpc = rps_to_slpc(rps);
+
+			/* Return if old value is non zero */
+			if (atomic_fetch_inc(&slpc->num_waiters))
+				return;
+
+			if (intel_rps_get_requested_frequency(rps) < slpc->boost_freq)
+				schedule_work(&slpc->boost_work);
+		}
+
 		if (atomic_fetch_inc(&rps->num_waiters))
 			return;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h
index 11960d64ca82..407e878d5006 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -23,6 +23,7 @@  void intel_rps_disable(struct intel_rps *rps);
 void intel_rps_park(struct intel_rps *rps);
 void intel_rps_unpark(struct intel_rps *rps);
 void intel_rps_boost(struct i915_request *rq);
+void intel_rps_dec_waiters(struct intel_rps *rps);
 
 int intel_rps_set(struct intel_rps *rps, u8 val);
 void intel_rps_mark_interactive(struct intel_rps *rps, bool interactive);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index cc51987b2535..65da454b6693 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -445,7 +445,11 @@  int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val)
 	    val > slpc->max_freq_softlimit)
 		return -EINVAL;
 
+	/* Need a lock now since waitboost can be modifying min as well */
+	mutex_lock(&slpc->lock);
+
 	with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+
 		ret = slpc_set_param(slpc,
 				     SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
 				     val);
@@ -458,6 +462,8 @@  int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val)
 	if (!ret)
 		slpc->min_freq_softlimit = val;
 
+	mutex_unlock(&slpc->lock);
+
 	return ret;
 }
 
@@ -643,6 +649,19 @@  int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
 	return 0;
 }
 
+void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
+{
+	/* Return min back to the softlimit.
+	 * This is called during request retire,
+	 * so we don't need to fail that if the
+	 * set_param fails.
+	 */
+	mutex_lock(&slpc->lock);
+	if (atomic_dec_and_test(&slpc->num_waiters))
+		slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
+	mutex_unlock(&slpc->lock);
+}
+
 int intel_guc_slpc_print_info(struct intel_guc_slpc *slpc, struct drm_printer *p)
 {
 	struct drm_i915_private *i915 = slpc_to_i915(slpc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
index b62528647770..d74d6d749bdc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
@@ -39,5 +39,6 @@  int intel_guc_slpc_get_min_freq(struct intel_guc_slpc *slpc, u32 *val);
 int intel_guc_slpc_print_info(struct intel_guc_slpc *slpc, struct drm_printer *p);
 void intel_guc_pm_intrmsk_enable(struct intel_gt *gt);
 void intel_guc_slpc_boost(struct intel_guc_slpc *slpc);
+void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 2c3cd6e635b5..08f38e86231d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -339,7 +339,7 @@  bool i915_request_retire(struct i915_request *rq)
 	}
 
 	if (test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags))
-		atomic_dec(&rq->engine->gt->rps.num_waiters);
+		intel_rps_dec_waiters(&rq->engine->gt->rps);
 
 	/*
 	 * We only loosely track inflight requests across preemption,