Message ID | 20221022002452.36716-1-vinay.belgaumkar@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] drm/i915/slpc: Optmize waitboost for SLPC | expand |
On Fri, 21 Oct 2022 17:24:52 -0700, Vinay Belgaumkar wrote: > Hi Vinay, > Waitboost (when SLPC is enabled) results in a H2G message. This can result > in thousands of messages during a stress test and fill up an already full > CTB. There is no need to request for RP0 if boost_freq and the min softlimit > are the same. > > v2: Add the tracing back, and check requested freq > in the worker thread (Tvrtko) > v3: Check requested freq in dec_waiters as well > v4: Only check min_softlimit against boost_freq. Limit this > optimization for server parts for now. Sorry I didn't follow. Why are we saying limit this only to server? This: if (slpc->min_freq_softlimit == slpc->boost_freq) return; The condition above should work for client too if it is true? But yes it is typically true automatically for server but not for client. Is that what you mean? > > Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_rps.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > index fc23c562d9b2..32e1f5dde5bb 100644 > --- a/drivers/gpu/drm/i915/gt/intel_rps.c > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq) > if (rps_uses_slpc(rps)) { > slpc = rps_to_slpc(rps); > > + if (slpc->min_freq_softlimit == slpc->boost_freq) > + return; nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq' (looks possible to me from the code though we might not have intended it)? Then we can change this to: if (slpc->min_freq_softlimit >= slpc->boost_freq) return; > + > /* Return if old value is non zero */ > - if (!atomic_fetch_inc(&slpc->num_waiters)) > + if (!atomic_fetch_inc(&slpc->num_waiters)) { > + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", > + rq->fence.context, rq->fence.seqno); Another possibility would have been to add the trace to slpc_boost_work but this is matches host turbo so I think it is fine here. > schedule_work(&slpc->boost_work); > + } > > return; > } Thanks. -- Ashutosh
On 10/21/2022 7:11 PM, Dixit, Ashutosh wrote: > On Fri, 21 Oct 2022 17:24:52 -0700, Vinay Belgaumkar wrote: > Hi Vinay, > >> Waitboost (when SLPC is enabled) results in a H2G message. This can result >> in thousands of messages during a stress test and fill up an already full >> CTB. There is no need to request for RP0 if boost_freq and the min softlimit >> are the same. >> >> v2: Add the tracing back, and check requested freq >> in the worker thread (Tvrtko) >> v3: Check requested freq in dec_waiters as well >> v4: Only check min_softlimit against boost_freq. Limit this >> optimization for server parts for now. > Sorry I didn't follow. Why are we saying limit this only to server? This: > > if (slpc->min_freq_softlimit == slpc->boost_freq) > return; > > The condition above should work for client too if it is true? But yes it is > typically true automatically for server but not for client. Is that what > you mean? yes. For client, min_freq_softlimit would typically be RPn. > >> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> >> --- >> drivers/gpu/drm/i915/gt/intel_rps.c | 8 +++++++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >> index fc23c562d9b2..32e1f5dde5bb 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >> @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq) >> if (rps_uses_slpc(rps)) { >> slpc = rps_to_slpc(rps); >> >> + if (slpc->min_freq_softlimit == slpc->boost_freq) >> + return; > nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq' > (looks possible to me from the code though we might not have intended it)? > Then we can change this to: > > if (slpc->min_freq_softlimit >= slpc->boost_freq) > return; > > >> + >> /* Return if old value is non zero */ >> - if (!atomic_fetch_inc(&slpc->num_waiters)) >> + if (!atomic_fetch_inc(&slpc->num_waiters)) { >> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", >> + rq->fence.context, rq->fence.seqno); > Another possibility would have been to add the trace to slpc_boost_work but > this is matches host turbo so I think it is fine here. > >> schedule_work(&slpc->boost_work); >> + } >> >> return; >> } > Thanks. > -- > Ashutosh
On Sat, 22 Oct 2022 10:56:03 -0700, Belgaumkar, Vinay wrote: > Hi Vinay, > >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c > >> index fc23c562d9b2..32e1f5dde5bb 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c > >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > >> @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq) > >> if (rps_uses_slpc(rps)) { > >> slpc = rps_to_slpc(rps); > >> > >> + if (slpc->min_freq_softlimit == slpc->boost_freq) > >> + return; > > nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq' > > (looks possible to me from the code though we might not have intended it)? > > Then we can change this to: > > > > if (slpc->min_freq_softlimit >= slpc->boost_freq) > > return; Any comment about this? It looks clearly possible to me from the code. So with the above change this is: Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
On 10/22/2022 12:22 PM, Dixit, Ashutosh wrote: > On Sat, 22 Oct 2022 10:56:03 -0700, Belgaumkar, Vinay wrote: > Hi Vinay, > >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c >>>> index fc23c562d9b2..32e1f5dde5bb 100644 >>>> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >>>> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >>>> @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq) >>>> if (rps_uses_slpc(rps)) { >>>> slpc = rps_to_slpc(rps); >>>> >>>> + if (slpc->min_freq_softlimit == slpc->boost_freq) >>>> + return; >>> nit but is it possible that 'slpc->min_freq_softlimit > slpc->boost_freq' >>> (looks possible to me from the code though we might not have intended it)? >>> Then we can change this to: >>> >>> if (slpc->min_freq_softlimit >= slpc->boost_freq) >>> return; > Any comment about this? It looks clearly possible to me from the code. > > So with the above change this is: > > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Agree. Thanks, Vinay.
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index fc23c562d9b2..32e1f5dde5bb 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -1016,9 +1016,15 @@ void intel_rps_boost(struct i915_request *rq) if (rps_uses_slpc(rps)) { slpc = rps_to_slpc(rps); + if (slpc->min_freq_softlimit == slpc->boost_freq) + return; + /* Return if old value is non zero */ - if (!atomic_fetch_inc(&slpc->num_waiters)) + if (!atomic_fetch_inc(&slpc->num_waiters)) { + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", + rq->fence.context, rq->fence.seqno); schedule_work(&slpc->boost_work); + } return; }
Waitboost (when SLPC is enabled) results in a H2G message. This can result in thousands of messages during a stress test and fill up an already full CTB. There is no need to request for RP0 if boost_freq and the min softlimit are the same. v2: Add the tracing back, and check requested freq in the worker thread (Tvrtko) v3: Check requested freq in dec_waiters as well v4: Only check min_softlimit against boost_freq. Limit this optimization for server parts for now. Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> --- drivers/gpu/drm/i915/gt/intel_rps.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)