Message ID | 20180120093101.11952-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 1/20/2018 3:01 PM, Chris Wilson wrote: > On gen9+, after an idle period the HW will disable the entire power well > to conserve power (by preventing current leakage). It takes around a 100 > microseconds to bring the power well back online afterwards. With the > current hysteresis value of 25us, we do not have sufficient time to > respond to an interrupt and schedule the next execution before the HW > powers itself down. (At present, we prevent this by grabbing the > forcewake for prolonged periods of time, but that is overkill fixed in > the next patch.) The minimum we want to set the power gating hysteresis > to is the length of time it takes us to service the GPU, which across a > broad spectrum of machines is about 250us. > > (Note this also brings guc latency into the same ballpark as execlists.) > > Testcase: igt/gem_exec_nop/sequential > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> > Cc: Michel Thierry <michel.thierry@intel.com> > Cc: Michal Winiarski <michal.winiarski@intel.com> > --- > drivers/gpu/drm/i915/intel_pm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 1db79a860b96..6748d3efb537 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -6627,8 +6627,8 @@ static void gen9_enable_rc6(struct drm_i915_private *dev_priv) > I915_WRITE(GEN6_RC_SLEEP, 0); > > /* 2c: Program Coarse Power Gating Policies. */ > - I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 25); > - I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25); > + I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250); > + I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 250); As per bspec this is in 1.28usec units so it should be 195. With that: Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> > > /* 3a: Enable RC6 */ > I915_WRITE(GEN6_RC6_THRESHOLD, 37500); /* 37.5/125ms per EI */
Quoting Sagar Arun Kamble (2018-01-21 11:01:23) > > > On 1/20/2018 3:01 PM, Chris Wilson wrote: > > On gen9+, after an idle period the HW will disable the entire power well > > to conserve power (by preventing current leakage). It takes around a 100 > > microseconds to bring the power well back online afterwards. With the > > current hysteresis value of 25us, we do not have sufficient time to > > respond to an interrupt and schedule the next execution before the HW > > powers itself down. (At present, we prevent this by grabbing the > > forcewake for prolonged periods of time, but that is overkill fixed in > > the next patch.) The minimum we want to set the power gating hysteresis > > to is the length of time it takes us to service the GPU, which across a > > broad spectrum of machines is about 250us. > > > > (Note this also brings guc latency into the same ballpark as execlists.) > > > > Testcase: igt/gem_exec_nop/sequential > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> > > Cc: Michel Thierry <michel.thierry@intel.com> > > Cc: Michal Winiarski <michal.winiarski@intel.com> > > --- > > drivers/gpu/drm/i915/intel_pm.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > > index 1db79a860b96..6748d3efb537 100644 > > --- a/drivers/gpu/drm/i915/intel_pm.c > > +++ b/drivers/gpu/drm/i915/intel_pm.c > > @@ -6627,8 +6627,8 @@ static void gen9_enable_rc6(struct drm_i915_private *dev_priv) > > I915_WRITE(GEN6_RC_SLEEP, 0); > > > > /* 2c: Program Coarse Power Gating Policies. */ > > - I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 25); > > - I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25); > > + I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250); > > + I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 250); > As per bspec this is in 1.28usec units so it should be 195. Both numbers pulled out of the air, just read us above as approx.us. -Chris
On 1/21/2018 5:55 PM, Chris Wilson wrote: > Quoting Sagar Arun Kamble (2018-01-21 11:01:23) >> >> On 1/20/2018 3:01 PM, Chris Wilson wrote: >>> On gen9+, after an idle period the HW will disable the entire power well >>> to conserve power (by preventing current leakage). It takes around a 100 >>> microseconds to bring the power well back online afterwards. With the >>> current hysteresis value of 25us, we do not have sufficient time to >>> respond to an interrupt and schedule the next execution before the HW >>> powers itself down. (At present, we prevent this by grabbing the >>> forcewake for prolonged periods of time, but that is overkill fixed in >>> the next patch.) The minimum we want to set the power gating hysteresis >>> to is the length of time it takes us to service the GPU, which across a >>> broad spectrum of machines is about 250us. >>> >>> (Note this also brings guc latency into the same ballpark as execlists.) >>> >>> Testcase: igt/gem_exec_nop/sequential >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> >>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> >>> Cc: Michel Thierry <michel.thierry@intel.com> >>> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> >>> --- >>> drivers/gpu/drm/i915/intel_pm.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c >>> index 1db79a860b96..6748d3efb537 100644 >>> --- a/drivers/gpu/drm/i915/intel_pm.c >>> +++ b/drivers/gpu/drm/i915/intel_pm.c >>> @@ -6627,8 +6627,8 @@ static void gen9_enable_rc6(struct drm_i915_private *dev_priv) >>> I915_WRITE(GEN6_RC_SLEEP, 0); >>> >>> /* 2c: Program Coarse Power Gating Policies. */ >>> - I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 25); >>> - I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25); >>> + I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250); >>> + I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 250); >> As per bspec this is in 1.28usec units so it should be 195. > Both numbers pulled out of the air, just read us above as approx.us. ok ... Interestingly spec too says 25us and also say that it is actually 32us :) > -Chris
Quoting Patchwork (2018-01-22 21:26:48) > == Series Details == > > Series: series starting with [v2] drm/i915: Increase render/media power gating hysteresis for gen9+ (rev3) > URL : https://patchwork.freedesktop.org/series/36842/ > State : success > > == Summary == > > Warning: bzip CI_DRM_3667/shard-glkb6/results7.json.bz2 wasn't in correct JSON format > Test kms_cursor_legacy: > Subgroup cursor-vs-flip-atomic: > fail -> PASS (shard-apl) fdo#103355 > Subgroup flip-vs-cursor-legacy: > pass -> FAIL (shard-apl) fdo#102670 > Test perf: > Subgroup buffer-fill: > fail -> PASS (shard-apl) fdo#103755 > Subgroup oa-exponents: > fail -> PASS (shard-apl) fdo#102254 > Test kms_frontbuffer_tracking: > Subgroup fbc-1p-offscren-pri-shrfb-draw-blt: > fail -> PASS (shard-snb) fdo#101623 +1 > > fdo#103355 https://bugs.freedesktop.org/show_bug.cgi?id=103355 > fdo#102670 https://bugs.freedesktop.org/show_bug.cgi?id=102670 > fdo#103755 https://bugs.freedesktop.org/show_bug.cgi?id=103755 > fdo#102254 https://bugs.freedesktop.org/show_bug.cgi?id=102254 > fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623 > > shard-apl total:2780 pass:1716 dwarn:1 dfail:0 fail:23 skip:1040 time:14666s > shard-hsw total:2780 pass:1724 dwarn:1 dfail:0 fail:12 skip:1042 time:15570s > shard-snb total:2780 pass:1317 dwarn:1 dfail:0 fail:13 skip:1449 time:8120s > Blacklisted hosts: > shard-kbl total:2780 pass:1835 dwarn:3 dfail:0 fail:26 skip:916 time:11041s > > == Logs == > > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7738/shards.html Reran on trybot to confirm the kbl errors were just flukes, https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_1695/shards.html and pushed. Given that this is destined for 4.17, that gives us 3 months to soak it before it gets exposed upstream. Fingers crossed! Thanks, -Chris
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 1db79a860b96..6748d3efb537 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -6627,8 +6627,8 @@ static void gen9_enable_rc6(struct drm_i915_private *dev_priv) I915_WRITE(GEN6_RC_SLEEP, 0); /* 2c: Program Coarse Power Gating Policies. */ - I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 25); - I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25); + I915_WRITE(GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250); + I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 250); /* 3a: Enable RC6 */ I915_WRITE(GEN6_RC6_THRESHOLD, 37500); /* 37.5/125ms per EI */
On gen9+, after an idle period the HW will disable the entire power well to conserve power (by preventing current leakage). It takes around a 100 microseconds to bring the power well back online afterwards. With the current hysteresis value of 25us, we do not have sufficient time to respond to an interrupt and schedule the next execution before the HW powers itself down. (At present, we prevent this by grabbing the forcewake for prolonged periods of time, but that is overkill fixed in the next patch.) The minimum we want to set the power gating hysteresis to is the length of time it takes us to service the GPU, which across a broad spectrum of machines is about 250us. (Note this also brings guc latency into the same ballpark as execlists.) Testcase: igt/gem_exec_nop/sequential Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> --- drivers/gpu/drm/i915/intel_pm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)