diff mbox series

[13/25] drm/i915: Reduce the RPS shock

Message ID 20190219122215.8941-13-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [01/25] drm/i915: Move verify_wm_state() to heap | expand

Commit Message

Chris Wilson Feb. 19, 2019, 12:22 p.m. UTC
Limit deboosting and boosting to keep ourselves at the extremes
when in the respective power modes (i.e. slowly decrease frequencies
while in the HIGH_POWER zone and slowly increase frequencies while
in the LOW_POWER zone). On idle, we will hit the timeout and drop
to the next level quickly, and conversely if busy we expect to
hit a waitboost and rapidly switch into max power.

This should improve the UX experience by keeping the GPU clocks higher
than they ostensibly should be (based on simple busyness) by switching
into the INTERACTIVE mode (due to waiting for pageflips) and increasing
clocks via waitboosting. This will incur some additional power, our
saving grace should be rc6 and powergating to keep the extra current
draw in check.

Food for future thought would be deadline scheduling? If we know certain
contexts (high priority compositors) absolutely must hit the next vblank
then we can raise the frequencies ahead of time. Part of this is covered
by per-context frequencies, where userspace is given control over the
frequency range they want the GPU to execute at (for largely the same
problem as this, where the workload is very latency sensitive but at the
EI level appears mostly idle). Indeed, the per-context series does
extend the modeset boosting to include a frequency range tweak which
seems applicable to solving this jittery UX behaviour.

Reported-by: Lyude Paul <lyude@redhat.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109408
References: 0d55babc8392 ("drm/i915: Drop stray clearing of rps->last_adj")
References: 60548c554be2 ("drm/i915: Interactive RPS mode")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>

Quoting Lyude Paul:
> Before reverting 0d55babc8392754352f1058866dd4182ae587d11: [4.20]
>
> 35 measurements [of gnome-shell animations]
> Average: 33.65657142857143 FPS
> FPS observed: 20.8 - 46.87 FPS
> Percentage under 60 FPS: 100.0%
> Percentage under 55 FPS: 100.0%
> Percentage under 50 FPS: 100.0%
> Percentage under 45 FPS: 97.14285714285714%
> Percentage under 40 FPS: 97.14285714285714%
> Percentage under 35 FPS: 45.714285714285715%
> Percentage under 30 FPS: 11.428571428571429%
> Percentage under 25 FPS: 2.857142857142857%
>
> After reverting: [4.19 behaviour]
>
> 30 measurements
> Average: 49.833666666666666 FPS
> FPS observed: 33.85 - 60.0 FPS
> Percentage under 60 FPS: 86.66666666666667%
> Percentage under 55 FPS: 70.0%
> Percentage under 50 FPS: 53.333333333333336%
> Percentage under 45 FPS: 20.0%
> Percentage under 40 FPS: 6.666666666666667%
> Percentage under 35 FPS: 6.666666666666667%
> Percentage under 30 FPS: 0%
> Percentage under 25 FPS: 0%
>
> Patched:
> 42 measurements
> Average: 46.05428571428571 FPS
> FPS observed: 1.82 - 59.98 FPS
> Percentage under 60 FPS: 88.09523809523809%
> Percentage under 55 FPS: 61.904761904761905%
> Percentage under 50 FPS: 45.23809523809524%
> Percentage under 45 FPS: 35.714285714285715%
> Percentage under 40 FPS: 33.33333333333333%
> Percentage under 35 FPS: 19.047619047619047%
> Percentage under 30 FPS: 7.142857142857142%
> Percentage under 25 FPS: 4.761904761904762%

Tested-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Lyude Paul Feb. 19, 2019, 9 p.m. UTC | #1
Should this maybe be CC'd for stable for v4.20? If so I've already got a
working port of this patch that I can send to you (I've been running it on my
laptop for a while now, seems to work fine)

On Tue, 2019-02-19 at 12:22 +0000, Chris Wilson wrote:
> Limit deboosting and boosting to keep ourselves at the extremes
> when in the respective power modes (i.e. slowly decrease frequencies
> while in the HIGH_POWER zone and slowly increase frequencies while
> in the LOW_POWER zone). On idle, we will hit the timeout and drop
> to the next level quickly, and conversely if busy we expect to
> hit a waitboost and rapidly switch into max power.
> 
> This should improve the UX experience by keeping the GPU clocks higher
> than they ostensibly should be (based on simple busyness) by switching
> into the INTERACTIVE mode (due to waiting for pageflips) and increasing
> clocks via waitboosting. This will incur some additional power, our
> saving grace should be rc6 and powergating to keep the extra current
> draw in check.
> 
> Food for future thought would be deadline scheduling? If we know certain
> contexts (high priority compositors) absolutely must hit the next vblank
> then we can raise the frequencies ahead of time. Part of this is covered
> by per-context frequencies, where userspace is given control over the
> frequency range they want the GPU to execute at (for largely the same
> problem as this, where the workload is very latency sensitive but at the
> EI level appears mostly idle). Indeed, the per-context series does
> extend the modeset boosting to include a frequency range tweak which
> seems applicable to solving this jittery UX behaviour.
> 
> Reported-by: Lyude Paul <lyude@redhat.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109408
> References: 0d55babc8392 ("drm/i915: Drop stray clearing of rps->last_adj")
> References: 60548c554be2 ("drm/i915: Interactive RPS mode")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> 
> Quoting Lyude Paul:
> > Before reverting 0d55babc8392754352f1058866dd4182ae587d11: [4.20]
> > 
> > 35 measurements [of gnome-shell animations]
> > Average: 33.65657142857143 FPS
> > FPS observed: 20.8 - 46.87 FPS
> > Percentage under 60 FPS: 100.0%
> > Percentage under 55 FPS: 100.0%
> > Percentage under 50 FPS: 100.0%
> > Percentage under 45 FPS: 97.14285714285714%
> > Percentage under 40 FPS: 97.14285714285714%
> > Percentage under 35 FPS: 45.714285714285715%
> > Percentage under 30 FPS: 11.428571428571429%
> > Percentage under 25 FPS: 2.857142857142857%
> > 
> > After reverting: [4.19 behaviour]
> > 
> > 30 measurements
> > Average: 49.833666666666666 FPS
> > FPS observed: 33.85 - 60.0 FPS
> > Percentage under 60 FPS: 86.66666666666667%
> > Percentage under 55 FPS: 70.0%
> > Percentage under 50 FPS: 53.333333333333336%
> > Percentage under 45 FPS: 20.0%
> > Percentage under 40 FPS: 6.666666666666667%
> > Percentage under 35 FPS: 6.666666666666667%
> > Percentage under 30 FPS: 0%
> > Percentage under 25 FPS: 0%
> > 
> > Patched:
> > 42 measurements
> > Average: 46.05428571428571 FPS
> > FPS observed: 1.82 - 59.98 FPS
> > Percentage under 60 FPS: 88.09523809523809%
> > Percentage under 55 FPS: 61.904761904761905%
> > Percentage under 50 FPS: 45.23809523809524%
> > Percentage under 45 FPS: 35.714285714285715%
> > Percentage under 40 FPS: 33.33333333333333%
> > Percentage under 35 FPS: 19.047619047619047%
> > Percentage under 30 FPS: 7.142857142857142%
> > Percentage under 25 FPS: 4.761904761904762%
> 
> Tested-by: Lyude Paul <lyude@redhat.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c
> b/drivers/gpu/drm/i915/i915_irq.c
> index 92bb32ed27fb..7c7e84e86c6a 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1288,6 +1288,18 @@ static void gen6_pm_rps_work(struct work_struct
> *work)
>  
>  	rps->last_adj = adj;
>  
> +	/*
> +	 * Limit deboosting and boosting to keep ourselves at the extremes
> +	 * when in the respective power modes (i.e. slowly decrease
> frequencies
> +	 * while in the HIGH_POWER zone and slowly increase frequencies while
> +	 * in the LOW_POWER zone). On idle, we will hit the timeout and drop
> +	 * to the next level quickly, and conversely if busy we expect to
> +	 * hit a waitboost and rapidly switch into max power.
> +	 */
> +	if ((adj < 0 && rps->power.mode == HIGH_POWER) ||
> +	    (adj > 0 && rps->power.mode == LOW_POWER))
> +		rps->last_adj = 0;
> +
>  	/* sysfs frequency interfaces may have snuck in while servicing the
>  	 * interrupt
>  	 */
Chris Wilson Feb. 20, 2019, 12:05 p.m. UTC | #2
Quoting Lyude Paul (2019-02-19 21:00:08)
> Should this maybe be CC'd for stable for v4.20? If so I've already got a
> working port of this patch that I can send to you (I've been running it on my
> laptop for a while now, seems to work fine)

I wouldn't say no (I am still wondering if we can do better than hitting
waitboost and a slow backoff that just happens to be giving high
frequencies until we dip too low and waitboost again, but that's future
work). So if we can get this in, you can send your patch to GregKH for
4.20.
-Chris
Mika Kuoppala Feb. 20, 2019, 3:14 p.m. UTC | #3
Chris Wilson <chris@chris-wilson.co.uk> writes:

> Limit deboosting and boosting to keep ourselves at the extremes
> when in the respective power modes (i.e. slowly decrease frequencies
> while in the HIGH_POWER zone and slowly increase frequencies while
> in the LOW_POWER zone). On idle, we will hit the timeout and drop
> to the next level quickly, and conversely if busy we expect to
> hit a waitboost and rapidly switch into max power.
>
> This should improve the UX experience by keeping the GPU clocks higher
> than they ostensibly should be (based on simple busyness) by switching
> into the INTERACTIVE mode (due to waiting for pageflips) and increasing
> clocks via waitboosting. This will incur some additional power, our
> saving grace should be rc6 and powergating to keep the extra current
> draw in check.
>
> Food for future thought would be deadline scheduling? If we know certain
> contexts (high priority compositors) absolutely must hit the next vblank
> then we can raise the frequencies ahead of time. Part of this is covered
> by per-context frequencies, where userspace is given control over the
> frequency range they want the GPU to execute at (for largely the same
> problem as this, where the workload is very latency sensitive but at the
> EI level appears mostly idle). Indeed, the per-context series does
> extend the modeset boosting to include a frequency range tweak which
> seems applicable to solving this jittery UX behaviour.
>
> Reported-by: Lyude Paul <lyude@redhat.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109408
> References: 0d55babc8392 ("drm/i915: Drop stray clearing of rps->last_adj")
> References: 60548c554be2 ("drm/i915: Interactive RPS mode")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
>
> Quoting Lyude Paul:
>> Before reverting 0d55babc8392754352f1058866dd4182ae587d11: [4.20]
>>
>> 35 measurements [of gnome-shell animations]
>> Average: 33.65657142857143 FPS
>> FPS observed: 20.8 - 46.87 FPS
>> Percentage under 60 FPS: 100.0%
>> Percentage under 55 FPS: 100.0%
>> Percentage under 50 FPS: 100.0%
>> Percentage under 45 FPS: 97.14285714285714%
>> Percentage under 40 FPS: 97.14285714285714%
>> Percentage under 35 FPS: 45.714285714285715%
>> Percentage under 30 FPS: 11.428571428571429%
>> Percentage under 25 FPS: 2.857142857142857%
>>
>> After reverting: [4.19 behaviour]
>>
>> 30 measurements
>> Average: 49.833666666666666 FPS
>> FPS observed: 33.85 - 60.0 FPS
>> Percentage under 60 FPS: 86.66666666666667%
>> Percentage under 55 FPS: 70.0%
>> Percentage under 50 FPS: 53.333333333333336%
>> Percentage under 45 FPS: 20.0%
>> Percentage under 40 FPS: 6.666666666666667%
>> Percentage under 35 FPS: 6.666666666666667%
>> Percentage under 30 FPS: 0%
>> Percentage under 25 FPS: 0%
>>
>> Patched:
>> 42 measurements
>> Average: 46.05428571428571 FPS
>> FPS observed: 1.82 - 59.98 FPS
>> Percentage under 60 FPS: 88.09523809523809%
>> Percentage under 55 FPS: 61.904761904761905%
>> Percentage under 50 FPS: 45.23809523809524%
>> Percentage under 45 FPS: 35.714285714285715%
>> Percentage under 40 FPS: 33.33333333333333%
>> Percentage under 35 FPS: 19.047619047619047%
>> Percentage under 30 FPS: 7.142857142857142%
>> Percentage under 25 FPS: 4.761904761904762%
>
> Tested-by: Lyude Paul <lyude@redhat.com>

It does what it says on the tin,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_irq.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 92bb32ed27fb..7c7e84e86c6a 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1288,6 +1288,18 @@ static void gen6_pm_rps_work(struct work_struct *work)
>  
>  	rps->last_adj = adj;
>  
> +	/*
> +	 * Limit deboosting and boosting to keep ourselves at the extremes
> +	 * when in the respective power modes (i.e. slowly decrease frequencies
> +	 * while in the HIGH_POWER zone and slowly increase frequencies while
> +	 * in the LOW_POWER zone). On idle, we will hit the timeout and drop
> +	 * to the next level quickly, and conversely if busy we expect to
> +	 * hit a waitboost and rapidly switch into max power.
> +	 */
> +	if ((adj < 0 && rps->power.mode == HIGH_POWER) ||
> +	    (adj > 0 && rps->power.mode == LOW_POWER))
> +		rps->last_adj = 0;
> +
>  	/* sysfs frequency interfaces may have snuck in while servicing the
>  	 * interrupt
>  	 */
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 92bb32ed27fb..7c7e84e86c6a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1288,6 +1288,18 @@  static void gen6_pm_rps_work(struct work_struct *work)
 
 	rps->last_adj = adj;
 
+	/*
+	 * Limit deboosting and boosting to keep ourselves at the extremes
+	 * when in the respective power modes (i.e. slowly decrease frequencies
+	 * while in the HIGH_POWER zone and slowly increase frequencies while
+	 * in the LOW_POWER zone). On idle, we will hit the timeout and drop
+	 * to the next level quickly, and conversely if busy we expect to
+	 * hit a waitboost and rapidly switch into max power.
+	 */
+	if ((adj < 0 && rps->power.mode == HIGH_POWER) ||
+	    (adj > 0 && rps->power.mode == LOW_POWER))
+		rps->last_adj = 0;
+
 	/* sysfs frequency interfaces may have snuck in while servicing the
 	 * interrupt
 	 */