mbox series

[v1,0/5] Improvement stopping tick decision making in 'menu' idle governor

Message ID 1534090171-14464-1-git-send-email-leo.yan@linaro.org (mailing list archive)
Headers show
Series Improvement stopping tick decision making in 'menu' idle governor | expand

Message

Leo Yan Aug. 12, 2018, 4:09 p.m. UTC
We found the CPU cannot stay in deepest idle state as expected with
running synthetic workloads with mainline kernel on Arm platform
(96boards Hikey620 with octa CA53 CPUs).

The main issue is the criteria for decision stopping tick; now
the criteria is checking expected interval is less than TICK_USEC, but
this doesn't consider the next tick detla is float due CPU randomly
eneters and exits idle states; furthermore, it's stick to checking
TICK_USEC as boundary for decision stopping tick, unfortunately this has
hole to select a shallow state with stopping tick, so the CPU stays in
shallow state for long time.

This patch series is to explore more reasonable making decision for
stopping tick and the most important fixing is to avoid powernightmares
issue after we apply these criterias for making decisions.  Patches
0001 ~ 0003 are used to refactor the variables and structures for more
readable code, it also provides a function menu_decide_stopping_tick()
which can be used to encapsulate the making decision logics.  The last
two patches are primary for improvement, patch 0004 'cpuidle: menu:
Don't stay in shallow state for a long time' introduces a new criteria
(it's a more strict criteria than before) for not stopping tick for
shallow state cases; patch 0005 is use the dynamic tick detla to replace
the static value TICK_USEC for decision if the tick is expired before or
after the prediction, according this comparison we can get conclusion if
need to stop tick or not.

With more accurate decision for stopping tick, one immediate benefit is
the CPUs have more chance to stay in deepest state, it also can avoid to
run tick unnecessarily and so avoid a shallower state introduced by tick
event.  For the testing result in below table, we can see the result
proves the improvement by better stopping tick decision making in this
patch series, we run the workload generated by rt-app (a single task
with period 5ms and duty cycle 1%/3%/5%/10%/20%/30%/40%), the total
running time is 60s.  We do statistics for all CPUs for all idle states
duration, the unit is second (s), for cases (dutycycle=1%/3%/5%/10%/20%)
we can see the shallow state C0/C1 duration are reduced and the time
has been moved to deepest state, so the deepest state C2 duration can
have improvement for ~9s to ~21s.  for cases (dutycycle=30%/40%) though
we can see the deepest state durations are parity between with and
without patch series, but it has a minor improvement for C1 state
duration by stealing C0 state duration.

Some notations are used in the table:

state: C0: WFI; C1: CPU OFF; C2: Cluster OFF

All testing cases have single task with 5ms period:

		 Without patches           With patches               Difference
            -----------------------  -----------------------   --------------------------
Duty cycle    C0     C1       C2       C0      C1      C2        C0        C1        C2
  1%        2.397  16.528  471.905   0.916    2.688  487.328   -1.481   -13.840   +15.422
  3%        3.957  20.541  464.434   1.510    2.398  485.914   -2.447   -18.143   +21.480
  5%        2.866   8.609  474.777   1.166    2.250  483.983   -1.699    -6.359    +9.205
 10%        2.893  28.753  453.277   1.147   14.134  469.190   -1.745   -14.618   +15.913
 20%        7.620  41.086  431.735   1.595   35.055  442.482   -6.024    -6.030   +10.747
 30%        4.394  38.328  431.442   1.964   40.857  430.973   -2.430    +2.529    -0.468
 40%        7.390  29.415  430.914   1.789   34.832  431.588   -5.600    +5.417    -0.673


P.s. for the testing, applied Rafael's patch 'cpuidle: menu: Handle
stopped tick more aggressively' [1] to avoid select unexpected shallow
state after tick has been stopped.

[1] https://lkml.org/lkml/2018/8/10/259

Leo Yan (5):
  cpuidle: menu: Clean up variables usage in menu_select()
  cpuidle: menu: Record tick delta value in struct menu_device
  cpuidle: menu: Provide menu_decide_stopping_tick()
  cpuidle: menu: Don't stay in shallow state for a long time
  cpuidle: menu: Change to compare prediction with tick delta

 drivers/cpuidle/governors/menu.c | 104 ++++++++++++++++++++++++++++-----------
 1 file changed, 76 insertions(+), 28 deletions(-)

Comments

Rafael J. Wysocki Aug. 21, 2018, 8:37 a.m. UTC | #1
On Sunday, August 12, 2018 6:09:26 PM CEST Leo Yan wrote:
> We found the CPU cannot stay in deepest idle state as expected with
> running synthetic workloads with mainline kernel on Arm platform
> (96boards Hikey620 with octa CA53 CPUs).
> 
> The main issue is the criteria for decision stopping tick; now
> the criteria is checking expected interval is less than TICK_USEC, but
> this doesn't consider the next tick detla is float due CPU randomly
> eneters and exits idle states; furthermore, it's stick to checking
> TICK_USEC as boundary for decision stopping tick, unfortunately this has
> hole to select a shallow state with stopping tick, so the CPU stays in
> shallow state for long time.
> 
> This patch series is to explore more reasonable making decision for
> stopping tick and the most important fixing is to avoid powernightmares
> issue after we apply these criterias for making decisions.  Patches
> 0001 ~ 0003 are used to refactor the variables and structures for more
> readable code, it also provides a function menu_decide_stopping_tick()
> which can be used to encapsulate the making decision logics.  The last
> two patches are primary for improvement, patch 0004 'cpuidle: menu:
> Don't stay in shallow state for a long time' introduces a new criteria
> (it's a more strict criteria than before) for not stopping tick for
> shallow state cases; patch 0005 is use the dynamic tick detla to replace
> the static value TICK_USEC for decision if the tick is expired before or
> after the prediction, according this comparison we can get conclusion if
> need to stop tick or not.
> 
> With more accurate decision for stopping tick, one immediate benefit is
> the CPUs have more chance to stay in deepest state, it also can avoid to
> run tick unnecessarily and so avoid a shallower state introduced by tick
> event.  For the testing result in below table, we can see the result
> proves the improvement by better stopping tick decision making in this
> patch series, we run the workload generated by rt-app (a single task
> with period 5ms and duty cycle 1%/3%/5%/10%/20%/30%/40%), the total
> running time is 60s.  We do statistics for all CPUs for all idle states
> duration, the unit is second (s), for cases (dutycycle=1%/3%/5%/10%/20%)
> we can see the shallow state C0/C1 duration are reduced and the time
> has been moved to deepest state, so the deepest state C2 duration can
> have improvement for ~9s to ~21s.  for cases (dutycycle=30%/40%) though
> we can see the deepest state durations are parity between with and
> without patch series, but it has a minor improvement for C1 state
> duration by stealing C0 state duration.
> 
> Some notations are used in the table:
> 
> state: C0: WFI; C1: CPU OFF; C2: Cluster OFF
> 
> All testing cases have single task with 5ms period:
> 
> 		 Without patches           With patches               Difference
>             -----------------------  -----------------------   --------------------------
> Duty cycle    C0     C1       C2       C0      C1      C2        C0        C1        C2
>   1%        2.397  16.528  471.905   0.916    2.688  487.328   -1.481   -13.840   +15.422
>   3%        3.957  20.541  464.434   1.510    2.398  485.914   -2.447   -18.143   +21.480
>   5%        2.866   8.609  474.777   1.166    2.250  483.983   -1.699    -6.359    +9.205
>  10%        2.893  28.753  453.277   1.147   14.134  469.190   -1.745   -14.618   +15.913
>  20%        7.620  41.086  431.735   1.595   35.055  442.482   -6.024    -6.030   +10.747
>  30%        4.394  38.328  431.442   1.964   40.857  430.973   -2.430    +2.529    -0.468
>  40%        7.390  29.415  430.914   1.789   34.832  431.588   -5.600    +5.417    -0.673
> 
> 
> P.s. for the testing, applied Rafael's patch 'cpuidle: menu: Handle
> stopped tick more aggressively' [1] to avoid select unexpected shallow
> state after tick has been stopped.
> 
> [1] https://lkml.org/lkml/2018/8/10/259
> 
> Leo Yan (5):
>   cpuidle: menu: Clean up variables usage in menu_select()
>   cpuidle: menu: Record tick delta value in struct menu_device
>   cpuidle: menu: Provide menu_decide_stopping_tick()
>   cpuidle: menu: Don't stay in shallow state for a long time
>   cpuidle: menu: Change to compare prediction with tick delta
> 
>  drivers/cpuidle/governors/menu.c | 104 ++++++++++++++++++++++++++++-----------
>  1 file changed, 76 insertions(+), 28 deletions(-)
> 
> 

Overall, I don't like this series, sorry about that.

The majority of changes in it are code reorganization, quite questionable
in a couple of cases, and a similar goal can be achieved with a very simple
patch that I'm going to post shortly.

Thanks,
Rafael