diff mbox series

[v2] powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state

Message ID 20221114145611.37669-1-aboorvad@linux.vnet.ibm.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series [v2] powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state | expand

Commit Message

Aboorva Devarajan Nov. 14, 2022, 2:56 p.m. UTC
During the comparative study of cpuidle governors, it is noticed that the
menu governor does not select CEDE state in some scenarios even though when
the sleep duration of the CPU exceeds the target residency of the CEDE idle
state this is because the CPU exits the snooze "polling" state when snooze
time limit is reached in the snooze_loop(), which is not a real wake up
and it just means that the polling state selection was not adequate.

cpuidle governors rely on CPUIDLE_FLAG_POLLING flag to be set for the
polling states to handle the condition mentioned above.

Hence, set the CPUIDLE_FLAG_POLLING flag for snooze state (polling state)
in powerpc arch to make the cpuidle governor work as expected.

Reference Commits:

- Timeout enabled for snooze state:
  commit 78eaa10f027c
  ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state")

- commit dc2251bf98c6
  ("cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol")

- Fix wakeup stats in governor for polling states
  commit 5f26bdceb9c0
  ("cpuidle: menu: Fix wakeup statistics updates for polling state")

Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
---

Changelog: (v1 -> v2)

Added CPUIDLE_POLLING_FLAG to the correct cpuidle_state struct.

Previous version of the patch is stale which was sent by mistake, this 
is the correct version which is tested on powernv, pseries (shared and 
dedicated partitions)

 drivers/cpuidle/cpuidle-powernv.c | 5 ++++-
 drivers/cpuidle/cpuidle-pseries.c | 8 ++++++--
 2 files changed, 10 insertions(+), 3 deletions(-)

Comments

Vishal Chourasia Nov. 19, 2022, 8:49 a.m. UTC | #1
On Mon, Nov 14, 2022 at 08:26:11PM +0530, Aboorva Devarajan wrote:
> During the comparative study of cpuidle governors, it is noticed that the
> menu governor does not select CEDE state in some scenarios even though when
> the sleep duration of the CPU exceeds the target residency of the CEDE idle
> state this is because the CPU exits the snooze "polling" state when snooze
> time limit is reached in the snooze_loop(), which is not a real wake up
> and it just means that the polling state selection was not adequate.
> 
> cpuidle governors rely on CPUIDLE_FLAG_POLLING flag to be set for the
> polling states to handle the condition mentioned above.
> 
> Hence, set the CPUIDLE_FLAG_POLLING flag for snooze state (polling state)
> in powerpc arch to make the cpuidle governor work as expected.
> 
> Reference Commits:
> 
> - Timeout enabled for snooze state:
>   commit 78eaa10f027c
>   ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state")
> 
> - commit dc2251bf98c6
>   ("cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol")
> 
> - Fix wakeup stats in governor for polling states
>   commit 5f26bdceb9c0
>   ("cpuidle: menu: Fix wakeup statistics updates for polling state")
> 
> Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
> ---
> 
> Changelog: (v1 -> v2)
> 
> Added CPUIDLE_POLLING_FLAG to the correct cpuidle_state struct.
> 
> Previous version of the patch is stale which was sent by mistake, this 
> is the correct version which is tested on powernv, pseries (shared and 
> dedicated partitions)
> 
>  drivers/cpuidle/cpuidle-powernv.c | 5 ++++-
>  drivers/cpuidle/cpuidle-pseries.c | 8 ++++++--
>  2 files changed, 10 insertions(+), 3 deletions(-)

Thanks for the patch.
Tested it on top of v6.0-rc4
Against workload: https://github.com/gautshen/misc/tree/master/cpuidle-smt-performance

Reviewed-by: Vishal Chourasia <vishalc@linux.vnet.ibm.com>
Tested-by: Vishal Chourasia <vishalc@linux.vnet.ibm.com <mailto:vishalc@linux.vnet.ibm.com>>

|----------------+--------+-----------------------+----------------------|
| wake up period | state  | % time spent (before) | % time spent (after) |
|----------------+--------+-----------------------+----------------------|
| 110 us         | snooze | 95.40 %               | 1.17 %               |
|                | CEDE   | 0.03 %                | 92.67 %              |
|----------------+--------+-----------------------+----------------------|
| 120 us         | snooze | 96.37 %               | 1.18 %               |
|                | CEDE   | 0.05 %                | 94.57 %              |
|----------------+--------+-----------------------+----------------------|
| 130 us         | snooze | 17.12 %               | 1.21 %               |
|                | CEDE   | 78.16 %               | 94.71 %              |
|----------------+--------+-----------------------+----------------------|
| 230 us         | snooze | 95.38 %               | 0.64 %               |
|                | CEDE   | 2.55 %                | 97.06 %              |
|----------------+--------+-----------------------+----------------------|
| 240 us         | snooze | 96.86 %               | 0.62 %               |
|                | CEDE   | 1.14 %                | 97.17 %              |
|----------------+--------+-----------------------+----------------------|
| 250 us         | snooze | 1.38 %                | 0.59 %               |
|                | CEDE   | 96.46 %               | 97.28 %              |
|----------------+--------+-----------------------+----------------------|
| 350 us         | snooze | 62.91 %               | 0.42 %               |
|                | CEDE   | 35.56 %               | 98.04 %              |
|----------------+--------+-----------------------+----------------------|
| 360 us         | snooze | 11.93 %               | 0.34 %               |
|                | CEDE   | 86.56 %               | 98.18 %              |
|----------------+--------+-----------------------+----------------------|
| 370 us         | snooze | 6.21 %                | 0.40 %               |
|                | CEDE   | 92.31 %               | 98.16 %              |
|----------------+--------+-----------------------+----------------------|
| 470 us         | snooze | 42.06 %               | 0.31 %               |
|                | CEDE   | 56.74 %               | 98.54 %              |
|----------------+--------+-----------------------+----------------------|
| 480 us         | snooze | 64.67 %               | 0.30 %               |
|                | CEDE   | 34.14 %               | 98.56 %              |
|----------------+--------+-----------------------+----------------------|
| 490 us         | snooze | 0.57 %                | 0.30 %               |
|                | CEDE   | 98.31 %               | 98.60 %              |
|----------------+--------+-----------------------+----------------------|

Note: *before* and *after* (see table heading), simply mean before applying the
patch and after applying the patch

-- vishal.c
Vaidyanathan Srinivasan Dec. 2, 2022, 11:17 a.m. UTC | #2
* Vishal Chourasia <vishalc@linux.vnet.ibm.com> [2022-11-19 14:19:52]:

> On Mon, Nov 14, 2022 at 08:26:11PM +0530, Aboorva Devarajan wrote:
> > During the comparative study of cpuidle governors, it is noticed that the
> > menu governor does not select CEDE state in some scenarios even though when
> > the sleep duration of the CPU exceeds the target residency of the CEDE idle
> > state this is because the CPU exits the snooze "polling" state when snooze
> > time limit is reached in the snooze_loop(), which is not a real wake up
> > and it just means that the polling state selection was not adequate.
> > 
> > cpuidle governors rely on CPUIDLE_FLAG_POLLING flag to be set for the
> > polling states to handle the condition mentioned above.
> > 
> > Hence, set the CPUIDLE_FLAG_POLLING flag for snooze state (polling state)
> > in powerpc arch to make the cpuidle governor work as expected.
> > 
> > Reference Commits:
> > 
> > - Timeout enabled for snooze state:
> >   commit 78eaa10f027c
> >   ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state")
> > 
> > - commit dc2251bf98c6
> >   ("cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol")
> > 
> > - Fix wakeup stats in governor for polling states
> >   commit 5f26bdceb9c0
> >   ("cpuidle: menu: Fix wakeup statistics updates for polling state")
> > 
> > Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
> > ---
> > 
> > Changelog: (v1 -> v2)
> > 
> > Added CPUIDLE_POLLING_FLAG to the correct cpuidle_state struct.
> > 
> > Previous version of the patch is stale which was sent by mistake, this 
> > is the correct version which is tested on powernv, pseries (shared and 
> > dedicated partitions)
> > 
> >  drivers/cpuidle/cpuidle-powernv.c | 5 ++++-
> >  drivers/cpuidle/cpuidle-pseries.c | 8 ++++++--
> >  2 files changed, 10 insertions(+), 3 deletions(-)

Hi Aboorva,

Thanks for the patch. This fixes the unpredictable idle state
selection issue under differ idle interval patterns.

Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>

--Vaidy
Michael Ellerman Dec. 8, 2022, 12:40 p.m. UTC | #3
On Mon, 14 Nov 2022 20:26:11 +0530, Aboorva Devarajan wrote:
> During the comparative study of cpuidle governors, it is noticed that the
> menu governor does not select CEDE state in some scenarios even though when
> the sleep duration of the CPU exceeds the target residency of the CEDE idle
> state this is because the CPU exits the snooze "polling" state when snooze
> time limit is reached in the snooze_loop(), which is not a real wake up
> and it just means that the polling state selection was not adequate.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state
      https://git.kernel.org/powerpc/c/5ddcc03a07ae1ab5062f89a946d9495f1fd8eaa4

cheers
diff mbox series

Patch

diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 0b5461b3d7dd..9ebedd972df0 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -76,6 +76,7 @@  static int snooze_loop(struct cpuidle_device *dev,
 	local_irq_enable();
 
 	snooze_exit_time = get_tb() + get_snooze_timeout(dev, drv, index);
+	dev->poll_time_limit = false;
 	ppc64_runlatch_off();
 	HMT_very_low();
 	while (!need_resched()) {
@@ -86,6 +87,7 @@  static int snooze_loop(struct cpuidle_device *dev,
 			 * cleared to order subsequent test of need_resched().
 			 */
 			clear_thread_flag(TIF_POLLING_NRFLAG);
+			dev->poll_time_limit = true;
 			smp_mb();
 			break;
 		}
@@ -155,7 +157,8 @@  static struct cpuidle_state powernv_states[CPUIDLE_STATE_MAX] = {
 		.desc = "snooze",
 		.exit_latency = 0,
 		.target_residency = 0,
-		.enter = snooze_loop },
+		.enter = snooze_loop,
+		.flags = CPUIDLE_FLAG_POLLING },
 };
 
 static int powernv_cpuidle_cpu_online(unsigned int cpu)
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 7e7ab5597d7a..1bad4d2b7be3 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -44,6 +44,7 @@  static int snooze_loop(struct cpuidle_device *dev,
 	pseries_idle_prolog();
 	local_irq_enable();
 	snooze_exit_time = get_tb() + snooze_timeout;
+	dev->poll_time_limit = false;
 
 	while (!need_resched()) {
 		HMT_low();
@@ -54,6 +55,7 @@  static int snooze_loop(struct cpuidle_device *dev,
 			 * loop anyway. Require a barrier after polling is
 			 * cleared to order subsequent test of need_resched().
 			 */
+			dev->poll_time_limit = true;
 			clear_thread_flag(TIF_POLLING_NRFLAG);
 			smp_mb();
 			break;
@@ -268,7 +270,8 @@  static struct cpuidle_state dedicated_states[NR_DEDICATED_STATES] = {
 		.desc = "snooze",
 		.exit_latency = 0,
 		.target_residency = 0,
-		.enter = &snooze_loop },
+		.enter = &snooze_loop,
+		.flags = CPUIDLE_FLAG_POLLING },
 	{ /* CEDE */
 		.name = "CEDE",
 		.desc = "CEDE",
@@ -286,7 +289,8 @@  static struct cpuidle_state shared_states[] = {
 		.desc = "snooze",
 		.exit_latency = 0,
 		.target_residency = 0,
-		.enter = &snooze_loop },
+		.enter = &snooze_loop,
+		.flags = CPUIDLE_FLAG_POLLING },
 	{ /* Shared Cede */
 		.name = "Shared Cede",
 		.desc = "Shared Cede",