From patchwork Mon Jul 10 09:30:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artem Bityutskiy X-Patchwork-Id: 13306577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E43E0EB64DA for ; Mon, 10 Jul 2023 09:33:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231919AbjGJJdW (ORCPT ); Mon, 10 Jul 2023 05:33:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233019AbjGJJdC (ORCPT ); Mon, 10 Jul 2023 05:33:02 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CB2926BD for ; Mon, 10 Jul 2023 02:31:34 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="344624695" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="344624695" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 02:31:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="865281487" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="865281487" Received: from powerlab.fi.intel.com ([10.237.71.25]) by fmsmga001.fm.intel.com with ESMTP; 10 Jul 2023 02:31:03 -0700 From: Artem Bityutskiy To: x86@kernel.org, "Rafael J. Wysocki" Cc: Linux PM Mailing List , Arjan van de Ven , Artem Bityutskiy , Thomas Gleixner Subject: [PATCH v4 1/4] x86/umwait: use 'IS_ENABLED()' Date: Mon, 10 Jul 2023 12:30:57 +0300 Message-Id: <20230710093100.918337-2-dedekind1@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230710093100.918337-1-dedekind1@gmail.com> References: <20230710093100.918337-1-dedekind1@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Artem Bityutskiy Both kernel coding style and x86 maintainers prefer using 'IS_ENABLED()' instead of '#ifdef' whenever possible. Switch '__tpause()' to 'IS_ENABLED()'. Signed-off-by: Artem Bityutskiy --- arch/x86/include/asm/mwait.h | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index 778df05f8539..03bef2bc28d4 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -130,15 +130,15 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo static inline void __tpause(u32 ecx, u32 edx, u32 eax) { /* "tpause %ecx, %edx, %eax;" */ - #ifdef CONFIG_AS_TPAUSE - asm volatile("tpause %%ecx\n" - : - : "c"(ecx), "d"(edx), "a"(eax)); - #else - asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n" - : - : "c"(ecx), "d"(edx), "a"(eax)); - #endif + if (IS_ENABLED(CONFIG_AS_TPAUSE)) { + asm volatile("tpause %%ecx\n" + : + : "c"(ecx), "d"(edx), "a"(eax)); + } else { + asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n" + : + : "c"(ecx), "d"(edx), "a"(eax)); + } } #endif /* _ASM_X86_MWAIT_H */ From patchwork Mon Jul 10 09:30:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artem Bityutskiy X-Patchwork-Id: 13306579 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A560CEB64D9 for ; Mon, 10 Jul 2023 09:33:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232143AbjGJJdZ (ORCPT ); Mon, 10 Jul 2023 05:33:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233060AbjGJJdD (ORCPT ); Mon, 10 Jul 2023 05:33:03 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5E54127 for ; Mon, 10 Jul 2023 02:31:35 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="344624705" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="344624705" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 02:31:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="865281507" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="865281507" Received: from powerlab.fi.intel.com ([10.237.71.25]) by fmsmga001.fm.intel.com with ESMTP; 10 Jul 2023 02:31:05 -0700 From: Artem Bityutskiy To: x86@kernel.org, "Rafael J. Wysocki" Cc: Linux PM Mailing List , Arjan van de Ven , Artem Bityutskiy , Thomas Gleixner Subject: [PATCH v4 2/4] x86/mwait: Add support for idle via umwait Date: Mon, 10 Jul 2023 12:30:58 +0300 Message-Id: <20230710093100.918337-3-dedekind1@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230710093100.918337-1-dedekind1@gmail.com> References: <20230710093100.918337-1-dedekind1@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Artem Bityutskiy On Intel platforms, C-states are requested using the 'monitor/mwait' instructions pair, as implemented in 'mwait_idle_with_hints()'. This mechanism allows for entering C1 and deeper C-states. Sapphire Rapids Xeon supports new idle states - C0.1 and C0.2 (later C0.x). These idle states have lower latency comparing to C1, and can be requested with either 'tpause' or 'umwait' instructions. Linux kernel already supports the 'tpause' instruction and uses it in delay functions like 'udelay()'. Add 'umwait' support by implementing the 'umwait_idle()' function. This function is analogous to 'mwait_idle_with_hints()', but instead of requesting a C-state with 'monitor/mwait', it requests C0.x with 'umonitor/umwait'. Tested with both gcc/binutils and clang/llvm. Signed-off-by: Artem Bityutskiy --- arch/x86/include/asm/mwait.h | 67 ++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index 03bef2bc28d4..48210f4d7c77 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -141,4 +141,71 @@ static inline void __tpause(u32 ecx, u32 edx, u32 eax) } } +#ifdef CONFIG_X86_64 +/* + * Monitor a memory address at 'rcx' using the 'umonitor' instruction. + */ +static __always_inline void __umonitor(const void *rcx) +{ + /* "umonitor %rcx" */ + if (IS_ENABLED(CONFIG_AS_TPAUSE)) { + asm volatile("umonitor %%rcx\n" + : + : "c"(rcx)); + } else { + asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf1\t\n" + : + : "c"(rcx)); + } +} + +/* + * Same as '__tpause()', but uses the 'umwait' instruction. It is very + * similar to 'tpause', but also breaks out if the data at the address + * monitored with 'umonitor' is modified. + */ +static __always_inline void __umwait(u32 ecx, u32 edx, u32 eax) +{ + /* "umwait %ecx, %edx, %eax;" */ + if (IS_ENABLED(CONFIG_AS_TPAUSE)) { + asm volatile("umwait %%ecx\n" + : + : "c"(ecx), "d"(edx), "a"(eax)); + } else { + asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf1\t\n" + : + : "c"(ecx), "d"(edx), "a"(eax)); + } +} + +/* + * Enter C0.1 or C0.2 state and stay there until an event happens (an interrupt + * or 'need_resched()'), the explicit deadline is reached, or the implicit + * global limit is reached. + * + * The deadline is the absolute TSC value to exit the idle state at. If it + * exceeds the global limit in the 'IA32_UMWAIT_CONTROL' register, the global + * limit prevails, and the idle state is exited earlier than the deadline. + */ +static __always_inline void umwait_idle(u64 deadline, u32 state) +{ + if (!current_set_polling_and_test()) { + u32 eax, edx; + + eax = lower_32_bits(deadline); + edx = upper_32_bits(deadline); + + __umonitor(¤t_thread_info()->flags); + if (!need_resched()) + __umwait(state, edx, eax); + } + current_clr_polling(); +} +#else /* CONFIG_X86_64 */ +static __always_inline void umwait_idle(u64 deadline, u32 state) +{ + WARN_ONCE(1, "umwait CPU instruction is not supported"); +} +#endif /* !CONFIG_X86_64 */ + #endif /* _ASM_X86_MWAIT_H */ From patchwork Mon Jul 10 09:30:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artem Bityutskiy X-Patchwork-Id: 13306580 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 753E8C001DF for ; Mon, 10 Jul 2023 09:33:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231381AbjGJJdu (ORCPT ); Mon, 10 Jul 2023 05:33:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231608AbjGJJdV (ORCPT ); Mon, 10 Jul 2023 05:33:21 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 390DB1A8 for ; Mon, 10 Jul 2023 02:31:50 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="344624721" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="344624721" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 02:31:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="865281524" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="865281524" Received: from powerlab.fi.intel.com ([10.237.71.25]) by fmsmga001.fm.intel.com with ESMTP; 10 Jul 2023 02:31:07 -0700 From: Artem Bityutskiy To: x86@kernel.org, "Rafael J. Wysocki" Cc: Linux PM Mailing List , Arjan van de Ven , Artem Bityutskiy , Thomas Gleixner Subject: [PATCH v4 3/4] intel_idle: rename 'intel_idle_hlt_irq_on()' Date: Mon, 10 Jul 2023 12:30:59 +0300 Message-Id: <20230710093100.918337-4-dedekind1@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230710093100.918337-1-dedekind1@gmail.com> References: <20230710093100.918337-1-dedekind1@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Artem Bityutskiy Rename 'intel_idle_hlt_irq_on()' to 'intel_idle_hlt_irq()' for consistency with 'intel_idle_irq()'. While on it, fix indentation in 'intel_idle_hlt_irq()' declaration to use tabs instead of white-spaces. Signed-off-by: Artem Bityutskiy --- drivers/idle/intel_idle.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b930036edbbe..0a835f97de72 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -224,8 +224,8 @@ static __cpuidle int intel_idle_hlt(struct cpuidle_device *dev, return __intel_idle_hlt(dev, drv, index); } -static __cpuidle int intel_idle_hlt_irq_on(struct cpuidle_device *dev, - struct cpuidle_driver *drv, int index) +static __cpuidle int intel_idle_hlt_irq(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) { int ret; @@ -1900,11 +1900,11 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate) if (state->enter == intel_idle_hlt) { if (force_irq_on) { pr_info("forced intel_idle_irq for state %d\n", cstate); - state->enter = intel_idle_hlt_irq_on; + state->enter = intel_idle_hlt_irq; } return; } - if (state->enter == intel_idle_hlt_irq_on) + if (state->enter == intel_idle_hlt_irq) return; /* no update scenarios */ if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) { @@ -1949,7 +1949,7 @@ static bool should_verify_mwait(struct cpuidle_state *state) { if (state->enter == intel_idle_hlt) return false; - if (state->enter == intel_idle_hlt_irq_on) + if (state->enter == intel_idle_hlt_irq) return false; return true; From patchwork Mon Jul 10 09:31:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artem Bityutskiy X-Patchwork-Id: 13306581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAF35EB64DA for ; Mon, 10 Jul 2023 09:33:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231608AbjGJJdv (ORCPT ); Mon, 10 Jul 2023 05:33:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231637AbjGJJdV (ORCPT ); Mon, 10 Jul 2023 05:33:21 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B3901AB for ; Mon, 10 Jul 2023 02:31:50 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="344624738" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="344624738" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2023 02:31:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10766"; a="865281541" X-IronPort-AV: E=Sophos;i="6.01,194,1684825200"; d="scan'208";a="865281541" Received: from powerlab.fi.intel.com ([10.237.71.25]) by fmsmga001.fm.intel.com with ESMTP; 10 Jul 2023 02:31:09 -0700 From: Artem Bityutskiy To: x86@kernel.org, "Rafael J. Wysocki" Cc: Linux PM Mailing List , Arjan van de Ven , Artem Bityutskiy , Thomas Gleixner Subject: [PATCH v4 4/4] intel_idle: add C0.2 state for Sapphire Rapids Xeon Date: Mon, 10 Jul 2023 12:31:00 +0300 Message-Id: <20230710093100.918337-5-dedekind1@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230710093100.918337-1-dedekind1@gmail.com> References: <20230710093100.918337-1-dedekind1@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Artem Bityutskiy Add Sapphire Rapids Xeon C0.2 state support. This state has a lower exit latency comparing to C1, and saves energy comparing to POLL. C0.2 may also improve performance (e.g., as measured by 'hackbench'), because idle CPU power savings in C0.2 increase busy CPU power budget and therefore, improve turbo boost of the busy CPU. Suggested-by: Len Brown Suggested-by: Arjan Van De Ven Signed-off-by: Artem Bityutskiy --- drivers/idle/intel_idle.c | 44 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 0a835f97de72..eb2bcc7f8ea0 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -130,6 +130,11 @@ static unsigned int mwait_substates __initdata; #define flg2MWAIT(flags) (((flags) >> 24) & 0xFF) #define MWAIT2flg(eax) ((eax & 0xFF) << 24) +/* + * The maximum possible 'umwait' deadline value. + */ +#define UMWAIT_MAX_DEADLINE (~((u64)0)) + static __always_inline int __intel_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { @@ -263,6 +268,32 @@ static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, return 0; } +/** + * intel_idle_umwait_irq - Request C0.x using the 'umwait' instruction. + * @dev: cpuidle device of the target CPU. + * @drv: cpuidle driver (assumed to point to intel_idle_driver). + * @index: Target idle state index. + * + * Request C0.1 or C0.2 using 'umwait' instruction with interrupts enabled. + */ +static __cpuidle int intel_idle_umwait_irq(struct cpuidle_device *dev, + struct cpuidle_driver *drv, + int index) +{ + u32 state = flg2MWAIT(drv->states[index].flags); + + raw_local_irq_enable(); + /* + * Use the maximum possible deadline value. This means that 'C0.x' + * residency will be limited by the global limit in + * 'IA32_UMWAIT_CONTROL'. + */ + umwait_idle(UMWAIT_MAX_DEADLINE, state); + raw_local_irq_disable(); + + return index; +} + /* * States are indexed by the cstate number, * which is also the index into the MWAIT hint array. @@ -1006,6 +1037,13 @@ static struct cpuidle_state adl_n_cstates[] __initdata = { }; static struct cpuidle_state spr_cstates[] __initdata = { + { + .name = "C0.2", + .desc = "UMWAIT C0.2", + .flags = MWAIT2flg(TPAUSE_C02_STATE) | CPUIDLE_FLAG_IRQ_ENABLE, + .exit_latency_ns = 200, + .target_residency_ns = 200, + .enter = &intel_idle_umwait_irq, }, { .name = "C1", .desc = "MWAIT 0x00", @@ -1904,7 +1942,9 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate) } return; } - if (state->enter == intel_idle_hlt_irq) + + if (state->enter == intel_idle_hlt_irq || + state->enter == intel_idle_umwait_irq) return; /* no update scenarios */ if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) { @@ -1951,6 +1991,8 @@ static bool should_verify_mwait(struct cpuidle_state *state) return false; if (state->enter == intel_idle_hlt_irq) return false; + if (state->enter == intel_idle_umwait_irq) + return false; return true; }