From patchwork Thu Aug 17 08:39:20 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julien Thierry X-Patchwork-Id: 9905311 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 159AB60244 for ; Thu, 17 Aug 2017 08:40:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0307028AC6 for ; Thu, 17 Aug 2017 08:40:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC0E428AD0; Thu, 17 Aug 2017 08:40:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E9B6128AC6 for ; Thu, 17 Aug 2017 08:40:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=Kve9Rxck+s7zc3yUFUCTBWzIL6LBd4+i99n0GYEmu0M=; b=rZA9LJraiuZ5KAPnBUDeW4yqWl UpwHvWcqntAQ5V5xbYiTFZqOg678tRyDcrJzMhZdacXgiQWA7kvnwQfoCbvh9XMMtPm8E11wR85Em r7yKQALWKt+cTbfYT5qarF+Dyw/h4j7JBCaBgxYA4qsOeZP4YCsfqtidWIMoCrqNtfxiIRPegkk23 MAwIOdS5c9XBC8AZRhseEWbxr+SgoOd28Hs4oz5mfefu6gG2jYw4Og8ucSsxZvSaNgslVAhmBBTPP LNqxPCqCPeE90lrgbCd6zjKga9Nj9k7Gi6u7+7oa/+4BDet22OszpikqzF/BdTSStc8pEKNAiuI+X 6RNPbAvw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1diGLP-00048Q-BO; Thu, 17 Aug 2017 08:39:59 +0000 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70] helo=foss.arm.com) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1diGLJ-0003ua-RS for linux-arm-kernel@lists.infradead.org; Thu, 17 Aug 2017 08:39:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 86AC31682; Thu, 17 Aug 2017 01:39:38 -0700 (PDT) Received: from e112298-lin.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 4BE4D3F483; Thu, 17 Aug 2017 01:39:37 -0700 (PDT) From: Julien Thierry To: linux-arm-kernel@lists.infradead.org Subject: [PATCH v2 2/2] arm64: use WFE for long delays Date: Thu, 17 Aug 2017 09:39:20 +0100 Message-Id: <1502959160-30900-3-git-send-email-julien.thierry@arm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1502959160-30900-1-git-send-email-julien.thierry@arm.com> References: <1502959160-30900-1-git-send-email-julien.thierry@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170817_013954_015010_790D01E1 X-CRM114-Status: GOOD ( 13.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Catalin Marinas , Will Deacon , Arnd Bergmann , Julien Thierry MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The current delay implementation uses the yield instruction, which is a hint that it is beneficial to schedule another thread. As this is a hint, it may be implemented as a NOP, causing all delays to be busy loops. This is the case for many existing CPUs. Taking advantage of the generic timer sending periodic events to all cores, we can use WFE during delays to reduce power consumption. This is beneficial only for delays longer than the period of the timer event stream. If timer event stream is not enabled, delays will behave as yield/busy loops. Signed-off-by: Julien Thierry Cc: Catalin Marinas Cc: Will Deacon Cc: Mark Rutland Cc: Arnd Bergmann --- arch/arm64/lib/delay.c | 25 ++++++++++++++++++++----- include/asm-generic/delay.h | 9 +++++++-- 2 files changed, 27 insertions(+), 7 deletions(-) -- 1.9.1 diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c index dad4ec9..ada7005 100644 --- a/arch/arm64/lib/delay.c +++ b/arch/arm64/lib/delay.c @@ -24,10 +24,28 @@ #include #include +#include + +#define USECS_TO_CYCLES(TIME_USECS) \ + (xloops_to_cycles(usecs_to_xloops(TIME_USECS))) + +static inline unsigned long xloops_to_cycles(unsigned long xloops) +{ + return (xloops * loops_per_jiffy * HZ) >> 32; +} + void __delay(unsigned long cycles) { cycles_t start = get_cycles(); + if (arch_timer_evtstrm_available()) { + const cycles_t timer_evt_period = + USECS_TO_CYCLES(1000000 / ARCH_TIMER_EVT_STREAM_FREQ); + + while (get_cycles() - start + timer_evt_period < cycles) + wfe(); + } + while ((get_cycles() - start) < cycles) cpu_relax(); } @@ -35,16 +53,13 @@ void __delay(unsigned long cycles) inline void __const_udelay(unsigned long xloops) { - unsigned long loops; - - loops = xloops * loops_per_jiffy * HZ; - __delay(loops >> 32); + __delay(xloops_to_cycles(xloops)); } EXPORT_SYMBOL(__const_udelay); void __udelay(unsigned long usecs) { - __const_udelay(usecs * 0x10C7UL); /* 2**32 / 1000000 (rounded up) */ + __const_udelay(usecs_to_xloops(usecs)); } EXPORT_SYMBOL(__udelay); diff --git a/include/asm-generic/delay.h b/include/asm-generic/delay.h index 0f79054..1538e58 100644 --- a/include/asm-generic/delay.h +++ b/include/asm-generic/delay.h @@ -10,19 +10,24 @@ extern void __const_udelay(unsigned long xloops); extern void __delay(unsigned long loops); +/* 0x10c7 is 2**32 / 1000000 (rounded up) */ +static inline unsigned long usecs_to_xloops(unsigned long usecs) +{ + return usecs * 0x10C7UL; +} + /* * The weird n/20000 thing suppresses a "comparison is always false due to * limited range of data type" warning with non-const 8-bit arguments. */ -/* 0x10c7 is 2**32 / 1000000 (rounded up) */ #define udelay(n) \ ({ \ if (__builtin_constant_p(n)) { \ if ((n) / 20000 >= 1) \ __bad_udelay(); \ else \ - __const_udelay((n) * 0x10c7ul); \ + __const_udelay(usecs_to_xloops(n)); \ } else { \ __udelay(n); \ } \