arm64: Select ARCH_HAS_FAST_MULTIPLIER

Message ID	877b532d8d240c1d9e9db923c84b924443a218ed.1524583390.git.robin.murphy@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Robin Murphy <robin.murphy@arm.com> To: will.deacon@arm.com, catalin.marinas@arm.com Subject: [PATCH] arm64: Select ARCH_HAS_FAST_MULTIPLIER Date: Tue, 24 Apr 2018 16:25:47 +0100 Message-Id: <877b532d8d240c1d9e9db923c84b924443a218ed.1524583390.git.robin.murphy@arm.com> Precedence: list Cc: linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Message ID

877b532d8d240c1d9e9db923c84b924443a218ed.1524583390.git.robin.murphy@arm.com (mailing list archive)

State

New, archived

Headers

From: Robin Murphy <robin.murphy@arm.com>
To: will.deacon@arm.com,
	catalin.marinas@arm.com
Subject: [PATCH] arm64: Select ARCH_HAS_FAST_MULTIPLIER
Date: Tue, 24 Apr 2018 16:25:47 +0100
Message-Id: <877b532d8d240c1d9e9db923c84b924443a218ed.1524583390.git.robin.murphy@arm.com>
Precedence: list
Cc: linux-arm-kernel@lists.infradead.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Commit Message

Robin Murphy April 24, 2018, 3:25 p.m. UTC

It is probably safe to assume that all Armv8-A implementations have a
multiplier whose efficiency is comparable or better than a sequence of
three or so register-dependent arithmetic instructions. Select
ARCH_HAS_FAST_MULTIPLIER to get ever-so-slightly nicer codegen in the
few dusty old corners which care.

In a contrived benchmark calling hweight64() in a loop, this does indeed
turn out to be a small win overall, with no measurable impact on
Cortex-A57 but about 5% performance improvement on Cortex-A53.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Apropos of stumbling across this option whilst digging down into some
bitmap-juggling code...

 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

Comments

Will Deacon April 25, 2018, 1:41 p.m. UTC | #1

On Tue, Apr 24, 2018 at 04:25:47PM +0100, Robin Murphy wrote:
> It is probably safe to assume that all Armv8-A implementations have a
> multiplier whose efficiency is comparable or better than a sequence of
> three or so register-dependent arithmetic instructions. Select
> ARCH_HAS_FAST_MULTIPLIER to get ever-so-slightly nicer codegen in the
> few dusty old corners which care.
> 
> In a contrived benchmark calling hweight64() in a loop, this does indeed
> turn out to be a small win overall, with no measurable impact on
> Cortex-A57 but about 5% performance improvement on Cortex-A53.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---

Acked-by: Will Deacon <will.deacon@arm.com>

Will

Catalin Marinas May 16, 2018, 10:51 a.m. UTC | #2

On Tue, Apr 24, 2018 at 04:25:47PM +0100, Robin Murphy wrote:
> It is probably safe to assume that all Armv8-A implementations have a
> multiplier whose efficiency is comparable or better than a sequence of
> three or so register-dependent arithmetic instructions. Select
> ARCH_HAS_FAST_MULTIPLIER to get ever-so-slightly nicer codegen in the
> few dusty old corners which care.
> 
> In a contrived benchmark calling hweight64() in a loop, this does indeed
> turn out to be a small win overall, with no measurable impact on
> Cortex-A57 but about 5% performance improvement on Cortex-A53.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Queued for 4.18. Thanks.

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf4938f6d..9c850f3b398f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -12,6 +12,7 @@  config ARM64
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
 	select ARCH_HAS_ELF_RANDOMIZE
+	select ARCH_HAS_FAST_MULTIPLIER
 	select ARCH_HAS_FORTIFY_SOURCE
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA

arm64: Select ARCH_HAS_FAST_MULTIPLIER

Commit Message

Comments

Patch