[2/4] x86: further speed-up to hweight{32,64}()

Message ID	5CF0F9770200007800233E04@prv1-mh.provo.novell.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Message-Id: <5CF0F9770200007800233E04@prv1-mh.provo.novell.com> Date: Fri, 31 May 2019 03:52:55 -0600 From: "Jan Beulich" <JBeulich@suse.com> To: "xen-devel" <xen-devel@lists.xenproject.org> References: <5CF0F8530200007800233DE0@prv1-mh.provo.novell.com> In-Reply-To: <5CF0F8530200007800233DE0@prv1-mh.provo.novell.com> Mime-Version: 1.0 Content-Disposition: inline Subject: [Xen-devel] [PATCH 2/4] x86: further speed-up to hweight{32,64}() Precedence: list Cc: George Dunlap <George.Dunlap@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>, Roger Pau Monne <roger.pau@citrix.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Series	bitops: hweight<N>() improvements \| expand [0/4] bitops: hweight<N>() improvements [1/4] bitops: speed up hweight<N>() [2/4] x86: further speed-up to hweight{32,64}() [RFC,3/4] Arm64: further speed-up to hweight{32, 64}() [4/4] x86: use POPCNT for hweight<N>() when available

Message ID

5CF0F9770200007800233E04@prv1-mh.provo.novell.com (mailing list archive)

State

New, archived

Headers

Message-Id: <5CF0F9770200007800233E04@prv1-mh.provo.novell.com>
Date: Fri, 31 May 2019 03:52:55 -0600
From: "Jan Beulich" <JBeulich@suse.com>
To: "xen-devel" <xen-devel@lists.xenproject.org>
References: <5CF0F8530200007800233DE0@prv1-mh.provo.novell.com>
In-Reply-To: <5CF0F8530200007800233DE0@prv1-mh.provo.novell.com>
Mime-Version: 1.0
Content-Disposition: inline
Subject: [Xen-devel] [PATCH 2/4] x86: further speed-up to hweight{32,64}()
Precedence: list
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
 Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>,
 Roger Pau Monne <roger.pau@citrix.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: xen-devel-bounces@lists.xenproject.org
Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>

Series

bitops: hweight<N>() improvements | expand

Commit Message

Jan Beulich May 31, 2019, 9:52 a.m. UTC

According to Linux commit 0136611c62 ("optimize hweight64 for x86_64")
this is a further improvement over the variant using only bitwise
operations. It's also a slight further code size reduction.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Andrew Cooper May 31, 2019, 7:23 p.m. UTC | #1

On 31/05/2019 02:52, Jan Beulich wrote:
> According to Linux commit 0136611c62 ("optimize hweight64 for x86_64")
> this is a further improvement over the variant using only bitwise
> operations. It's also a slight further code size reduction.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

This should also include ARM64, which also unconditionally selects
HAS_FAST_MULTIPLY in Linux.

As for the x86 side of things, Acked-by: Andrew Cooper
<andrew.cooper3@citrix.com>

>
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -12,6 +12,7 @@ config X86
>  	select HAS_CPUFREQ
>  	select HAS_EHCI
>  	select HAS_EX_TABLE
> +	select HAS_FAST_MULTIPLY
>  	select HAS_GDBSX
>  	select HAS_IOPORTS
>  	select HAS_KEXEC
>
>

Jan Beulich June 3, 2019, 7:52 a.m. UTC | #2

>>> On 31.05.19 at 21:23, <andrew.cooper3@citrix.com> wrote:
> On 31/05/2019 02:52, Jan Beulich wrote:
>> According to Linux commit 0136611c62 ("optimize hweight64 for x86_64")
>> this is a further improvement over the variant using only bitwise
>> operations. It's also a slight further code size reduction.
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> This should also include ARM64, which also unconditionally selects
> HAS_FAST_MULTIPLY in Linux.

I've very intentionally split the Arm change from the x86 one:
Looking at the generated code I'm unconvinced this is a win
there, and hence I'd prefer if someone could measure this. It
is for this reason that patch 3 was actually sent as RFC.

> As for the x86 side of things, Acked-by: Andrew Cooper
> <andrew.cooper3@citrix.com>

Thanks.

Jan

--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -12,6 +12,7 @@  config X86
 	select HAS_CPUFREQ
 	select HAS_EHCI
 	select HAS_EX_TABLE
+	select HAS_FAST_MULTIPLY
 	select HAS_GDBSX
 	select HAS_IOPORTS
 	select HAS_KEXEC

[2/4] x86: further speed-up to hweight{32,64}()

Commit Message

Comments

Patch