[v2] dwarf_loader: use a better hashing function

Message ID	20210212080104.2499483-1-morbo@google.com (mailing list archive)
State	Not Applicable
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@kernel.org> Sender: "morbo via sendgmr" <morbo@fawn.svl.corp.google.com> Date: Fri, 12 Feb 2021 00:01:04 -0800 In-Reply-To: <20210210232327.1965876-1-morbo@google.com> Message-Id: <20210212080104.2499483-1-morbo@google.com> Mime-Version: 1.0 References: <20210210232327.1965876-1-morbo@google.com> Subject: [PATCH v2] dwarf_loader: use a better hashing function From: Bill Wendling <morbo@google.com> To: dwarves@vger.kernel.org, bpf@vger.kernel.org Cc: arnaldo.melo@gmail.com, Bill Wendling <morbo@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	[v2] dwarf_loader: use a better hashing function \| expand [v2] dwarf_loader: use a better hashing function

Message ID

20210212080104.2499483-1-morbo@google.com (mailing list archive)

State

Not Applicable

Delegated to:

BPF

Headers

Sender: "morbo via sendgmr" <morbo@fawn.svl.corp.google.com>
Date: Fri, 12 Feb 2021 00:01:04 -0800
In-Reply-To: <20210210232327.1965876-1-morbo@google.com>
Message-Id: <20210212080104.2499483-1-morbo@google.com>
Mime-Version: 1.0
References: <20210210232327.1965876-1-morbo@google.com>
Subject: [PATCH v2] dwarf_loader: use a better hashing function
From: Bill Wendling <morbo@google.com>
To: dwarves@vger.kernel.org, bpf@vger.kernel.org
Cc: arnaldo.melo@gmail.com, Bill Wendling <morbo@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

[v2] dwarf_loader: use a better hashing function | expand

Context	Check	Description
netdev/tree_selection	success	Not a local patch

Context

Check

Description

netdev/tree_selection

success

Not a local patch

Commit Message

Bill Wendling Feb. 12, 2021, 8:01 a.m. UTC

This hashing function[1] produces better hash table bucket
distributions. The original hashing function always produced zeros in
the three least significant bits. The new hashing function gives a
modest performance boost:

  Original: 0:11.373s
  New:      0:11.110s

for a performance improvement of ~2%.

[1] From the hash function used in libbpf.

Signed-off-by: Bill Wendling <morbo@google.com>
---
 hash.h | 20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

Comments

Arnaldo Carvalho de Melo Feb. 12, 2021, 12:37 p.m. UTC | #1

Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu:
> This hashing function[1] produces better hash table bucket
> distributions. The original hashing function always produced zeros in
> the three least significant bits. The new hashing function gives a
> modest performance boost:

Some tidbits:

You forgot to CC Andrii and also to add this, which I'm doing now:

Suggested-by: Andrii Nakryiko <andrii@kernel.org>

:-)

- Arnaldo
 
>   Original: 0:11.373s
>   New:      0:11.110s
> 
> for a performance improvement of ~2%.
> 
> [1] From the hash function used in libbpf.
> 
> Signed-off-by: Bill Wendling <morbo@google.com>
> ---
>  hash.h | 20 +-------------------
>  1 file changed, 1 insertion(+), 19 deletions(-)
> 
> diff --git a/hash.h b/hash.h
> index d3aa416..6f952c7 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -33,25 +33,7 @@
>  
>  static inline uint64_t hash_64(const uint64_t val, const unsigned int bits)
>  {
> -	uint64_t hash = val;
> -
> -	/*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
> -	uint64_t n = hash;
> -	n <<= 18;
> -	hash -= n;
> -	n <<= 33;
> -	hash -= n;
> -	n <<= 3;
> -	hash += n;
> -	n <<= 3;
> -	hash -= n;
> -	n <<= 4;
> -	hash += n;
> -	n <<= 2;
> -	hash += n;
> -
> -	/* High bits are more random, so use them. */
> -	return hash >> (64 - bits);
> +	return (val * 11400714819323198485LLU) >> (64 - bits);
>  }
>  
>  static inline uint32_t hash_32(uint32_t val, unsigned int bits)
> -- 
> 2.30.0.478.g8a0d178c01-goog
>

Arnaldo Carvalho de Melo Feb. 12, 2021, 12:39 p.m. UTC | #2

Em Fri, Feb 12, 2021 at 09:37:22AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu:
> > This hashing function[1] produces better hash table bucket
> > distributions. The original hashing function always produced zeros in
> > the three least significant bits. The new hashing function gives a
> > modest performance boost:
> 
> Some tidbits:
> 
> You forgot to CC Andrii and also to add this, which I'm doing now:
> 
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> 
> :-)

See below the full cset, that will go public after some more tests here.

- Arnaldo

commit 9fecc77ed82d429fd3fe49ba275465813228e617 (HEAD -> master)
Author: Bill Wendling <morbo@google.com>
Date:   Fri Feb 12 00:01:04 2021 -0800

    dwarf_loader: Use a better hashing function, from libbpf
    
    This hashing function[1] produces better hash table bucket
    distributions. The original hashing function always produced zeros in
    the three least significant bits. The new hashing function gives a
    modest performance boost:
    
      Original: 0:11.373s
      New:      0:11.110s
    
    for a performance improvement of ~2%.
    
    [1] From the hash function used in libbpf.
    
    Committer notes:
    
    Bill found the suboptimality of the hash function being used, Andrii
    suggested using the libbpf one, which ended up being better.
    
    Signed-off-by: Bill Wendling <morbo@google.com>
    Suggested-by: Andrii Nakryiko <andrii@kernel.org>
    Cc: bpf@vger.kernel.org
    Cc: dwarves@vger.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Bill Wendling Feb. 12, 2021, 8:14 p.m. UTC | #3

On Fri, Feb 12, 2021 at 4:37 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu:
> > This hashing function[1] produces better hash table bucket
> > distributions. The original hashing function always produced zeros in
> > the three least significant bits. The new hashing function gives a
> > modest performance boost:
>
> Some tidbits:
>
> You forgot to CC Andrii and also to add this, which I'm doing now:
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>
> :-)
>
Doh! You're right. Sorry about that. Thanks for catching it!

-bw

diff --git a/hash.h b/hash.h
index d3aa416..6f952c7 100644
--- a/hash.h
+++ b/hash.h
@@ -33,25 +33,7 @@ 
 
 static inline uint64_t hash_64(const uint64_t val, const unsigned int bits)
 {
-	uint64_t hash = val;
-
-	/*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
-	uint64_t n = hash;
-	n <<= 18;
-	hash -= n;
-	n <<= 33;
-	hash -= n;
-	n <<= 3;
-	hash += n;
-	n <<= 3;
-	hash -= n;
-	n <<= 4;
-	hash += n;
-	n <<= 2;
-	hash += n;
-
-	/* High bits are more random, so use them. */
-	return hash >> (64 - bits);
+	return (val * 11400714819323198485LLU) >> (64 - bits);
 }
 
 static inline uint32_t hash_32(uint32_t val, unsigned int bits)

[v2] dwarf_loader: use a better hashing function

Checks

Commit Message

Comments

Patch