Message ID | 20210210232327.1965876-1-morbo@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | dwarf_loader: use a better hashing function | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > This hashing function[1] produces better hash table bucket > distributions. The original hashing function always produced zeros in > the three least significant bits. > > The new hashing funciton gives a modest performance boost. > > Original New > 0:11.41 0:11.38 > 0:11.36 0:11.34 > 0:11.35 0:11.26 > ----------------------- > Avg: 0:11.373 0:11.327 > > for a performance improvement of 0.4%. > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > Can you please also test with the one libbpf uses internally: return (val * 11400714819323198485llu) >> (64 - bits); ? Thanks! > Signed-off-by: Bill Wendling <morbo@google.com> > --- > hash.h | 25 ++++++++++--------------- > 1 file changed, 10 insertions(+), 15 deletions(-) > > diff --git a/hash.h b/hash.h > index d3aa416..ea201ab 100644 > --- a/hash.h > +++ b/hash.h > @@ -33,22 +33,17 @@ > > static inline uint64_t hash_64(const uint64_t val, const unsigned int bits) > { > - uint64_t hash = val; > + uint64_t hash = val * 0x369DEA0F31A53F85UL + 0x255992D382208B61UL; > > - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ > - uint64_t n = hash; > - n <<= 18; > - hash -= n; > - n <<= 33; > - hash -= n; > - n <<= 3; > - hash += n; > - n <<= 3; > - hash -= n; > - n <<= 4; > - hash += n; > - n <<= 2; > - hash += n; > + hash ^= hash >> 21; > + hash ^= hash << 37; > + hash ^= hash >> 4; > + > + hash *= 0x422E19E1D95D2F0DUL; > + > + hash ^= hash << 20; > + hash ^= hash >> 41; > + hash ^= hash << 5; > > /* High bits are more random, so use them. */ > return hash >> (64 - bits); > -- > 2.30.0.478.g8a0d178c01-goog >
On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > This hashing function[1] produces better hash table bucket > > distributions. The original hashing function always produced zeros in > > the three least significant bits. > > > > The new hashing funciton gives a modest performance boost. > > > > Original New > > 0:11.41 0:11.38 > > 0:11.36 0:11.34 > > 0:11.35 0:11.26 > > ----------------------- > > Avg: 0:11.373 0:11.327 > > > > for a performance improvement of 0.4%. > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > Can you please also test with the one libbpf uses internally: > > return (val * 11400714819323198485llu) >> (64 - bits); > > ? > > Thanks! > It's giving me a running time of ~11.11s, which is even better. Would you like me to submit a patch? -bw
On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > > > This hashing function[1] produces better hash table bucket > > > distributions. The original hashing function always produced zeros in > > > the three least significant bits. > > > > > > The new hashing funciton gives a modest performance boost. > > > > > > Original New > > > 0:11.41 0:11.38 > > > 0:11.36 0:11.34 > > > 0:11.35 0:11.26 > > > ----------------------- > > > Avg: 0:11.373 0:11.327 > > > > > > for a performance improvement of 0.4%. > > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > > > > Can you please also test with the one libbpf uses internally: > > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > > ? > > > > Thanks! > > > It's giving me a running time of ~11.11s, which is even better. Would > you like me to submit a patch? faster is better, so yeah, why not? :) > > -bw
Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu: > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > This hashing function[1] produces better hash table bucket > > > > distributions. The original hashing function always produced zeros in > > > > the three least significant bits. > > > > The new hashing funciton gives a modest performance boost. > > > > Original New > > > > 0:11.41 0:11.38 > > > > 0:11.36 0:11.34 > > > > 0:11.35 0:11.26 > > > > ----------------------- > > > > Avg: 0:11.373 0:11.327 > > > > for a performance improvement of 0.4%. > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > Can you please also test with the one libbpf uses internally: > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > ? > > It's giving me a running time of ~11.11s, which is even better. Would > > you like me to submit a patch? > faster is better, so yeah, why not? :) Yeah, I agree, faster is better, please make it so :-) - Arnaldo
On Thu, Feb 11, 2021 at 5:01 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu: > > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > > This hashing function[1] produces better hash table bucket > > > > > distributions. The original hashing function always produced zeros in > > > > > the three least significant bits. > > > > > > The new hashing funciton gives a modest performance boost. > > > > > > Original New > > > > > 0:11.41 0:11.38 > > > > > 0:11.36 0:11.34 > > > > > 0:11.35 0:11.26 > > > > > ----------------------- > > > > > Avg: 0:11.373 0:11.327 > > > > > > for a performance improvement of 0.4%. > > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > > Can you please also test with the one libbpf uses internally: > > > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > > > ? > > > > It's giving me a running time of ~11.11s, which is even better. Would > > > you like me to submit a patch? > > > faster is better, so yeah, why not? :) > > Yeah, I agree, faster is better, please make it so :-) > Your wish is my command! :-) Done. -bw
Em Thu, Feb 11, 2021 at 10:55:32PM -0800, Bill Wendling escreveu: > On Thu, Feb 11, 2021 at 5:01 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu: > > > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > > > This hashing function[1] produces better hash table bucket > > > > > > distributions. The original hashing function always produced zeros in > > > > > > the three least significant bits. > > > > > > > > The new hashing funciton gives a modest performance boost. > > > > > > > > Original New > > > > > > 0:11.41 0:11.38 > > > > > > 0:11.36 0:11.34 > > > > > > 0:11.35 0:11.26 > > > > > > ----------------------- > > > > > > Avg: 0:11.373 0:11.327 > > > > > > > > for a performance improvement of 0.4%. > > > > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > > > > Can you please also test with the one libbpf uses internally: > > > > > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > > > > > ? > > > > > > It's giving me a running time of ~11.11s, which is even better. Would > > > > you like me to submit a patch? > > > > > faster is better, so yeah, why not? :) > > > > Yeah, I agree, faster is better, please make it so :-) > > > Your wish is my command! :-) Done. Thanks, looking for the patch and applying! No go think about something else to make it faster 8-) - Arnaldo
diff --git a/hash.h b/hash.h index d3aa416..ea201ab 100644 --- a/hash.h +++ b/hash.h @@ -33,22 +33,17 @@ static inline uint64_t hash_64(const uint64_t val, const unsigned int bits) { - uint64_t hash = val; + uint64_t hash = val * 0x369DEA0F31A53F85UL + 0x255992D382208B61UL; - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ - uint64_t n = hash; - n <<= 18; - hash -= n; - n <<= 33; - hash -= n; - n <<= 3; - hash += n; - n <<= 3; - hash -= n; - n <<= 4; - hash += n; - n <<= 2; - hash += n; + hash ^= hash >> 21; + hash ^= hash << 37; + hash ^= hash >> 4; + + hash *= 0x422E19E1D95D2F0DUL; + + hash ^= hash << 20; + hash ^= hash >> 41; + hash ^= hash << 5; /* High bits are more random, so use them. */ return hash >> (64 - bits);
This hashing function[1] produces better hash table bucket distributions. The original hashing function always produced zeros in the three least significant bits. The new hashing funciton gives a modest performance boost. Original New 0:11.41 0:11.38 0:11.36 0:11.34 0:11.35 0:11.26 ----------------------- Avg: 0:11.373 0:11.327 for a performance improvement of 0.4%. [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes Signed-off-by: Bill Wendling <morbo@google.com> --- hash.h | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-)