Message ID | 20161214184605.24006-1-Jason@zx2c4.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Dec 14, 2016 at 10:46 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > SipHash is a 64-bit keyed hash function that is actually a > cryptographically secure PRF, like HMAC. Except SipHash is super fast, > and is meant to be used as a hashtable keyed lookup function. > "super fast" is relative. My quick test shows that this faster than Toeplitz (good, but not exactly hard to achieve), but is about 4x slower than jhash. > SipHash isn't just some new trendy hash function. It's been around for a > while, and there really isn't anything that comes remotely close to > being useful in the way SipHash is. With that said, why do we need this? > I don't think we need advertising nor a lesson on hashing. It would be much more useful if you just point us to the paper on siphash (which I assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?). > There are a variety of attacks known as "hashtable poisoning" in which an > attacker forms some data such that the hash of that data will be the > same, and then preceeds to fill up all entries of a hashbucket. This is > a realistic and well-known denial-of-service vector. > > Linux developers already seem to be aware that this is an issue, and > various places that use hash tables in, say, a network context, use a > non-cryptographically secure function (usually jhash) and then try to > twiddle with the key on a time basis (or in many cases just do nothing > and hope that nobody notices). While this is an admirable attempt at > solving the problem, it doesn't actually fix it. SipHash fixes it. > Key rotation is important anyway, without any key rotation even if the key is compromised in siphash by some external means we would have an insecure hash until the system reboots. > (It fixes it in such a sound way that you could even build a stream > cipher out of SipHash that would resist the modern cryptanalysis.) > > There are a modicum of places in the kernel that are vulnerable to > hashtable poisoning attacks, either via userspace vectors or network > vectors, and there's not a reliable mechanism inside the kernel at the > moment to fix it. The first step toward fixing these issues is actually > getting a secure primitive into the kernel for developers to use. Then > we can, bit by bit, port things over to it as deemed appropriate. > > Secondly, a few places are using MD5 for creating secure sequence > numbers, port numbers, or fast random numbers. Siphash is a faster, more > fittting, and more secure replacement for MD5 in those situations. > > Dozens of languages are already using this internally for their hash > tables. Some of the BSDs already use this in their kernels. SipHash is > a widely known high-speed solution to a widely known problem, and it's > time we catch-up. > Maybe so, but we need to do due diligence before considering adopting siphash as the primary hashing in the network stack. Consider that we may very well perform a hash over L4 tuples on _every_ packet. We've done a good job at limiting this to be at most one hash per packet, but nevertheless the performance of the hash function must be take into account. A few points: 1) My quick test shows siphash is about four times more expensive than jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33 nsecs with siphash. Given that we have eliminated most of the packet header hashes this might be tolerable, but still should be looking at ways to optimize. 2) I like moving to use u64 (quad words) in the hash, this is an improvement over Jenkins which is based on 32 bit words. If we put this in the kernel we probably want to have several variants of siphash for specific sizes (e.g. siphash1, siphash2, siphash2, siphashn for hash over one, two, three, or n sixty four bit words). 3) I also tested siphash distribution and Avalanche Effect (these really should have been covered in the paper :-( ). Both of these are good with siphash, in fact almost have identical characteristics to Jenkins hash. Tom > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> > Cc: Daniel J. Bernstein <djb@cr.yp.to> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Eric Biggers <ebiggers3@gmail.com> > Cc: David Laight <David.Laight@aculab.com> > --- > Changes from v2->v3: > > - There is now a fast aligned version of the function and a not-as-fast > unaligned version. The requirements for each have been documented in > a docbook-style comment. As well, the header now contains a constant > for the expected alignment. > > - The test suite has been updated to check both the unaligned and aligned > version of the function. > > include/linux/siphash.h | 30 ++++++++++ > lib/Kconfig.debug | 6 +- > lib/Makefile | 5 +- > lib/siphash.c | 153 ++++++++++++++++++++++++++++++++++++++++++++++++ > lib/test_siphash.c | 85 +++++++++++++++++++++++++++ > 5 files changed, 274 insertions(+), 5 deletions(-) > create mode 100644 include/linux/siphash.h > create mode 100644 lib/siphash.c > create mode 100644 lib/test_siphash.c > > diff --git a/include/linux/siphash.h b/include/linux/siphash.h > new file mode 100644 > index 000000000000..82dc1a911a2e > --- /dev/null > +++ b/include/linux/siphash.h > @@ -0,0 +1,30 @@ > +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com> > + * > + * This file is provided under a dual BSD/GPLv2 license. > + * > + * SipHash: a fast short-input PRF > + * https://131002.net/siphash/ > + */ > + > +#ifndef _LINUX_SIPHASH_H > +#define _LINUX_SIPHASH_H > + > +#include <linux/types.h> > + > +enum siphash_lengths { > + SIPHASH24_KEY_LEN = 16, > + SIPHASH24_ALIGNMENT = 8 > +}; > + > +u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]); > + > +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > +static inline u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]) > +{ > + return siphash24(data, len, key); > +} > +#else > +u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]); > +#endif > + > +#endif /* _LINUX_SIPHASH_H */ > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index e6327d102184..32bbf689fc46 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1843,9 +1843,9 @@ config TEST_HASH > tristate "Perform selftest on hash functions" > default n > help > - Enable this option to test the kernel's integer (<linux/hash,h>) > - and string (<linux/stringhash.h>) hash functions on boot > - (or module load). > + Enable this option to test the kernel's integer (<linux/hash.h>), > + string (<linux/stringhash.h>), and siphash (<linux/siphash.h>) > + hash functions on boot (or module load). > > This is intended to help people writing architecture-specific > optimized versions. If unsure, say N. > diff --git a/lib/Makefile b/lib/Makefile > index 50144a3aeebd..71d398b04a74 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ > sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ > flex_proportions.o ratelimit.o show_mem.o \ > is_single_threaded.o plist.o decompress.o kobject_uevent.o \ > - earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o > + earlycpio.o seq_buf.o siphash.o \ > + nmi_backtrace.o nodemask.o win_minmax.o > > lib-$(CONFIG_MMU) += ioremap.o > lib-$(CONFIG_SMP) += cpumask.o > @@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o > obj-y += kstrtox.o > obj-$(CONFIG_TEST_BPF) += test_bpf.o > obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o > -obj-$(CONFIG_TEST_HASH) += test_hash.o > +obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o > obj-$(CONFIG_TEST_KASAN) += test_kasan.o > obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o > obj-$(CONFIG_TEST_LKM) += test_module.o > diff --git a/lib/siphash.c b/lib/siphash.c > new file mode 100644 > index 000000000000..32acdc26234f > --- /dev/null > +++ b/lib/siphash.c > @@ -0,0 +1,153 @@ > +/* Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com> > + * Copyright (C) 2012-2014 Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> > + * Copyright (C) 2012-2014 Daniel J. Bernstein <djb@cr.yp.to> > + * > + * This file is provided under a dual BSD/GPLv2 license. > + * > + * SipHash: a fast short-input PRF > + * https://131002.net/siphash/ > + */ > + > +#include <linux/siphash.h> > +#include <linux/kernel.h> > +#include <asm/unaligned.h> > + > +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 > +#include <linux/dcache.h> > +#include <asm/word-at-a-time.h> > +#endif > + > +#define SIPROUND \ > + do { \ > + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \ > + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \ > + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \ > + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ > + } while(0) > + > +static inline u16 le16_to_cpuvp(const void *p) > +{ > + return le16_to_cpup(p); > +} > +static inline u32 le32_to_cpuvp(const void *p) > +{ > + return le32_to_cpup(p); > +} > +static inline u64 le64_to_cpuvp(const void *p) > +{ > + return le64_to_cpup(p); > +} > + > +/** > + * siphash24 - compute 64-bit siphash24 PRF value > + * @data: buffer to hash, must be aligned to SIPHASH24_ALIGNMENT > + * @size: size of @data > + * @key: key buffer of size SIPHASH24_KEY_LEN, must be aligned to SIPHASH24_ALIGNMENT > + */ > +u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]) > +{ > + u64 v0 = 0x736f6d6570736575ULL; > + u64 v1 = 0x646f72616e646f6dULL; > + u64 v2 = 0x6c7967656e657261ULL; > + u64 v3 = 0x7465646279746573ULL; > + u64 b = ((u64)len) << 56; > + u64 k0 = le64_to_cpuvp(key); > + u64 k1 = le64_to_cpuvp(key + sizeof(u64)); > + u64 m; > + const u8 *end = data + len - (len % sizeof(u64)); > + const u8 left = len & (sizeof(u64) - 1); > + v3 ^= k1; > + v2 ^= k0; > + v1 ^= k1; > + v0 ^= k0; > + for (; data != end; data += sizeof(u64)) { > + m = le64_to_cpuvp(data); > + v3 ^= m; > + SIPROUND; > + SIPROUND; > + v0 ^= m; > + } > +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 > + if (left) > + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & bytemask_from_count(left))); > +#else > + switch (left) { > + case 7: b |= ((u64)data[6]) << 48; > + case 6: b |= ((u64)data[5]) << 40; > + case 5: b |= ((u64)data[4]) << 32; > + case 4: b |= le32_to_cpuvp(data); break; > + case 3: b |= ((u64)data[2]) << 16; > + case 2: b |= le16_to_cpuvp(data); break; > + case 1: b |= data[0]; > + } > +#endif > + v3 ^= b; > + SIPROUND; > + SIPROUND; > + v0 ^= b; > + v2 ^= 0xff; > + SIPROUND; > + SIPROUND; > + SIPROUND; > + SIPROUND; > + return (v0 ^ v1) ^ (v2 ^ v3); > +} > +EXPORT_SYMBOL(siphash24); > + > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > +/** > + * siphash24 - compute 64-bit siphash24 PRF value, without alignment requirements > + * @data: buffer to hash > + * @size: size of @data > + * @key: key buffer of size SIPHASH24_KEY_LEN > + */ > +u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]) > +{ > + u64 v0 = 0x736f6d6570736575ULL; > + u64 v1 = 0x646f72616e646f6dULL; > + u64 v2 = 0x6c7967656e657261ULL; > + u64 v3 = 0x7465646279746573ULL; > + u64 b = ((u64)len) << 56; > + u64 k0 = get_unaligned_le64(key); > + u64 k1 = get_unaligned_le64(key + sizeof(u64)); > + u64 m; > + const u8 *end = data + len - (len % sizeof(u64)); > + const u8 left = len & (sizeof(u64) - 1); > + v3 ^= k1; > + v2 ^= k0; > + v1 ^= k1; > + v0 ^= k0; > + for (; data != end; data += sizeof(u64)) { > + m = get_unaligned_le64(data); > + v3 ^= m; > + SIPROUND; > + SIPROUND; > + v0 ^= m; > + } > +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 > + if (left) > + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & bytemask_from_count(left))); > +#else > + switch (left) { > + case 7: b |= ((u64)data[6]) << 48; > + case 6: b |= ((u64)data[5]) << 40; > + case 5: b |= ((u64)data[4]) << 32; > + case 4: b |= get_unaligned_le32(data); break; > + case 3: b |= ((u64)data[2]) << 16; > + case 2: b |= get_unaligned_le16(data); break; > + case 1: b |= data[0]; > + } > +#endif > + v3 ^= b; > + SIPROUND; > + SIPROUND; > + v0 ^= b; > + v2 ^= 0xff; > + SIPROUND; > + SIPROUND; > + SIPROUND; > + SIPROUND; > + return (v0 ^ v1) ^ (v2 ^ v3); > +} > +EXPORT_SYMBOL(siphash24_unaligned); > +#endif > diff --git a/lib/test_siphash.c b/lib/test_siphash.c > new file mode 100644 > index 000000000000..69ac94dec366 > --- /dev/null > +++ b/lib/test_siphash.c > @@ -0,0 +1,85 @@ > +/* Test cases for siphash.c > + * > + * Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com> > + * > + * This file is provided under a dual BSD/GPLv2 license. > + * > + * SipHash: a fast short-input PRF > + * https://131002.net/siphash/ > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include <linux/siphash.h> > +#include <linux/kernel.h> > +#include <linux/string.h> > +#include <linux/errno.h> > +#include <linux/module.h> > + > +/* Test vectors taken from official reference source available at: > + * https://131002.net/siphash/siphash24.c > + */ > +static const u64 test_vectors[64] = { > + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, > + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, > + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, > + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL, > + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL, > + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL, > + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL, > + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL, > + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL, > + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL, > + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL, > + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL, > + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL, > + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL, > + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL, > + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL, > + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL, > + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL, > + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL, > + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL, > + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, > + 0x958a324ceb064572ULL > +}; > + > +static int __init siphash_test_init(void) > +{ > + u8 in[64] __aligned(SIPHASH24_ALIGNMENT); > + u8 k[16] __aligned(SIPHASH24_ALIGNMENT); > + u8 in_unaligned[65]; > + u8 k_unaligned[65]; > + u8 i; > + int ret = 0; > + > + for (i = 0; i < 16; ++i) { > + k[i] = i; > + k_unaligned[i + 1] = i; > + } > + for (i = 0; i < 64; ++i) { > + in[i] = i; > + in_unaligned[i + 1] = i; > + if (siphash24(in, i, k) != test_vectors[i]) { > + pr_info("self-test aligned %u: FAIL\n", i + 1); > + ret = -EINVAL; > + } > + if (siphash24_unaligned(in_unaligned + 1, i, k_unaligned + 1) != test_vectors[i]) { > + pr_info("self-test unaligned %u: FAIL\n", i + 1); > + ret = -EINVAL; > + } > + } > + if (!ret) > + pr_info("self-tests: pass\n"); > + return ret; > +} > + > +static void __exit siphash_test_exit(void) > +{ > +} > + > +module_init(siphash_test_init); > +module_exit(siphash_test_exit); > + > +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); > +MODULE_LICENSE("Dual BSD/GPL"); > -- > 2.11.0 >
Hi Tom, On Wed, Dec 14, 2016 at 8:18 PM, Tom Herbert <tom@herbertland.com> wrote: > "super fast" is relative. My quick test shows that this faster than > Toeplitz (good, but not exactly hard to achieve), but is about 4x > slower than jhash. Fast relative to other cryptographically secure PRFs. >> SipHash isn't just some new trendy hash function. It's been around for a >> while, and there really isn't anything that comes remotely close to >> being useful in the way SipHash is. With that said, why do we need this? > I don't think we need advertising nor a lesson on hashing. It would be > much more useful if you just point us to the paper on siphash (which I > assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?). Ugh. Sorry. It definitely wasn't my intention to give an uninvited lesson or an annoying advert. For the former, I didn't want to make any expectations about fields of knowledge, because I honest have no idea. For the latter, I wrote that sentence to indicate that siphash isn't just some newfangled hipster function, but something useful and well established. I didn't mean it as a form of advertising. My apologies if I've offended your sensibilities. That cr.yp.to link is fine, or https://131002.net/siphash/siphash.pdf I believe. > Key rotation is important anyway, without any key rotation even if the > key is compromised in siphash by some external means we would have an > insecure hash until the system reboots. I'm a bit surprised to read this. I've never designed a system to be secure even in the event of remote arbitrary kernel memory disclosure, and I wasn't aware this was generally considered an architectural requirement or Linux. In any case, if you want this, I suppose you can have it with siphash too. > Maybe so, but we need to do due diligence before considering adopting > siphash as the primary hashing in the network stack. Consider that we > may very well perform a hash over L4 tuples on _every_ packet. We've > done a good job at limiting this to be at most one hash per packet, > but nevertheless the performance of the hash function must be take > into account. I agree with you. It seems like each case is going to needed to be measured on a case by case basis. In this series I make the first use of siphash in the secure sequence generation and get_random_int/long, where siphash replaces md5, so there's a pretty clear performance in. But for the jhash replacements indeed things are going to need to be individually evaluated. > 1) My quick test shows siphash is about four times more expensive than > jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit > addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33 > nsecs with siphash. Given that we have eliminated most of the packet > header hashes this might be tolerable, but still should be looking at > ways to optimize. > 2) I like moving to use u64 (quad words) in the hash, this is an > improvement over Jenkins which is based on 32 bit words. If we put > this in the kernel we probably want to have several variants of > siphash for specific sizes (e.g. siphash1, siphash2, siphash2, > siphashn for hash over one, two, three, or n sixty four bit words). I think your suggestion for (2) will contribute to further optimizations for (1). In v2, I had another patch in there adding siphash_1word, siphash_2words, etc, like jhash, but I implemented it by taking u32 variables and then just concatenating these into a buffer and passing them to the main siphash function. I removed it from v3 because I thought that these kind of missed the whole point. In particular: a) siphash24_1word, siphash24_2words, siphash24_3words, etc should take u64, not u32, since that's what siphash operates on natively b) Rather than concatenating them in a buffer, I should write specializations of the siphash24 function _especially_ for these size inputs to avoid the copy and to reduce the book keeping. I'll add these functions to v4 implemented like that. Thanks for the useful feedback and benchmarks! Jason
Hey Tom, Just following up on what I mentioned in my last email... On Wed, Dec 14, 2016 at 8:35 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > I think your suggestion for (2) will contribute to further > optimizations for (1). In v2, I had another patch in there adding > siphash_1word, siphash_2words, etc, like jhash, but I implemented it > by taking u32 variables and then just concatenating these into a > buffer and passing them to the main siphash function. I removed it > from v3 because I thought that these kind of missed the whole point. > In particular: > > a) siphash24_1word, siphash24_2words, siphash24_3words, etc should > take u64, not u32, since that's what siphash operates on natively I implemented these here: https://git.zx2c4.com/linux-dev/commit/?h=siphash&id=4652b6f3643bdba217e2194d89661348bbac48a0 This will be part of the next version of the series I submit. It's not immediately clear that using it is strictly faster than the struct trick though. However, I'm not yet sure why this would be. Jason
Hi Jason, [auto build test ERROR on linus/master] [also build test ERROR on v4.9 next-20161214] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/siphash-add-cryptographically-secure-hashtable-function/20161215-041458 config: i386-randconfig-i1-201650 (attached as .config) compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): lib/test_siphash.c: In function 'siphash_test_init': >> lib/test_siphash.c:49:2: error: requested alignment is not an integer constant u8 in[64] __aligned(SIPHASH24_ALIGNMENT); ^ lib/test_siphash.c:50:2: error: requested alignment is not an integer constant u8 k[16] __aligned(SIPHASH24_ALIGNMENT); ^ vim +49 lib/test_siphash.c 43 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, 44 0x958a324ceb064572ULL 45 }; 46 47 static int __init siphash_test_init(void) 48 { > 49 u8 in[64] __aligned(SIPHASH24_ALIGNMENT); 50 u8 k[16] __aligned(SIPHASH24_ALIGNMENT); 51 u8 in_unaligned[65]; 52 u8 k_unaligned[65]; --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Interesting. Evidently gcc 4.8 doesn't like my use of: enum siphash_lengths { SIPHASH24_KEY_LEN = 16, SIPHASH24_ALIGNMENT = 8 }; I'll convert this to the more boring: #define SIPHASH24_KEY_LEN 16 #define SIPHASH24_ALIGNMENT 8
On Wed, Dec 14, 2016 at 12:55 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > Hey Tom, > > Just following up on what I mentioned in my last email... > > On Wed, Dec 14, 2016 at 8:35 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: >> I think your suggestion for (2) will contribute to further >> optimizations for (1). In v2, I had another patch in there adding >> siphash_1word, siphash_2words, etc, like jhash, but I implemented it >> by taking u32 variables and then just concatenating these into a >> buffer and passing them to the main siphash function. I removed it >> from v3 because I thought that these kind of missed the whole point. >> In particular: >> >> a) siphash24_1word, siphash24_2words, siphash24_3words, etc should >> take u64, not u32, since that's what siphash operates on natively > > I implemented these here: > https://git.zx2c4.com/linux-dev/commit/?h=siphash&id=4652b6f3643bdba217e2194d89661348bbac48a0 > Those look good, although I would probably just do 1,2,3 words and then have a function that takes n words like jhash. Might want to call these dword to distinguish from 32 bit words in jhash. Also, what is the significance of "24" in the function and constant names? Can we just drop that and call this siphash? Tom > This will be part of the next version of the series I submit. It's not > immediately clear that using it is strictly faster than the struct > trick though. However, I'm not yet sure why this would be. > > Jason
Hey Tom, On Wed, Dec 14, 2016 at 10:35 PM, Tom Herbert <tom@herbertland.com> wrote: > Those look good, although I would probably just do 1,2,3 words and > then have a function that takes n words like jhash. Might want to call > these dword to distinguish from 32 bit words in jhash. So actually jhash_Nwords makes no sense, since it takes dwords (32-bits) not words (16-bits). The siphash analog should be called siphash24_Nqwords. I think what I'll do is change what I already have to: siphash24_1qword siphash24_2qword siphash24_3qword siphash24_4qword And then add some static inline helpers to assist with smaller u32s like ipv4 addresses called: siphash24_2dword siphash24_4dword siphash24_6dword siphash24_8dword While we're having something new, might as well call it the right thing. > Also, what is the significance of "24" in the function and constant > names? Can we just drop that and call this siphash? SipHash is actually a family of PRFs, differentiated by the number of SIPROUNDs after each 64-bit input is processed and the number of SIPROUNDs at the very end of the function. The best trade-off of speed and security for kernel usage is 2 rounds after each 64-bit input and 4 rounds at the end of the function. This doesn't fall to any known cryptanalysis and it's very fast.
On Wed, Dec 14, 2016 at 2:56 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > Hey Tom, > > On Wed, Dec 14, 2016 at 10:35 PM, Tom Herbert <tom@herbertland.com> wrote: >> Those look good, although I would probably just do 1,2,3 words and >> then have a function that takes n words like jhash. Might want to call >> these dword to distinguish from 32 bit words in jhash. > > So actually jhash_Nwords makes no sense, since it takes dwords > (32-bits) not words (16-bits). The siphash analog should be called > siphash24_Nqwords. > Yeah, that's a "bug" with jhash function names. > I think what I'll do is change what I already have to: > siphash24_1qword > siphash24_2qword > siphash24_3qword > siphash24_4qword > > And then add some static inline helpers to assist with smaller u32s > like ipv4 addresses called: > > siphash24_2dword > siphash24_4dword > siphash24_6dword > siphash24_8dword > > While we're having something new, might as well call it the right thing. > I'm confused, doesn't 2dword == 1qword? Anyway, I think the qword functions are good enough. If someone needs to hash over some odd length they can either put them in a structure padded to 64 bits or call the hash function that takes a byte length. > >> Also, what is the significance of "24" in the function and constant >> names? Can we just drop that and call this siphash? > > SipHash is actually a family of PRFs, differentiated by the number of > SIPROUNDs after each 64-bit input is processed and the number of > SIPROUNDs at the very end of the function. The best trade-off of speed > and security for kernel usage is 2 rounds after each 64-bit input and > 4 rounds at the end of the function. This doesn't fall to any known > cryptanalysis and it's very fast. I'd still drop the "24" unless you really think we're going to have multiple variants coming into the kernel. Tom
Hey Tom, On Thu, Dec 15, 2016 at 12:14 AM, Tom Herbert <tom@herbertland.com> wrote: > I'm confused, doesn't 2dword == 1qword? Anyway, I think the qword > functions are good enough. If someone needs to hash over some odd > length they can either put them in a structure padded to 64 bits or > call the hash function that takes a byte length. Yes. Here's an example: static inline u64 siphash24_2dwords(const u32 a, const u32 b, const u8 key[SIPHASH24_KEY_LEN]) { return siphash24_1qword(((u64)b << 32) | a, key); } This winds up being extremely useful and syntactically convenient in a few places. Check out my git branch in about 10 minutes or wait for v4 to be posted tomorrow; these are nice helpers. > I'd still drop the "24" unless you really think we're going to have > multiple variants coming into the kernel. Okay. I don't have a problem with this, unless anybody has some reason to the contrary. Jason
On Wed, Dec 14, 2016 at 2:56 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > So actually jhash_Nwords makes no sense, since it takes dwords > (32-bits) not words (16-bits). The siphash analog should be called > siphash24_Nqwords. No. The bug is talking about "words" in the first place. Depending on your background, a "word" can be generally be either 16 bits or 32 bits (or, in some cases, 18 bits). In theory, a 64-bit entity can be a "word" too, but pretty much nobody uses that. Even architectures that started out with a 64-bit register size and never had any smaller historical baggage (eg alpha) tend to call 32-bit entities "words". So 16 bits can be a word, but some people/architectures will call it a "half-word". To make matters even more confusing, a "quadword" is generally always 64 bits, regardless of the size of "word". So please try to avoid the use of "word" entirely. It's too ambiguous, and it's not even helpful as a "size of the native register". It's almost purely random. For the kernel, we tend use - uX for types that have specific sizes (X being the number of bits) - "[unsigned] long" for native register size But never "word". Linus
Hey Linus, On Thu, Dec 15, 2016 at 12:30 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > No. The bug is talking about "words" in the first place. > > Depending on your background, a "word" can be generally be either 16 > bits or 32 bits (or, in some cases, 18 bits). > > In theory, a 64-bit entity can be a "word" too, but pretty much nobody > uses that. Even architectures that started out with a 64-bit register > size and never had any smaller historical baggage (eg alpha) tend to > call 32-bit entities "words". > > So 16 bits can be a word, but some people/architectures will call it a > "half-word". > > To make matters even more confusing, a "quadword" is generally always > 64 bits, regardless of the size of "word". > > So please try to avoid the use of "word" entirely. It's too ambiguous, > and it's not even helpful as a "size of the native register". It's > almost purely random. > > For the kernel, we tend use > > - uX for types that have specific sizes (X being the number of bits) > > - "[unsigned] long" for native register size > > But never "word". The voice of reason. Have a desired name for this function family? siphash_3u64s siphash_3u64 siphash_three_u64 siphash_3sixityfourbitintegers Or does your reasonable dislike of "word" still allow for the use of dword and qword, so that the current function names of: siphash_3qwords siphash_6dwords are okay? Jason
On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > Or does your reasonable dislike of "word" still allow for the use of > dword and qword, so that the current function names of: dword really is confusing to people. If you have a MIPS background, it means 64 bits. While to people with Windows programming backgrounds it means 32 bits. Please try to avoid using it. As mentioned, I think almost everybody agrees on the "q" part being 64 bits, but that may just be me not having seen it in any other context. And before anybody points it out - yes, we already have lots of uses of "dword" in various places. But they tend to be mostly hardware-specific - either architectures or drivers. So I'd _prefer_ to try to keep "word" and "dword" away from generic helper routines. But it's not like anything is really black and white. Linus
From: Linus Torvalds > Sent: 15 December 2016 00:11 > On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > Or does your reasonable dislike of "word" still allow for the use of > > dword and qword, so that the current function names of: > > dword really is confusing to people. > > If you have a MIPS background, it means 64 bits. While to people with > Windows programming backgrounds it means 32 bits. Guess what a DWORD_PTR is on 64bit windows ... (it is an integer type). David
On Thu, 15 Dec 2016, Jason A. Donenfeld wrote: > > I'd still drop the "24" unless you really think we're going to have > > multiple variants coming into the kernel. > > Okay. I don't have a problem with this, unless anybody has some reason > to the contrary. What if the 2/4-round version falls and we need more rounds to withstand future cryptoanalysis? We'd then have siphash_ and siphash48_ functions, no? My amateurish bike-shedding argument would be "let's keep the 24 then" :-) C.
diff --git a/include/linux/siphash.h b/include/linux/siphash.h new file mode 100644 index 000000000000..82dc1a911a2e --- /dev/null +++ b/include/linux/siphash.h @@ -0,0 +1,30 @@ +/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com> + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + */ + +#ifndef _LINUX_SIPHASH_H +#define _LINUX_SIPHASH_H + +#include <linux/types.h> + +enum siphash_lengths { + SIPHASH24_KEY_LEN = 16, + SIPHASH24_ALIGNMENT = 8 +}; + +u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]); + +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +static inline u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]) +{ + return siphash24(data, len, key); +} +#else +u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]); +#endif + +#endif /* _LINUX_SIPHASH_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index e6327d102184..32bbf689fc46 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1843,9 +1843,9 @@ config TEST_HASH tristate "Perform selftest on hash functions" default n help - Enable this option to test the kernel's integer (<linux/hash,h>) - and string (<linux/stringhash.h>) hash functions on boot - (or module load). + Enable this option to test the kernel's integer (<linux/hash.h>), + string (<linux/stringhash.h>), and siphash (<linux/siphash.h>) + hash functions on boot (or module load). This is intended to help people writing architecture-specific optimized versions. If unsure, say N. diff --git a/lib/Makefile b/lib/Makefile index 50144a3aeebd..71d398b04a74 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ - earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o + earlycpio.o seq_buf.o siphash.o \ + nmi_backtrace.o nodemask.o win_minmax.o lib-$(CONFIG_MMU) += ioremap.o lib-$(CONFIG_SMP) += cpumask.o @@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o obj-y += kstrtox.o obj-$(CONFIG_TEST_BPF) += test_bpf.o obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o -obj-$(CONFIG_TEST_HASH) += test_hash.o +obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o obj-$(CONFIG_TEST_KASAN) += test_kasan.o obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o obj-$(CONFIG_TEST_LKM) += test_module.o diff --git a/lib/siphash.c b/lib/siphash.c new file mode 100644 index 000000000000..32acdc26234f --- /dev/null +++ b/lib/siphash.c @@ -0,0 +1,153 @@ +/* Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com> + * Copyright (C) 2012-2014 Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> + * Copyright (C) 2012-2014 Daniel J. Bernstein <djb@cr.yp.to> + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + */ + +#include <linux/siphash.h> +#include <linux/kernel.h> +#include <asm/unaligned.h> + +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 +#include <linux/dcache.h> +#include <asm/word-at-a-time.h> +#endif + +#define SIPROUND \ + do { \ + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \ + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \ + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \ + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ + } while(0) + +static inline u16 le16_to_cpuvp(const void *p) +{ + return le16_to_cpup(p); +} +static inline u32 le32_to_cpuvp(const void *p) +{ + return le32_to_cpup(p); +} +static inline u64 le64_to_cpuvp(const void *p) +{ + return le64_to_cpup(p); +} + +/** + * siphash24 - compute 64-bit siphash24 PRF value + * @data: buffer to hash, must be aligned to SIPHASH24_ALIGNMENT + * @size: size of @data + * @key: key buffer of size SIPHASH24_KEY_LEN, must be aligned to SIPHASH24_ALIGNMENT + */ +u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]) +{ + u64 v0 = 0x736f6d6570736575ULL; + u64 v1 = 0x646f72616e646f6dULL; + u64 v2 = 0x6c7967656e657261ULL; + u64 v3 = 0x7465646279746573ULL; + u64 b = ((u64)len) << 56; + u64 k0 = le64_to_cpuvp(key); + u64 k1 = le64_to_cpuvp(key + sizeof(u64)); + u64 m; + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + v3 ^= k1; + v2 ^= k0; + v1 ^= k1; + v0 ^= k0; + for (; data != end; data += sizeof(u64)) { + m = le64_to_cpuvp(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)data[6]) << 48; + case 6: b |= ((u64)data[5]) << 40; + case 5: b |= ((u64)data[4]) << 32; + case 4: b |= le32_to_cpuvp(data); break; + case 3: b |= ((u64)data[2]) << 16; + case 2: b |= le16_to_cpuvp(data); break; + case 1: b |= data[0]; + } +#endif + v3 ^= b; + SIPROUND; + SIPROUND; + v0 ^= b; + v2 ^= 0xff; + SIPROUND; + SIPROUND; + SIPROUND; + SIPROUND; + return (v0 ^ v1) ^ (v2 ^ v3); +} +EXPORT_SYMBOL(siphash24); + +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +/** + * siphash24 - compute 64-bit siphash24 PRF value, without alignment requirements + * @data: buffer to hash + * @size: size of @data + * @key: key buffer of size SIPHASH24_KEY_LEN + */ +u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]) +{ + u64 v0 = 0x736f6d6570736575ULL; + u64 v1 = 0x646f72616e646f6dULL; + u64 v2 = 0x6c7967656e657261ULL; + u64 v3 = 0x7465646279746573ULL; + u64 b = ((u64)len) << 56; + u64 k0 = get_unaligned_le64(key); + u64 k1 = get_unaligned_le64(key + sizeof(u64)); + u64 m; + const u8 *end = data + len - (len % sizeof(u64)); + const u8 left = len & (sizeof(u64) - 1); + v3 ^= k1; + v2 ^= k0; + v1 ^= k1; + v0 ^= k0; + for (; data != end; data += sizeof(u64)) { + m = get_unaligned_le64(data); + v3 ^= m; + SIPROUND; + SIPROUND; + v0 ^= m; + } +#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64 + if (left) + b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & bytemask_from_count(left))); +#else + switch (left) { + case 7: b |= ((u64)data[6]) << 48; + case 6: b |= ((u64)data[5]) << 40; + case 5: b |= ((u64)data[4]) << 32; + case 4: b |= get_unaligned_le32(data); break; + case 3: b |= ((u64)data[2]) << 16; + case 2: b |= get_unaligned_le16(data); break; + case 1: b |= data[0]; + } +#endif + v3 ^= b; + SIPROUND; + SIPROUND; + v0 ^= b; + v2 ^= 0xff; + SIPROUND; + SIPROUND; + SIPROUND; + SIPROUND; + return (v0 ^ v1) ^ (v2 ^ v3); +} +EXPORT_SYMBOL(siphash24_unaligned); +#endif diff --git a/lib/test_siphash.c b/lib/test_siphash.c new file mode 100644 index 000000000000..69ac94dec366 --- /dev/null +++ b/lib/test_siphash.c @@ -0,0 +1,85 @@ +/* Test cases for siphash.c + * + * Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com> + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/siphash.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <linux/errno.h> +#include <linux/module.h> + +/* Test vectors taken from official reference source available at: + * https://131002.net/siphash/siphash24.c + */ +static const u64 test_vectors[64] = { + 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL, + 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL, + 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL, + 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL, + 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL, + 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL, + 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL, + 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL, + 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL, + 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL, + 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL, + 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL, + 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL, + 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL, + 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL, + 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL, + 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL, + 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL, + 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL, + 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL, + 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL, + 0x958a324ceb064572ULL +}; + +static int __init siphash_test_init(void) +{ + u8 in[64] __aligned(SIPHASH24_ALIGNMENT); + u8 k[16] __aligned(SIPHASH24_ALIGNMENT); + u8 in_unaligned[65]; + u8 k_unaligned[65]; + u8 i; + int ret = 0; + + for (i = 0; i < 16; ++i) { + k[i] = i; + k_unaligned[i + 1] = i; + } + for (i = 0; i < 64; ++i) { + in[i] = i; + in_unaligned[i + 1] = i; + if (siphash24(in, i, k) != test_vectors[i]) { + pr_info("self-test aligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + if (siphash24_unaligned(in_unaligned + 1, i, k_unaligned + 1) != test_vectors[i]) { + pr_info("self-test unaligned %u: FAIL\n", i + 1); + ret = -EINVAL; + } + } + if (!ret) + pr_info("self-tests: pass\n"); + return ret; +} + +static void __exit siphash_test_exit(void) +{ +} + +module_init(siphash_test_init); +module_exit(siphash_test_exit); + +MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); +MODULE_LICENSE("Dual BSD/GPL");
SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function. SipHash isn't just some new trendy hash function. It's been around for a while, and there really isn't anything that comes remotely close to being useful in the way SipHash is. With that said, why do we need this? There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Linux developers already seem to be aware that this is an issue, and various places that use hash tables in, say, a network context, use a non-cryptographically secure function (usually jhash) and then try to twiddle with the key on a time basis (or in many cases just do nothing and hope that nobody notices). While this is an admirable attempt at solving the problem, it doesn't actually fix it. SipHash fixes it. (It fixes it in such a sound way that you could even build a stream cipher out of SipHash that would resist the modern cryptanalysis.) There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. Secondly, a few places are using MD5 for creating secure sequence numbers, port numbers, or fast random numbers. Siphash is a faster, more fittting, and more secure replacement for MD5 in those situations. Dozens of languages are already using this internally for their hash tables. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known problem, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Daniel J. Bernstein <djb@cr.yp.to> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> --- Changes from v2->v3: - There is now a fast aligned version of the function and a not-as-fast unaligned version. The requirements for each have been documented in a docbook-style comment. As well, the header now contains a constant for the expected alignment. - The test suite has been updated to check both the unaligned and aligned version of the function. include/linux/siphash.h | 30 ++++++++++ lib/Kconfig.debug | 6 +- lib/Makefile | 5 +- lib/siphash.c | 153 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/test_siphash.c | 85 +++++++++++++++++++++++++++ 5 files changed, 274 insertions(+), 5 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c