random: use max-period linear interrupt extractor

The current fast_mix() function is a piece of classic mailing list
crypto, where it just sort of sprung up without a lot of real analysis
of what precisely it was accomplishing. As an ARX permutation of only
two rounds, there are some easily searchable differential trails in it,
and as a means of preventing malicious interrupts, it completely fails,
since it xors new data into the entire state every time. It can't really
be analyzed as a random permutation, because it clearly isn't, and it
can't be analyzed as an interesting linear algebraic structure either,
because it's also not that. There really is very little one can say
about it in terms of entropy accumulation. It might diffuse bits, some
of the time, maybe, we hope, I guess. But for the most part, it fails to
accomplish anything concrete.

As a reminder, the simple goal of add_interrupt_randomness() is to
simply accumulate entropy until ~64 interrupts have elapsed, and then
dump it into the main input pool, which uses a cryptographic hash.

It would be nice to have something cryptographically strong in the
interrupt handler itself, in case a malicious interrupt compromises a
per-cpu fast pool within the 64 interrupts / 1 second window, and then
inside of that same window somehow can control its return address and
cycle counter, even if that's a bit far fetched. However, with a very
CPU-limited budget, actually doing that remains an active research
project (and perhaps there'll be something useful for Linux to come out
of it). And while the abundance of caution would be nice, this isn't
*currently* the security model, and we don't yet have a fast enough
solution to make it our security model.  Plus there's not exactly a
pressing need to do that. (And for the avoidance of doubt, the actual
cluster of 64 accumulated interrupts still gets dumped into our
cryptographically secure input pool.)

So, for now we are going to stick with the existing interrupt security
model, which assumes that each cluster of 64 interrupt data samples is
mostly non-malicious and not colluding with an infoleaker. With this as
our goal, we can then endeavor to simply accumulate entropy linearly,
discarding the least amount of it, and make sure that accumulation is
sound, unlike the case with the current fast_mix().

It turns out that this security model is also the trade off that other
operating systems have taken. The NT kernel, for example, uses something
very simple to accumulate entropy in interrupts, `s = ror32(s, 5) ^ x`.
Dodis et al. analyzed this in <https://eprint.iacr.org/2021/523>, and
found that rotation by 7 would have been better than 5, but that
otherwise, simple linear constructions like this can be used as an
entropy collector for 2-monotone distributions.

However, when considering this for our four-word accumulation, versus
NT's one-word, we run into potential problems because the words don't
contribute to each other, and some may well be fixed, which means we'd
need something to schedule on top. And more importantly, our
distribution is not 2-monotone like NT's, because in addition to the
cycle counter, we also include in those 4 words a register value, a
return address, and an inverted jiffies. (Whether capturing anything
beyond the cycle counter in the interrupt handler is even adding much of
value is a question for a different time.)

So since we have 4 words, and not all of them are 2-monotone, we instead
look for a proven linear extractor that works for more complex
distributions. It turns out that a max-period linear feedback shift
register fits this bill quite well, easily extending to the larger state
size and to the fact that we're mixing in more than just the cycle
counter. By picking a linear function with no non-trivial invariant
subspace, unlike NT's rotate-xor, we benefit from the analysis of
<https://eprint.iacr.org/2021/1002>.  This paper proves that those types
of linear functions, used in the form `s = f(s) ^ x`, make very good
entropy extractors for the type of complex distributions that we have.

This commit implements one such very fast and high diffusion max-period
linear function in a Feistel-like fashion, which pipelines quite well.
On an i7-11850H, this takes 34 cycles, versus the original's 65 cycles.
(Though, as a note for posterity: if later this is replaced with some
sort of cryptographic hash function, I'll still be keeping 65 cycles as
my target 

Message ID	20220218121054.45304-1-Jason@zx2c4.com (mailing list archive)
State	Not Applicable
Delegated to:	Herbert Xu
Headers	show Return-Path: <linux-crypto-owner@kernel.org> From: "Jason A. Donenfeld" <Jason@zx2c4.com> To: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>, Theodore Ts'o <tytso@mit.edu>, Greg Kroah-Hartman <gregkh@linuxfoundation.org> Subject: [PATCH] random: use max-period linear interrupt extractor Date: Fri, 18 Feb 2022 13:10:54 +0100 Message-Id: <20220218121054.45304-1-Jason@zx2c4.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	random: use max-period linear interrupt extractor \| expand random: use max-period linear interrupt extractor

random: use max-period linear interrupt extractor

Commit Message

Comments

Patch