Message ID | 20200205223950.1212394-4-kristen@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Finer grained kernel address space randomization | expand |
> On Feb 5, 2020, at 2:39 PM, Kristen Carlson Accardi <kristen@linux.intel.com> wrote: > > From: Kees Cook <keescook@chromium.org> > > Under earlyprintk, each RNG call produces a debug report line. When > shuffling hundreds of functions, this is not useful information (each > line is identical and tells us nothing new). Instead, allow for a NULL > "purpose" to suppress the debug reporting. Have you counted how many RDRAND calls this causes? RDRAND is exceedingly slow on all CPUs I’ve looked at. The whole “RDRAND has great bandwidth” marketing BS actually means that it has decent bandwidth if all CPUs hammer it at the same time. The latency is abysmal. I have asked Intel to improve this, but the latency of that request will be quadrillions of cycles :) It wouldn’t shock me if just the RDRAND calls account for a respectable fraction of total time. The RDTSC fallback, on the other hand, may be so predictable as to be useless. I would suggest adding a little ChaCha20 DRBG or similar to the KASLR environment instead. What crypto primitives are available there?
On Wed, Feb 05, 2020 at 05:08:55PM -0800, Andy Lutomirski wrote: > > > > On Feb 5, 2020, at 2:39 PM, Kristen Carlson Accardi <kristen@linux.intel.com> wrote: > > > > From: Kees Cook <keescook@chromium.org> > > > > Under earlyprintk, each RNG call produces a debug report line. When > > shuffling hundreds of functions, this is not useful information (each > > line is identical and tells us nothing new). Instead, allow for a NULL > > "purpose" to suppress the debug reporting. > > Have you counted how many RDRAND calls this causes? RDRAND is > exceedingly slow on all CPUs I’ve looked at. The whole “RDRAND > has great bandwidth” marketing BS actually means that it has decent > bandwidth if all CPUs hammer it at the same time. The latency is abysmal. > I have asked Intel to improve this, but the latency of that request will > be quadrillions of cycles :) In an earlier version of this series, it was called once per function section (so, about 50,000 times). The (lack of) speed was quite measurable. > I would suggest adding a little ChaCha20 DRBG or similar to the KASLR > environment instead. What crypto primitives are available there? Agreed. The simple PRNG in the next patch was most just a POC initially, but Kristen kept it due to its debugging properties (specifying an external seed). Pulling in ChaCha20 seems like a good approach.
On Wed, 2020-02-05 at 17:08 -0800, Andy Lutomirski wrote: > > On Feb 5, 2020, at 2:39 PM, Kristen Carlson Accardi < > > kristen@linux.intel.com> wrote: > > > > From: Kees Cook <keescook@chromium.org> > > > > Under earlyprintk, each RNG call produces a debug report line. When > > shuffling hundreds of functions, this is not useful information > > (each > > line is identical and tells us nothing new). Instead, allow for a > > NULL > > "purpose" to suppress the debug reporting. > > Have you counted how many RDRAND calls this causes? RDRAND is > exceedingly slow on all CPUs I’ve looked at. The whole “RDRAND has > great bandwidth” marketing BS actually means that it has decent > bandwidth if all CPUs hammer it at the same time. The latency is > abysmal. I have asked Intel to improve this, but the latency of that > request will be quadrillions of cycles :) > > It wouldn’t shock me if just the RDRAND calls account for a > respectable fraction of total time. The RDTSC fallback, on the other > hand, may be so predictable as to be useless. I think at the moment the calls to rdrand are really not the largest contributor to the latency. The relocations are the real bottleneck - each address must be inspected to see if it is in the list of function sections that have been randomized, and the value at that address must also be inspected to see if it's in the list of function sections. That's a lot of lookups. That said, I tried to measure the difference between using Kees' prng vs. the rdrand calls and found little to no measurable difference. I think at this point it's in the noise - hopefully we will get to a point where this matters more. > > I would suggest adding a little ChaCha20 DRBG or similar to the KASLR > environment instead. What crypto primitives are available there? I will read up on this.
diff --git a/arch/x86/lib/kaslr.c b/arch/x86/lib/kaslr.c index a53665116458..2b3eb8c948a3 100644 --- a/arch/x86/lib/kaslr.c +++ b/arch/x86/lib/kaslr.c @@ -56,11 +56,14 @@ unsigned long kaslr_get_random_long(const char *purpose) unsigned long raw, random = get_boot_seed(); bool use_i8254 = true; - debug_putstr(purpose); - debug_putstr(" KASLR using"); + if (purpose) { + debug_putstr(purpose); + debug_putstr(" KASLR using"); + } if (has_cpuflag(X86_FEATURE_RDRAND)) { - debug_putstr(" RDRAND"); + if (purpose) + debug_putstr(" RDRAND"); if (rdrand_long(&raw)) { random ^= raw; use_i8254 = false; @@ -68,7 +71,8 @@ unsigned long kaslr_get_random_long(const char *purpose) } if (has_cpuflag(X86_FEATURE_TSC)) { - debug_putstr(" RDTSC"); + if (purpose) + debug_putstr(" RDTSC"); raw = rdtsc(); random ^= raw; @@ -76,7 +80,8 @@ unsigned long kaslr_get_random_long(const char *purpose) } if (use_i8254) { - debug_putstr(" i8254"); + if (purpose) + debug_putstr(" i8254"); random ^= i8254(); } @@ -86,7 +91,8 @@ unsigned long kaslr_get_random_long(const char *purpose) : "a" (random), "rm" (mix_const)); random += raw; - debug_putstr("...\n"); + if (purpose) + debug_putstr("...\n"); return random; }