diff mbox series

[RFC,03/11] x86/boot: Allow a "silent" kaslr random byte fetch

Message ID 20200205223950.1212394-4-kristen@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Finer grained kernel address space randomization | expand

Commit Message

Kristen Carlson Accardi Feb. 5, 2020, 10:39 p.m. UTC
From: Kees Cook <keescook@chromium.org>

Under earlyprintk, each RNG call produces a debug report line. When
shuffling hundreds of functions, this is not useful information (each
line is identical and tells us nothing new). Instead, allow for a NULL
"purpose" to suppress the debug reporting.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
---
 arch/x86/lib/kaslr.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

Comments

Andy Lutomirski Feb. 6, 2020, 1:08 a.m. UTC | #1
> On Feb 5, 2020, at 2:39 PM, Kristen Carlson Accardi <kristen@linux.intel.com> wrote:
> 
> From: Kees Cook <keescook@chromium.org>
> 
> Under earlyprintk, each RNG call produces a debug report line. When
> shuffling hundreds of functions, this is not useful information (each
> line is identical and tells us nothing new). Instead, allow for a NULL
> "purpose" to suppress the debug reporting.

Have you counted how many RDRAND calls this causes?  RDRAND is exceedingly slow on all CPUs I’ve looked at. The whole “RDRAND has great bandwidth” marketing BS actually means that it has decent bandwidth if all CPUs hammer it at the same time. The latency is abysmal.  I have asked Intel to improve this, but the latency of that request will be quadrillions of cycles :)

It wouldn’t shock me if just the RDRAND calls account for a respectable fraction of total time. The RDTSC fallback, on the other hand, may be so predictable as to be useless.

I would suggest adding a little ChaCha20 DRBG or similar to the KASLR environment instead. What crypto primitives are available there?
Kees Cook Feb. 6, 2020, 11:48 a.m. UTC | #2
On Wed, Feb 05, 2020 at 05:08:55PM -0800, Andy Lutomirski wrote:
> 
> 
> > On Feb 5, 2020, at 2:39 PM, Kristen Carlson Accardi <kristen@linux.intel.com> wrote:
> > 
> > From: Kees Cook <keescook@chromium.org>
> > 
> > Under earlyprintk, each RNG call produces a debug report line. When
> > shuffling hundreds of functions, this is not useful information (each
> > line is identical and tells us nothing new). Instead, allow for a NULL
> > "purpose" to suppress the debug reporting.
> 
> Have you counted how many RDRAND calls this causes?  RDRAND is
> exceedingly slow on all CPUs I’ve looked at. The whole “RDRAND
> has great bandwidth” marketing BS actually means that it has decent
> bandwidth if all CPUs hammer it at the same time. The latency is abysmal.
> I have asked Intel to improve this, but the latency of that request will
> be quadrillions of cycles :)

In an earlier version of this series, it was called once per function
section (so, about 50,000 times). The (lack of) speed was quite
measurable.

> I would suggest adding a little ChaCha20 DRBG or similar to the KASLR
> environment instead. What crypto primitives are available there?

Agreed. The simple PRNG in the next patch was most just a POC initially,
but Kristen kept it due to its debugging properties (specifying an
external seed). Pulling in ChaCha20 seems like a good approach.
Kristen Carlson Accardi Feb. 6, 2020, 4:58 p.m. UTC | #3
On Wed, 2020-02-05 at 17:08 -0800, Andy Lutomirski wrote:
> > On Feb 5, 2020, at 2:39 PM, Kristen Carlson Accardi <
> > kristen@linux.intel.com> wrote:
> > 
> > From: Kees Cook <keescook@chromium.org>
> > 
> > Under earlyprintk, each RNG call produces a debug report line. When
> > shuffling hundreds of functions, this is not useful information
> > (each
> > line is identical and tells us nothing new). Instead, allow for a
> > NULL
> > "purpose" to suppress the debug reporting.
> 
> Have you counted how many RDRAND calls this causes?  RDRAND is
> exceedingly slow on all CPUs I’ve looked at. The whole “RDRAND has
> great bandwidth” marketing BS actually means that it has decent
> bandwidth if all CPUs hammer it at the same time. The latency is
> abysmal.  I have asked Intel to improve this, but the latency of that
> request will be quadrillions of cycles :)
> 
> It wouldn’t shock me if just the RDRAND calls account for a
> respectable fraction of total time. The RDTSC fallback, on the other
> hand, may be so predictable as to be useless.

I think at the moment the calls to rdrand are really not the largest
contributor to the latency. The relocations are the real bottleneck -
each address must be inspected to see if it is in the list of function
sections that have been randomized, and the value at that address must
also be inspected to see if it's in the list of function sections.
That's a lot of lookups. That said, I tried to measure the difference
between using Kees' prng vs. the rdrand calls and found little to no
measurable difference. I think at this point it's in the noise -
hopefully we will get to a point where this matters more.

> 
> I would suggest adding a little ChaCha20 DRBG or similar to the KASLR
> environment instead. What crypto primitives are available there?

I will read up on this.
diff mbox series

Patch

diff --git a/arch/x86/lib/kaslr.c b/arch/x86/lib/kaslr.c
index a53665116458..2b3eb8c948a3 100644
--- a/arch/x86/lib/kaslr.c
+++ b/arch/x86/lib/kaslr.c
@@ -56,11 +56,14 @@  unsigned long kaslr_get_random_long(const char *purpose)
 	unsigned long raw, random = get_boot_seed();
 	bool use_i8254 = true;
 
-	debug_putstr(purpose);
-	debug_putstr(" KASLR using");
+	if (purpose) {
+		debug_putstr(purpose);
+		debug_putstr(" KASLR using");
+	}
 
 	if (has_cpuflag(X86_FEATURE_RDRAND)) {
-		debug_putstr(" RDRAND");
+		if (purpose)
+			debug_putstr(" RDRAND");
 		if (rdrand_long(&raw)) {
 			random ^= raw;
 			use_i8254 = false;
@@ -68,7 +71,8 @@  unsigned long kaslr_get_random_long(const char *purpose)
 	}
 
 	if (has_cpuflag(X86_FEATURE_TSC)) {
-		debug_putstr(" RDTSC");
+		if (purpose)
+			debug_putstr(" RDTSC");
 		raw = rdtsc();
 
 		random ^= raw;
@@ -76,7 +80,8 @@  unsigned long kaslr_get_random_long(const char *purpose)
 	}
 
 	if (use_i8254) {
-		debug_putstr(" i8254");
+		if (purpose)
+			debug_putstr(" i8254");
 		random ^= i8254();
 	}
 
@@ -86,7 +91,8 @@  unsigned long kaslr_get_random_long(const char *purpose)
 	    : "a" (random), "rm" (mix_const));
 	random += raw;
 
-	debug_putstr("...\n");
+	if (purpose)
+		debug_putstr("...\n");
 
 	return random;
 }