diff mbox

printk: hash addresses printed with %p

Message ID 1507693696-3777-1-git-send-email-me@tobin.cc (mailing list archive)
State New, archived
Headers show

Commit Message

Tobin Harding Oct. 11, 2017, 3:48 a.m. UTC
Currently there are many places in the kernel where addresses are being
printed using an unadorned %p. Kernel pointers should be printed using
%pK allowing some control via the kptr_restrict sysctl. Exposing addresses
gives attackers sensitive information about the kernel layout in memory.

We can reduce the attack surface by hashing all addresses printed with
%p. This will of course break some users, forcing code printing needed
addresses to be updated.

For what it's worth, usage of unadorned %p can be broken down as follows

    git grep '%p[^KFfSsBRrbMmIiEUVKNhdDgCGO]' | wc -l

arch: 2512
block: 20
crypto: 12
fs: 1221
include: 147
kernel: 109
lib: 77
mm: 120
net: 1516
security: 11
sound: 168
virt: 2
drivers: 8420

Add function ptr_to_id() to map an address to a unique identifier. This
mapping is created by calling ptr_obfuscate() to hash the address. The
hashing algorithm is carried out in two stages. First the address is
xor'd by a random value then we multiply the xor production by a second
random value.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---

This is version 2 of the series (of which I sent only the cover letter,
failing to send the actual patches)

	[PATCH 0/3] add %pX specifier

Implementing changes as suggested by Linus (in response to the cover
letter). Patch 2 and 3 of the original series dropped.

 include/linux/printk.h | 17 +++++++++++++++++
 lib/vsprintf.c         | 35 +++++++++++++++++++++++++++++++++--
 2 files changed, 50 insertions(+), 2 deletions(-)

Comments

Joe Perches Oct. 11, 2017, 4:06 a.m. UTC | #1
On Wed, 2017-10-11 at 14:48 +1100, Tobin C. Harding wrote:
> Currently there are many places in the kernel where addresses are being
> printed using an unadorned %p. Kernel pointers should be printed using
> %pK allowing some control via the kptr_restrict sysctl. Exposing addresses
> gives attackers sensitive information about the kernel layout in memory.
[]
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
[]
> @@ -1591,6 +1591,35 @@ char *device_node_string(char *buf, char *end, struct device_node *dn,
>  	return widen_string(buf, buf - buf_start, end, spec);
>  }
>  
> +static long get_random_odd_long(void)
> +{
> +	long val = 0;
> +
> +	while((val & 1) == 0) {
> +		val = get_random_long();
> +	}
> +
> +	return val;
> +}

Perhaps

static long get_random_odd_long(void)
{
	return get_random_long() | 1L;
}
Linus Torvalds Oct. 11, 2017, 4:49 p.m. UTC | #2
On Tue, Oct 10, 2017 at 8:48 PM, Tobin C. Harding <me@tobin.cc> wrote:
>
> Add function ptr_to_id() to map an address to a unique identifier. This
> mapping is created by calling ptr_obfuscate() to hash the address. The
> hashing algorithm is carried out in two stages. First the address is
> xor'd by a random value then we multiply the xor production by a second
> random value.

So that's a fine obfuscation for normal kernel addresses.

It's also fine for testing this feature, and forcing people who
actually care about the real pointer value to look at thei rcode.

However, it's almost certainly no good for hardening.

The reasons is that it's going to be fairly trivial to just reverse
the obfuscation if you have any kind of pattern to the pointers
printed out.

And there's tons of information like that. Some code (think KASAN
reports etc) might print out addresses that increment with a fixed
offset. In other situations, NULL is going to be fairly common. In yet
other cases, you'll know specific bit patterns of the pointers (like
"I know this reports a page address, so the upper 14 bits are all set,
and the lower 12 bits are all zero").

You need just a few of those kinds of things, and you're going to
easily break the obfuscation.

A *normal* user won't bother. A kernel developer won't bother. But
somebody writing an attack would. It's an inconvenience, and maybe it
would push somebody towards an easier attack if one can be found, but
it's not really a particularly _big_ inconvenience.

So I'm absolutely ok with this patch for testing, and for finding the
places that need fixing up, but it should be clear that for real
hardening we'd need something much smarter.

                   Linus
Theodore Ts'o Oct. 11, 2017, 5:48 p.m. UTC | #3
On Wed, Oct 11, 2017 at 02:48:16PM +1100, Tobin C. Harding wrote:
> +/*
> + * Obfuscates pointer (algorithm taken from kptr_obfuscate(). See kernel/kcmp.c)
> + * v is the pointer value, randval is some random value, oddval is some random
> + * odd value.
> + *
> + * The obfuscation is done in two steps. First we xor the kernel pointer with
> + * a random value, which puts pointer into a new position in a reordered space.
> + * Secondly we multiply the xor production with a large odd random number to
> + * permute its bits even more (the odd multiplier guarantees that the product
> + * is unique ever after the high bits are truncated, since any odd number is
> + * relative prime to 2^n).
> + */

Why not just expose kptr_obfusecate() and use it, instead of copying
code?

Also, I'm nervous about the obfuscation.  If the attacker can get a
handful of known "real kernel pointer" and "obfuscated kernel pointer"
values, it wouldn't be that hard for them to be able to reverse
engineer the two secret values.

Perhaps the argument is "if the attacker can get a _single_ real
kernel address, it's all over anyway", which is probably true for
KASLR, but which might not be true for all attacks.

Anyway, if you use kptr_obfuscate in kernel/kcmp.c, then if we later
decide that we should change the obfuscation algorithm to something
stronger, we only need to do it in one place.

					- Ted
Tobin Harding Oct. 12, 2017, 2:24 a.m. UTC | #4
On Tue, Oct 10, 2017 at 09:06:50PM -0700, Joe Perches wrote:
> On Wed, 2017-10-11 at 14:48 +1100, Tobin C. Harding wrote:
> > Currently there are many places in the kernel where addresses are being
> > printed using an unadorned %p. Kernel pointers should be printed using
> > %pK allowing some control via the kptr_restrict sysctl. Exposing addresses
> > gives attackers sensitive information about the kernel layout in memory.
> []
> > diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> []
> > @@ -1591,6 +1591,35 @@ char *device_node_string(char *buf, char *end, struct device_node *dn,
> >  	return widen_string(buf, buf - buf_start, end, spec);
> >  }
> >  
> > +static long get_random_odd_long(void)
> > +{
> > +	long val = 0;
> > +
> > +	while((val & 1) == 0) {
> > +		val = get_random_long();
> > +	}
> > +
> > +	return val;
> > +}
> 
> Perhaps
> 
> static long get_random_odd_long(void)
> {
> 	return get_random_long() | 1L;
> }
> 

Nice.

thanks,
Tobin.
Tobin Harding Oct. 12, 2017, 2:55 a.m. UTC | #5
Removing kvm@vger.kernel.org from the CC list.

On Wed, Oct 11, 2017 at 01:48:58PM -0400, Theodore Ts'o wrote:
> On Wed, Oct 11, 2017 at 02:48:16PM +1100, Tobin C. Harding wrote:
> > +/*
> > + * Obfuscates pointer (algorithm taken from kptr_obfuscate(). See kernel/kcmp.c)
> > + * v is the pointer value, randval is some random value, oddval is some random
> > + * odd value.
> > + *
> > + * The obfuscation is done in two steps. First we xor the kernel pointer with
> > + * a random value, which puts pointer into a new position in a reordered space.
> > + * Secondly we multiply the xor production with a large odd random number to
> > + * permute its bits even more (the odd multiplier guarantees that the product
> > + * is unique ever after the high bits are truncated, since any odd number is
> > + * relative prime to 2^n).
> > + */
> 
> Why not just expose kptr_obfusecate() and use it, instead of copying
> code?
> 
> Also, I'm nervous about the obfuscation.  If the attacker can get a
> handful of known "real kernel pointer" and "obfuscated kernel pointer"
> values, it wouldn't be that hard for them to be able to reverse
> engineer the two secret values.
> 
> Perhaps the argument is "if the attacker can get a _single_ real
> kernel address, it's all over anyway", which is probably true for
> KASLR, but which might not be true for all attacks.
> 
> Anyway, if you use kptr_obfuscate in kernel/kcmp.c, then if we later
> decide that we should change the obfuscation algorithm to something
> stronger, we only need to do it in one place.
> 
> 					- Ted

Thanks Ted, others have misgivings about this method also. The email threads are all a bit mixed up
(thansk to my ineptness at posting patches :).

FYI, in the other threads Jason A. Donenfel and Linus Torvalds have discussed SipHash as a solution.

thanks,
Tobin.
diff mbox

Patch

diff --git a/include/linux/printk.h b/include/linux/printk.h
index e10f27468322..60c3d018efcf 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -41,6 +41,23 @@  static inline const char *printk_skip_headers(const char *buffer)
 	return buffer;
 }
 
+/*
+ * Obfuscates pointer (algorithm taken from kptr_obfuscate(). See kernel/kcmp.c)
+ * v is the pointer value, randval is some random value, oddval is some random
+ * odd value.
+ *
+ * The obfuscation is done in two steps. First we xor the kernel pointer with
+ * a random value, which puts pointer into a new position in a reordered space.
+ * Secondly we multiply the xor production with a large odd random number to
+ * permute its bits even more (the odd multiplier guarantees that the product
+ * is unique ever after the high bits are truncated, since any odd number is
+ * relative prime to 2^n).
+ */
+static inline long ptr_obfuscate(long v, long randval, long oddval)
+{
+	return (v ^ randval) * oddval;
+}
+
 #define CONSOLE_EXT_LOG_MAX	8192
 
 /* printk's without a loglevel use this.. */
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 86c3385b9eb3..399cc090be75 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1591,6 +1591,35 @@  char *device_node_string(char *buf, char *end, struct device_node *dn,
 	return widen_string(buf, buf - buf_start, end, spec);
 }
 
+static long get_random_odd_long(void)
+{
+	long val = 0;
+
+	while((val & 1) == 0) {
+		val = get_random_long();
+	}
+
+	return val;
+}
+
+/* Maps a pointer to a unique identifier. */
+static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec spec)
+{
+	long hashval;
+	static long randval = 0;
+	static long oddval = 0;
+
+	if (oddval == 0 && randval == 0) {
+		randval = get_random_long();
+		oddval = get_random_odd_long();
+	}
+
+	hashval = ptr_obfuscate((unsigned long)ptr, randval, oddval);
+	spec.base = 16;
+
+	return number(buf, end, hashval, spec);
+}
+
 int kptr_restrict __read_mostly;
 
 /*
@@ -1703,6 +1732,9 @@  int kptr_restrict __read_mostly;
  * Note: The difference between 'S' and 'F' is that on ia64 and ppc64
  * function pointers are really function descriptors, which contain a
  * pointer to the real address.
+ *
+ * Default behaviour (unadorned %p) is to hash the address, rendering it useful
+ * as a unique identifier.
  */
 static noinline_for_stack
 char *pointer(const char *fmt, char *buf, char *end, void *ptr,
@@ -1858,14 +1890,13 @@  char *pointer(const char *fmt, char *buf, char *end, void *ptr,
 			return device_node_string(buf, end, ptr, spec, fmt + 1);
 		}
 	}
-	spec.flags |= SMALL;
 	if (spec.field_width == -1) {
 		spec.field_width = default_width;
 		spec.flags |= ZEROPAD;
 	}
 	spec.base = 16;
 
-	return number(buf, end, (unsigned long) ptr, spec);
+	return ptr_to_id(buf, end, ptr, spec);
 }
 
 /*