mbox series

[v8,0/4] KASLR feature to randomize each loadable module

Message ID 20181102192520.4522-1-rick.p.edgecombe@intel.com (mailing list archive)
Headers show
Series KASLR feature to randomize each loadable module | expand

Message

Rick Edgecombe Nov. 2, 2018, 7:25 p.m. UTC
Hi,

This is V8 of the "KASLR feature to randomize each loadable module" patchset.
The purpose is to increase the randomization and also to make the modules
randomized in relation to each other instead of just the base, so that if one
module leaks the location of the others can't be inferred.

This version gets rid of the more complex, more LOC, new logic in vmalloc that
helped optimization around lazy the free area case, and hopefully makes this
patchset more straightforward. Earlier versions were concerned with efficiently
handling this case, but I have learned they are actually not common in real
world module loader usage. So instead there are some smaller tweaks to existing
vmalloc logic to allow an allocation to be tried without triggering a
purge_vmap_area_lazy() and retry, when it encounters a real (non lazy free)
area. The kselftest simulations have been updated with the logic of init
sections getting cleaned up as well.

There is a small allocation performance degradation versus v7 as a trade off, but
it is still faster on average than the existing algorithm until >7000 modules.

Changes for V8:
 - Simplify code by removing logic for optimum handling of lazy free areas

Changes for V7:
 - More 0-day build fixes, readability improvements (Kees Cook)

Changes for V6:
 - 0-day build fixes by removing un-needed functional testing, more error
   handling

Changes for V5:
 - Add module_alloc test module

Changes for V4:
 - Fix issue caused by KASAN, kmemleak being provided different allocation
   lengths (padding).
 - Avoid kmalloc until sure its needed in __vmalloc_node_try_addr.
 - Fixed issues reported by 0-day.

Changes for V3:
 - Code cleanup based on internal feedback. (thanks to Dave Hansen and Andriy
   Shevchenko)
 - Slight refactor of existing algorithm to more cleanly live along side new
   one.
 - BPF synthetic benchmark

Changes for V2:
 - New implementation of __vmalloc_node_try_addr based on the
   __vmalloc_node_range implementation, that only flushes TLB when needed.
 - Modified module loading algorithm to try to reduce the TLB flushes further.
 - Increase "random area" tries in order to increase the number of modules that
   can get high randomness.
 - Increase "random area" size to 2/3 of module area in order to increase the
   number of modules that can get high randomness.
 - Fix for 0day failures on other architectures.
 - Fix for wrong debugfs permissions. (thanks to Jann Horn)
 - Spelling fix. (thanks to Jann Horn)
 - Data on module_alloc performance and TLB flushes. (brought up by Kees Cook
   and Jann Horn)
 - Data on memory usage. (suggested by Jann)


Rick Edgecombe (4):
  vmalloc: Add __vmalloc_node_try_addr function
  x86/modules: Increase randomization for modules
  vmalloc: Add debugfs modfraginfo
  Kselftest for module text allocation benchmarking

 arch/x86/Kconfig                              |   3 +
 arch/x86/include/asm/kaslr_modules.h          |  38 ++
 arch/x86/include/asm/pgtable_64_types.h       |   7 +
 arch/x86/kernel/module.c                      | 111 ++++--
 include/linux/vmalloc.h                       |   3 +
 lib/Kconfig.debug                             |   9 +
 lib/Makefile                                  |   1 +
 lib/test_mod_alloc.c                          | 343 ++++++++++++++++++
 mm/vmalloc.c                                  | 228 ++++++++++--
 tools/testing/selftests/bpf/test_mod_alloc.sh |  29 ++
 10 files changed, 711 insertions(+), 61 deletions(-)
 create mode 100644 arch/x86/include/asm/kaslr_modules.h
 create mode 100644 lib/test_mod_alloc.c
 create mode 100755 tools/testing/selftests/bpf/test_mod_alloc.sh

Comments

Andrew Morton Nov. 6, 2018, 9:04 p.m. UTC | #1
On Fri,  2 Nov 2018 12:25:16 -0700 Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:

> This is V8 of the "KASLR feature to randomize each loadable module" patchset.
> The purpose is to increase the randomization and also to make the modules
> randomized in relation to each other instead of just the base, so that if one
> module leaks the location of the others can't be inferred.

I'm not seeing any info here which explains why we should add this to
Linux.

What is the end-user value?  What problems does it solve?  Are those
problems real or theoretical?  What are the exploit scenarios and how
realistic are they?  etcetera, etcetera.  How are we to decide to buy
this thing if we aren't given a glossy brochure?

> There is a small allocation performance degradation versus v7 as a
> trade off, but it is still faster on average than the existing
> algorithm until >7000 modules.

lol.  How did you test 7000 modules?  Using the selftest code?
Rick Edgecombe Nov. 7, 2018, 8:03 p.m. UTC | #2
On Tue, 2018-11-06 at 13:04 -0800, Andrew Morton wrote:
> On Fri,  2 Nov 2018 12:25:16 -0700 Rick Edgecombe <rick.p.edgecombe@intel.com>
> wrote:
> 
> > This is V8 of the "KASLR feature to randomize each loadable module"
> > patchset.
> > The purpose is to increase the randomization and also to make the modules
> > randomized in relation to each other instead of just the base, so that if
> > one
> > module leaks the location of the others can't be inferred.
> 
> I'm not seeing any info here which explains why we should add this to
> Linux.
> 
> What is the end-user value?  What problems does it solve?  Are those
> problems real or theoretical?  What are the exploit scenarios and how
> realistic are they?  etcetera, etcetera.  How are we to decide to buy
> this thing if we aren't given a glossy brochure?
Hi Andrew,

Thanks for taking a look! The first version had a proper write up, but now the
details are spread out over 8 versions. I'll send out another version with it
all in one place.

The short version is that today the RANDOMIZE_BASE feature randomizes the base
address where the module allocations begin with 10 bits of entropy. From here,
a highly deterministic algorithm allocates space for the modules as they are 
loaded and un-loaded. If an attacker can predict the order and identities for
modules that will be loaded, then a single text address leak can give the
attacker access to the locations of all the modules. 

So this is trying to prevent the same class of attacks as the existing KASLR,
like control flow manipulation and now also making it harder/longer to find
speculative execution gadgets. It increases the number of possible
positions 128X, and with that amount of randomness per module instead of for all
modules.

> > There is a small allocation performance degradation versus v7 as a
> > trade off, but it is still faster on average than the existing
> > algorithm until >7000 modules.
> 
> lol.  How did you test 7000 modules?  Using the selftest code?

Yes, this is with simulations using the included kselftest code with sizes
extracted from the x86_64 in-tree modules. Supporting 7000 kernel modules is not
the intention though, instead it's trying to accommodate 7000 allocations in the
module space. So also eBPF JIT, classic BPF socket filter JIT, kprobes, etc.