mbox series

[v4,00/26] arm64: refactor boot flow and add support for WXN

Message ID 20220613144550.3760857-1-ardb@kernel.org (mailing list archive)
Headers show
Series arm64: refactor boot flow and add support for WXN | expand

Message

Ard Biesheuvel June 13, 2022, 2:45 p.m. UTC
[ TL;DR this series does the following:
  - move variable definitions and assignments out of early asm code
    where possible, and get rid of explicit cache maintenance;
  - convert initial ID map so it covers the entire loaded image as well
    as the DT blob;
  - create the kernel mapping only once instead of twice (for KASLR),
    and do it with the MMU and caches on;
  - avoid mappings that are both writable and executable entirely;
  - avoid parsing the DT while the kernel text and rodata are still
    mapped writable;
  - allow WXN to be enabled (with an opt-out) so writable mappings are
    never executable. ]

This is a followup to a previous series of mine [0][1], and it aims to
streamline the boot flow with respect to cache maintenance and redundant
copying of data in memory, as well as eliminate writable executable
mappings at any time during the boot or after.

Additionally, this series removes the little dance we do to create a
kernel mapping, relocate the kernel, run the KASLR init code, tear down
the old mapping and create a new one, relocate the kernel again, and
finally enter the kernel proper. Instead, it invokes a minimal C
function 'kaslr_early_init()' while running from the ID map which
includes a temporary mapping of the FDT. This change represents a
substantial chunk of the diffstat, as it requires some work to
instantiate code that can run safely from an arbitrary load address.

The WXN support was tested using a Debian bullseye mixed AArch64/armhf
user space, running Gnome shell, Chromium, Firefox, LibreOffice, etc.
Some minimal tweaks are needed to avoid entries appearing in the kernel
log regarding attempts from user space to create PROT_EXEC+PROT_WRITE
mappings, but in most cases (libffi, for instance), the library in
question already carries a workaround, but didn't enable it by default
because it did not detect selinux as being active (which it was not, in
this case)

Changes since v3:
- drop changes for entering with the MMU enabled for now;
- reject mmap() and mprotect() calls with PROT_WRITE and PROT_EXEC flags
  passed when WXN is in effect; this essentially matches the behavior of
  both selinux and PaX, and most distros (including Android) can already
  deal with this just fine;
- defer KASLR initialization to an initcall() to the extent possible.

Changes since v2:
- create a separate, initial ID map that is discarded after boot, and
  create the permanent ID map from C code using the ordinary memory
  mapping code;
- refactor the extended ID map handling, and along with it, simplify the
  early memory mapping macros, so that we can deal with an extended ID
  map that requires multiple table entries at intermediate levels;
- eliminate all variable assignments with the MMU off from the happy
  flow;
- replace temporary FDT mapping in TTBR1 with a FDT mapping in the
  initial ID map;
- use read-only attributes for all code mappings, so we can boot with
  WXN enabled if we elect to do so.

Changes since v1:
- Remove the dodgy handling of the KASLR seed, which was necessary to
  avoid doing two iterations of the setup/teardown of the page tables.
  This is now dealt with by creating the TTBR1 page tables while
  executing from TTBR0, and so all memory manipulations are still done
  with the MMU and caches on.
- Only boot from EFI with the MMU and caches on if the image was not
  moved around in memory. Otherwise, we cannot rely on the firmware's ID
  map to have created an executable mapping for the copied code.

[0] https://lore.kernel.org/all/20220304175657.2744400-1-ardb@kernel.org/
[1] https://lore.kernel.org/all/20220330154205.2483167-1-ardb@kernel.org/
[2] https://lore.kernel.org/all/20220411094824.4176877-1-ardb@kernel.org/

Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>

Ard Biesheuvel (26):
  arm64: head: move kimage_vaddr variable into C file
  arm64: mm: make vabits_actual a build time constant if possible
  arm64: head: move assignment of idmap_t0sz to C code
  arm64: head: drop idmap_ptrs_per_pgd
  arm64: head: simplify page table mapping macros (slightly)
  arm64: head: switch to map_memory macro for the extended ID map
  arm64: head: split off idmap creation code
  arm64: kernel: drop unnecessary PoC cache clean+invalidate
  arm64: head: pass ID map root table address to __enable_mmu()
  arm64: mm: provide idmap pointer to cpu_replace_ttbr1()
  arm64: head: add helper function to remap regions in early page tables
  arm64: head: cover entire kernel image in initial ID map
  arm64: head: use relative references to the RELA and RELR tables
  arm64: head: create a temporary FDT mapping in the initial ID map
  arm64: idreg-override: use early FDT mapping in ID map
  arm64: head: factor out TTBR1 assignment into a macro
  arm64: head: populate kernel page tables with MMU and caches on
  arm64: head: record CPU boot mode after enabling the MMU
  arm64: kaslr: defer initialization to late initcall where permitted
  arm64: head: avoid relocating the kernel twice for KASLR
  arm64: setup: drop early FDT pointer helpers
  arm64: mm: move ro_after_init section into the data segment
  arm64: head: remap the kernel text/inittext region read-only
  mm: add arch hook to validate mmap() prot flags
  arm64: mm: add support for WXN memory translation attribute
  arm64: kernel: move ID map out of .text mapping

 arch/arm64/Kconfig                      |  11 +
 arch/arm64/include/asm/assembler.h      |  37 +-
 arch/arm64/include/asm/cpufeature.h     |   8 +
 arch/arm64/include/asm/kernel-pgtable.h |  18 +-
 arch/arm64/include/asm/memory.h         |   4 +
 arch/arm64/include/asm/mman.h           |  36 ++
 arch/arm64/include/asm/mmu_context.h    |  46 +-
 arch/arm64/include/asm/setup.h          |   3 -
 arch/arm64/kernel/Makefile              |   2 +-
 arch/arm64/kernel/cpufeature.c          |   2 +-
 arch/arm64/kernel/head.S                | 536 ++++++++++----------
 arch/arm64/kernel/hyp-stub.S            |   4 +-
 arch/arm64/kernel/idreg-override.c      |  33 +-
 arch/arm64/kernel/image-vars.h          |   4 +
 arch/arm64/kernel/kaslr.c               | 149 +-----
 arch/arm64/kernel/pi/Makefile           |  33 ++
 arch/arm64/kernel/pi/kaslr_early.c      | 112 ++++
 arch/arm64/kernel/setup.c               |  15 -
 arch/arm64/kernel/sleep.S               |   1 +
 arch/arm64/kernel/suspend.c             |   2 +-
 arch/arm64/kernel/vmlinux.lds.S         |  63 ++-
 arch/arm64/mm/kasan_init.c              |   4 +-
 arch/arm64/mm/mmu.c                     |  96 +++-
 arch/arm64/mm/proc.S                    |  29 +-
 include/linux/mman.h                    |  15 +
 mm/mmap.c                               |   3 +
 26 files changed, 763 insertions(+), 503 deletions(-)
 create mode 100644 arch/arm64/kernel/pi/Makefile
 create mode 100644 arch/arm64/kernel/pi/kaslr_early.c

Comments

Will Deacon June 24, 2022, 1:19 p.m. UTC | #1
Hi Ard,

On Mon, Jun 13, 2022 at 04:45:24PM +0200, Ard Biesheuvel wrote:
> [ TL;DR this series does the following:
>   - move variable definitions and assignments out of early asm code
>     where possible, and get rid of explicit cache maintenance;
>   - convert initial ID map so it covers the entire loaded image as well
>     as the DT blob;
>   - create the kernel mapping only once instead of twice (for KASLR),
>     and do it with the MMU and caches on;
>   - avoid mappings that are both writable and executable entirely;
>   - avoid parsing the DT while the kernel text and rodata are still
>     mapped writable;
>   - allow WXN to be enabled (with an opt-out) so writable mappings are
>     never executable. ]

I really like this series -- it removes quite a few ugly warts from our
boot assembly that we've collected over the years and, while functional,
they have never been particularly satisfactory. Thank you for putting it
together.

I've left a handful of minor comments on some of the patches and if you
can address those then I'd like to queue the first 21 patches ASAP to
give them some more exposure before the next merge window.

The remaining patches are the WXN pieces, which I'd like to give others
a chance to chime in on first.

Cheers,

Will
Ard Biesheuvel June 24, 2022, 2:40 p.m. UTC | #2
On Fri, 24 Jun 2022 at 15:20, Will Deacon <will@kernel.org> wrote:
>
> Hi Ard,
>
> On Mon, Jun 13, 2022 at 04:45:24PM +0200, Ard Biesheuvel wrote:
> > [ TL;DR this series does the following:
> >   - move variable definitions and assignments out of early asm code
> >     where possible, and get rid of explicit cache maintenance;
> >   - convert initial ID map so it covers the entire loaded image as well
> >     as the DT blob;
> >   - create the kernel mapping only once instead of twice (for KASLR),
> >     and do it with the MMU and caches on;
> >   - avoid mappings that are both writable and executable entirely;
> >   - avoid parsing the DT while the kernel text and rodata are still
> >     mapped writable;
> >   - allow WXN to be enabled (with an opt-out) so writable mappings are
> >     never executable. ]
>
> I really like this series -- it removes quite a few ugly warts from our
> boot assembly that we've collected over the years and, while functional,
> they have never been particularly satisfactory. Thank you for putting it
> together.
>
> I've left a handful of minor comments on some of the patches and if you
> can address those then I'd like to queue the first 21 patches ASAP to
> give them some more exposure before the next merge window.
>

I'll spin a v5 with just those patches, and we can revisit the
remaining work at a later time.

> The remaining patches are the WXN pieces, which I'd like to give others
> a chance to chime in on first.
>
> Cheers,
>
> Will