mbox series

[V2,00/10] PKS: Add Protection Keys Supervisor (PKS) support

Message ID 20201102205320.1458656-1-ira.weiny@intel.com (mailing list archive)
Headers show
Series PKS: Add Protection Keys Supervisor (PKS) support | expand

Message

Ira Weiny Nov. 2, 2020, 8:53 p.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

Changes from V1
	Rebase to TIP master; resolve conflicts and test
	Clean up some kernel docs updates missed in V1
	Add irqentry_state_t kernel doc for PKRS field
	Removed redundant irq_state->pkrs
		This is only needed when we add the global state and somehow
		ended up in this patch series.  That will come back when we add
		the global functionality in.
	From Thomas Gleixner
		Update commit messages
		Add kernel doc for struct irqentry_state_t
	From Dave Hansen add flags to pks_key_alloc()

Changes from RFC V3[3]
	Rebase to TIP master
	Update test error output
	Standardize on 'irq_state' for state variables
	From Dave Hansen
		Update commit messages
		Add/clean up comments
		Add X86_FEATURE_PKS to disabled-features.h and remove some
			explicit CONFIG checks
		Move saved_pkrs member of thread_struct
		Remove superfluous preempt_disable()
		s/irq_save_pks/irq_save_set_pks/
		Ensure PKRS is not seen in faults if not configured or not
			supported
		s/pks_mknoaccess/pks_mk_noaccess/
		s/pks_mkread/pks_mk_readonly/
		s/pks_mkrdwr/pks_mk_readwrite/
		Change pks_key_alloc return to -EOPNOTSUPP when not supported
	From Peter Zijlstra
		Clean up Attribution
		Remove superfluous preempt_disable()
		Add union to differentiate exit_rcu/lockdep use in
			irqentry_state_t
	From Thomas Gleixner
		Add preliminary clean up patch and adjust series as needed


Introduce a new page protection mechanism for supervisor pages, Protection Key
Supervisor (PKS).

2 use cases for PKS are being developed, trusted keys and PMEM.  Trusted keys
is a newer use case which is still being explored.  PMEM was submitted as part
of the RFC (v2) series[1].  However, since then it was found that some callers
of kmap() require a global implementation of PKS.  Specifically some users of
kmap() expect mappings to be available to all kernel threads.  While global use
of PKS is rare it needs to be included for correctness.  Unfortunately the
kmap() updates required a large patch series to make the needed changes at the
various kmap() call sites so that patch set has been split out.  Because the
global PKS feature is only required for that use case it will be deferred to
that set as well.[2]  This patch set is being submitted as a precursor to both
of the use cases.

For an overview of the entire PKS ecosystem, a git tree including this series
and 2 proposed use cases can be found here:

	https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/
	https://lore.kernel.org/lkml/20201009201410.3209180-1-ira.weiny@intel.com/


PKS enables protections on 'domains' of supervisor pages to limit supervisor
mode access to those pages beyond the normal paging protections.  PKS works in
a similar fashion to user space pkeys, PKU.  As with PKU, supervisor pkeys are
checked in addition to normal paging protections and Access or Writes can be
disabled via a MSR update without TLB flushes when permissions change.  Also
like PKU, a page mapping is assigned to a domain by setting pkey bits in the
page table entry for that mapping.

Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.

XSAVE is not supported for the PKRS MSR.  Therefore the implementation
saves/restores the MSR across context switches and during exceptions.  Nested
exceptions are supported by each exception getting a new PKS state.

For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections on mappings with the default pkey value of 0.

Other keys, (1-15) are allocated by an allocator which prepares us for key
contention from day one.  Kernel users should be prepared for the allocator to
fail either because of key exhaustion or due to PKS not being supported on the
arch and/or CPU instance.

The following are key attributes of PKS.

   1) Fast switching of permissions
	1a) Prevents access without page table manipulations
	1b) No TLB flushes required
   2) Works on a per thread basis

PKS is available with 4 and 5 level paging.  Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.


[1] https://lore.kernel.org/lkml/20200717072056.73134-1-ira.weiny@intel.com/
[2] https://lore.kernel.org/lkml/20201009195033.3208459-2-ira.weiny@intel.com/
[3] https://lore.kernel.org/lkml/20201009194258.3207172-1-ira.weiny@intel.com/


Fenghua Yu (2):
  x86/pks: Enable Protection Keys Supervisor (PKS)
  x86/pks: Add PKS kernel API

Ira Weiny (7):
  x86/pkeys: Create pkeys_common.h
  x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
  x86/pks: Preserve the PKRS MSR on context switch
  x86/entry: Pass irqentry_state_t by reference
  x86/entry: Preserve PKRS MSR across exceptions
  x86/fault: Report the PKRS state on fault
  x86/pks: Add PKS test code

Thomas Gleixner (1):
  x86/entry: Move nmi entry/exit into common code

 Documentation/core-api/protection-keys.rst  | 103 ++-
 arch/x86/Kconfig                            |   1 +
 arch/x86/entry/common.c                     |  64 +-
 arch/x86/include/asm/cpufeatures.h          |   1 +
 arch/x86/include/asm/disabled-features.h    |   8 +-
 arch/x86/include/asm/idtentry.h             |  28 +-
 arch/x86/include/asm/msr-index.h            |   1 +
 arch/x86/include/asm/pgtable.h              |  13 +-
 arch/x86/include/asm/pgtable_types.h        |  12 +
 arch/x86/include/asm/pkeys.h                |  15 +
 arch/x86/include/asm/pkeys_common.h         |  40 ++
 arch/x86/include/asm/processor.h            |  18 +-
 arch/x86/include/uapi/asm/processor-flags.h |   2 +
 arch/x86/kernel/cpu/common.c                |  15 +
 arch/x86/kernel/cpu/mce/core.c              |   6 +-
 arch/x86/kernel/fpu/xstate.c                |  22 +-
 arch/x86/kernel/kvm.c                       |   6 +-
 arch/x86/kernel/nmi.c                       |   6 +-
 arch/x86/kernel/process.c                   |  26 +
 arch/x86/kernel/traps.c                     |  24 +-
 arch/x86/mm/fault.c                         |  87 ++-
 arch/x86/mm/pkeys.c                         | 194 +++++-
 include/linux/entry-common.h                |  64 +-
 include/linux/pgtable.h                     |   4 +
 include/linux/pkeys.h                       |  24 +
 kernel/entry/common.c                       |  62 +-
 lib/Kconfig.debug                           |  12 +
 lib/Makefile                                |   3 +
 lib/pks/Makefile                            |   3 +
 lib/pks/pks_test.c                          | 691 ++++++++++++++++++++
 mm/Kconfig                                  |   2 +
 tools/testing/selftests/x86/Makefile        |   3 +-
 tools/testing/selftests/x86/test_pks.c      |  66 ++
 33 files changed, 1465 insertions(+), 161 deletions(-)
 create mode 100644 arch/x86/include/asm/pkeys_common.h
 create mode 100644 lib/pks/Makefile
 create mode 100644 lib/pks/pks_test.c
 create mode 100644 tools/testing/selftests/x86/test_pks.c

Comments

Thomas Gleixner Nov. 2, 2020, 11:36 p.m. UTC | #1
On Mon, Nov 02 2020 at 12:53, ira weiny wrote:
> Fenghua Yu (2):
>   x86/pks: Enable Protection Keys Supervisor (PKS)
>   x86/pks: Add PKS kernel API
>
> Ira Weiny (7):
>   x86/pkeys: Create pkeys_common.h
>   x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
>   x86/pks: Preserve the PKRS MSR on context switch
>   x86/entry: Pass irqentry_state_t by reference
>   x86/entry: Preserve PKRS MSR across exceptions
>   x86/fault: Report the PKRS state on fault
>   x86/pks: Add PKS test code
>
> Thomas Gleixner (1):
>   x86/entry: Move nmi entry/exit into common code

So the actual patch ordering is:

   x86/pkeys: Create pkeys_common.h
   x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
   x86/pks: Enable Protection Keys Supervisor (PKS)
   x86/pks: Preserve the PKRS MSR on context switch
   x86/pks: Add PKS kernel API

   x86/entry: Move nmi entry/exit into common code
   x86/entry: Pass irqentry_state_t by reference

   x86/entry: Preserve PKRS MSR across exceptions
   x86/fault: Report the PKRS state on fault
   x86/pks: Add PKS test code

This is the wrong ordering, really.

     x86/entry: Move nmi entry/exit into common code

is a general cleanup and has absolutely nothing to do with PKRS.So this
wants to go first.

Also:

    x86/entry: Move nmi entry/exit into common code

is a prerequisite for the rest. So why is it in the middle of the
series?

And then you enable all that muck _before_ it is usable:

   Patch 3/N: x86/pks: Enable Protection Keys Supervisor (PKS)

Bisectability is overrrated, right?

Once again: Read an understand Documentation/process/*

Aside of that using a spell checker is not optional.

Thanks,

        tglx
Thomas Gleixner Nov. 2, 2020, 11:41 p.m. UTC | #2
On Tue, Nov 03 2020 at 00:36, Thomas Gleixner wrote:
> On Mon, Nov 02 2020 at 12:53, ira weiny wrote:
>
> This is the wrong ordering, really.
>
>      x86/entry: Move nmi entry/exit into common code
>
> is a general cleanup and has absolutely nothing to do with PKRS.So this
> wants to go first.
>
> Also:
>
>     x86/entry: Move nmi entry/exit into common code

this should be

     x86/entry: Pass irqentry_state_t by reference

of course. Copy&pasta fail...
Ira Weiny Nov. 4, 2020, 5:46 p.m. UTC | #3
On Tue, Nov 03, 2020 at 12:36:16AM +0100, Thomas Gleixner wrote:
> On Mon, Nov 02 2020 at 12:53, ira weiny wrote:
> > Fenghua Yu (2):
> >   x86/pks: Enable Protection Keys Supervisor (PKS)
> >   x86/pks: Add PKS kernel API
> >
> > Ira Weiny (7):
> >   x86/pkeys: Create pkeys_common.h
> >   x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
> >   x86/pks: Preserve the PKRS MSR on context switch
> >   x86/entry: Pass irqentry_state_t by reference
> >   x86/entry: Preserve PKRS MSR across exceptions
> >   x86/fault: Report the PKRS state on fault
> >   x86/pks: Add PKS test code
> >
> > Thomas Gleixner (1):
> >   x86/entry: Move nmi entry/exit into common code
> 
> So the actual patch ordering is:
> 
>    x86/pkeys: Create pkeys_common.h
>    x86/fpu: Refactor arch_set_user_pkey_access() for PKS support
>    x86/pks: Enable Protection Keys Supervisor (PKS)
>    x86/pks: Preserve the PKRS MSR on context switch
>    x86/pks: Add PKS kernel API
> 
>    x86/entry: Move nmi entry/exit into common code
>    x86/entry: Pass irqentry_state_t by reference
> 
>    x86/entry: Preserve PKRS MSR across exceptions
>    x86/fault: Report the PKRS state on fault
>    x86/pks: Add PKS test code
> 
> This is the wrong ordering, really.
> 
>      x86/entry: Move nmi entry/exit into common code
> 
> is a general cleanup and has absolutely nothing to do with PKRS.So this
> wants to go first.
> 

Sorry, yes this should be a pre-patch.

> Also:
> 
>     x86/entry: Move nmi entry/exit into common code
> [from other email]
>    >      x86/entry: Pass irqentry_state_t by reference
>    >      > 
>    >      >
> 
> is a prerequisite for the rest. So why is it in the middle of the
> series?

It is in the middle because passing by reference is not needed until additional
information is added to irqentry_state_t which is done immediately after this
patch by:

    x86/entry: Preserve PKRS MSR across exceptions

I debated squashing the 2 but it made review harder IMO.  But I thought keeping
them in order together made a lot of sense.

> 
> And then you enable all that muck _before_ it is usable:
> 

Strictly speaking you are correct, sorry.  I will reorder the series.

> 
> Bisectability is overrrated, right?

Agreed, bisectability is important.  I thought I had it covered but I was
wrong.

> 
> Once again: Read an understand Documentation/process/*
> 
> Aside of that using a spell checker is not optional.

Agreed.

In looking closer at the entry code I've found a couple of other instances I'll
add another precursor patch.

I've also found other errors with the series which I should have caught.  My
apologies I made some last minute changes which I should have checked more
thoroughly.

Thanks,
Ira