mbox series

[RFC,v2,00/19] PKS write protected page tables

Message ID 20210830235927.6443-1-rick.p.edgecombe@intel.com (mailing list archive)
Headers show
Series PKS write protected page tables | expand

Message

Edgecombe, Rick P Aug. 30, 2021, 11:59 p.m. UTC
Hi,

This is a second RFC for the PKS write protected tables concept. I'm sharing to
show the progress to interested people. I'd also appreciate any comments,
especially on the direct map page table protection solution (patch 17).

Since v1[1], the improvements are:
 - Fully handle direct map page tables, and handle hotplug/unplug path.
 - Create a debug time checker that scans page tables and verifies 
   their protection.
 - Fix odds-and-ends kernel page tables that showed up with debug 
   checker. At this point all of the typical normal page tables should be
   protected.
 - Fix toggling of writablility for odds-and-ends page table modifications found
   that don't use the normal helpers.
 - Create atomic context grouped page allocator, after finding some page table
   allocations that are passing GFP_ATOMIC.
 - Create "soft" mode that warns and disables protection on violation instead
   of oopsing.
 - Boot parameters for disabling pks tables
 - Change PageTable set clear to ctor/dtor (peterz)
 - Remove VM_BUG_ON_PAGE in alloc_table() (Shakeel Butt) 
 - PeterZ/Vlastimil had suggested to also build a non-PKS mode for use in  
   debugging. I skipped it for now because the series was too big.
 - Rebased to latest PKS core v7 [2]

Also, Mike Rapoport has been experimenting[3] with this usage to work on how to
share caches of permissioned/broken pages between use cases. This RFCv2 still
uses the "grouped pages" concept, where each usage would maintain its own
cache, but should be able to integrate with a central solution if something is
developed.

Next I was planning to look into characterizing/tuning the performance, although
what page allocation scheme is ultimately used will probably impact that.

This applies on top of the PKS core v7 series[2] and this patch[4]. Testing is
still pretty light.

This RFC has been acked by Dave Hansen.

[1] https://lore.kernel.org/lkml/20210505003032.489164-1-rick.p.edgecombe@intel.com/
[2] https://lore.kernel.org/lkml/20210804043231.2655537-1-ira.weiny@intel.com/
[3] https://lore.kernel.org/lkml/20210823132513.15836-1-rppt@kernel.org/
[4] https://lore.kernel.org/lkml/20210818221026.10794-1-rick.p.edgecombe@intel.com/

Rick Edgecombe (19):
  list: Support getting most recent element in list_lru
  list: Support list head not in object for list_lru
  x86/mm/cpa: Add grouped page allocations
  mm: Explicitly zero page table lock ptr
  x86, mm: Use cache of page tables
  x86/mm/cpa: Add perm callbacks to grouped pages
  x86/cpufeatures: Add feature for pks tables
  x86/mm/cpa: Add get_grouped_page_atomic()
  x86/mm: Support GFP_ATOMIC in alloc_table_node()
  x86/mm: Use alloc_table() for fill_pte(), etc
  mm/sparsemem: Use alloc_table() for table allocations
  x86/mm: Use free_table in unmap path
  mm/debug_vm_page_table: Use setters instead of WRITE_ONCE
  x86/efi: Toggle table protections when copying
  x86/mm/cpa: Add set_memory_pks()
  x86/mm: Protect page tables with PKS
  x86/mm/cpa: PKS protect direct map page tables
  x86/mm: Add PKS table soft mode
  x86/mm: Add PKS table debug checking

 .../admin-guide/kernel-parameters.txt         |   4 +
 arch/x86/boot/compressed/ident_map_64.c       |   5 +
 arch/x86/include/asm/cpufeatures.h            |   2 +-
 arch/x86/include/asm/pgalloc.h                |   6 +-
 arch/x86/include/asm/pgtable.h                |  31 +-
 arch/x86/include/asm/pgtable_64.h             |  33 +-
 arch/x86/include/asm/pkeys_common.h           |   1 -
 arch/x86/include/asm/set_memory.h             |  24 +
 arch/x86/mm/init.c                            |  90 +++
 arch/x86/mm/init_64.c                         |  29 +-
 arch/x86/mm/pat/set_memory.c                  | 527 +++++++++++++++++-
 arch/x86/mm/pgtable.c                         | 183 +++++-
 arch/x86/mm/pkeys.c                           |   4 +
 arch/x86/platform/efi/efi_64.c                |   8 +
 include/asm-generic/pgalloc.h                 |  46 +-
 include/linux/list_lru.h                      |  26 +
 include/linux/mm.h                            |  16 +-
 include/linux/pkeys.h                         |   1 +
 mm/Kconfig                                    |  23 +
 mm/debug_vm_pgtable.c                         |  36 +-
 mm/list_lru.c                                 |  38 +-
 mm/memory.c                                   |   1 +
 mm/sparse-vmemmap.c                           |  22 +-
 mm/swap.c                                     |   6 +
 mm/swap_state.c                               |   5 +
 .../arch/x86/include/asm/disabled-features.h  |   8 +-
 26 files changed, 1123 insertions(+), 52 deletions(-)

Comments

Kees Cook March 14, 2024, 4:27 p.m. UTC | #1
On Mon, Aug 30, 2021 at 04:59:08PM -0700, Rick Edgecombe wrote:
> This is a second RFC for the PKS write protected tables concept. I'm sharing to
> show the progress to interested people. I'd also appreciate any comments,
> especially on the direct map page table protection solution (patch 17).

*thread necromancy*

Hi,

Where does this series stand? I don't think it ever got merged?

-Kees
Edgecombe, Rick P March 14, 2024, 5:10 p.m. UTC | #2
On Thu, 2024-03-14 at 09:27 -0700, Kees Cook wrote:
> On Mon, Aug 30, 2021 at 04:59:08PM -0700, Rick Edgecombe wrote:
> > This is a second RFC for the PKS write protected tables concept.
> > I'm sharing to
> > show the progress to interested people. I'd also appreciate any
> > comments,
> > especially on the direct map page table protection solution (patch
> > 17).
> 
> *thread necromancy*
> 
> Hi,
> 
> Where does this series stand? I don't think it ever got merged?

There are sort of three components to this:
1. Basic PKS support. It was dropped after the main use case was
rejected (pmem stray write protection).
2. Solution for applying direct map permissions efficiently. This
includes avoiding excessive kernel shootdowns, as well as avoiding
direct map fragmentation. rppt continued to look at the fragmentation
part of the problem and ended up arguing that it actually isn't an
issue [0]. Regardless, the shootdown problem remains for usages like
PKS tables that allocate so frequently. There is an attempt to address
both in this series. But given the above, there may be lots of debate
and opinions.
3. The actual protection of the PKS tables (most of this series). It
got paused when I started to work on CET. In the meantime 1 was
dropped, and 2 is still open(?). So there is more to work through now,
then when it was dropped.

If anyone wants to pick it up, it is fine by me. I can help with
reviews.


[0] https://lwn.net/Articles/931406/
Ira Weiny March 14, 2024, 6:25 p.m. UTC | #3
Edgecombe, Rick P wrote:
> On Thu, 2024-03-14 at 09:27 -0700, Kees Cook wrote:
> > On Mon, Aug 30, 2021 at 04:59:08PM -0700, Rick Edgecombe wrote:
> > > This is a second RFC for the PKS write protected tables concept.
> > > I'm sharing to
> > > show the progress to interested people. I'd also appreciate any
> > > comments,
> > > especially on the direct map page table protection solution (patch
> > > 17).
> > 
> > *thread necromancy*
> > 
> > Hi,
> > 
> > Where does this series stand? I don't think it ever got merged?
> 
> There are sort of three components to this:
> 1. Basic PKS support. It was dropped after the main use case was
> rejected (pmem stray write protection).

This was the main reason it got dropped.

> 2. Solution for applying direct map permissions efficiently. This
> includes avoiding excessive kernel shootdowns, as well as avoiding
> direct map fragmentation. rppt continued to look at the fragmentation
> part of the problem and ended up arguing that it actually isn't an
> issue [0]. Regardless, the shootdown problem remains for usages like
> PKS tables that allocate so frequently. There is an attempt to address
> both in this series. But given the above, there may be lots of debate
> and opinions.
> 3. The actual protection of the PKS tables (most of this series). It
> got paused when I started to work on CET. In the meantime 1 was
> dropped, and 2 is still open(?). So there is more to work through now,
> then when it was dropped.
> 
> If anyone wants to pick it up, it is fine by me. I can help with
> reviews.

I can help with reviews as well,
Ira

> 
> 
> [0] https://lwn.net/Articles/931406/
Boris Lukashev March 14, 2024, 9:02 p.m. UTC | #4
IIRC shoot-downs are one of the reasons for using per-cpu PGDs which would
be a hard sell to some people.
https://forum.osdev.org/viewtopic.php?f=15&t=29661

-Boris

On Thu, Mar 14, 2024 at 2:26 PM Ira Weiny <ira.weiny@intel.com> wrote:

> Edgecombe, Rick P wrote:
> > On Thu, 2024-03-14 at 09:27 -0700, Kees Cook wrote:
> > > On Mon, Aug 30, 2021 at 04:59:08PM -0700, Rick Edgecombe wrote:
> > > > This is a second RFC for the PKS write protected tables concept.
> > > > I'm sharing to
> > > > show the progress to interested people. I'd also appreciate any
> > > > comments,
> > > > especially on the direct map page table protection solution (patch
> > > > 17).
> > >
> > > *thread necromancy*
> > >
> > > Hi,
> > >
> > > Where does this series stand? I don't think it ever got merged?
> >
> > There are sort of three components to this:
> > 1. Basic PKS support. It was dropped after the main use case was
> > rejected (pmem stray write protection).
>
> This was the main reason it got dropped.
>
> > 2. Solution for applying direct map permissions efficiently. This
> > includes avoiding excessive kernel shootdowns, as well as avoiding
> > direct map fragmentation. rppt continued to look at the fragmentation
> > part of the problem and ended up arguing that it actually isn't an
> > issue [0]. Regardless, the shootdown problem remains for usages like
> > PKS tables that allocate so frequently. There is an attempt to address
> > both in this series. But given the above, there may be lots of debate
> > and opinions.
> > 3. The actual protection of the PKS tables (most of this series). It
> > got paused when I started to work on CET. In the meantime 1 was
> > dropped, and 2 is still open(?). So there is more to work through now,
> > then when it was dropped.
> >
> > If anyone wants to pick it up, it is fine by me. I can help with
> > reviews.
>
> I can help with reviews as well,
> Ira
>
> >
> >
> > [0] https://lwn.net/Articles/931406/
>
>
>
Boris Lukashev March 16, 2024, 3:14 a.m. UTC | #5
IIRC shoot-downs are one of the reasons for using per-cpu PGDs, which
can in-turn enable/underpin other hardening functions... presuming the
churn of recent years has softened attitudes toward such core MM
changes.
https://forum.osdev.org/viewtopic.php?f=15&t=29661

-Boris


On Mon, Aug 30, 2021 at 8:02 PM Rick Edgecombe
<rick.p.edgecombe@intel.com> wrote:
>
> Hi,
>
> This is a second RFC for the PKS write protected tables concept. I'm sharing to
> show the progress to interested people. I'd also appreciate any comments,
> especially on the direct map page table protection solution (patch 17).
>
> Since v1[1], the improvements are:
>  - Fully handle direct map page tables, and handle hotplug/unplug path.
>  - Create a debug time checker that scans page tables and verifies
>    their protection.
>  - Fix odds-and-ends kernel page tables that showed up with debug
>    checker. At this point all of the typical normal page tables should be
>    protected.
>  - Fix toggling of writablility for odds-and-ends page table modifications found
>    that don't use the normal helpers.
>  - Create atomic context grouped page allocator, after finding some page table
>    allocations that are passing GFP_ATOMIC.
>  - Create "soft" mode that warns and disables protection on violation instead
>    of oopsing.
>  - Boot parameters for disabling pks tables
>  - Change PageTable set clear to ctor/dtor (peterz)
>  - Remove VM_BUG_ON_PAGE in alloc_table() (Shakeel Butt)
>  - PeterZ/Vlastimil had suggested to also build a non-PKS mode for use in
>    debugging. I skipped it for now because the series was too big.
>  - Rebased to latest PKS core v7 [2]
>
> Also, Mike Rapoport has been experimenting[3] with this usage to work on how to
> share caches of permissioned/broken pages between use cases. This RFCv2 still
> uses the "grouped pages" concept, where each usage would maintain its own
> cache, but should be able to integrate with a central solution if something is
> developed.
>
> Next I was planning to look into characterizing/tuning the performance, although
> what page allocation scheme is ultimately used will probably impact that.
>
> This applies on top of the PKS core v7 series[2] and this patch[4]. Testing is
> still pretty light.
>
> This RFC has been acked by Dave Hansen.
>
> [1] https://lore.kernel.org/lkml/20210505003032.489164-1-rick.p.edgecombe@intel.com/
> [2] https://lore.kernel.org/lkml/20210804043231.2655537-1-ira.weiny@intel.com/
> [3] https://lore.kernel.org/lkml/20210823132513.15836-1-rppt@kernel.org/
> [4] https://lore.kernel.org/lkml/20210818221026.10794-1-rick.p.edgecombe@intel.com/
>
> Rick Edgecombe (19):
>   list: Support getting most recent element in list_lru
>   list: Support list head not in object for list_lru
>   x86/mm/cpa: Add grouped page allocations
>   mm: Explicitly zero page table lock ptr
>   x86, mm: Use cache of page tables
>   x86/mm/cpa: Add perm callbacks to grouped pages
>   x86/cpufeatures: Add feature for pks tables
>   x86/mm/cpa: Add get_grouped_page_atomic()
>   x86/mm: Support GFP_ATOMIC in alloc_table_node()
>   x86/mm: Use alloc_table() for fill_pte(), etc
>   mm/sparsemem: Use alloc_table() for table allocations
>   x86/mm: Use free_table in unmap path
>   mm/debug_vm_page_table: Use setters instead of WRITE_ONCE
>   x86/efi: Toggle table protections when copying
>   x86/mm/cpa: Add set_memory_pks()
>   x86/mm: Protect page tables with PKS
>   x86/mm/cpa: PKS protect direct map page tables
>   x86/mm: Add PKS table soft mode
>   x86/mm: Add PKS table debug checking
>
>  .../admin-guide/kernel-parameters.txt         |   4 +
>  arch/x86/boot/compressed/ident_map_64.c       |   5 +
>  arch/x86/include/asm/cpufeatures.h            |   2 +-
>  arch/x86/include/asm/pgalloc.h                |   6 +-
>  arch/x86/include/asm/pgtable.h                |  31 +-
>  arch/x86/include/asm/pgtable_64.h             |  33 +-
>  arch/x86/include/asm/pkeys_common.h           |   1 -
>  arch/x86/include/asm/set_memory.h             |  24 +
>  arch/x86/mm/init.c                            |  90 +++
>  arch/x86/mm/init_64.c                         |  29 +-
>  arch/x86/mm/pat/set_memory.c                  | 527 +++++++++++++++++-
>  arch/x86/mm/pgtable.c                         | 183 +++++-
>  arch/x86/mm/pkeys.c                           |   4 +
>  arch/x86/platform/efi/efi_64.c                |   8 +
>  include/asm-generic/pgalloc.h                 |  46 +-
>  include/linux/list_lru.h                      |  26 +
>  include/linux/mm.h                            |  16 +-
>  include/linux/pkeys.h                         |   1 +
>  mm/Kconfig                                    |  23 +
>  mm/debug_vm_pgtable.c                         |  36 +-
>  mm/list_lru.c                                 |  38 +-
>  mm/memory.c                                   |   1 +
>  mm/sparse-vmemmap.c                           |  22 +-
>  mm/swap.c                                     |   6 +
>  mm/swap_state.c                               |   5 +
>  .../arch/x86/include/asm/disabled-features.h  |   8 +-
>  26 files changed, 1123 insertions(+), 52 deletions(-)
>
> --
> 2.17.1
>