mbox series

[RFCv2,00/13] TDX and guest memory unmapping

Message ID 20210416154106.23721-1-kirill.shutemov@linux.intel.com (mailing list archive)
Headers show
Series TDX and guest memory unmapping | expand

Message

Kirill A . Shutemov April 16, 2021, 3:40 p.m. UTC
TDX integrity check failures may lead to system shutdown host kernel must
not allow any writes to TD-private memory. This requirment clashes with
KVM design: KVM expects the guest memory to be mapped into host userspace
(e.g. QEMU).

This patchset aims to start discussion on how we can approach the issue.

The core of the change is in the last patch. Please see more detailed
description of the issue and proposoal of the solution there.

TODO:
  - THP support
  - Clarify semantics wrt. page cache

v2:
  - Unpoison page on free
  - Authorize access to the page: only the KVM that poisoned the page
    allows to touch it
  - FOLL_ALLOW_POISONED is implemented

The patchset can also be found here:

git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git kvm-unmapped-poison

Kirill A. Shutemov (13):
  x86/mm: Move force_dma_unencrypted() to common code
  x86/kvm: Introduce KVM memory protection feature
  x86/kvm: Make DMA pages shared
  x86/kvm: Use bounce buffers for KVM memory protection
  x86/kvmclock: Share hvclock memory with the host
  x86/realmode: Share trampoline area if KVM memory protection enabled
  mm: Add hwpoison_entry_to_pfn() and hwpoison_entry_to_page()
  mm/gup: Add FOLL_ALLOW_POISONED
  shmem: Fail shmem_getpage_gfp() on poisoned pages
  mm: Keep page reference for hwpoison entries
  mm: Replace hwpoison entry with present PTE if page got unpoisoned
  KVM: passdown struct kvm to hva_to_pfn_slow()
  KVM: unmap guest memory using poisoned pages

 arch/powerpc/kvm/book3s_64_mmu_hv.c    |   2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c |   2 +-
 arch/x86/Kconfig                       |   9 +-
 arch/x86/include/asm/cpufeatures.h     |   1 +
 arch/x86/include/asm/io.h              |   4 +-
 arch/x86/include/asm/kvm_para.h        |   5 +
 arch/x86/include/asm/mem_encrypt.h     |   7 +-
 arch/x86/include/uapi/asm/kvm_para.h   |   3 +-
 arch/x86/kernel/kvm.c                  |  19 +++
 arch/x86/kernel/kvmclock.c             |   2 +-
 arch/x86/kernel/pci-swiotlb.c          |   3 +-
 arch/x86/kvm/Kconfig                   |   1 +
 arch/x86/kvm/cpuid.c                   |   3 +-
 arch/x86/kvm/mmu/mmu.c                 |  20 ++-
 arch/x86/kvm/mmu/paging_tmpl.h         |  10 +-
 arch/x86/kvm/x86.c                     |   6 +
 arch/x86/mm/Makefile                   |   2 +
 arch/x86/mm/mem_encrypt.c              |  74 ---------
 arch/x86/mm/mem_encrypt_common.c       |  87 ++++++++++
 arch/x86/mm/pat/set_memory.c           |  10 ++
 arch/x86/realmode/init.c               |   7 +-
 include/linux/kvm_host.h               |  31 +++-
 include/linux/mm.h                     |   1 +
 include/linux/swapops.h                |  20 +++
 include/uapi/linux/kvm_para.h          |   5 +-
 mm/gup.c                               |  29 ++--
 mm/memory.c                            |  44 ++++-
 mm/page_alloc.c                        |   7 +
 mm/rmap.c                              |   2 +-
 mm/shmem.c                             |   7 +
 virt/Makefile                          |   2 +-
 virt/kvm/Kconfig                       |   4 +
 virt/kvm/Makefile                      |   1 +
 virt/kvm/kvm_main.c                    | 212 +++++++++++++++++++------
 virt/kvm/mem_protected.c               | 110 +++++++++++++
 35 files changed, 593 insertions(+), 159 deletions(-)
 create mode 100644 arch/x86/mm/mem_encrypt_common.c
 create mode 100644 virt/kvm/Makefile
 create mode 100644 virt/kvm/mem_protected.c

Comments

Matthew Wilcox April 16, 2021, 4:46 p.m. UTC | #1
On Fri, Apr 16, 2021 at 06:40:53PM +0300, Kirill A. Shutemov wrote:
> TDX integrity check failures may lead to system shutdown host kernel must
> not allow any writes to TD-private memory. This requirment clashes with
> KVM design: KVM expects the guest memory to be mapped into host userspace
> (e.g. QEMU).
> 
> This patchset aims to start discussion on how we can approach the issue.
> 
> The core of the change is in the last patch. Please see more detailed
> description of the issue and proposoal of the solution there.

This seems to have some parallels with s390's arch_make_page_accessible().
Is there any chance to combine the two, so we don't end up with duplicated
hooks all over the MM for this kind of thing?

https://patchwork.kernel.org/project/kvm/cover/20200214222658.12946-1-borntraeger@de.ibm.com/

and recent THP/Folio-related discussion:
https://lore.kernel.org/linux-mm/20210409194059.GW2531743@casper.infradead.org/