mbox series

[v2,00/14] KVM: guest_memfd: lazy preparation of pages + prefault support for SEV-SNP

Message ID 20240726185157.72821-1-pbonzini@redhat.com (mailing list archive)
Headers show
Series KVM: guest_memfd: lazy preparation of pages + prefault support for SEV-SNP | expand

Message

Paolo Bonzini July 26, 2024, 6:51 p.m. UTC
There are a few issues with the implementation of SEV page "preparation"
and KVM_SEV_SNP_LAUNCH_UPDATE:

- doing fallocate() before KVM_SEV_SNP_LAUNCH_UPDATE will cause the latter to fail.

- checking that only private pages are passed to KVM_SEV_SNP_LAUNCH_UPDATE
  is done in the guts of vendor code, as is the check that pages are not
  passed twice.  This goes against the idea of putting as much common
  code as possible in kvm_gmem_populate(), for example it returns -EINVAL
  if the page is already assigned.

- clearing the page is unnecessary if the firmware will overwrite it

- the "prepare" bool argument is kinda gross

The important observation here is that the two paths that initialize the
contents of the folio are mutually exclusive: either the folio is populated
with unencrypted data by userspace, or it is zeroed because userspace did
not do anything beyond fallocate().  But in the latter there's no need to
zero and prepare the page at the time of fallocate(): it can be done instead
when the guest actually uses the page.

Pulling all the zero-and-prepare code into kvm_gmem_get_pfn() separates the
flows more clearly, but how do you recognize folios that haven't been
prepared yet?  The answer is to use the up-to-date flag; there is a
limited attempt to use it right now, but it only concerns itself with
the folio having been cleared.  Instead after this series the flag is set
for folios that went through either kvm_gmem_populate() or that have been
mapped into the guest once (and thus went through kvm_arch_gmem_prepare(),
on architectures where that is a thing).

As a bonus, KVM knows exactly which guest is mapping a page, and thus
the preparation code does not have to iterate through all bound
instances of "struct kvm".

There is an easy way vendor-independent way to obtain the previous
behavior if desired, and that is simply to do KVM_PRE_FAULT_MEMORY after
KVM_SEV_SNP_LAUNCH_FINISH.  Note that KVM_PRE_FAULT_MEMORY *before*
KVM_SEV_SNP_LAUNCH_FINISH would not work, and is therefore explicitly
blocked by the first patch in the series.

The bulk of the changes are in patches 1, 7, 10 and 13.  Everything else is
small preparatory changes; many patches in the first half for example try
to use struct folio more instead of struct page and pfns, which is useful
when the code region before folio_unlock() is extended to the callers
of kvm_gmem_get_folio().  There's also a couple cleanups in the middle,
for example patches 4 and 8.

Paolo



Paolo Bonzini (14):
  KVM: x86: disallow pre-fault for SNP VMs before initialization
  KVM: guest_memfd: return folio from __kvm_gmem_get_pfn()
  KVM: guest_memfd: delay folio_mark_uptodate() until after successful
    preparation
  KVM: guest_memfd: do not go through struct page
  KVM: rename CONFIG_HAVE_KVM_GMEM_* to CONFIG_HAVE_KVM_ARCH_GMEM_*
  KVM: guest_memfd: return locked folio from __kvm_gmem_get_pfn
  KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is
    passed to the guest
  KVM: guest_memfd: make kvm_gmem_prepare_folio() operate on a single
    struct kvm
  KVM: remove kvm_arch_gmem_prepare_needed()
  KVM: guest_memfd: move check for already-populated page to common code
  KVM: cleanup and add shortcuts to kvm_range_has_memory_attributes()
  KVM: extend kvm_range_has_memory_attributes() to check subset of
    attributes
  KVM: guest_memfd: let kvm_gmem_populate() operate only on private gfns
  KVM: guest_memfd: abstract how prepared folios are recorded

v1->v2:
- patch 1: new
- patch 7: point out benign race between page fault and hole-punch
- patch 7: clean up comments
- patch 10: clean up commit message
- patch 11: remove ";;"
- patch 13: fix length argument to kvm_range_has_memory_attributes()
- patch 14: new

 Documentation/virt/kvm/api.rst  |   6 +
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/kvm/Kconfig            |   4 +-
 arch/x86/kvm/mmu/mmu.c          |   5 +-
 arch/x86/kvm/svm/sev.c          |  17 +--
 arch/x86/kvm/svm/svm.c          |   1 +
 arch/x86/kvm/x86.c              |  12 +-
 include/linux/kvm_host.h        |   9 +-
 virt/kvm/Kconfig                |   4 +-
 virt/kvm/guest_memfd.c          | 226 +++++++++++++++++++-------------
 virt/kvm/kvm_main.c             |  73 +++++------
 11 files changed, 206 insertions(+), 152 deletions(-)

Comments

Michael Roth July 29, 2024, 10:06 p.m. UTC | #1
On Fri, Jul 26, 2024 at 02:51:43PM -0400, Paolo Bonzini wrote:
> Paolo Bonzini (14):
>   KVM: x86: disallow pre-fault for SNP VMs before initialization
>   KVM: guest_memfd: return folio from __kvm_gmem_get_pfn()
>   KVM: guest_memfd: delay folio_mark_uptodate() until after successful
>     preparation
>   KVM: guest_memfd: do not go through struct page
>   KVM: rename CONFIG_HAVE_KVM_GMEM_* to CONFIG_HAVE_KVM_ARCH_GMEM_*
>   KVM: guest_memfd: return locked folio from __kvm_gmem_get_pfn
>   KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is
>     passed to the guest
>   KVM: guest_memfd: make kvm_gmem_prepare_folio() operate on a single
>     struct kvm
>   KVM: remove kvm_arch_gmem_prepare_needed()
>   KVM: guest_memfd: move check for already-populated page to common code
>   KVM: cleanup and add shortcuts to kvm_range_has_memory_attributes()
>   KVM: extend kvm_range_has_memory_attributes() to check subset of
>     attributes
>   KVM: guest_memfd: let kvm_gmem_populate() operate only on private gfns
>   KVM: guest_memfd: abstract how prepared folios are recorded

Re-tested series with multiple SNP guests with/without experimental THP
patches on top.

Also re-tested using the following SNP self tests (I believe Pratik has
a newer version he'll be submitting soon):

  https://github.com/mdroth/linux/commits/snp-uptodate2-kst4/
  x86_64/coco_pre_fault_memory_test

Series:

Tested-by: Michael Roth <michael.roth@amd.com>

> 
> v1->v2:
> - patch 1: new
> - patch 7: point out benign race between page fault and hole-punch
> - patch 7: clean up comments
> - patch 10: clean up commit message
> - patch 11: remove ";;"
> - patch 13: fix length argument to kvm_range_has_memory_attributes()
> - patch 14: new
> 
>  Documentation/virt/kvm/api.rst  |   6 +
>  arch/x86/include/asm/kvm_host.h |   1 +
>  arch/x86/kvm/Kconfig            |   4 +-
>  arch/x86/kvm/mmu/mmu.c          |   5 +-
>  arch/x86/kvm/svm/sev.c          |  17 +--
>  arch/x86/kvm/svm/svm.c          |   1 +
>  arch/x86/kvm/x86.c              |  12 +-
>  include/linux/kvm_host.h        |   9 +-
>  virt/kvm/Kconfig                |   4 +-
>  virt/kvm/guest_memfd.c          | 226 +++++++++++++++++++-------------
>  virt/kvm/kvm_main.c             |  73 +++++------
>  11 files changed, 206 insertions(+), 152 deletions(-)
> 
> -- 
> 2.43.0
> 
>