mbox series

[v2,0/5] KVM: guest_memfd: support for uffd minor

Message ID 20250402160721.97596-1-kalyazin@amazon.com (mailing list archive)
Headers show
Series KVM: guest_memfd: support for uffd minor | expand

Message

Nikita Kalyazin April 2, 2025, 4:07 p.m. UTC
This series is built on top of Fuad's v7 "mapping guest_memfd backed
memory at the host" [1].

With James's KVM userfault [2], it is possible to handle stage-2 faults
in guest_memfd in userspace.  However, KVM itself also triggers faults
in guest_memfd in some cases, for example: PV interfaces like kvmclock,
PV EOI and page table walking code when fetching the MMIO instruction on
x86.  It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3]
that KVM would be accessing those pages via userspace page tables.  In
order for such faults to be handled in userspace, guest_memfd needs to
support userfaultfd.

Changes since v1 [4]:
 - James, Peter: implement a full minor trap instead of a hybrid
   missing/minor trap
 - James, Peter: to avoid shmem- and guest_memfd-specific code in the
   UFFDIO_CONTINUE implementation make it generic by calling
vm_ops->fault()

While generalising UFFDIO_CONTINUE implementation helped avoid
guest_memfd-specific code in mm/userfaulfd, userfaultfd still needs
access to KVM code to be able to verify the VMA type when handling
UFFDIO_REGISTER_MODE_MINOR, so I used a similar approach to what Fuad
did for now [5].

In v1, Peter was mentioning a potential for eliminating taking a folio
lock [6].  I did not implement that, but according to my testing, the
performance of shmem minor fault handling stayed the same after the
migration to calling vm_ops->fault() (tested on an x86).

Before:

./demand_paging_test -u MINOR -s shmem
Random seed: 0x6b8b4567
Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory: [0x3fffbffff000, 0x3ffffffff000)
Finished creating vCPUs and starting uffd threads
Started all vCPUs
All vCPU threads joined
Total guest execution time:	10.979277020s
Per-vcpu demand paging rate:	23876.253375 pgs/sec/vcpu
Overall demand paging rate:	23876.253375 pgs/sec

After:

./demand_paging_test -u MINOR -s shmem
Random seed: 0x6b8b4567
Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory: [0x3fffbffff000, 0x3ffffffff000)
Finished creating vCPUs and starting uffd threads
Started all vCPUs
All vCPU threads joined
Total guest execution time:	10.978893504s
Per-vcpu demand paging rate:	23877.087423 pgs/sec/vcpu
Overall demand paging rate:	23877.087423 pgs/sec

Nikita

[1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/
[2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/
[3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3
[4] https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com/T/
[5] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/#Z2e.:..:20250318161823.4005529-3-tabba::40google.com:1mm:swap.c
[6] https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com/T/#m8695dc24d2cc633a6a486a8990e3f7d50d4efb79

Nikita Kalyazin (5):
  mm: userfaultfd: generic continue for non hugetlbfs
  KVM: guest_memfd: add kvm_gmem_vma_is_gmem
  mm: userfaultfd: allow to register continue for guest_memfd
  KVM: guest_memfd: add support for userfaultfd minor
  KVM: selftests: test userfaultfd minor for guest_memfd

 include/linux/mm_types.h                      |  3 +
 include/linux/userfaultfd_k.h                 | 13 ++-
 mm/hugetlb.c                                  |  2 +-
 mm/shmem.c                                    |  3 +-
 mm/userfaultfd.c                              | 25 +++--
 .../testing/selftests/kvm/guest_memfd_test.c  | 94 +++++++++++++++++++
 virt/kvm/guest_memfd.c                        | 15 +++
 virt/kvm/kvm_mm.h                             |  1 +
 8 files changed, 146 insertions(+), 10 deletions(-)


base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0