@@ -8068,7 +8068,7 @@ See KVM_EXIT_MEMORY_FAULT for more information.
7.35 KVM_CAP_EXIT_ON_MISSING
----------------------------
-:Architectures: None
+:Architectures: x86
:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
The presence of this capability indicates that userspace may set the
@@ -49,6 +49,7 @@ config KVM
select INTERVAL_TREE
select HAVE_KVM_PM_NOTIFIER if PM
select KVM_GENERIC_HARDWARE_ENABLING
+ select HAVE_KVM_EXIT_ON_MISSING
help
Support hosting fully virtualized guest machines using hardware
virtualization extensions. You will need a fairly recent
@@ -3309,6 +3309,10 @@ static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fa
return RET_PF_RETRY;
}
+ WARN_ON_ONCE(fault->goal_level != PG_LEVEL_4K);
+
+ kvm_prepare_memory_fault_exit(vcpu, gfn_to_gpa(fault->gfn), PAGE_SIZE,
+ fault->write, fault->exec, fault->is_private);
return -EFAULT;
}
@@ -4375,7 +4379,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
async = false;
fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async,
fault->write, &fault->map_writable,
- false, &fault->hva);
+ true, &fault->hva);
if (!async)
return RET_PF_CONTINUE; /* *pfn has correct page already */
@@ -4397,7 +4401,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
*/
fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, true, NULL,
fault->write, &fault->map_writable,
- false, &fault->hva);
+ true, &fault->hva);
return RET_PF_CONTINUE;
}
Prevent the stage-2 fault handler from faulting in pages when KVM_MEM_EXIT_ON_MISSING is set by allowing its __gfn_to_pfn_memslot() calls to check the memslot flag. To actually make that behavior useful, prepare a KVM_EXIT_MEMORY_FAULT when the stage-2 handler returns EFAULT, e.g. when it cannot resolve the pfn. With KVM_MEM_EXIT_ON_MISSING enabled this effects the delivery of stage-2 faults as vCPU exits, which userspace can attempt to resolve without terminating the guest. Delivering stage-2 faults to userspace in this way sidesteps the significant scalabiliy issues associated with using userfaultfd for the same purpose. Signed-off-by: Anish Moorthy <amoorthy@google.com> --- Documentation/virt/kvm/api.rst | 2 +- arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/mmu/mmu.c | 8 ++++++-- 3 files changed, 8 insertions(+), 3 deletions(-)