diff mbox series

[v6,09/14] KVM: arm64: Enable KVM_CAP_EXIT_ON_MISSING and annotate an EFAULT from stage-2 fault-handler

Message ID 20231109210325.3806151-10-amoorthy@google.com (mailing list archive)
State New, archived
Headers show
Series Improve KVM + userfaultfd performance via KVM_MEMORY_FAULT_EXITs on stage-2 faults | expand

Commit Message

Anish Moorthy Nov. 9, 2023, 9:03 p.m. UTC
Prevent the stage-2 fault handler from faulting in pages when
KVM_MEM_EXIT_ON_MISSING is set by allowing its  __gfn_to_pfn_memslot()
calls to check the memslot flag.

To actually make that behavior useful, prepare a KVM_EXIT_MEMORY_FAULT
when the stage-2 handler cannot resolve the pfn for a fault. With
KVM_MEM_EXIT_ON_MISSING enabled this effects the delivery of stage-2
faults as vCPU exits, which userspace can attempt to resolve without
terminating the guest.

Delivering stage-2 faults to userspace in this way sidesteps the
significant scalabiliy issues associated with using userfaultfd for the
same purpose.

Signed-off-by: Anish Moorthy <amoorthy@google.com>
---
 Documentation/virt/kvm/api.rst | 2 +-
 arch/arm64/kvm/Kconfig         | 1 +
 arch/arm64/kvm/mmu.c           | 7 +++++--
 3 files changed, 7 insertions(+), 3 deletions(-)

Comments

James Houghton Jan. 30, 2024, 11:58 p.m. UTC | #1
Hi Anish,

Sorry to get back to you so late. :) I was hoping others would provide
more feedback, but I have a little bit to give anyway. Overall the
series looks good to me.

On Thu, Nov 9, 2023 at 1:03 PM Anish Moorthy <amoorthy@google.com> wrote:
>
> Prevent the stage-2 fault handler from faulting in pages when
> KVM_MEM_EXIT_ON_MISSING is set by allowing its  __gfn_to_pfn_memslot()
> calls to check the memslot flag.
>
> To actually make that behavior useful, prepare a KVM_EXIT_MEMORY_FAULT
> when the stage-2 handler cannot resolve the pfn for a fault. With
> KVM_MEM_EXIT_ON_MISSING enabled this effects the delivery of stage-2
> faults as vCPU exits, which userspace can attempt to resolve without
> terminating the guest.
>
> Delivering stage-2 faults to userspace in this way sidesteps the
> significant scalabiliy issues associated with using userfaultfd for the
> same purpose.
>
> Signed-off-by: Anish Moorthy <amoorthy@google.com>
> ---
>  Documentation/virt/kvm/api.rst | 2 +-
>  arch/arm64/kvm/Kconfig         | 1 +
>  arch/arm64/kvm/mmu.c           | 7 +++++--
>  3 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index fd87bbfbfdf2..67fcb9dbe855 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8068,7 +8068,7 @@ See KVM_EXIT_MEMORY_FAULT for more information.
>  7.35 KVM_CAP_EXIT_ON_MISSING
>  ----------------------------
>
> -:Architectures: x86
> +:Architectures: x86, arm64
>  :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
>
>  The presence of this capability indicates that userspace may set the
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 1a777715199f..d6fae31f7e1a 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -43,6 +43,7 @@ menuconfig KVM
>         select GUEST_PERF_EVENTS if PERF_EVENTS
>         select INTERVAL_TREE
>         select XARRAY_MULTI
> +        select HAVE_KVM_EXIT_ON_MISSING
>         help
>           Support hosting virtualized guest machines.
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 13066a6fdfff..3b9fb80672ac 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1486,13 +1486,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>         mmap_read_unlock(current->mm);
>
>         pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
> -                                  write_fault, &writable, false, NULL);
> +                                  write_fault, &writable, true, NULL);
>         if (pfn == KVM_PFN_ERR_HWPOISON) {
>                 kvm_send_hwpoison_signal(hva, vma_shift);
>                 return 0;
>         }
> -       if (is_error_noslot_pfn(pfn))
> +       if (is_error_noslot_pfn(pfn)) {
> +               kvm_prepare_memory_fault_exit(vcpu, gfn * PAGE_SIZE, PAGE_SIZE,
> +                                             write_fault, exec_fault, false);

I think that either (1) we move this kvm_prepare_memory_fault_exit
logic into the previous patch[1], or (2) we merge this patch with the
previous one. IIUC, we can only advertise KVM_CAP_MEMORY_FAULT_INFO on
arm64 if this logic is present.

As for the changelog in the previous patch[1], if you leave it
unmerged with this one, something like "Enable
KVM_CAP_MEMORY_FAULT_INFO to make KVM_CAP_EXIT_ON_MISSING useful, as
without it, userspace doesn't know which page(s) of memory it needs to
fix" works for me.

Also, I think we need to update the documentation for
KVM_CAP_MEMORY_FAULT_INFO to say that it is available for arm64 now
(just like you have done for KVM_CAP_EXIT_ON_MISSING).

Thanks!

[1]: https://lore.kernel.org/kvm/20231109210325.3806151-9-amoorthy@google.com/

>                 return -EFAULT;
> +       }
>
>         if (kvm_is_device_pfn(pfn)) {
>                 /*
> --
> 2.42.0.869.gea05f2083d-goog
>
Anish Moorthy Jan. 31, 2024, 10:38 p.m. UTC | #2
On Tue, Jan 30, 2024 at 3:58 PM James Houghton <jthoughton@google.com> wrote:
>
> Hi Anish,
>
> Sorry to get back to you so late. :) I was hoping others would provide
> more feedback, but I have a little bit to give anyway. Overall the
> series looks good to me.

Thanks James. I'm just happy to have some review :D

> On Thu, Nov 9, 2023 at 1:03 PM Anish Moorthy <amoorthy@google.com> wrote:
> >
> > Prevent the stage-2 fault handler from faulting in pages when
> > KVM_MEM_EXIT_ON_MISSING is set by allowing its  __gfn_to_pfn_memslot()
> > calls to check the memslot flag.
> >
> > To actually make that behavior useful, prepare a KVM_EXIT_MEMORY_FAULT
> > when the stage-2 handler cannot resolve the pfn for a fault. With
> > KVM_MEM_EXIT_ON_MISSING enabled this effects the delivery of stage-2
> > faults as vCPU exits, which userspace can attempt to resolve without
> > terminating the guest.
> >
> > Delivering stage-2 faults to userspace in this way sidesteps the
> > significant scalabiliy issues associated with using userfaultfd for the
> > same purpose.
> >
> > Signed-off-by: Anish Moorthy <amoorthy@google.com>
> > ---
> >  Documentation/virt/kvm/api.rst | 2 +-
> >  arch/arm64/kvm/Kconfig         | 1 +
> >  arch/arm64/kvm/mmu.c           | 7 +++++--
> >  3 files changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index fd87bbfbfdf2..67fcb9dbe855 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -8068,7 +8068,7 @@ See KVM_EXIT_MEMORY_FAULT for more information.
> >  7.35 KVM_CAP_EXIT_ON_MISSING
> >  ----------------------------
> >
> > -:Architectures: x86
> > +:Architectures: x86, arm64
> >  :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
> >
> >  The presence of this capability indicates that userspace may set the
> > diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> > index 1a777715199f..d6fae31f7e1a 100644
> > --- a/arch/arm64/kvm/Kconfig
> > +++ b/arch/arm64/kvm/Kconfig
> > @@ -43,6 +43,7 @@ menuconfig KVM
> >         select GUEST_PERF_EVENTS if PERF_EVENTS
> >         select INTERVAL_TREE
> >         select XARRAY_MULTI
> > +        select HAVE_KVM_EXIT_ON_MISSING
> >         help
> >           Support hosting virtualized guest machines.
> >
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 13066a6fdfff..3b9fb80672ac 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1486,13 +1486,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >         mmap_read_unlock(current->mm);
> >
> >         pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
> > -                                  write_fault, &writable, false, NULL);
> > +                                  write_fault, &writable, true, NULL);
> >         if (pfn == KVM_PFN_ERR_HWPOISON) {
> >                 kvm_send_hwpoison_signal(hva, vma_shift);
> >                 return 0;
> >         }
> > -       if (is_error_noslot_pfn(pfn))
> > +       if (is_error_noslot_pfn(pfn)) {
> > +               kvm_prepare_memory_fault_exit(vcpu, gfn * PAGE_SIZE, PAGE_SIZE,
> > +                                             write_fault, exec_fault, false);
>
> I think that either (1) we move this kvm_prepare_memory_fault_exit
> logic into the previous patch[1], or (2) we merge this patch with the
> previous one. IIUC, we can only advertise KVM_CAP_MEMORY_FAULT_INFO on
> arm64 if this logic is present.

Actually (sorry, about-face from our off-list chat), *does* it make
sense to merge these two patches? arm64 could benefit from annotated
EFAULTs even without KVM_CAP_EXIT_ON_MISSING: for instance if there
were spots outside the stage-2 handler if EFAULTs were annotated [a].
And if this patch was for some reason reverted in the future, then we
probably wouldn't want that to entail silencing any other
KVM_EXIT_MEMORY_FAULTs that arm64 might also get.

Sean did also ask me to merge some patches back on v5 [b], but I think
the point there was to enable KVM_CAP_EXIT_ON_MISSING at the same time
as adding the stage-2 annotation, which I'm doing here.

[a] Theoretically, anyways. Atm only x86 code uses
kvm_prepare_memory_fault_exit()
[b] https://lore.kernel.org/kvm/ZR4WzE1JOvq_0dhE@google.com/

> As for the changelog in the previous patch[1], if you leave it
> unmerged with this one, something like "Enable
> KVM_CAP_MEMORY_FAULT_INFO to make KVM_CAP_EXIT_ON_MISSING useful, as
> without it, userspace doesn't know which page(s) of memory it needs to
> fix" works for me.

For that previous patch, I updated the description to the following

> Advertise to arm64 userspaces that KVM_RUN may return annotated EFAULTs
> (see KVM_EXIT_MEMORY_FAULT). In fact KVM_RUN might already be annotating
> some EFAULTs, but this capability is necessary for userspace to know
> that.

Which I think makes more sense than sort of forward-declaring
KVM_CAP_EXIT_ON_MISSING on arm64.

On Tue, Jan 30, 2024 at 3:58 PM James Houghton <jthoughton@google.com> wrote:
>
> Also, I think we need to update the documentation for
> KVM_CAP_MEMORY_FAULT_INFO to say that it is available for arm64 now
> (just like you have done for KVM_CAP_EXIT_ON_MISSING).

Done, ty
Anish Moorthy Feb. 9, 2024, 1:21 a.m. UTC | #3
On Wed, Jan 31, 2024 at 2:38 PM Anish Moorthy <amoorthy@google.com> wrote:
>
> On Tue, Jan 30, 2024 at 3:58 PM James Houghton <jthoughton@google.com> wrote:
> >
> > I think that either (1) we move this kvm_prepare_memory_fault_exit
> > logic into the previous patch[1], or (2) we merge this patch with the
> > previous one. IIUC, we can only advertise KVM_CAP_MEMORY_FAULT_INFO on
> > arm64 if this logic is present.
>
> Actually (sorry, about-face from our off-list chat), *does* it make
> sense to merge these two patches?

As per [1]: yes, it makes sense to move the kvm_prepare_memory_fault_exit().

So for the next version, the description is also going to change a bit to

> KVM: arm64: Implement and advertise KVM_CAP_EXIT_ON_MISSING
>
> Prevent the stage-2 fault handler from faulting in pages when
> KVM_MEM_EXIT_ON_MISSING is set by allowing its  __gfn_to_pfn_memslot()
> call to check the memslot flag. This effects the delivery of stage-2
> faults as vCPU exits (see KVM_CAP_MEMORY_FAULT_INFO), which userspace
> can attempt to resolve without terminating the guest.

> Delivering stage-2 faults to userspace in this way sidesteps the
> significant scalabiliy issues associated with using userfaultfd for the
> same purpose.

[1] https://lore.kernel.org/kvm/ZcP_JHsMJUlvjAs1@linux.dev/#t
diff mbox series

Patch

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index fd87bbfbfdf2..67fcb9dbe855 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8068,7 +8068,7 @@  See KVM_EXIT_MEMORY_FAULT for more information.
 7.35 KVM_CAP_EXIT_ON_MISSING
 ----------------------------
 
-:Architectures: x86
+:Architectures: x86, arm64
 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
 
 The presence of this capability indicates that userspace may set the
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 1a777715199f..d6fae31f7e1a 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -43,6 +43,7 @@  menuconfig KVM
 	select GUEST_PERF_EVENTS if PERF_EVENTS
 	select INTERVAL_TREE
 	select XARRAY_MULTI
+        select HAVE_KVM_EXIT_ON_MISSING
 	help
 	  Support hosting virtualized guest machines.
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 13066a6fdfff..3b9fb80672ac 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1486,13 +1486,16 @@  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	mmap_read_unlock(current->mm);
 
 	pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
-				   write_fault, &writable, false, NULL);
+				   write_fault, &writable, true, NULL);
 	if (pfn == KVM_PFN_ERR_HWPOISON) {
 		kvm_send_hwpoison_signal(hva, vma_shift);
 		return 0;
 	}
-	if (is_error_noslot_pfn(pfn))
+	if (is_error_noslot_pfn(pfn)) {
+		kvm_prepare_memory_fault_exit(vcpu, gfn * PAGE_SIZE, PAGE_SIZE,
+					      write_fault, exec_fault, false);
 		return -EFAULT;
+	}
 
 	if (kvm_is_device_pfn(pfn)) {
 		/*