[v5,11/17] KVM: x86: Enable KVM_CAP_USERFAULT_ON_MISSING

Message ID	20230908222905.1321305-12-amoorthy@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@vger.kernel.org> Date: Fri, 8 Sep 2023 22:28:58 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> Message-ID: <20230908222905.1321305-12-amoorthy@google.com> Subject: [PATCH v5 11/17] KVM: x86: Enable KVM_CAP_USERFAULT_ON_MISSING From: Anish Moorthy <amoorthy@google.com> To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Improve KVM + userfaultfd live migration via annotated memory faults. \| expand [v5,00/17] Improve KVM + userfaultfd live migration via annotated memory faults. [v5,01/17] KVM: Clarify documentation of hva_to_pfn()'s 'atomic' parameter [v5,02/17] KVM: Add docstrings to __kvm_read/write_guest_page() [v5,03/17] KVM: Simplify error handling in __gfn_to_pfn_memslot() [v5,04/17] KVM: Add KVM_CAP_MEMORY_FAULT_INFO [v5,05/17] KVM: Annotate -EFAULTs from kvm_vcpu_read/write_guest_page() [v5,06/17] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() [v5,07/17] KVM: arm64: Annotate -EFAULT from user_mem_abort() [v5,08/17] KVM: Allow hva_pfn_fast() to resolve read faults [v5,09/17] KVM: Introduce KVM_CAP_USERFAULT_ON_MISSING without implementation [v5,10/17] KVM: Implement KVM_CAP_USERFAULT_ON_MISSING by atomizing __gfn_to_pfn_memslot() calls [v5,11/17] KVM: x86: Enable KVM_CAP_USERFAULT_ON_MISSING [v5,12/17] KVM: arm64: Enable KVM_CAP_USERFAULT_ON_MISSING [v5,13/17] KVM: selftests: Report per-vcpu demand paging rate from demand paging test [v5,14/17] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test [v5,15/17] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_… [v5,16/17] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() [v5,17/17] KVM: selftests: Handle memory fault exits in demand_paging_test

Message ID

20230908222905.1321305-12-amoorthy@google.com (mailing list archive)

State

New, archived

Headers

Date: Fri,  8 Sep 2023 22:28:58 +0000
In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com>
Mime-Version: 1.0
References: <20230908222905.1321305-1-amoorthy@google.com>
Message-ID: <20230908222905.1321305-12-amoorthy@google.com>
Subject: [PATCH v5 11/17] KVM: x86: Enable KVM_CAP_USERFAULT_ON_MISSING
From: Anish Moorthy <amoorthy@google.com>
To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org,
        kvmarm@lists.linux.dev
Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com,
        jthoughton@google.com, amoorthy@google.com, ricarkol@google.com,
        axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com,
        isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

Improve KVM + userfaultfd live migration via annotated memory faults. | expand

Commit Message

Anish Moorthy Sept. 8, 2023, 10:28 p.m. UTC

The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn()
already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE.

Signed-off-by: Anish Moorthy <amoorthy@google.com>
---
 Documentation/virt/kvm/api.rst | 2 +-
 arch/x86/kvm/Kconfig           | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

Sean Christopherson Oct. 5, 2023, 1:52 a.m. UTC | #1

On Fri, Sep 08, 2023, Anish Moorthy wrote:
> The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn()
> already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE.

--verbose

> Signed-off-by: Anish Moorthy <amoorthy@google.com>
> ---
>  Documentation/virt/kvm/api.rst | 2 +-
>  arch/x86/kvm/Kconfig           | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index c2eaacb6dc63..a74d721a18f6 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7788,7 +7788,7 @@ error/annotated fault.
>  7.35 KVM_CAP_USERFAULT_ON_MISSING
>  ---------------------------------
>  
> -:Architectures: None
> +:Architectures: x86
>  :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
>  
>  The presence of this capability indicates that userspace may set the
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index ed90f148140d..11d956f17a9d 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -49,6 +49,7 @@ config KVM
>  	select INTERVAL_TREE
>  	select HAVE_KVM_PM_NOTIFIER if PM
>  	select KVM_GENERIC_HARDWARE_ENABLING
> +        select HAVE_KVM_USERFAULT_ON_MISSING

Hmm, I vote to squash this patch with

  KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn()

or rather, squash that code into this patch.  Ditto for the ARM patches.

Since we're making KVM_EXIT_MEMORY_FAULT informational-only for flows that KVM
isn't committing to supporting, I think it makes sense to enable the arch specific
paths that *need* to return KVM_EXIT_MEMORY_FAULT at the same time as the feature
that adds the requirement.

E.g. without HAVE_KVM_USERFAULT_ON_MISSING support, per the contract we are creating,
it would be totally fine for KVM to not use KVM_EXIT_MEMORY_FAULT for the page
fault paths.  And that obviously has to be the case since KVM_CAP_MEMORY_FAULT_INFO
is introduced before the arch code is enabled.

But as of this path, KVM *must* return KVM_EXIT_MEMORY_FAULT, so we should make
it impossible to do a straight revert of that dependency.  That should also help
with the changelogs, e.g. it'll give you a prompt for why only kvm_handle_error_pfn()
is getting treatment.

Anish Moorthy Nov. 1, 2023, 10:55 p.m. UTC | #2

On Wed, Oct 4, 2023 at 6:52 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Sep 08, 2023, Anish Moorthy wrote:
> > The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn()
> > already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE.
>
> --verbose

Alright, how about the following?

    KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING

    __gfn_to_pfn_memslot(), used by the stage-2 fault handler, already
    checks the memslot flag to avoid faulting in missing pages as required.
    Therefore, enabling the capability is just a matter of selecting the kconfig
    to allow the flag to actually be set.

> Hmm, I vote to squash this patch with
>
>   KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn()
>
> or rather, squash that code into this patch.  Ditto for the ARM patches.
>
> Since we're making KVM_EXIT_MEMORY_FAULT informational-only for flows that KVM
> isn't committing to supporting, I think it makes sense to enable the arch specific
> paths that *need* to return KVM_EXIT_MEMORY_FAULT at the same time as the feature
> that adds the requirement.
>
> E.g. without HAVE_KVM_USERFAULT_ON_MISSING support, per the contract we are creating,
> it would be totally fine for KVM to not use KVM_EXIT_MEMORY_FAULT for the page
> fault paths.  And that obviously has to be the case since KVM_CAP_MEMORY_FAULT_INFO
> is introduced before the arch code is enabled.
>
> But as of this path, KVM *must* return KVM_EXIT_MEMORY_FAULT, so we should make
> it impossible to do a straight revert of that dependency.

Should we really be concerned about people reverting the
KVM_CAP_MEMORY_FAULT_INFO commit(s) in this way? I see what you're
saying- but it also seems to me that KVM could add other things akin
to KVM_CAP_EXIT_ON_MISSING in the future, and then we end up in the
exact same situation. Sure the squash might make sense for the
specific commits in the series, but there's a general issue that isn't
really being solved.

Maybe I'm just letting the better be the enemy of the best, but I do
like the current separation/single-focus of the commits. That said,
the squash is nbd and I can go ahead if you're not convinced.

Sean Christopherson Nov. 2, 2023, 2:31 p.m. UTC | #3

On Wed, Nov 01, 2023, Anish Moorthy wrote:
> On Wed, Oct 4, 2023 at 6:52 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Sep 08, 2023, Anish Moorthy wrote:
> > > The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn()
> > > already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE.
> >
> > --verbose
> 
> Alright, how about the following?
> 
>     KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING
> 
>     __gfn_to_pfn_memslot(), used by the stage-2 fault handler, already
>     checks the memslot flag to avoid faulting in missing pages as required.
>     Therefore, enabling the capability is just a matter of selecting the kconfig
>     to allow the flag to actually be set.
> 
> > Hmm, I vote to squash this patch with
> >
> >   KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn()
> >
> > or rather, squash that code into this patch.  Ditto for the ARM patches.
> >
> > Since we're making KVM_EXIT_MEMORY_FAULT informational-only for flows that KVM
> > isn't committing to supporting, I think it makes sense to enable the arch specific
> > paths that *need* to return KVM_EXIT_MEMORY_FAULT at the same time as the feature
> > that adds the requirement.
> >
> > E.g. without HAVE_KVM_USERFAULT_ON_MISSING support, per the contract we are creating,
> > it would be totally fine for KVM to not use KVM_EXIT_MEMORY_FAULT for the page
> > fault paths.  And that obviously has to be the case since KVM_CAP_MEMORY_FAULT_INFO
> > is introduced before the arch code is enabled.
> >
> > But as of this path, KVM *must* return KVM_EXIT_MEMORY_FAULT, so we should make
> > it impossible to do a straight revert of that dependency.
> 
> Should we really be concerned about people reverting the
> KVM_CAP_MEMORY_FAULT_INFO commit(s) in this way?

Yes.  A revert is highly unlikely, but possible.  Keep in mind that with upstream
KVM, there are a *lot* of end users that don't know the inner workings of KVM
(or the kernel).  When things break, the standard approach for such users is to
bisect to find where things first broke, and then to try reverting the suspected
commit to see if that fixes the problem.

In the (again, highly unlikely) case that filling run->memory_fault breaks something,
an unsuspecting user will bisect to that commit and revert.  Then they're suddenly
in a situation where they've unknowing broken KVM, and likely in a way that won't
immediately fail.

*Nothing* in either changelog gives any clue that there is a hard dependency.
Even the slightly more verbose version above provides almost no indication of the
real dependency, as it primarily talks about __gfn_to_pfn_memslot() and
KVM_CAP_EXIT_ON_MISSING.

And now that we've punted on trying annotate "everything", there's no sane way
for the "Annotate -EFAULTs from kvm_handle_error_pfn()" changelog to communicate
that it will have a hard dependency in the future.  If the changelog says "this
will be used by blah blah blah", then it raises the question of "well why isn't
this added there?".

And the patches are tiny.  Having a final "enable and advertise XYZ" patch is
relatively common for new features, but that's often because the enabling of the
new feature is spread across multiple patches.  E.g. see the LAM support sitting
in "https://github.com/kvm-x86/linux.git lam".  And in those cases, the hard
dependency is always very obvious, e.g. if someone complained that reverting
"Virtualize LAM for user pointer" but not "Advertise and enable LAM (user and
supervisor)" caused problems, we'd be like, "well, yeah".

> I see what you're> saying- but it also seems to me that KVM could add other
> things akin to KVM_CAP_EXIT_ON_MISSING in the future, and then we end up in
> the exact same situation.

No, it's not the exact same situation.

First and foremost, we *can't* solve that problem, because some future feature
doesn't exist yet.

Second, merging features into two different kernel releases creates a very different
situation.  Let's say this goes into kernel 6.9, and then some future feature
lands in kernel 6.11 and has a hard dependency on the annotations.  The odds of
needing to revert a patch from 6.9 while upgrading to 6.11 are significnatly lower
than the odds of needing to revert a patch from 6.9-rc1 between when rc1 is released
and a user upgrades to 6.9.  And users aren't stupid; they might not know the inner
workings of KVM, but bisecting to a patch from 6.9 when upgrading to 6.11 would
give them pause.

Third, with the patches squashed, to revert the annotations, a person would also
be reverting _this_ patch (because they'd be one and the same).  At that point,
they're no longer reverting a seemingly innocous "give userspace more info" commit,
they're reverting something that clearly advertises a feature userspace, which
again provides a clue that a straight revert might not be super safe.

> Sure the squash might make sense for the specific commits in the series, but
> there's a general issue that isn't really being solved.

Patches within a series and future series are two very different things.

> Maybe I'm just letting the better be the enemy of the best,

The "best" isn't even possible, unless we never ship anything, ever.

> but I do like the current separation/single-focus of the commits.

Because you already know what the entire series does.  Someone looking at this
patch without already understanding the big picture is going to have no idea that
these two patches are tightly coupled.

And again, now that we've punted on annotating everything, I see zero reason to
split the patches.  Maybe you could argue it provides a new bisection point, but
again the patches are _tiny_, and that same bisection point is effectively achieved
by running with an "older" userspace, i.e. a userspace that doesn't utilize the
KVM_MEM_EXIT_ON_MISSING, which is literally every userspace in existence right now.

> That said, the squash is nbd and I can go ahead if you're not convinced.

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index c2eaacb6dc63..a74d721a18f6 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7788,7 +7788,7 @@  error/annotated fault.
 7.35 KVM_CAP_USERFAULT_ON_MISSING
 ---------------------------------
 
-:Architectures: None
+:Architectures: x86
 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP.
 
 The presence of this capability indicates that userspace may set the
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ed90f148140d..11d956f17a9d 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -49,6 +49,7 @@  config KVM
 	select INTERVAL_TREE
 	select HAVE_KVM_PM_NOTIFIER if PM
 	select KVM_GENERIC_HARDWARE_ENABLING
+        select HAVE_KVM_USERFAULT_ON_MISSING
 	help
 	  Support hosting fully virtualized guest machines using hardware
 	  virtualization extensions.  You will need a fairly recent