Message ID | 20230908222905.1321305-12-amoorthy@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Improve KVM + userfaultfd live migration via annotated memory faults. | expand |
On Fri, Sep 08, 2023, Anish Moorthy wrote: > The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn() > already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE. --verbose > Signed-off-by: Anish Moorthy <amoorthy@google.com> > --- > Documentation/virt/kvm/api.rst | 2 +- > arch/x86/kvm/Kconfig | 1 + > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index c2eaacb6dc63..a74d721a18f6 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -7788,7 +7788,7 @@ error/annotated fault. > 7.35 KVM_CAP_USERFAULT_ON_MISSING > --------------------------------- > > -:Architectures: None > +:Architectures: x86 > :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. > > The presence of this capability indicates that userspace may set the > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > index ed90f148140d..11d956f17a9d 100644 > --- a/arch/x86/kvm/Kconfig > +++ b/arch/x86/kvm/Kconfig > @@ -49,6 +49,7 @@ config KVM > select INTERVAL_TREE > select HAVE_KVM_PM_NOTIFIER if PM > select KVM_GENERIC_HARDWARE_ENABLING > + select HAVE_KVM_USERFAULT_ON_MISSING Hmm, I vote to squash this patch with KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() or rather, squash that code into this patch. Ditto for the ARM patches. Since we're making KVM_EXIT_MEMORY_FAULT informational-only for flows that KVM isn't committing to supporting, I think it makes sense to enable the arch specific paths that *need* to return KVM_EXIT_MEMORY_FAULT at the same time as the feature that adds the requirement. E.g. without HAVE_KVM_USERFAULT_ON_MISSING support, per the contract we are creating, it would be totally fine for KVM to not use KVM_EXIT_MEMORY_FAULT for the page fault paths. And that obviously has to be the case since KVM_CAP_MEMORY_FAULT_INFO is introduced before the arch code is enabled. But as of this path, KVM *must* return KVM_EXIT_MEMORY_FAULT, so we should make it impossible to do a straight revert of that dependency. That should also help with the changelogs, e.g. it'll give you a prompt for why only kvm_handle_error_pfn() is getting treatment.
On Wed, Oct 4, 2023 at 6:52 PM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Sep 08, 2023, Anish Moorthy wrote: > > The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn() > > already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE. > > --verbose Alright, how about the following? KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING __gfn_to_pfn_memslot(), used by the stage-2 fault handler, already checks the memslot flag to avoid faulting in missing pages as required. Therefore, enabling the capability is just a matter of selecting the kconfig to allow the flag to actually be set. > Hmm, I vote to squash this patch with > > KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() > > or rather, squash that code into this patch. Ditto for the ARM patches. > > Since we're making KVM_EXIT_MEMORY_FAULT informational-only for flows that KVM > isn't committing to supporting, I think it makes sense to enable the arch specific > paths that *need* to return KVM_EXIT_MEMORY_FAULT at the same time as the feature > that adds the requirement. > > E.g. without HAVE_KVM_USERFAULT_ON_MISSING support, per the contract we are creating, > it would be totally fine for KVM to not use KVM_EXIT_MEMORY_FAULT for the page > fault paths. And that obviously has to be the case since KVM_CAP_MEMORY_FAULT_INFO > is introduced before the arch code is enabled. > > But as of this path, KVM *must* return KVM_EXIT_MEMORY_FAULT, so we should make > it impossible to do a straight revert of that dependency. Should we really be concerned about people reverting the KVM_CAP_MEMORY_FAULT_INFO commit(s) in this way? I see what you're saying- but it also seems to me that KVM could add other things akin to KVM_CAP_EXIT_ON_MISSING in the future, and then we end up in the exact same situation. Sure the squash might make sense for the specific commits in the series, but there's a general issue that isn't really being solved. Maybe I'm just letting the better be the enemy of the best, but I do like the current separation/single-focus of the commits. That said, the squash is nbd and I can go ahead if you're not convinced.
On Wed, Nov 01, 2023, Anish Moorthy wrote: > On Wed, Oct 4, 2023 at 6:52 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Fri, Sep 08, 2023, Anish Moorthy wrote: > > > The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn() > > > already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE. > > > > --verbose > > Alright, how about the following? > > KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING > > __gfn_to_pfn_memslot(), used by the stage-2 fault handler, already > checks the memslot flag to avoid faulting in missing pages as required. > Therefore, enabling the capability is just a matter of selecting the kconfig > to allow the flag to actually be set. > > > Hmm, I vote to squash this patch with > > > > KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() > > > > or rather, squash that code into this patch. Ditto for the ARM patches. > > > > Since we're making KVM_EXIT_MEMORY_FAULT informational-only for flows that KVM > > isn't committing to supporting, I think it makes sense to enable the arch specific > > paths that *need* to return KVM_EXIT_MEMORY_FAULT at the same time as the feature > > that adds the requirement. > > > > E.g. without HAVE_KVM_USERFAULT_ON_MISSING support, per the contract we are creating, > > it would be totally fine for KVM to not use KVM_EXIT_MEMORY_FAULT for the page > > fault paths. And that obviously has to be the case since KVM_CAP_MEMORY_FAULT_INFO > > is introduced before the arch code is enabled. > > > > But as of this path, KVM *must* return KVM_EXIT_MEMORY_FAULT, so we should make > > it impossible to do a straight revert of that dependency. > > Should we really be concerned about people reverting the > KVM_CAP_MEMORY_FAULT_INFO commit(s) in this way? Yes. A revert is highly unlikely, but possible. Keep in mind that with upstream KVM, there are a *lot* of end users that don't know the inner workings of KVM (or the kernel). When things break, the standard approach for such users is to bisect to find where things first broke, and then to try reverting the suspected commit to see if that fixes the problem. In the (again, highly unlikely) case that filling run->memory_fault breaks something, an unsuspecting user will bisect to that commit and revert. Then they're suddenly in a situation where they've unknowing broken KVM, and likely in a way that won't immediately fail. *Nothing* in either changelog gives any clue that there is a hard dependency. Even the slightly more verbose version above provides almost no indication of the real dependency, as it primarily talks about __gfn_to_pfn_memslot() and KVM_CAP_EXIT_ON_MISSING. And now that we've punted on trying annotate "everything", there's no sane way for the "Annotate -EFAULTs from kvm_handle_error_pfn()" changelog to communicate that it will have a hard dependency in the future. If the changelog says "this will be used by blah blah blah", then it raises the question of "well why isn't this added there?". And the patches are tiny. Having a final "enable and advertise XYZ" patch is relatively common for new features, but that's often because the enabling of the new feature is spread across multiple patches. E.g. see the LAM support sitting in "https://github.com/kvm-x86/linux.git lam". And in those cases, the hard dependency is always very obvious, e.g. if someone complained that reverting "Virtualize LAM for user pointer" but not "Advertise and enable LAM (user and supervisor)" caused problems, we'd be like, "well, yeah". > I see what you're> saying- but it also seems to me that KVM could add other > things akin to KVM_CAP_EXIT_ON_MISSING in the future, and then we end up in > the exact same situation. No, it's not the exact same situation. First and foremost, we *can't* solve that problem, because some future feature doesn't exist yet. Second, merging features into two different kernel releases creates a very different situation. Let's say this goes into kernel 6.9, and then some future feature lands in kernel 6.11 and has a hard dependency on the annotations. The odds of needing to revert a patch from 6.9 while upgrading to 6.11 are significnatly lower than the odds of needing to revert a patch from 6.9-rc1 between when rc1 is released and a user upgrades to 6.9. And users aren't stupid; they might not know the inner workings of KVM, but bisecting to a patch from 6.9 when upgrading to 6.11 would give them pause. Third, with the patches squashed, to revert the annotations, a person would also be reverting _this_ patch (because they'd be one and the same). At that point, they're no longer reverting a seemingly innocous "give userspace more info" commit, they're reverting something that clearly advertises a feature userspace, which again provides a clue that a straight revert might not be super safe. > Sure the squash might make sense for the specific commits in the series, but > there's a general issue that isn't really being solved. Patches within a series and future series are two very different things. > Maybe I'm just letting the better be the enemy of the best, The "best" isn't even possible, unless we never ship anything, ever. > but I do like the current separation/single-focus of the commits. Because you already know what the entire series does. Someone looking at this patch without already understanding the big picture is going to have no idea that these two patches are tightly coupled. And again, now that we've punted on annotating everything, I see zero reason to split the patches. Maybe you could argue it provides a new bisection point, but again the patches are _tiny_, and that same bisection point is effectively achieved by running with an "older" userspace, i.e. a userspace that doesn't utilize the KVM_MEM_EXIT_ON_MISSING, which is literally every userspace in existence right now. > That said, the squash is nbd and I can go ahead if you're not convinced.
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index c2eaacb6dc63..a74d721a18f6 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7788,7 +7788,7 @@ error/annotated fault. 7.35 KVM_CAP_USERFAULT_ON_MISSING --------------------------------- -:Architectures: None +:Architectures: x86 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. The presence of this capability indicates that userspace may set the diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index ed90f148140d..11d956f17a9d 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -49,6 +49,7 @@ config KVM select INTERVAL_TREE select HAVE_KVM_PM_NOTIFIER if PM select KVM_GENERIC_HARDWARE_ENABLING + select HAVE_KVM_USERFAULT_ON_MISSING help Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent
The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn() already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE. Signed-off-by: Anish Moorthy <amoorthy@google.com> --- Documentation/virt/kvm/api.rst | 2 +- arch/x86/kvm/Kconfig | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)