Message ID | 20240927161657.68110-2-iorlov@amazon.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Handle MMIO during event delivery error on SVM | expand |
"KVM: VMX:" for the scope. See "Shortlog" in Documentation/process/maintainer-kvm-x86.rst On Fri, Sep 27, 2024, Ivan Orlov wrote: > Extract KVM_INTERNAL_ERROR_DELIVERY_EV internal error generation into > the SVM/VMX-agnostic 'kvm_prepare_ev_delivery_failure_exit' function, as > it is done for KVM_INTERNAL_ERROR_EMULATION. Use the changelog to provide a human readable summary of the change. There are definitely situations where calling out functions, variables, defines, etc. by name is necessary, but this isn't one such situation. > The order of internal.data array entries is preserved as is, so it is going > to be the same on VMX platforms (vectoring info, full exit reason, exit > qualification, GPA if error happened due to MMIO and last_vmentry_cpu of the > vcpu). Similar to the above, let the code speak. The "No functional change intended" makes it clear that the intent is to preserve the order and behavior. > Having it as a separate function will help us to avoid code duplication Avoid pronouns as much as possible, and no "we" or "us" as a hard rule. E.g. this can all be distilled down to: -- Extract VMX's code for reporting an unhandleable VM-Exit during event delivery to userspace, so that the boilerplate code can be shared by SVM. No functional change intended. -- > when handling the MMIO during event delivery error on SVM. > > No functional change intended. > > Signed-off-by: Ivan Orlov <iorlov@amazon.com> > --- > arch/x86/include/asm/kvm_host.h | 2 ++ > arch/x86/kvm/vmx/vmx.c | 15 +++------------ > arch/x86/kvm/x86.c | 22 ++++++++++++++++++++++ > 3 files changed, 27 insertions(+), 12 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 6d9f763a7bb9..348daba424dd 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -2060,6 +2060,8 @@ void __kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu, > u64 *data, u8 ndata); > void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu); > > +void kvm_prepare_ev_delivery_failure_exit(struct kvm_vcpu *vcpu, gpa_t gpa, bool is_mmio); Please wrap at 80 columns. While checkpatch doesn't complaing until 100, my preference is to default to wrapping at 80, and poking past 80 only when it yields more readable code (which is obviously subjective, but it shouldn't be too hard to figure out KVM x86's preferred style). > void kvm_enable_efer_bits(u64); > bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); > int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data); > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index c67e448c6ebd..afd785e7f3a3 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -6550,19 +6550,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) > exit_reason.basic != EXIT_REASON_APIC_ACCESS && > exit_reason.basic != EXIT_REASON_TASK_SWITCH && > exit_reason.basic != EXIT_REASON_NOTIFY)) { > - int ndata = 3; > + gpa_t gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); > + bool is_mmio = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG; There's no need for is_mmio, just pass INVALID_GPA when the GPA isn't known. > - vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; > - vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV; > - vcpu->run->internal.data[0] = vectoring_info; > - vcpu->run->internal.data[1] = exit_reason.full; > - vcpu->run->internal.data[2] = vmx_get_exit_qual(vcpu); > - if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) { > - vcpu->run->internal.data[ndata++] = > - vmcs_read64(GUEST_PHYSICAL_ADDRESS); > - } > - vcpu->run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu; > - vcpu->run->internal.ndata = ndata; > + kvm_prepare_ev_delivery_failure_exit(vcpu, gpa, is_mmio); > return 0; > } > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 83fe0a78146f..8ee67fc23e5d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -8828,6 +8828,28 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu) > } > EXPORT_SYMBOL_GPL(kvm_prepare_emulation_failure_exit); > > +void kvm_prepare_ev_delivery_failure_exit(struct kvm_vcpu *vcpu, gpa_t gpa, bool is_mmio) Hmm, I don't love the name. I really don't like that event is abbreviated, and I suspect many readers will be misinterpret "event delivery failure" to mean that _KVM_ failed to deliver an event. Which is kinda sorta true, but it's more accurate to say that the CPU triggered a VM-Exit when vectoring/delivery an event, and KVM doesn't have code to robustly handle the situation. Maybe kvm_prepare_event_vectoring_exit()? Vectoring is quite specific in Intel terminology. > +{ > + struct kvm_run *run = vcpu->run; > + int ndata = 0; > + u32 reason, intr_info, error_code; > + u64 info1, info2; Reverse fir/x-mas tree for variables. See "Coding Style" in Documentation/process/maintainer-kvm-x86.rst (which will redirect you to Documentation/process/maintainer-tip.rst, specifically "Variable declarations"). > + > + kvm_x86_call(get_exit_info)(vcpu, &reason, &info1, &info2, &intr_info, &error_code); Wrap. Though calling back into vendor code is silly. Pass the necessary info as parameters. E.g. error_code and intr_info are unused, so the above is wasteful and weird. > + > + run->internal.data[ndata++] = info2; > + run->internal.data[ndata++] = reason; > + run->internal.data[ndata++] = info1; > + if (is_mmio) And this is where keying off MMIO gets weird. > + run->internal.data[ndata++] = (u64)gpa; > + run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu; > + > + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; > + run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV; > + run->internal.ndata = ndata; > +} > +EXPORT_SYMBOL_GPL(kvm_prepare_ev_delivery_failure_exit); > + > static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type) > { > struct kvm *kvm = vcpu->kvm; > -- > 2.43.0 >
On Fri, Oct 11, 2024, Sean Christopherson wrote: > > + kvm_x86_call(get_exit_info)(vcpu, &reason, &info1, &info2, &intr_info, &error_code); > > Wrap. Though calling back into vendor code is silly. Pass the necessary info > as parameters. E.g. error_code and intr_info are unused, so the above is wasteful > and weird. Ah, but the next patch invokes this from common code, i.e. can't pass in the necessary info. _That_ is definitely worth calling out in the changelog.
Hi Sean, On Fri, Oct 11, 2024 at 04:20:46PM -0700, Sean Christopherson wrote: > "KVM: VMX:" for the scope. See "Shortlog" in Documentation/process/maintainer-kvm-x86.rst > Ah, will update in the next version, thanks! > On Fri, Sep 27, 2024, Ivan Orlov wrote: > > Extract KVM_INTERNAL_ERROR_DELIVERY_EV internal error generation into > > the SVM/VMX-agnostic 'kvm_prepare_ev_delivery_failure_exit' function, as > > it is done for KVM_INTERNAL_ERROR_EMULATION. > > Use the changelog to provide a human readable summary of the change. There are > definitely situations where calling out functions, variables, defines, etc. by > name is necessary, but this isn't one such situation. > > > The order of internal.data array entries is preserved as is, so it is going > > to be the same on VMX platforms (vectoring info, full exit reason, exit > > qualification, GPA if error happened due to MMIO and last_vmentry_cpu of the > > vcpu). > > Similar to the above, let the code speak. The "No functional change intended" > makes it clear that the intent is to preserve the order and behavior. > > > Having it as a separate function will help us to avoid code duplication > > Avoid pronouns as much as possible, and no "we" or "us" as a hard rule. E.g. this > can all be distilled down to: > Yeah, makes sense. Will reformulate the message in the next version to consider all of the changes you suggested. > -- > Extract VMX's code for reporting an unhandleable VM-Exit during event > delivery to userspace, so that the boilerplate code can be shared by SVM. > > No functional change intended. > -- > Awesome, thanks for the example! > > Please wrap at 80 columns. While checkpatch doesn't complaing until 100, my > preference is to default to wrapping at 80, and poking past 80 only when it yields > more readable code (which is obviously subjective, but it shouldn't be too hard > to figure out KVM x86's preferred style). > Alright, will do, thanks! These rules vary from one subsystem to another, and I'll try to keep the style consistent in the future. > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index c67e448c6ebd..afd785e7f3a3 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -6550,19 +6550,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) > > exit_reason.basic != EXIT_REASON_APIC_ACCESS && > > exit_reason.basic != EXIT_REASON_TASK_SWITCH && > > exit_reason.basic != EXIT_REASON_NOTIFY)) { > > - int ndata = 3; > > + gpa_t gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); > > + bool is_mmio = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG; > > There's no need for is_mmio, just pass INVALID_GPA when the GPA isn't known. > Ah alright, then we definitely don't need an is_mmio field. I assume we can't do MMIO at GPA=0, right? > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 83fe0a78146f..8ee67fc23e5d 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -8828,6 +8828,28 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu) > > } > > EXPORT_SYMBOL_GPL(kvm_prepare_emulation_failure_exit); > > > > +void kvm_prepare_ev_delivery_failure_exit(struct kvm_vcpu *vcpu, gpa_t gpa, bool is_mmio) > > Hmm, I don't love the name. I really don't like that event is abbreviated, and > I suspect many readers will be misinterpret "event delivery failure" to mean that > _KVM_ failed to deliver an event. Which is kinda sorta true, but it's more > accurate to say that the CPU triggered a VM-Exit when vectoring/delivery an event, > and KVM doesn't have code to robustly handle the situation. > > Maybe kvm_prepare_event_vectoring_exit()? Vectoring is quite specific in Intel > terminology. > Yep, sounds good, I like that the name you suggested doesn't contain 'failure' part as essentially it is not a failure but an MMIO exit. Will update in V2. > > +{ > > + struct kvm_run *run = vcpu->run; > > + int ndata = 0; > > + u32 reason, intr_info, error_code; > > + u64 info1, info2; > > Reverse fir/x-mas tree for variables. See "Coding Style" in > Documentation/process/maintainer-kvm-x86.rst (which will redirect you to > Documentation/process/maintainer-tip.rst, specifically "Variable declarations"). > Great, didn't know about that, thanks! > > + > > + kvm_x86_call(get_exit_info)(vcpu, &reason, &info1, &info2, &intr_info, &error_code); > > Wrap. Though calling back into vendor code is silly. Pass the necessary info > as parameters. E.g. error_code and intr_info are unused, so the above is wasteful > and weird. > I use it here as this function gets called from the common for svm/vmx code in the next patch, but as I can see from the next email you've already noticed that :) > > + > > + run->internal.data[ndata++] = info2; > > + run->internal.data[ndata++] = reason; > > + run->internal.data[ndata++] = info1; > > + if (is_mmio) > > And this is where keying off MMIO gets weird. > We still need to exclude one of the data elements when GPA is not known to be backwards compatible, so we can get rid of the `is_mmio` argument, but not from this `if` (unfortunately). Thank you so much for the review! Kind regards, Ivan Orlov
On Tue, Oct 15, 2024, Ivan Orlov wrote: > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > index c67e448c6ebd..afd785e7f3a3 100644 > > > --- a/arch/x86/kvm/vmx/vmx.c > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > @@ -6550,19 +6550,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) > > > exit_reason.basic != EXIT_REASON_APIC_ACCESS && > > > exit_reason.basic != EXIT_REASON_TASK_SWITCH && > > > exit_reason.basic != EXIT_REASON_NOTIFY)) { > > > - int ndata = 3; > > > + gpa_t gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); > > > + bool is_mmio = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG; > > > > There's no need for is_mmio, just pass INVALID_GPA when the GPA isn't known. > > Ah alright, then we definitely don't need an is_mmio field. I assume we > can't do MMIO at GPA=0, right? Wrong :-) From an architectural perspective, GPA=0 is not special in any way. E.g. prior to L1TF, Linux would happily use the page with PFN=0.
On 10/16/24 22:05, Sean Christopherson wrote: > On Tue, Oct 15, 2024, Ivan Orlov wrote: >>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c >>>> index c67e448c6ebd..afd785e7f3a3 100644 >>>> --- a/arch/x86/kvm/vmx/vmx.c >>>> +++ b/arch/x86/kvm/vmx/vmx.c >>>> @@ -6550,19 +6550,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) >>>> exit_reason.basic != EXIT_REASON_APIC_ACCESS && >>>> exit_reason.basic != EXIT_REASON_TASK_SWITCH && >>>> exit_reason.basic != EXIT_REASON_NOTIFY)) { >>>> - int ndata = 3; >>>> + gpa_t gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); >>>> + bool is_mmio = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG; >>> >>> There's no need for is_mmio, just pass INVALID_GPA when the GPA isn't known. >> >> Ah alright, then we definitely don't need an is_mmio field. I assume we >> can't do MMIO at GPA=0, right? > > Wrong :-) > Then getting rid of `is_mmio` will make distinguishing between vectoring error due to MMIO with GPA=0 and non-mmio vectoring error quite hard for the error reporti Passing INVALID_GPA into the userspace due to non-mmio vectoring error will change the existing internal.data order, but I can do it if it's fine. Sorry for nitpicking :) > From an architectural perspective, GPA=0 is not special in any way. E.g. prior > to L1TF, Linux would happily use the page with PFN=0. > Cool, didn't know about this vulnerability... Thanks for the explanation!
On Wed, Oct 16, 2024, Ivan Orlov wrote: > On 10/16/24 22:05, Sean Christopherson wrote: > > On Tue, Oct 15, 2024, Ivan Orlov wrote: > > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > > > index c67e448c6ebd..afd785e7f3a3 100644 > > > > > --- a/arch/x86/kvm/vmx/vmx.c > > > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > > > @@ -6550,19 +6550,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) > > > > > exit_reason.basic != EXIT_REASON_APIC_ACCESS && > > > > > exit_reason.basic != EXIT_REASON_TASK_SWITCH && > > > > > exit_reason.basic != EXIT_REASON_NOTIFY)) { > > > > > - int ndata = 3; > > > > > + gpa_t gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); > > > > > + bool is_mmio = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG; > > > > > > > > There's no need for is_mmio, just pass INVALID_GPA when the GPA isn't known. > > > > > > Ah alright, then we definitely don't need an is_mmio field. I assume we > > > can't do MMIO at GPA=0, right? > > > > Wrong :-) > > > > Then getting rid of `is_mmio` will make distinguishing between vectoring > error due to MMIO with GPA=0 and non-mmio vectoring error quite hard for the > error reporti > > Passing INVALID_GPA into the userspace due to non-mmio vectoring error will > change the existing internal.data order, but I can do it if it's fine. Sorry > for nitpicking :) KVM's existing ABI is rather awful, though arguably the intent was that there is no ABI, i.e. that KVM is dumping info to try to be helpful. E.g. the existing behavior is that data[3] contains a GPA only for EPT_MISCONFIG, but for everything else, data[3] contains last_vmentry_cpu. And because it's so awful, I doubt any userspace actually has code that acts on the layout of data[]. So, I suspect we can do the simple and sane thing, and fill data[3] with -1ull if the GPA is invalid, and then document that that's the behavior (if we're feeling generous).
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6d9f763a7bb9..348daba424dd 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2060,6 +2060,8 @@ void __kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu, u64 *data, u8 ndata); void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu); +void kvm_prepare_ev_delivery_failure_exit(struct kvm_vcpu *vcpu, gpa_t gpa, bool is_mmio); + void kvm_enable_efer_bits(u64); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c67e448c6ebd..afd785e7f3a3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6550,19 +6550,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) exit_reason.basic != EXIT_REASON_APIC_ACCESS && exit_reason.basic != EXIT_REASON_TASK_SWITCH && exit_reason.basic != EXIT_REASON_NOTIFY)) { - int ndata = 3; + gpa_t gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); + bool is_mmio = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG; - vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; - vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV; - vcpu->run->internal.data[0] = vectoring_info; - vcpu->run->internal.data[1] = exit_reason.full; - vcpu->run->internal.data[2] = vmx_get_exit_qual(vcpu); - if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) { - vcpu->run->internal.data[ndata++] = - vmcs_read64(GUEST_PHYSICAL_ADDRESS); - } - vcpu->run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu; - vcpu->run->internal.ndata = ndata; + kvm_prepare_ev_delivery_failure_exit(vcpu, gpa, is_mmio); return 0; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 83fe0a78146f..8ee67fc23e5d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8828,6 +8828,28 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_prepare_emulation_failure_exit); +void kvm_prepare_ev_delivery_failure_exit(struct kvm_vcpu *vcpu, gpa_t gpa, bool is_mmio) +{ + struct kvm_run *run = vcpu->run; + int ndata = 0; + u32 reason, intr_info, error_code; + u64 info1, info2; + + kvm_x86_call(get_exit_info)(vcpu, &reason, &info1, &info2, &intr_info, &error_code); + + run->internal.data[ndata++] = info2; + run->internal.data[ndata++] = reason; + run->internal.data[ndata++] = info1; + if (is_mmio) + run->internal.data[ndata++] = (u64)gpa; + run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu; + + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV; + run->internal.ndata = ndata; +} +EXPORT_SYMBOL_GPL(kvm_prepare_ev_delivery_failure_exit); + static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type) { struct kvm *kvm = vcpu->kvm;
Extract KVM_INTERNAL_ERROR_DELIVERY_EV internal error generation into the SVM/VMX-agnostic 'kvm_prepare_ev_delivery_failure_exit' function, as it is done for KVM_INTERNAL_ERROR_EMULATION. The order of internal.data array entries is preserved as is, so it is going to be the same on VMX platforms (vectoring info, full exit reason, exit qualification, GPA if error happened due to MMIO and last_vmentry_cpu of the vcpu). Having it as a separate function will help us to avoid code duplication when handling the MMIO during event delivery error on SVM. No functional change intended. Signed-off-by: Ivan Orlov <iorlov@amazon.com> --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 15 +++------------ arch/x86/kvm/x86.c | 22 ++++++++++++++++++++++ 3 files changed, 27 insertions(+), 12 deletions(-)