Message ID | 20190920212509.2578-16-aarcange@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM monolithic v1 | expand |
Andrea Arcangeli <aarcange@redhat.com> writes: > It's enough to check the exit value and issue a direct call to avoid > the retpoline for all the common vmexit reasons. > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > --- > arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index a6e597025011..9aa73e216df2 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5866,9 +5866,29 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) > } > > if (exit_reason < kvm_vmx_max_exit_handlers > - && kvm_vmx_exit_handlers[exit_reason]) > + && kvm_vmx_exit_handlers[exit_reason]) { > +#ifdef CONFIG_RETPOLINE > + if (exit_reason == EXIT_REASON_MSR_WRITE) > + return handle_wrmsr(vcpu); > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > + return handle_preemption_timer(vcpu); > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > + return handle_interrupt_window(vcpu); > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > + return handle_external_interrupt(vcpu); > + else if (exit_reason == EXIT_REASON_HLT) > + return handle_halt(vcpu); > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > + return handle_pause(vcpu); > + else if (exit_reason == EXIT_REASON_MSR_READ) > + return handle_rdmsr(vcpu); > + else if (exit_reason == EXIT_REASON_CPUID) > + return handle_cpuid(vcpu); > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > + return handle_ept_misconfig(vcpu); > +#endif > return kvm_vmx_exit_handlers[exit_reason](vcpu); I agree with the identified set of most common vmexits, however, this still looks a bit random. Would it be too much if we get rid of kvm_vmx_exit_handlers completely replacing this code with one switch()? > - else { > + } else { > vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", > exit_reason); > dump_vmcs();
On 23/09/19 11:31, Vitaly Kuznetsov wrote: > +#ifdef CONFIG_RETPOLINE > + if (exit_reason == EXIT_REASON_MSR_WRITE) > + return handle_wrmsr(vcpu); > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > + return handle_preemption_timer(vcpu); > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > + return handle_interrupt_window(vcpu); > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > + return handle_external_interrupt(vcpu); > + else if (exit_reason == EXIT_REASON_HLT) > + return handle_halt(vcpu); > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > + return handle_pause(vcpu); > + else if (exit_reason == EXIT_REASON_MSR_READ) > + return handle_rdmsr(vcpu); > + else if (exit_reason == EXIT_REASON_CPUID) > + return handle_cpuid(vcpu); > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > + return handle_ept_misconfig(vcpu); > +#endif > return kvm_vmx_exit_handlers[exit_reason](vcpu); Most of these, while frequent, are already part of slow paths. I would keep only EXIT_REASON_MSR_WRITE, EXIT_REASON_PREEMPTION_TIMER, EXIT_REASON_EPT_MISCONFIG and add EXIT_REASON_IO_INSTRUCTION. If you make kvm_vmx_exit_handlers const, can the compiler substitute for instance kvm_vmx_exit_handlers[EXIT_REASON_MSR_WRITE] with handle_wrmsr? Just thinking out loud, not sure if it's an improvement code-wise. Paolo Paolo
Subject should be something like: KVM: VMX: Make direct calls to fast path VM-Exit handlers On Fri, Sep 20, 2019 at 05:25:07PM -0400, Andrea Arcangeli wrote: > It's enough to check the exit value and issue a direct call to avoid > the retpoline for all the common vmexit reasons. > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > --- > arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index a6e597025011..9aa73e216df2 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5866,9 +5866,29 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) > } > > if (exit_reason < kvm_vmx_max_exit_handlers > - && kvm_vmx_exit_handlers[exit_reason]) > + && kvm_vmx_exit_handlers[exit_reason]) { > +#ifdef CONFIG_RETPOLINE I'd strongly prefer to make any optimization of this type unconditional, i.e. not dependent on CONFIG_RETPOLINE. Today, I'm comfortable testing KVM only with CONFIG_RETPOLINE=y since the only KVM-specific difference is additive code in vmx_vmexit. That would no longer be the case if KVM has non-trivial code differences for retpoline. > + if (exit_reason == EXIT_REASON_MSR_WRITE) > + return handle_wrmsr(vcpu); > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > + return handle_preemption_timer(vcpu); > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > + return handle_interrupt_window(vcpu); > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > + return handle_external_interrupt(vcpu); > + else if (exit_reason == EXIT_REASON_HLT) > + return handle_halt(vcpu); > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > + return handle_pause(vcpu); > + else if (exit_reason == EXIT_REASON_MSR_READ) > + return handle_rdmsr(vcpu); > + else if (exit_reason == EXIT_REASON_CPUID) > + return handle_cpuid(vcpu); > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > + return handle_ept_misconfig(vcpu); > +#endif This can be hoisted above the if statement. > return kvm_vmx_exit_handlers[exit_reason](vcpu); > - else { > + } else { > vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", > exit_reason); > dump_vmcs();
On Mon, Sep 23, 2019 at 11:31:58AM +0200, Vitaly Kuznetsov wrote: > Andrea Arcangeli <aarcange@redhat.com> writes: > > > It's enough to check the exit value and issue a direct call to avoid > > the retpoline for all the common vmexit reasons. > > > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > > --- > > arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++-- > > 1 file changed, 22 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index a6e597025011..9aa73e216df2 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -5866,9 +5866,29 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) > > } > > > > if (exit_reason < kvm_vmx_max_exit_handlers > > - && kvm_vmx_exit_handlers[exit_reason]) > > + && kvm_vmx_exit_handlers[exit_reason]) { > > +#ifdef CONFIG_RETPOLINE > > + if (exit_reason == EXIT_REASON_MSR_WRITE) > > + return handle_wrmsr(vcpu); > > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > > + return handle_preemption_timer(vcpu); > > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > > + return handle_interrupt_window(vcpu); > > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > > + return handle_external_interrupt(vcpu); > > + else if (exit_reason == EXIT_REASON_HLT) > > + return handle_halt(vcpu); > > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > > + return handle_pause(vcpu); > > + else if (exit_reason == EXIT_REASON_MSR_READ) > > + return handle_rdmsr(vcpu); > > + else if (exit_reason == EXIT_REASON_CPUID) > > + return handle_cpuid(vcpu); > > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > > + return handle_ept_misconfig(vcpu); > > +#endif > > return kvm_vmx_exit_handlers[exit_reason](vcpu); > > I agree with the identified set of most common vmexits, however, this > still looks a bit random. Would it be too much if we get rid of > kvm_vmx_exit_handlers completely replacing this code with one switch()? Hmm, that'd require redirects for nVMX functions since they are set at runtime. That isn't necessarily a bad thing. The approach could also be used if Paolo's idea of making kvm_vmx_max_exit_handlers const allows the compiler to avoid retpoline. E.g.: static int handle_vmx_instruction(struct kvm_vcpu *vcpu) { if (nested) return nested_vmx_handle_exit(vcpu); kvm_queue_exception(vcpu, UD_VECTOR); return 1; }
On 23/09/19 18:37, Sean Christopherson wrote: >> Would it be too much if we get rid of >> kvm_vmx_exit_handlers completely replacing this code with one switch()? > Hmm, that'd require redirects for nVMX functions since they are set at > runtime. That isn't necessarily a bad thing. The approach could also be > used if Paolo's idea of making kvm_vmx_max_exit_handlers const allows the > compiler to avoid retpoline. But aren't switch statements also retpolin-ized if they use a jump table? Paolo
On Mon, Sep 23, 2019 at 06:53:10PM +0200, Paolo Bonzini wrote: > On 23/09/19 18:37, Sean Christopherson wrote: > >> Would it be too much if we get rid of > >> kvm_vmx_exit_handlers completely replacing this code with one switch()? > > Hmm, that'd require redirects for nVMX functions since they are set at > > runtime. That isn't necessarily a bad thing. The approach could also be > > used if Paolo's idea of making kvm_vmx_max_exit_handlers const allows the > > compiler to avoid retpoline. > > But aren't switch statements also retpolin-ized if they use a jump table? See commit a9d57ef15cbe327fe54416dd194ee0ea66ae53a4. We disabled that feature or the kernel would potentially suffer the downsides of the exit handlers through pointer to functions for every switch statement in the kernel. In turn you can't make it run any faster by converting my "if" to a "switch" at least the "if" can deterministic control the order of what is more likely that we should also re-review, but the order of secondary effect, the important thing is to reduce the retpolines to zero during normal hrtimer guest runtime.
On Mon, Sep 23, 2019 at 01:42:44PM -0400, Andrea Arcangeli wrote: > On Mon, Sep 23, 2019 at 06:53:10PM +0200, Paolo Bonzini wrote: > > On 23/09/19 18:37, Sean Christopherson wrote: > > >> Would it be too much if we get rid of > > >> kvm_vmx_exit_handlers completely replacing this code with one switch()? > > > Hmm, that'd require redirects for nVMX functions since they are set at > > > runtime. That isn't necessarily a bad thing. The approach could also be > > > used if Paolo's idea of making kvm_vmx_max_exit_handlers const allows the > > > compiler to avoid retpoline. > > > > But aren't switch statements also retpolin-ized if they use a jump table? > > See commit a9d57ef15cbe327fe54416dd194ee0ea66ae53a4. > > We disabled that feature or the kernel would potentially suffer the > downsides of the exit handlers through pointer to functions for every > switch statement in the kernel. > > In turn you can't make it run any faster by converting my "if" to a > "switch" at least the "if" can deterministic control the order of what > is more likely that we should also re-review, but the order of secondary > effect, the important thing is to reduce the retpolines to zero during > normal hrtimer guest runtime. On the flip side, using a switch for the fast-path handlers gives the compiler more flexibility to rearrange and combine checks. Of course that doesn't mean the compiler will actually generate faster code for our purposes :-) Anyways, getting rid of the retpolines is much more important.
On Mon, Sep 23, 2019 at 11:57:57AM +0200, Paolo Bonzini wrote: > On 23/09/19 11:31, Vitaly Kuznetsov wrote: > > +#ifdef CONFIG_RETPOLINE > > + if (exit_reason == EXIT_REASON_MSR_WRITE) > > + return handle_wrmsr(vcpu); > > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > > + return handle_preemption_timer(vcpu); > > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > > + return handle_interrupt_window(vcpu); > > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > > + return handle_external_interrupt(vcpu); > > + else if (exit_reason == EXIT_REASON_HLT) > > + return handle_halt(vcpu); > > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > > + return handle_pause(vcpu); > > + else if (exit_reason == EXIT_REASON_MSR_READ) > > + return handle_rdmsr(vcpu); > > + else if (exit_reason == EXIT_REASON_CPUID) > > + return handle_cpuid(vcpu); > > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > > + return handle_ept_misconfig(vcpu); > > +#endif > > return kvm_vmx_exit_handlers[exit_reason](vcpu); > > Most of these, while frequent, are already part of slow paths. > > I would keep only EXIT_REASON_MSR_WRITE, EXIT_REASON_PREEMPTION_TIMER, > EXIT_REASON_EPT_MISCONFIG and add EXIT_REASON_IO_INSTRUCTION. Intuition doesn't work great when it comes to CPU speculative execution runtime. I can however run additional benchmarks to verify your theory that keeping around frequent retpolines will still perform ok. > If you make kvm_vmx_exit_handlers const, can the compiler substitute for > instance kvm_vmx_exit_handlers[EXIT_REASON_MSR_WRITE] with handle_wrmsr? > Just thinking out loud, not sure if it's an improvement code-wise. gcc gets right if you make it const, it calls kvm_emulate_wrmsr in fact. However I don't think const will fly with_vmx_hardware_setup()... in fact at runtime testing nested I just got: BUG: unable to handle page fault for address: ffffffffa00751e0 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 2424067 P4D 2424067 PUD 2425063 PMD 7cc09067 PTE 80000000741cb161 Oops: 0003 [#1] SMP NOPTI CPU: 1 PID: 4458 Comm: insmod Not tainted 5.3.0+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.or4 RIP: 0010:nested_vmx_hardware_setup+0x29a/0x37a [kvm_intel] Code: 41 ff c5 66 89 2c 85 20 92 0b a0 66 44 89 34 85 22 92 0b a0 49 ff c7 e9 e6 fe ff ff 44 89 2d 28 24 fc ff 48 RSP: 0018:ffffc90000257c18 EFLAGS: 00010246 RAX: ffffffffa001e0b0 RBX: ffffffffa0075140 RCX: 0000000000000000 RDX: ffff888078f60000 RSI: 0000000000002401 RDI: 0000000000000018 RBP: 0000000000006c08 R08: 0000000000001000 R09: 000000000007ffdc R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000006c08 R13: 0000000000000017 R14: 0000000000000268 R15: 0000000000000018 FS: 00007f7fb7ef0b80(0000) GS:ffff88807da40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa00751e0 CR3: 0000000079620001 CR4: 0000000000160ee0 Call Trace: hardware_setup+0x4df/0x5b2 [kvm_intel] kvm_arch_hardware_setup+0x2f/0x27b [kvm_intel] kvm_init+0x5d/0x26d [kvm_intel]
On Mon, Sep 23, 2019 at 11:15:58AM -0700, Sean Christopherson wrote: > On the flip side, using a switch for the fast-path handlers gives the > compiler more flexibility to rearrange and combine checks. Of course that > doesn't mean the compiler will actually generate faster code for our > purposes :-) > > Anyways, getting rid of the retpolines is much more important. Precisely because of your last point, if we throw away the deterministic priority, then we could drop the whole structure like Vitaly suggested and we'd rely on gcc to add the indirect jump on a non-retpoline build. Solving the nested if we drop the structure and we don't pretend to make it const, isn't tricky: it only requires one more check if nested is enabled. The same variable that will have to be checked is also the variable that needs to be checked in the kvm_x86_ops->check_nested_events replacement later to drop the kvm_x86_ops struct as a whole like kvm_pmu_ops was dropped clean.
On Mon, Sep 23, 2019 at 03:05:14PM -0400, Andrea Arcangeli wrote: > On Mon, Sep 23, 2019 at 11:57:57AM +0200, Paolo Bonzini wrote: > > On 23/09/19 11:31, Vitaly Kuznetsov wrote: > > > +#ifdef CONFIG_RETPOLINE > > > + if (exit_reason == EXIT_REASON_MSR_WRITE) > > > + return handle_wrmsr(vcpu); > > > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > > > + return handle_preemption_timer(vcpu); > > > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > > > + return handle_interrupt_window(vcpu); > > > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > > > + return handle_external_interrupt(vcpu); > > > + else if (exit_reason == EXIT_REASON_HLT) > > > + return handle_halt(vcpu); > > > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > > > + return handle_pause(vcpu); > > > + else if (exit_reason == EXIT_REASON_MSR_READ) > > > + return handle_rdmsr(vcpu); > > > + else if (exit_reason == EXIT_REASON_CPUID) > > > + return handle_cpuid(vcpu); > > > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > > > + return handle_ept_misconfig(vcpu); > > > +#endif > > > return kvm_vmx_exit_handlers[exit_reason](vcpu); > > > > Most of these, while frequent, are already part of slow paths. > > > > I would keep only EXIT_REASON_MSR_WRITE, EXIT_REASON_PREEMPTION_TIMER, > > EXIT_REASON_EPT_MISCONFIG and add EXIT_REASON_IO_INSTRUCTION. > > Intuition doesn't work great when it comes to CPU speculative > execution runtime. I can however run additional benchmarks to verify > your theory that keeping around frequent retpolines will still perform > ok. > > > If you make kvm_vmx_exit_handlers const, can the compiler substitute for > > instance kvm_vmx_exit_handlers[EXIT_REASON_MSR_WRITE] with handle_wrmsr? > > Just thinking out loud, not sure if it's an improvement code-wise. > > gcc gets right if you make it const, it calls kvm_emulate_wrmsr in > fact. However I don't think const will fly > with_vmx_hardware_setup()... in fact at runtime testing nested I just > got: > > BUG: unable to handle page fault for address: ffffffffa00751e0 > #PF: supervisor write access in kernel mode > #PF: error_code(0x0003) - permissions violation > PGD 2424067 P4D 2424067 PUD 2425063 PMD 7cc09067 PTE 80000000741cb161 > Oops: 0003 [#1] SMP NOPTI > CPU: 1 PID: 4458 Comm: insmod Not tainted 5.3.0+ #1 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.or4 > RIP: 0010:nested_vmx_hardware_setup+0x29a/0x37a [kvm_intel] > Code: 41 ff c5 66 89 2c 85 20 92 0b a0 66 44 89 34 85 22 92 0b a0 49 ff c7 e9 e6 fe ff ff 44 89 2d 28 24 fc ff 48 > RSP: 0018:ffffc90000257c18 EFLAGS: 00010246 > RAX: ffffffffa001e0b0 RBX: ffffffffa0075140 RCX: 0000000000000000 > RDX: ffff888078f60000 RSI: 0000000000002401 RDI: 0000000000000018 > RBP: 0000000000006c08 R08: 0000000000001000 R09: 000000000007ffdc > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000006c08 > R13: 0000000000000017 R14: 0000000000000268 R15: 0000000000000018 > FS: 00007f7fb7ef0b80(0000) GS:ffff88807da40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffa00751e0 CR3: 0000000079620001 CR4: 0000000000160ee0 > Call Trace: > hardware_setup+0x4df/0x5b2 [kvm_intel] > kvm_arch_hardware_setup+0x2f/0x27b [kvm_intel] > kvm_init+0x5d/0x26d [kvm_intel] The attached patch should do the trick.
Hello,
On Mon, Sep 23, 2019 at 01:23:49PM -0700, Sean Christopherson wrote:
> The attached patch should do the trick.
The two most attractive options to me remains what I already have
implemented under #ifdef CONFIG_RETPOLINE with direct calls
(optionally replacing the "if" with a small "switch" still under
CONFIG_RETPOLINE if we give up the prioritization of the checks), or
the replacement of kvm_vmx_exit_handlers with a switch() as suggested
by Vitaly which would cleanup some code.
The intermediate solution that makes "const" work, has the cons of
forcing to parse EXIT_REASON_VMCLEAR and the other vmx exit reasons
twice, first through a pointer to function (or another if or switch
statement) then with a second switch() statement.
If we'd use a single switch statement per Vitaly's suggestion, the "if
nested" would better be more simply implemented as:
switch (exit_reason) {
case EXIT_REASON_VMCLEAR:
if (nested)
return handle_vmclear(vcpu);
else
return handle_vmx_instruction(vcpu);
case EXIT_REASON_VMCLEAR:
if (nested)
[..]
This also removes the compiler dependency to auto inline
handle_vmclear in the added nested_vmx_handle_vmx_instruction extern
call.
VMREAD/WRITE/RESUME are the most frequent vmexit in l0 while nested
runs in l2.
Thanks,
Andrea
On Mon, Sep 23, 2019 at 05:08:38PM -0400, Andrea Arcangeli wrote: > Hello, > > On Mon, Sep 23, 2019 at 01:23:49PM -0700, Sean Christopherson wrote: > > The attached patch should do the trick. > > The two most attractive options to me remains what I already have > implemented under #ifdef CONFIG_RETPOLINE with direct calls > (optionally replacing the "if" with a small "switch" still under > CONFIG_RETPOLINE if we give up the prioritization of the checks), or > the replacement of kvm_vmx_exit_handlers with a switch() as suggested > by Vitaly which would cleanup some code. > > The intermediate solution that makes "const" work, has the cons of > forcing to parse EXIT_REASON_VMCLEAR and the other vmx exit reasons > twice, first through a pointer to function (or another if or switch > statement) then with a second switch() statement. > > If we'd use a single switch statement per Vitaly's suggestion, the "if > nested" would better be more simply implemented as: > > switch (exit_reason) { > case EXIT_REASON_VMCLEAR: > if (nested) > return handle_vmclear(vcpu); > else > return handle_vmx_instruction(vcpu); > case EXIT_REASON_VMCLEAR: > if (nested) > [..] Combing if/case statements would mitigate this to some degree, e.g.: if (exit_reason >= EXIT_REASON_VMCALL && exit_reason <= EXIT_REASON_VMON) return handle_vmx_instruction(vcpu); or case EXIT_REASON_VMCALL ... EXIT_REASON_VMON: return handle_vmx_instruction(vcpu); > This also removes the compiler dependency to auto inline > handle_vmclear in the added nested_vmx_handle_vmx_instruction extern > call. I like the idea of routing through handle_vmx_instruction() because it allows the individual handlers to be wholly encapsulated in nested.c, e.g. handle_vmclear() and company don't have to be exposed. An extra CALL+RET isn't going to be noticeable, especially on modern hardware as the high frequency VMWRITE/VMREAD fields should hit the shadow VMCS. > VMREAD/WRITE/RESUME are the most frequent vmexit in l0 while nested > runs in l2. > > Thanks, > Andrea
On Mon, Sep 23, 2019 at 02:24:35PM -0700, Sean Christopherson wrote: > An extra CALL+RET isn't going to be noticeable, especially on modern > hardware as the high frequency VMWRITE/VMREAD fields should hit the > shadow VMCS. In your last email with regard to the inlining optimizations made possible by the monolithic KVM model you said "That'd likely save a few CALL/RET/JMP instructions", that kind of directly contradicts the above. I think neither one if taken at face value can be possibly measured. However the above only is relevant for nested KVM so I'm fine if there's an agreement that it's better to hide the nested vmx handlers in nested.c at the cost of some call/ret. From my part I'm dropping 15/16/17 in the short term, perhaps Vitaly or you or Paolo if he has time, want to work on that part in parallel to the orthogonal KVM monolithic changes? Thanks, Andrea
On Mon, Sep 23, 2019 at 07:43:07PM -0400, Andrea Arcangeli wrote: > On Mon, Sep 23, 2019 at 02:24:35PM -0700, Sean Christopherson wrote: > > An extra CALL+RET isn't going to be noticeable, especially on modern > > hardware as the high frequency VMWRITE/VMREAD fields should hit the > > shadow VMCS. > > In your last email with regard to the inlining optimizations made > possible by the monolithic KVM model you said "That'd likely save a > few CALL/RET/JMP instructions", that kind of directly contradicts the > above. I think neither one if taken at face value can be possibly > measured. However the above only is relevant for nested KVM so I'm > fine if there's an agreement that it's better to hide the nested vmx > handlers in nested.c at the cost of some call/ret. For the immediate exit case, eliminating the CALL/RET/JMP instructions is a bonus. The real goal was to eliminate the oddity of bouncing through vendor code to invoke a one-line x86 function. Having a separate __kvm_request_immediate_exit() made sense when overwriting kvm_ops_x86, but not so much when using direct calls.
On 23/09/19 22:23, Sean Christopherson wrote: > > +int nested_vmx_handle_vmx_instruction(struct kvm_vcpu *vcpu) > +{ > + switch (to_vmx(vcpu)->exit_reason) { > + case EXIT_REASON_VMCLEAR: > + return handle_vmclear(vcpu); > + case EXIT_REASON_VMLAUNCH: > + return handle_vmlaunch(vcpu); > + case EXIT_REASON_VMPTRLD: > + return handle_vmptrld(vcpu); > + case EXIT_REASON_VMPTRST: > + return handle_vmptrst(vcpu); > + case EXIT_REASON_VMREAD: > + return handle_vmread(vcpu); > + case EXIT_REASON_VMRESUME: > + return handle_vmresume(vcpu); > + case EXIT_REASON_VMWRITE: > + return handle_vmwrite(vcpu); > + case EXIT_REASON_VMOFF: > + return handle_vmoff(vcpu); > + case EXIT_REASON_VMON: > + return handle_vmon(vcpu); > + case EXIT_REASON_INVEPT: > + return handle_invept(vcpu); > + case EXIT_REASON_INVVPID: > + return handle_invvpid(vcpu); > + case EXIT_REASON_VMFUNC: > + return handle_vmfunc(vcpu); > + } > + Do you really need that? Why couldn't the handle_* functions simply be exported from nested.c to vmx.c? Paolo
On 23/09/19 23:08, Andrea Arcangeli wrote: > The two most attractive options to me remains what I already have > implemented under #ifdef CONFIG_RETPOLINE with direct calls > (optionally replacing the "if" with a small "switch" still under > CONFIG_RETPOLINE if we give up the prioritization of the checks), or > the replacement of kvm_vmx_exit_handlers with a switch() as suggested > by Vitaly which would cleanup some code. > > The intermediate solution that makes "const" work, has the cons of > forcing to parse EXIT_REASON_VMCLEAR and the other vmx exit reasons > twice, first through a pointer to function (or another if or switch > statement) then with a second switch() statement. I agree. I think the way Andrea did it in his patch may not the nicest but is (a bit surprisingly) the easiest and most maintainable. Paolo
On Tue, Sep 24, 2019 at 02:16:36AM +0200, Paolo Bonzini wrote: > On 23/09/19 23:08, Andrea Arcangeli wrote: > > The two most attractive options to me remains what I already have > > implemented under #ifdef CONFIG_RETPOLINE with direct calls > > (optionally replacing the "if" with a small "switch" still under > > CONFIG_RETPOLINE if we give up the prioritization of the checks), or > > the replacement of kvm_vmx_exit_handlers with a switch() as suggested > > by Vitaly which would cleanup some code. > > > > The intermediate solution that makes "const" work, has the cons of > > forcing to parse EXIT_REASON_VMCLEAR and the other vmx exit reasons > > twice, first through a pointer to function (or another if or switch > > statement) then with a second switch() statement. > > I agree. I think the way Andrea did it in his patch may not the nicest > but is (a bit surprisingly) the easiest and most maintainable. Heh, which patch? The original patch of special casing the high priority exits?
On 24/09/19 02:35, Sean Christopherson wrote: >> I agree. I think the way Andrea did it in his patch may not the nicest >> but is (a bit surprisingly) the easiest and most maintainable. > Heh, which patch? The original patch of special casing the high > priority exits? Yes. Paolo
Hi Paolo, On Tue, Sep 24, 2019 at 02:15:39AM +0200, Paolo Bonzini wrote: > Do you really need that? Why couldn't the handle_* functions simply be > exported from nested.c to vmx.c? I prefer the direct call too indeed. If Sean doesn't want to export those generic names to the whole kernel it would be enough to #include "nested.c" from vmx.c, and you'd still have it private but with no additional checks and no additional extern call. It's not like kvm-intel can be built without linking nested.o anyway. This overall is a C shortcoming of some sort if you've to resort to #include "nested.c" to keep the function hidden in kvm-intel.o despite it's implemented in two different object files. One should be able to limit the scope of an extern function declaration per a group of object files and to drop the declaration before linking that object beyond that initial group. Thanks, Andrea
On Tue, Sep 24, 2019 at 02:15:39AM +0200, Paolo Bonzini wrote: > On 23/09/19 22:23, Sean Christopherson wrote: > > > > +int nested_vmx_handle_vmx_instruction(struct kvm_vcpu *vcpu) > > +{ > > + switch (to_vmx(vcpu)->exit_reason) { > > + case EXIT_REASON_VMCLEAR: > > + return handle_vmclear(vcpu); > > + case EXIT_REASON_VMLAUNCH: > > + return handle_vmlaunch(vcpu); > > + case EXIT_REASON_VMPTRLD: > > + return handle_vmptrld(vcpu); > > + case EXIT_REASON_VMPTRST: > > + return handle_vmptrst(vcpu); > > + case EXIT_REASON_VMREAD: > > + return handle_vmread(vcpu); > > + case EXIT_REASON_VMRESUME: > > + return handle_vmresume(vcpu); > > + case EXIT_REASON_VMWRITE: > > + return handle_vmwrite(vcpu); > > + case EXIT_REASON_VMOFF: > > + return handle_vmoff(vcpu); > > + case EXIT_REASON_VMON: > > + return handle_vmon(vcpu); > > + case EXIT_REASON_INVEPT: > > + return handle_invept(vcpu); > > + case EXIT_REASON_INVVPID: > > + return handle_invvpid(vcpu); > > + case EXIT_REASON_VMFUNC: > > + return handle_vmfunc(vcpu); > > + } > > + > > Do you really need that? Why couldn't the handle_* functions simply be > exported from nested.c to vmx.c? Nope, just personal preference to keep the nested code as isolated as possible. We use a similar approach for vmx_{g,s}et_vmx_msr(). Though if we do want to go this route, it'd be better to simply move handle_vmx_instruction() to nested.c instead of bouncing through that and nested_vmx_handle_vmx_instruction().
On Mon, Sep 23, 2019 at 03:05:14PM -0400, Andrea Arcangeli wrote: > On Mon, Sep 23, 2019 at 11:57:57AM +0200, Paolo Bonzini wrote: > > On 23/09/19 11:31, Vitaly Kuznetsov wrote: > > > +#ifdef CONFIG_RETPOLINE > > > + if (exit_reason == EXIT_REASON_MSR_WRITE) > > > + return handle_wrmsr(vcpu); > > > + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > > > + return handle_preemption_timer(vcpu); > > > + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > > > + return handle_interrupt_window(vcpu); > > > + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > > > + return handle_external_interrupt(vcpu); > > > + else if (exit_reason == EXIT_REASON_HLT) > > > + return handle_halt(vcpu); > > > + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > > > + return handle_pause(vcpu); > > > + else if (exit_reason == EXIT_REASON_MSR_READ) > > > + return handle_rdmsr(vcpu); > > > + else if (exit_reason == EXIT_REASON_CPUID) > > > + return handle_cpuid(vcpu); > > > + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > > > + return handle_ept_misconfig(vcpu); > > > +#endif > > > return kvm_vmx_exit_handlers[exit_reason](vcpu); > > > > Most of these, while frequent, are already part of slow paths. > > > > I would keep only EXIT_REASON_MSR_WRITE, EXIT_REASON_PREEMPTION_TIMER, > > EXIT_REASON_EPT_MISCONFIG and add EXIT_REASON_IO_INSTRUCTION. > > Intuition doesn't work great when it comes to CPU speculative > execution runtime. I can however run additional benchmarks to verify > your theory that keeping around frequent retpolines will still perform > ok. On one most recent CPU model there's no measurable difference with your list or my list with a hrtimer workload (no cpuid). It's challenging to measure any difference below 0.5%. An artificial cpuid loop takes 1.5% longer to compute if I don't add CPUID to the list, but that's just the measurement of the cost of hitting a frequent retpoline in the exit reason handling code. Thanks, Andrea
On 24/09/19 23:46, Andrea Arcangeli wrote: >>> >>> I would keep only EXIT_REASON_MSR_WRITE, EXIT_REASON_PREEMPTION_TIMER, >>> EXIT_REASON_EPT_MISCONFIG and add EXIT_REASON_IO_INSTRUCTION. >> Intuition doesn't work great when it comes to CPU speculative >> execution runtime. I can however run additional benchmarks to verify >> your theory that keeping around frequent retpolines will still perform >> ok. > On one most recent CPU model there's no measurable difference with > your list or my list with a hrtimer workload (no cpuid). It's > challenging to measure any difference below 0.5%. Let's keep the short list then. Paolo
On Wed, Sep 25, 2019 at 01:03:32PM +0200, Christophe de Dinechin wrote: > > > > On 23 Sep 2019, at 11:31, Vitaly Kuznetsov <vkuznets@redhat.com> wrote: > > > > Andrea Arcangeli <aarcange@redhat.com <mailto:aarcange@redhat.com>> writes: > > > >> It's enough to check the exit value and issue a direct call to avoid > >> the retpoline for all the common vmexit reasons. > >> > >> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > >> --- > >> arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++-- > >> 1 file changed, 22 insertions(+), 2 deletions(-) > >> > >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > >> index a6e597025011..9aa73e216df2 100644 > >> --- a/arch/x86/kvm/vmx/vmx.c > >> +++ b/arch/x86/kvm/vmx/vmx.c > >> @@ -5866,9 +5866,29 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) > >> } > >> > >> if (exit_reason < kvm_vmx_max_exit_handlers > >> - && kvm_vmx_exit_handlers[exit_reason]) > >> + && kvm_vmx_exit_handlers[exit_reason]) { > >> +#ifdef CONFIG_RETPOLINE > >> + if (exit_reason == EXIT_REASON_MSR_WRITE) > >> + return handle_wrmsr(vcpu); > >> + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) > >> + return handle_preemption_timer(vcpu); > >> + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) > >> + return handle_interrupt_window(vcpu); > >> + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > >> + return handle_external_interrupt(vcpu); > >> + else if (exit_reason == EXIT_REASON_HLT) > >> + return handle_halt(vcpu); > >> + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) > >> + return handle_pause(vcpu); > >> + else if (exit_reason == EXIT_REASON_MSR_READ) > >> + return handle_rdmsr(vcpu); > >> + else if (exit_reason == EXIT_REASON_CPUID) > >> + return handle_cpuid(vcpu); > >> + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) > >> + return handle_ept_misconfig(vcpu); > >> +#endif > >> return kvm_vmx_exit_handlers[exit_reason](vcpu); > > > > I agree with the identified set of most common vmexits, however, this > > still looks a bit random. Would it be too much if we get rid of > > kvm_vmx_exit_handlers completely replacing this code with one switch()? > > Not sure, but if you do that, won’t the compiler generate a table and > bring you back to square one? Or is there a reason why the mitigation > is not needed for tables and indirect branches generated from switch > statements? When the kernel is built with retpolines the compiler is forbidden to use a table for any switch. I pointed out the relevant commit earlier in this thread. Instead the compiler will still try to bisect the exit_reason trying to make the cost more equal for all exit_reason and to reduce the number of checks, but we know the most likely exits so it should be better to prioritize the most frequent exit reasons. Thanks, Andrea
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a6e597025011..9aa73e216df2 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5866,9 +5866,29 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) } if (exit_reason < kvm_vmx_max_exit_handlers - && kvm_vmx_exit_handlers[exit_reason]) + && kvm_vmx_exit_handlers[exit_reason]) { +#ifdef CONFIG_RETPOLINE + if (exit_reason == EXIT_REASON_MSR_WRITE) + return handle_wrmsr(vcpu); + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) + return handle_preemption_timer(vcpu); + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) + return handle_interrupt_window(vcpu); + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) + return handle_external_interrupt(vcpu); + else if (exit_reason == EXIT_REASON_HLT) + return handle_halt(vcpu); + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) + return handle_pause(vcpu); + else if (exit_reason == EXIT_REASON_MSR_READ) + return handle_rdmsr(vcpu); + else if (exit_reason == EXIT_REASON_CPUID) + return handle_cpuid(vcpu); + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) + return handle_ept_misconfig(vcpu); +#endif return kvm_vmx_exit_handlers[exit_reason](vcpu); - else { + } else { vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", exit_reason); dump_vmcs();
It's enough to check the exit value and issue a direct call to avoid the retpoline for all the common vmexit reasons. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> --- arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-)