diff mbox

[2/4] KVM: nSVM: do not forward NMI window singlestep VM exits to L1

Message ID CABdb734PVjgvJr9qSLraF1W+3goJ09ku9tgC=BEnBckWJW11Dg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ladi Prosek June 19, 2017, 12:50 p.m. UTC
On Fri, Jun 16, 2017 at 3:26 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> 2017-06-15 13:20+0200, Ladi Prosek:
>> Nested hypervisor should not see singlestep VM exits if singlestepping
>> was enabled internally by KVM. Windows is particularly sensitive to this
>> and known to bluescreen on unexpected VM exits.
>>
>> Signed-off-by: Ladi Prosek <lprosek@redhat.com>
>> ---
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> @@ -966,9 +967,13 @@ static void svm_disable_lbrv(struct vcpu_svm *svm)
>>  static void disable_nmi_singlestep(struct vcpu_svm *svm)
>>  {
>>       svm->nmi_singlestep = false;
>> -     if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP))
>> -             svm->vmcb->save.rflags &=
>> -                     ~(X86_EFLAGS_TF | X86_EFLAGS_RF);
>> +     if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>> +             /* Clear our flags if they were not set by the guest */
>> +             if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_TF))
>> +                     svm->vmcb->save.rflags &= ~X86_EFLAGS_TF;
>> +             if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_RF))
>> +                     svm->vmcb->save.rflags &= ~X86_EFLAGS_RF;
>
> IIUC, we intercept/fault on IRET, disable the interception, set TF+RF
> and enter again, the CPU executes IRET and then we get a #DB exit.
>
> IRET pops EFLAGS from before the NMI -- doesn't the CPU properly restore
> EFLAGS, so we do not need this part here?

My test VM doesn't work without this part, even with the change that
Paolo suggested in 0/4:


1. The very first time we enable NMI singlestep is when running
nested. The nested guest's rip is just after 'pause' in a 'pause; cmp;
jne' loop. svm_nmi_allowed returns false because nested_svm_nmi
returns false - we don't want to inject NMI now because
svm->nested.intercept has the INTERCEPT_NMI bit set.

2. Then we find ourselves in L1 with the rip just after 'vmrun'.
svm_nmi_allowed returns false because gif_set returns false (hflags ==
HF_HIF_MASK | HF_VINTR_MASK). So we, again, enable NMI singlestep
(without having disabled it yet).

3. We singlestep over the instruction immediately following 'vmrun'
(it's a 'mov rax, [rsp+0x20]' in this trace) and finally disable NMI
singlestep on this vcpu. We inject the NMI when the guest executes
stgi.


I'll see if I can short this out. Setting the trap flag to step over
an instruction which has no other significance than following a
'vmrun' is indeed unnecessary.

Thanks!
Ladi

Comments

Ladi Prosek June 19, 2017, 1:05 p.m. UTC | #1
On Mon, Jun 19, 2017 at 2:50 PM, Ladi Prosek <lprosek@redhat.com> wrote:
> On Fri, Jun 16, 2017 at 3:26 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
>> 2017-06-15 13:20+0200, Ladi Prosek:
>>> Nested hypervisor should not see singlestep VM exits if singlestepping
>>> was enabled internally by KVM. Windows is particularly sensitive to this
>>> and known to bluescreen on unexpected VM exits.
>>>
>>> Signed-off-by: Ladi Prosek <lprosek@redhat.com>
>>> ---
>>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>>> @@ -966,9 +967,13 @@ static void svm_disable_lbrv(struct vcpu_svm *svm)
>>>  static void disable_nmi_singlestep(struct vcpu_svm *svm)
>>>  {
>>>       svm->nmi_singlestep = false;
>>> -     if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP))
>>> -             svm->vmcb->save.rflags &=
>>> -                     ~(X86_EFLAGS_TF | X86_EFLAGS_RF);
>>> +     if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>>> +             /* Clear our flags if they were not set by the guest */
>>> +             if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_TF))
>>> +                     svm->vmcb->save.rflags &= ~X86_EFLAGS_TF;
>>> +             if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_RF))
>>> +                     svm->vmcb->save.rflags &= ~X86_EFLAGS_RF;
>>
>> IIUC, we intercept/fault on IRET, disable the interception, set TF+RF
>> and enter again, the CPU executes IRET and then we get a #DB exit.
>>
>> IRET pops EFLAGS from before the NMI -- doesn't the CPU properly restore
>> EFLAGS, so we do not need this part here?
>
> My test VM doesn't work without this part, even with the change that
> Paolo suggested in 0/4:
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d1efe2c62b3f..15a2f7f8e539 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -4622,6 +4622,9 @@ static void enable_nmi_window(struct kvm_vcpu *vcpu)
>         if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_IRET_MASK))
>             == HF_NMI_MASK)
>                 return; /* IRET will cause a vm exit */
> +       if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_GIF_MASK))
> +           == HF_NMI_MASK)
> +               return; /* STGI will cause a vm exit */
>
>
> Let's see what we're singlestepping over (Windows Server 2016 with Hyper-V).
>
> 1. The very first time we enable NMI singlestep is when running
> nested. The nested guest's rip is just after 'pause' in a 'pause; cmp;
> jne' loop. svm_nmi_allowed returns false because nested_svm_nmi
> returns false - we don't want to inject NMI now because
> svm->nested.intercept has the INTERCEPT_NMI bit set.
>
> 2. Then we find ourselves in L1 with the rip just after 'vmrun'.
> svm_nmi_allowed returns false because gif_set returns false (hflags ==
> HF_HIF_MASK | HF_VINTR_MASK). So we, again, enable NMI singlestep
> (without having disabled it yet).
>
> 3. We singlestep over the instruction immediately following 'vmrun'
> (it's a 'mov rax, [rsp+0x20]' in this trace) and finally disable NMI
> singlestep on this vcpu. We inject the NMI when the guest executes
> stgi.
>
>
> I'll see if I can short this out. Setting the trap flag to step over
> an instruction which has no other significance than following a
> 'vmrun' is indeed unnecessary.

Ok, I think I understand it.

First, Paolo's GIF change should be checking only HF_GIF_MASK. Whether
we're currently in an NMI or not does not make a difference, NMI is
not allowed and we have interception in place for when GIF is set so
we don't need to singlestep.

Second, enable_nmi_window can also do nothing if
svm->nested.exit_required. We're not really going to run the guest in
this case so no need to singlestep.

With these two tweaks my test VM does not generate any DB exits. We
still occasionally set TF if we have an NMI and interrupt pending at
the same time (see inject_pending_event where we check
vcpu->arch.interrupt.pending, then later vcpu->arch.nmi_pending) but
we clear it immediately in the new code added in patch 4.

What a complex state machine! I'll prepare v2.
Paolo Bonzini June 19, 2017, 1:52 p.m. UTC | #2
On 19/06/2017 15:05, Ladi Prosek wrote:
> With these two tweaks my test VM does not generate any DB exits. We
> still occasionally set TF if we have an NMI and interrupt pending at
> the same time (see inject_pending_event where we check
> vcpu->arch.interrupt.pending, then later vcpu->arch.nmi_pending) but
> we clear it immediately in the new code added in patch 4.

Good, please check that eventinj.flat still works on non-nested and
doesn't regress on nested (I'm not sure if it works right now).  I'll
check that too before applying, though.

Paolo
diff mbox

Patch

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d1efe2c62b3f..15a2f7f8e539 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4622,6 +4622,9 @@  static void enable_nmi_window(struct kvm_vcpu *vcpu)
        if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_IRET_MASK))
            == HF_NMI_MASK)
                return; /* IRET will cause a vm exit */
+       if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_GIF_MASK))
+           == HF_NMI_MASK)
+               return; /* STGI will cause a vm exit */


Let's see what we're singlestepping over (Windows Server 2016 with Hyper-V).