From patchwork Mon Jun 19 16:17:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 9796709 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F047B603F5 for ; Mon, 19 Jun 2017 16:17:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E067B26D08 for ; Mon, 19 Jun 2017 16:17:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D55C726E97; Mon, 19 Jun 2017 16:17:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5CC8826E56 for ; Mon, 19 Jun 2017 16:17:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752519AbdFSQRu (ORCPT ); Mon, 19 Jun 2017 12:17:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36748 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752204AbdFSQRt (ORCPT ); Mon, 19 Jun 2017 12:17:49 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 88D917F6A1 for ; Mon, 19 Jun 2017 16:17:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 88D917F6A1 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 88D917F6A1 Received: from potion (unknown [10.43.2.65]) by smtp.corp.redhat.com (Postfix) with SMTP id D5B87189A5; Mon, 19 Jun 2017 16:17:41 +0000 (UTC) Received: by potion (sSMTP sendmail emulation); Mon, 19 Jun 2017 18:17:41 +0200 Date: Mon, 19 Jun 2017 18:17:41 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Ladi Prosek Cc: KVM list Subject: Re: [PATCH 2/4] KVM: nSVM: do not forward NMI window singlestep VM exits to L1 Message-ID: <20170619161740.GA13549@potion> References: <20170615112032.15812-1-lprosek@redhat.com> <20170615112032.15812-3-lprosek@redhat.com> <20170616132648.GF2224@potion> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 19 Jun 2017 16:17:43 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2017-06-19 15:05+0200, Ladi Prosek: > On Mon, Jun 19, 2017 at 2:50 PM, Ladi Prosek wrote: > > On Fri, Jun 16, 2017 at 3:26 PM, Radim Krčmář wrote: > >> 2017-06-15 13:20+0200, Ladi Prosek: > >>> Nested hypervisor should not see singlestep VM exits if singlestepping > >>> was enabled internally by KVM. Windows is particularly sensitive to this > >>> and known to bluescreen on unexpected VM exits. > >>> > >>> Signed-off-by: Ladi Prosek > >>> --- > >>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > >>> @@ -966,9 +967,13 @@ static void svm_disable_lbrv(struct vcpu_svm *svm) > >>> static void disable_nmi_singlestep(struct vcpu_svm *svm) > >>> { > >>> svm->nmi_singlestep = false; > >>> - if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP)) > >>> - svm->vmcb->save.rflags &= > >>> - ~(X86_EFLAGS_TF | X86_EFLAGS_RF); > >>> + if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP)) { > >>> + /* Clear our flags if they were not set by the guest */ > >>> + if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_TF)) > >>> + svm->vmcb->save.rflags &= ~X86_EFLAGS_TF; > >>> + if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_RF)) > >>> + svm->vmcb->save.rflags &= ~X86_EFLAGS_RF; > >> > >> IIUC, we intercept/fault on IRET, disable the interception, set TF+RF > >> and enter again, the CPU executes IRET and then we get a #DB exit. > >> > >> IRET pops EFLAGS from before the NMI -- doesn't the CPU properly restore > >> EFLAGS, so we do not need this part here? > > > > My test VM doesn't work without this part, even with the change that > > Paolo suggested in 0/4: > > > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > > index d1efe2c62b3f..15a2f7f8e539 100644 > > --- a/arch/x86/kvm/svm.c > > +++ b/arch/x86/kvm/svm.c > > @@ -4622,6 +4622,9 @@ static void enable_nmi_window(struct kvm_vcpu *vcpu) > > if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) > > == HF_NMI_MASK) > > return; /* IRET will cause a vm exit */ > > + if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_GIF_MASK)) > > + == HF_NMI_MASK) > > + return; /* STGI will cause a vm exit */ > > > > > > Let's see what we're singlestepping over (Windows Server 2016 with Hyper-V). > > > > 1. The very first time we enable NMI singlestep is when running > > nested. The nested guest's rip is just after 'pause' in a 'pause; cmp; > > jne' loop. svm_nmi_allowed returns false because nested_svm_nmi > > returns false - we don't want to inject NMI now because > > svm->nested.intercept has the INTERCEPT_NMI bit set. > > > > 2. Then we find ourselves in L1 with the rip just after 'vmrun'. > > svm_nmi_allowed returns false because gif_set returns false (hflags == > > HF_HIF_MASK | HF_VINTR_MASK). So we, again, enable NMI singlestep > > (without having disabled it yet). > > > > 3. We singlestep over the instruction immediately following 'vmrun' > > (it's a 'mov rax, [rsp+0x20]' in this trace) and finally disable NMI > > singlestep on this vcpu. We inject the NMI when the guest executes > > stgi. > > > > > > I'll see if I can short this out. Setting the trap flag to step over > > an instruction which has no other significance than following a > > 'vmrun' is indeed unnecessary. > > Ok, I think I understand it. > > First, Paolo's GIF change should be checking only HF_GIF_MASK. Whether > we're currently in an NMI or not does not make a difference, NMI is > not allowed and we have interception in place for when GIF is set so > we don't need to singlestep. > > Second, enable_nmi_window can also do nothing if > svm->nested.exit_required. We're not really going to run the guest in > this case so no need to singlestep. Makes sense. > With these two tweaks my test VM does not generate any DB exits. We > still occasionally set TF if we have an NMI and interrupt pending at > the same time (see inject_pending_event where we check > vcpu->arch.interrupt.pending, then later vcpu->arch.nmi_pending) but > we clear it immediately in the new code added in patch 4. Right, we only need the single step over IRET and interrupt shadow. Btw. instead of single-stepping over IRET/interrupt shadow, could we set INTERRUPT_SHADOW in VMCB, inject the NMI, and let it execute? This mechanism would explain why AMD didn't provide a trap for IRET ... APM 15.20 says "Injected events are treated in every way as though they had occurred normally in the guest", which makes me think that INTERRUPT_SHADOW blocks them, if it blocks NMIs at all on AMD. e.g. (Of course we'd want to refactor it for the final patch. :]) > What a complex state machine! I'll prepare v2. Yes. :/ Looking forward to v2, thanks. diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 6e3095d1bad4..b564613b4e65 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -4624,14 +4624,16 @@ static void enable_nmi_window(struct kvm_vcpu *vcpu) if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK) - return; /* IRET will cause a vm exit */ + return 0; /* IRET will cause a vm exit */ - /* - * Something prevents NMI from been injected. Single step over possible - * problem (IRET or exception injection or interrupt shadow) - */ - svm->nmi_singlestep = true; - svm->vmcb->save.rflags |= (X86_EFLAGS_TF | X86_EFLAGS_RF); + if (svm->vmcb->control.event_inj) + return 1; /* will request VM exit immediately after injection */ + + --vcpu->arch.nmi_pending; + svm_inject_nmi(vcpu); + svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK; + + return 0; } static int svm_set_tss_addr(struct kvm *kvm, unsigned int addr)