From patchwork Mon Jul 24 14:20:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 9859587 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 51106601A1 for ; Mon, 24 Jul 2017 14:20:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 426F926E54 for ; Mon, 24 Jul 2017 14:20:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 371F728178; Mon, 24 Jul 2017 14:20:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1ABFA26E54 for ; Mon, 24 Jul 2017 14:20:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932260AbdGXOUY (ORCPT ); Mon, 24 Jul 2017 10:20:24 -0400 Received: from mail-wr0-f193.google.com ([209.85.128.193]:33303 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932079AbdGXOUW (ORCPT ); Mon, 24 Jul 2017 10:20:22 -0400 Received: by mail-wr0-f193.google.com with SMTP id y43so17569933wrd.0; Mon, 24 Jul 2017 07:20:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id; bh=k6wCrPFe4p5viwbcdbpS9KjbdNZWa5TS3dUvVHKzmbQ=; b=hZGXVV7u7fMrLWlJXbxYu/GbvBbgc+1nUgXSu90OFBQB4ppkY9z6p3Pf9MEng+nJ9Z xu91dUIHn/jupychshqnBjT940zWmGtIog0Ijg1R74Z7UD3rVMbK3RrSzy7tn4L41yIe hxZNZNUAdp2Vfxy6NhAXibeeQ23VBMVehsmXXcdJVKbcYTW2KKRQr2XAMViinvafcSw4 shRAqn65FjhpXY6u9Qto7wQwIcEb7Hj1Tk/ml2vjx0wjE7c3TXnycSkEF3WNs0CGFD8x bBMOAPRkchhRI5XuWgHP3FGrhkcREU0vTybB6y0xXVHMM3RjJakhA7qU3TyFujqbDwRb JrJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id; bh=k6wCrPFe4p5viwbcdbpS9KjbdNZWa5TS3dUvVHKzmbQ=; b=mJ9JIq+tZk9hbTweVAnAlotFcTsf4dWBPKqc/GGzGbaSBgn4jVqjLCRxwAl7ioQC9g ZldwDxlWI7/7qvcUMjoiQ2bQjq3E7ghSY5yeB2pp+UltMEwXbW78T+b2bLM50wtjgwym zRD6aGqTA20uGl6BQgqD2/pVmWqjBav5MOMtmdFnTIP6EeS0Ecuklv114Y6UGCDqxEFj fR7rqdF1znKv6i+TGR2LUg5wPvIBb9j94UU602d8iNCNjFHoB6FJPEmoelzx8nFLpWFD Zf7/6VynzHQH9Ssy2XKTH/AESO4uICaVhx/FmDEpNMSvx4WXT+0h1lxlNrBMeuONiV9n hZPA== X-Gm-Message-State: AIVw111Mtnv4o7ldETC5yxyoQ31Yf/br8F7Tw7yg3qvZEZPJIR3V2EFw rBRJeNfDnBgFAkU+9lY= X-Received: by 10.223.169.100 with SMTP id u91mr11365123wrc.290.1500906020508; Mon, 24 Jul 2017 07:20:20 -0700 (PDT) Received: from 640k.lan (94-39-195-172.adsl-ull.clienti.tiscali.it. [94.39.195.172]) by smtp.gmail.com with ESMTPSA id 46sm16910316wrz.8.2017.07.24.07.20.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 07:20:18 -0700 (PDT) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Jim Mattson , Wanpeng Li Subject: [RFC/RFT PATCH] KVM: nVMX: fixes to nested virt interrupt injection Date: Mon, 24 Jul 2017 16:20:15 +0200 Message-Id: <1500906015-31784-1-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.8.3.1 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP There are three issues in nested_vmx_check_exception: 1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported by Wanpeng Li; 2) it should rebuild the interruption info and exit qualification fields from scratch, as reported by Jim Mattson, because the values from the L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes a page fault, the EPT misconfig's exit qualification is incorrect). 3) CR2 and DR6 should not be written for exception intercept vmexits (CR2 only for AMD). This patch fixes the first two and adds a comment about the last, outlining the fix. Cc: Jim Mattson Cc: Wanpeng Li Signed-off-by: Paolo Bonzini --- Wanpeng, can you test this on the testcases you had for commit d4912215d103 ("KVM: nVMX: Fix exception injection", 2017-06-05)? Also, do you have a testcase for PFEC matching? arch/x86/kvm/svm.c | 10 ++++++++ arch/x86/kvm/vmx.c | 71 +++++++++++++++++++++++++++++++++++++++++++----------- 2 files changed, 67 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 4d8141e533c3..1107626938cc 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2430,6 +2430,16 @@ static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr, svm->vmcb->control.exit_code = SVM_EXIT_EXCP_BASE + nr; svm->vmcb->control.exit_code_hi = 0; svm->vmcb->control.exit_info_1 = error_code; + + /* + * FIXME: we should not write CR2 when L1 intercepts an L2 #PF exception. + * The fix is to add the ancillary datum (CR2 or DR6) to structs + * kvm_queued_exception and kvm_vcpu_events, so that CR2 and DR6 can be + * written only when inject_pending_event runs (DR6 would written here + * too). This should be conditional on a new capability---if the + * capability is disabled, kvm_multiple_exception would write the + * ancillary information to CR2 or DR6, for backwards ABI-compatibility. + */ if (svm->vcpu.arch.exception.nested_apf) svm->vmcb->control.exit_info_2 = svm->vcpu.arch.apf.nested_apf_token; else diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d04092f821b6..b520614f9d46 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -927,6 +927,10 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu, static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx); static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx); static int alloc_identity_pagetable(struct kvm *kvm); +static bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu); +static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked); +static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, + u16 error_code); static DEFINE_PER_CPU(struct vmcs *, vmxarea); static DEFINE_PER_CPU(struct vmcs *, current_vmcs); @@ -2432,28 +2436,67 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu) * KVM wants to inject page-faults which it got to the guest. This function * checks whether in a nested guest, we need to inject them to L1 or L2. */ +static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, + unsigned long exit_qual) +{ + struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + unsigned int nr = vcpu->arch.exception.nr; + u32 intr_info = nr | INTR_INFO_VALID_MASK; + + if (vcpu->arch.exception.has_error_code) { + vmcs_write32(VM_EXIT_INTR_ERROR_CODE, vcpu->arch.exception.error_code); + intr_info |= INTR_INFO_DELIVER_CODE_MASK; + } + + if (kvm_exception_is_soft(nr)) + intr_info |= INTR_TYPE_SOFT_EXCEPTION; + else + intr_info |= INTR_TYPE_HARD_EXCEPTION; + + if (!(vmcs12->idt_vectoring_info_field & VECTORING_INFO_VALID_MASK) && + vmx_get_nmi_mask(vcpu)) + intr_info |= INTR_INFO_UNBLOCK_NMI; + + nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, intr_info, exit_qual); +} + static int nested_vmx_check_exception(struct kvm_vcpu *vcpu) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); unsigned int nr = vcpu->arch.exception.nr; - if (!((vmcs12->exception_bitmap & (1u << nr)) || - (nr == PF_VECTOR && vcpu->arch.exception.nested_apf))) - return 0; + if (nr == PF_VECTOR) { + if (vcpu->arch.exception.nested_apf) { + nested_vmx_inject_exception_vmexit(vcpu, + vcpu->arch.apf.nested_apf_token); + return 1; + } + /* + * FIXME: we must not write CR2 when L1 intercepts an L2 #PF exception. + * The fix is to add the ancillary datum (CR2 or DR6) to structs + * kvm_queued_exception and kvm_vcpu_events, so that CR2 and DR6 + * can be written only when inject_pending_event runs. This should be + * conditional on a new capability---if the capability is disabled, + * kvm_multiple_exception would write the ancillary information to + * CR2 or DR6, for backwards ABI-compatibility. + */ + if (nested_vmx_is_page_fault_vmexit(vmcs12, + vcpu->arch.exception.error_code)) { + nested_vmx_inject_exception_vmexit(vcpu, vcpu->arch.cr2); + return 1; + } + } else { + unsigned long exit_qual = 0; + if (nr == DB_VECTOR) + exit_qual = vcpu->arch.dr6; - if (vcpu->arch.exception.nested_apf) { - vmcs_write32(VM_EXIT_INTR_ERROR_CODE, vcpu->arch.exception.error_code); - nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, - PF_VECTOR | INTR_TYPE_HARD_EXCEPTION | - INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK, - vcpu->arch.apf.nested_apf_token); - return 1; + if (vmcs12->exception_bitmap & (1u << nr)) { + nested_vmx_inject_exception_vmexit(vcpu, exit_qual); + return 1; + } } - nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, - vmcs_read32(VM_EXIT_INTR_INFO), - vmcs_readl(EXIT_QUALIFICATION)); - return 1; + return 0; } static void vmx_queue_exception(struct kvm_vcpu *vcpu)