From patchwork Tue Jul 31 15:32:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550949 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD2C314E2 for ; Tue, 31 Jul 2018 15:32:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E46428F03 for ; Tue, 31 Jul 2018 15:32:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9280D295EC; Tue, 31 Jul 2018 15:32:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3ADCC28F03 for ; Tue, 31 Jul 2018 15:32:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732506AbeGaRNq (ORCPT ); Tue, 31 Jul 2018 13:13:46 -0400 Received: from mga03.intel.com ([134.134.136.65]:23683 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732269AbeGaRN0 (ORCPT ); Tue, 31 Jul 2018 13:13:26 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342622" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 01/16] KVM: nVMX: move host EFER consistency checks to VMFail path Date: Tue, 31 Jul 2018 08:32:00 -0700 Message-Id: <20180731153215.31794-2-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Invalid host state related to loading EFER on VMExit causes a VMFail(VMXERR_ENTRY_INVALID_HOST_STATE_FIELD), not a VMExit. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1689f433f3a0..f60965c94503 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11564,6 +11564,7 @@ static int nested_vmx_check_nmi_controls(struct vmcs12 *vmcs12) static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { struct vcpu_vmx *vmx = to_vmx(vcpu); + bool ia32e; if (vmcs12->guest_activity_state != GUEST_ACTIVITY_ACTIVE && vmcs12->guest_activity_state != GUEST_ACTIVITY_HLT) @@ -11631,6 +11632,21 @@ static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) !nested_cr3_valid(vcpu, vmcs12->host_cr3)) return VMXERR_ENTRY_INVALID_HOST_STATE_FIELD; + /* + * If the load IA32_EFER VM-exit control is 1, bits reserved in the + * IA32_EFER MSR must be 0 in the field for that register. In addition, + * the values of the LMA and LME bits in the field must each be that of + * the host address-space size VM-exit control. + */ + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER) { + ia32e = (vmcs12->vm_exit_controls & + VM_EXIT_HOST_ADDR_SPACE_SIZE) != 0; + if (!kvm_valid_efer(vcpu, vmcs12->host_ia32_efer) || + ia32e != !!(vmcs12->host_ia32_efer & EFER_LMA) || + ia32e != !!(vmcs12->host_ia32_efer & EFER_LME)) + return VMXERR_ENTRY_INVALID_HOST_STATE_FIELD; + } + /* * From the Intel SDM, volume 3: * Fields relevant to VM-entry event injection must be set properly. @@ -11726,21 +11742,6 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 1; } - /* - * If the load IA32_EFER VM-exit control is 1, bits reserved in the - * IA32_EFER MSR must be 0 in the field for that register. In addition, - * the values of the LMA and LME bits in the field must each be that of - * the host address-space size VM-exit control. - */ - if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER) { - ia32e = (vmcs12->vm_exit_controls & - VM_EXIT_HOST_ADDR_SPACE_SIZE) != 0; - if (!kvm_valid_efer(vcpu, vmcs12->host_ia32_efer) || - ia32e != !!(vmcs12->host_ia32_efer & EFER_LMA) || - ia32e != !!(vmcs12->host_ia32_efer & EFER_LME)) - return 1; - } - if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS) && (is_noncanonical_address(vmcs12->guest_bndcfgs & PAGE_MASK, vcpu) || (vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD))) From patchwork Tue Jul 31 15:32:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550953 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A09F13B8 for ; Tue, 31 Jul 2018 15:32:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0912B28F03 for ; Tue, 31 Jul 2018 15:32:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F1074295EC; Tue, 31 Jul 2018 15:32:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1D6E28F03 for ; Tue, 31 Jul 2018 15:32:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732424AbeGaRN0 (ORCPT ); Tue, 31 Jul 2018 13:13:26 -0400 Received: from mga03.intel.com ([134.134.136.65]:23684 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732388AbeGaRN0 (ORCPT ); Tue, 31 Jul 2018 13:13:26 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342623" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 02/16] KVM: nVMX: move vmcs12 EPTP consistency check to check_vmentry_prereqs() Date: Tue, 31 Jul 2018 08:32:01 -0700 Message-Id: <20180731153215.31794-3-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP An invalid EPTP causes a VMFail(VMXERR_ENTRY_INVALID_CONTROL_FIELD), not a VMExit. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f60965c94503..2bea814b347a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10577,11 +10577,9 @@ static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) return get_vmcs12(vcpu)->ept_pointer; } -static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +static void nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) { WARN_ON(mmu_is_nested(vcpu)); - if (!valid_ept_address(vcpu, nested_ept_get_cr3(vcpu))) - return 1; kvm_mmu_unload(vcpu); kvm_init_shadow_ept_mmu(vcpu, @@ -10593,7 +10591,6 @@ static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; - return 0; } static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) @@ -11491,15 +11488,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); } - if (nested_cpu_has_ept(vmcs12)) { - if (nested_ept_init_mmu_context(vcpu)) { - *entry_failure_code = ENTRY_FAIL_DEFAULT; - return 1; - } - } else if (nested_cpu_has2(vmcs12, - SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) { + if (nested_cpu_has_ept(vmcs12)) + nested_ept_init_mmu_context(vcpu); + else if (nested_cpu_has2(vmcs12, + SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) vmx_flush_tlb(vcpu, true); - } /* * This sets GUEST_CR0 to vmcs12->guest_cr0, possibly modifying those @@ -11703,6 +11696,10 @@ static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) } } + if (nested_cpu_has_ept(vmcs12) && + !valid_ept_address(vcpu, vmcs12->ept_pointer)) + return VMXERR_ENTRY_INVALID_CONTROL_FIELD; + return 0; } From patchwork Tue Jul 31 15:32:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550923 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C88914E2 for ; Tue, 31 Jul 2018 15:32:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F1B1B2B181 for ; Tue, 31 Jul 2018 15:32:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E5C402B1A2; Tue, 31 Jul 2018 15:32:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 97AE02B181 for ; Tue, 31 Jul 2018 15:32:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732337AbeGaRNZ (ORCPT ); Tue, 31 Jul 2018 13:13:25 -0400 Received: from mga03.intel.com ([134.134.136.65]:23683 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732269AbeGaRNZ (ORCPT ); Tue, 31 Jul 2018 13:13:25 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342620" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 03/16] KVM: nVMX: use vm_exit_controls_init() to write exit controls for vmcs02 Date: Tue, 31 Jul 2018 08:32:02 -0700 Message-Id: <20180731153215.31794-4-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring vmcs02, otherwise vm_exit_controls_shadow will be stale. EFER in particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated due to an incorrect shadow optimization, which can crash L0 due to EFER not being loaded on exit. This does not occur with the current code base simply because update_transition_efer() unconditionally clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER clear, i.e. we'll only ever unnecessarily clear the bit. That is, until someone optimizes update_transition_efer()... Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 2bea814b347a..449d1124cd1a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11434,7 +11434,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER * bits are further modified by vmx_set_efer() below. */ - vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl); + vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are * emulated by vmx_set_efer(), below. From patchwork Tue Jul 31 15:32:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550935 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 830B596FA for ; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 734EA2B19A for ; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 67C832B1A2; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1941C2B19A for ; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732458AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 Received: from mga03.intel.com ([134.134.136.65]:23683 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732447AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342628" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 04/16] KVM: nVMX: reset cache/shadows on nested consistency check VMExit Date: Tue, 31 Jul 2018 08:32:03 -0700 Message-Id: <20180731153215.31794-5-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Reset the vm_{entry,exit}_controls_shadow variables as well as the segment cache on consistency check VMExit. The shadow values in particular can lead to missed updates due to stale shadows. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 449d1124cd1a..200da73d5c42 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12522,12 +12522,18 @@ static void nested_vmx_entry_failure(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, u32 reason, unsigned long qualification) { + struct vcpu_vmx *vmx = to_vmx(vcpu); + + vm_entry_controls_reset_shadow(vmx); + vm_exit_controls_reset_shadow(vmx); + vmx_segment_cache_clear(vmx); + load_vmcs12_host_state(vcpu, vmcs12); vmcs12->vm_exit_reason = reason | VMX_EXIT_REASONS_FAILED_VMENTRY; vmcs12->exit_qualification = qualification; nested_vmx_succeed(vcpu); if (enable_shadow_vmcs) - to_vmx(vcpu)->nested.sync_shadow_vmcs = true; + vmx->nested.sync_shadow_vmcs = true; } static int vmx_check_intercept(struct kvm_vcpu *vcpu, From patchwork Tue Jul 31 15:32:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550943 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D73D96FA for ; Tue, 31 Jul 2018 15:32:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F1F22293CB for ; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E62AA29695; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 949DA293CB for ; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732491AbeGaRNk (ORCPT ); Tue, 31 Jul 2018 13:13:40 -0400 Received: from mga03.intel.com ([134.134.136.65]:23684 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732450AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342629" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 05/16] KVM: vmx: do not unconditionally clear EFER switching Date: Tue, 31 Jul 2018 08:32:04 -0700 Message-Id: <20180731153215.31794-6-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Do not unconditionally call clear_atomic_switch_msr() when updating EFER. This adds up to four unnecessary VMWrites in the case where guest_efer != host_efer, e.g. if the load_on_{entry,exit} bits were already set. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 200da73d5c42..51726276e71d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2511,8 +2511,6 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) ignore_bits &= ~(u64)EFER_SCE; #endif - clear_atomic_switch_msr(vmx, MSR_EFER); - /* * On EPT, we can't emulate NX, so we must switch EFER atomically. * On CPUs that support "load IA32_EFER", always switch EFER @@ -2525,8 +2523,12 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) if (guest_efer != host_efer) add_atomic_switch_msr(vmx, MSR_EFER, guest_efer, host_efer); + else + clear_atomic_switch_msr(vmx, MSR_EFER); return false; } else { + clear_atomic_switch_msr(vmx, MSR_EFER); + guest_efer &= ~ignore_bits; guest_efer |= host_efer & ignore_bits; From patchwork Tue Jul 31 15:32:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F299913B8 for ; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E24E528F03 for ; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D6AF8295FA; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 72D8628F03 for ; Tue, 31 Jul 2018 15:32:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732454AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 Received: from mga03.intel.com ([134.134.136.65]:23684 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732425AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342630" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 06/16] KVM: nVMX: try to set EFER bits correctly when init'ing entry controls Date: Tue, 31 Jul 2018 08:32:05 -0700 Message-Id: <20180731153215.31794-7-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP VM_ENTRY_IA32E_MODE and VM_{ENTRY,EXIT}_LOAD_IA32_EFER will be explicitly set/cleared as needed by vmx_set_efer(), but attempt to get the bits set correctly when intializing the control fields. Setting the value correctly can avoid multiple VMWrites. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 51726276e71d..24a25e7d4b40 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11146,6 +11146,17 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne return 0; } +static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) +{ + if (vmx->nested.nested_run_pending && + (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER)) + return vmcs12->guest_ia32_efer; + else if (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) + return vmx->vcpu.arch.efer | (EFER_LMA | EFER_LME); + else + return vmx->vcpu.arch.efer & ~(EFER_LMA | EFER_LME); +} + static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -11284,6 +11295,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 exec_control, vmcs12_exec_ctrl; + u64 guest_efer; if (vmx->nested.dirty_vmcs12) { prepare_vmcs02_full(vcpu, vmcs12); @@ -11438,13 +11450,23 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, */ vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); - /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are - * emulated by vmx_set_efer(), below. + /* + * vmcs12's VM_{ENTRY,EXIT}_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE + * are emulated by vmx_set_efer(), below, but speculate on the + * related bits (if supported by the CPU) in the hope that we can + * avoid VMWrites during vmx_set_efer(). */ - vm_entry_controls_init(vmx, - (vmcs12->vm_entry_controls & ~VM_ENTRY_LOAD_IA32_EFER & - ~VM_ENTRY_IA32E_MODE) | - (vmcs_config.vmentry_ctrl & ~VM_ENTRY_IA32E_MODE)); + exec_control = (vmcs12->vm_entry_controls | vmcs_config.vmentry_ctrl) & + ~VM_ENTRY_IA32E_MODE & ~VM_ENTRY_LOAD_IA32_EFER; + + guest_efer = nested_vmx_calc_efer(vmx, vmcs12); + if (cpu_has_load_ia32_efer) { + if (guest_efer & EFER_LMA) + exec_control |= VM_ENTRY_IA32E_MODE; + if (guest_efer != host_efer) + exec_control |= VM_ENTRY_LOAD_IA32_EFER; + } + vm_entry_controls_init(vmx, exec_control); if (vmx->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT)) { @@ -11510,14 +11532,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmx_set_cr4(vcpu, vmcs12->guest_cr4); vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(vmcs12)); - if (vmx->nested.nested_run_pending && - (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER)) - vcpu->arch.efer = vmcs12->guest_ia32_efer; - else if (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) - vcpu->arch.efer |= (EFER_LMA | EFER_LME); - else - vcpu->arch.efer &= ~(EFER_LMA | EFER_LME); - /* Note: modifies VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */ + vcpu->arch.efer = guest_efer; + /* Note: may modify VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */ vmx_set_efer(vcpu, vcpu->arch.efer); /* From patchwork Tue Jul 31 15:32:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550955 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62C0E14E2 for ; Tue, 31 Jul 2018 15:32:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5340B28F03 for ; Tue, 31 Jul 2018 15:32:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 47BBF295EC; Tue, 31 Jul 2018 15:32:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E55C828F03 for ; Tue, 31 Jul 2018 15:32:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732421AbeGaRN0 (ORCPT ); Tue, 31 Jul 2018 13:13:26 -0400 Received: from mga14.intel.com ([192.55.52.115]:4512 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732358AbeGaRN0 (ORCPT ); Tue, 31 Jul 2018 13:13:26 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342631" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 07/16] KVM: nVMX: rename enter_vmx_non_root_mode to nested_vmx_enter_non_root_mode Date: Tue, 31 Jul 2018 08:32:06 -0700 Message-Id: <20180731153215.31794-8-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP ...to be more consistent with the nested VMX nomenclature. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 24a25e7d4b40..f3b8cc803025 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11765,7 +11765,7 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 0; } -static int enter_vmx_non_root_mode(struct kvm_vcpu *vcpu) +static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); @@ -11888,7 +11888,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) */ vmx->nested.nested_run_pending = 1; - ret = enter_vmx_non_root_mode(vcpu); + ret = nested_vmx_enter_non_root_mode(vcpu); if (ret) { vmx->nested.nested_run_pending = 0; return ret; @@ -12983,7 +12983,7 @@ static int vmx_pre_leave_smm(struct kvm_vcpu *vcpu, u64 smbase) if (vmx->nested.smm.guest_mode) { vcpu->arch.hflags &= ~HF_SMM_MASK; - ret = enter_vmx_non_root_mode(vcpu); + ret = nested_vmx_enter_non_root_mode(vcpu); vcpu->arch.hflags |= HF_SMM_MASK; if (ret) return ret; From patchwork Tue Jul 31 15:32:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550929 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11C9696FA for ; Tue, 31 Jul 2018 15:32:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0361E2B19E for ; Tue, 31 Jul 2018 15:32:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC1DC2B1A8; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 72B292B19E for ; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732455AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 Received: from mga03.intel.com ([134.134.136.65]:23685 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732441AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342633" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 08/16] KVM: nVMX: move check_vmentry_postreqs() call to nested_vmx_enter_non_root_mode() Date: Tue, 31 Jul 2018 08:32:07 -0700 Message-Id: <20180731153215.31794-9-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In preparation of supporting checkpoint/restore for nested state, commit ca0bde28f2ed ("kvm: nVMX: Split VMCS checks from nested_vmx_run()") modified check_vmentry_postreqs() to only perform the guest EFER consistency checks when nested_run_pending is true. But, in the normal nested VMEntry flow, nested_run_pending is only set after check_vmentry_postreqs(), i.e. the consistency check is being skipped. Alternatively, nested_run_pending could be set prior to calling check_vmentry_postreqs() in nested_vmx_run(), but placing the consistency checks in nested_vmx_enter_non_root_mode() allows us to split prepare_vmcs02() and interleave the preparation with the consistency checks without having to change the call sites of nested_vmx_enter_non_root_mode(). In other words, the rest of the consistency check code in nested_vmx_run() will be joining the postreqs checks in future patches. Fixes: ca0bde28f2ed ("kvm: nVMX: Split VMCS checks from nested_vmx_run()") Signed-off-by: Sean Christopherson Cc: Jim Mattson --- arch/x86/kvm/vmx.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f3b8cc803025..c01ae6e7c323 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11769,10 +11769,22 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + bool is_vmentry = vmx->nested.nested_run_pending; u32 msr_entry_idx; u32 exit_qual; int r; + if (!is_vmentry) + goto enter_non_root_mode; + + r = check_vmentry_postreqs(vcpu, vmcs12, &exit_qual); + if (r) { + nested_vmx_entry_failure(vcpu, vmcs12, + EXIT_REASON_INVALID_STATE, exit_qual); + return 1; + } + +enter_non_root_mode: enter_guest_mode(vcpu); if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) @@ -11823,7 +11835,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) struct vmcs12 *vmcs12; struct vcpu_vmx *vmx = to_vmx(vcpu); u32 interrupt_shadow = vmx_get_interrupt_shadow(vcpu); - u32 exit_qual; int ret; if (!nested_vmx_check_permission(vcpu)) @@ -11875,13 +11886,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) */ skip_emulated_instruction(vcpu); - ret = check_vmentry_postreqs(vcpu, vmcs12, &exit_qual); - if (ret) { - nested_vmx_entry_failure(vcpu, vmcs12, - EXIT_REASON_INVALID_STATE, exit_qual); - return 1; - } - /* * We're finally done with prerequisite checking, and can start with * the nested entry. From patchwork Tue Jul 31 15:32:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550951 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D60E14E2 for ; Tue, 31 Jul 2018 15:32:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DFD728F03 for ; Tue, 31 Jul 2018 15:32:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 721D3295EC; Tue, 31 Jul 2018 15:32:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0651728F03 for ; Tue, 31 Jul 2018 15:32:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732503AbeGaRNq (ORCPT ); Tue, 31 Jul 2018 13:13:46 -0400 Received: from mga03.intel.com ([134.134.136.65]:23684 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732395AbeGaRN0 (ORCPT ); Tue, 31 Jul 2018 13:13:26 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342636" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 09/16] KVM: nVMX: assimilate nested_vmx_entry_failure() into nested_vmx_enter_non_root_mode() Date: Tue, 31 Jul 2018 08:32:08 -0700 Message-Id: <20180731153215.31794-10-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In addition to consolidating code, removing nested_vmx_entry_failure() eliminates a confusing function name and label. For a VMEntry, "fail" and its derivatives has a very specific meaning due to the different behavior of a VMEnter VMFail versus VMExit, i.e. a more appropriate name for nested_vmx_entry_failure() would have been something like nested_vmx_entry_consistency_check_vmexit(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 63 ++++++++++++++++++---------------------------- 1 file changed, 25 insertions(+), 38 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c01ae6e7c323..52c29d12d0e8 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1805,9 +1805,6 @@ static inline bool is_nmi(u32 intr_info) static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, u32 exit_intr_info, unsigned long exit_qualification); -static void nested_vmx_entry_failure(struct kvm_vcpu *vcpu, - struct vmcs12 *vmcs12, - u32 reason, unsigned long qualification); static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr) { @@ -11765,24 +11762,23 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 0; } +static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, + struct vmcs12 *vmcs12); + static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); bool is_vmentry = vmx->nested.nested_run_pending; + u32 exit_reason = EXIT_REASON_INVALID_STATE; u32 msr_entry_idx; u32 exit_qual; - int r; if (!is_vmentry) goto enter_non_root_mode; - r = check_vmentry_postreqs(vcpu, vmcs12, &exit_qual); - if (r) { - nested_vmx_entry_failure(vcpu, vmcs12, - EXIT_REASON_INVALID_STATE, exit_qual); - return 1; - } + if (check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) + goto consistency_check_vmexit; enter_non_root_mode: enter_guest_mode(vcpu); @@ -11796,13 +11792,12 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset += vmcs12->tsc_offset; - r = EXIT_REASON_INVALID_STATE; if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) goto fail; nested_get_vmcs12_pages(vcpu, vmcs12); - r = EXIT_REASON_MSR_LOAD_FAIL; + exit_reason = EXIT_REASON_MSR_LOAD_FAIL; msr_entry_idx = nested_vmx_load_msr(vcpu, vmcs12->vm_entry_msr_load_addr, vmcs12->vm_entry_msr_load_count); @@ -11822,7 +11817,24 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); vmx_switch_vmcs(vcpu, &vmx->vmcs01); - nested_vmx_entry_failure(vcpu, vmcs12, r, exit_qual); + + /* + * A consistency check VMExit during L1's VMEnter to L2 is a subset + * of a normal VMexit, as explained in 23.7 "VM-entry failures during + * or after loading guest state" (this also lists the acceptable exit- + * reason and exit-qualification parameters). + */ +consistency_check_vmexit: + vm_entry_controls_reset_shadow(vmx); + vm_exit_controls_reset_shadow(vmx); + vmx_segment_cache_clear(vmx); + + load_vmcs12_host_state(vcpu, vmcs12); + vmcs12->vm_exit_reason = exit_reason | VMX_EXIT_REASONS_FAILED_VMENTRY; + vmcs12->exit_qualification = exit_qual; + nested_vmx_succeed(vcpu); + if (enable_shadow_vmcs) + vmx->nested.sync_shadow_vmcs = true; return 1; } @@ -12533,31 +12545,6 @@ static void vmx_leave_nested(struct kvm_vcpu *vcpu) free_nested(to_vmx(vcpu)); } -/* - * L1's failure to enter L2 is a subset of a normal exit, as explained in - * 23.7 "VM-entry failures during or after loading guest state" (this also - * lists the acceptable exit-reason and exit-qualification parameters). - * It should only be called before L2 actually succeeded to run, and when - * vmcs01 is current (it doesn't leave_guest_mode() or switch vmcss). - */ -static void nested_vmx_entry_failure(struct kvm_vcpu *vcpu, - struct vmcs12 *vmcs12, - u32 reason, unsigned long qualification) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - vm_entry_controls_reset_shadow(vmx); - vm_exit_controls_reset_shadow(vmx); - vmx_segment_cache_clear(vmx); - - load_vmcs12_host_state(vcpu, vmcs12); - vmcs12->vm_exit_reason = reason | VMX_EXIT_REASONS_FAILED_VMENTRY; - vmcs12->exit_qualification = qualification; - nested_vmx_succeed(vcpu); - if (enable_shadow_vmcs) - vmx->nested.sync_shadow_vmcs = true; -} - static int vmx_check_intercept(struct kvm_vcpu *vcpu, struct x86_instruction_info *info, enum x86_intercept_stage stage) From patchwork Tue Jul 31 15:32:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550927 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C080713B8 for ; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AFFC02B181 for ; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A412B2B1A8; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FF802B181 for ; Tue, 31 Jul 2018 15:32:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732435AbeGaRN1 (ORCPT ); Tue, 31 Jul 2018 13:13:27 -0400 Received: from mga03.intel.com ([134.134.136.65]:23683 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732425AbeGaRN1 (ORCPT ); Tue, 31 Jul 2018 13:13:27 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342639" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 10/16] KVM: nVMX: split pieces of prepare_vmcs02() to prepare_vmcs02_early() Date: Tue, 31 Jul 2018 08:32:09 -0700 Message-Id: <20180731153215.31794-11-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add prepare_vmcs02_early() and move pieces of prepare_vmcs02() to the new function. prepare_vmcs02_early() writes the bits of vmcs02 that a) must be in place to pass the VMFail consistency checks (assuming vmcs12 is valid) and b) are needed recover from a VMExit, e.g. host state that is loaded on VMExit. Splitting the functionality will enable KVM to leverage hardware to do VMFail consistency checks via a dry run of VMEnter and recover from a potential VMExit without having to fully initialize vmcs02. Add prepare_vmcs02_first_run() to handle writing vmcs02 state that comes from vmcs01 and never changes, i.e. we don't need to rewrite any of the vmcs02 that is effectively constant once defined. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 402 ++++++++++++++++++++++++--------------------- 1 file changed, 219 insertions(+), 183 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 52c29d12d0e8..6e7ee3637f12 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11154,10 +11154,223 @@ static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) return vmx->vcpu.arch.efer & ~(EFER_LMA | EFER_LME); } -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) +static void prepare_vmcs02_first_run(struct vcpu_vmx *vmx) { - struct vcpu_vmx *vmx = to_vmx(vcpu); + /* + * If we have never launched vmcs02, set the constant vmcs02 state + * according to L0's settings (vmcs12 is irrelevant here). Host + * fields that come from L0 and are not constant, e.g. HOST_CR3, + * will be set as needed prior to VMLAUNCH/VMRESUME. + */ + if (vmx->nested.vmcs02.launched) + return; + /* All VMFUNCs are currently emulated through L0 vmexits. */ + if (cpu_has_vmx_vmfunc()) + vmcs_write64(VM_FUNCTION_CONTROL, 0); + + if (cpu_has_vmx_posted_intr()) + vmcs_write16(POSTED_INTR_NV, POSTED_INTR_NESTED_VECTOR); + + if (cpu_has_vmx_msr_bitmap()) + vmcs_write64(MSR_BITMAP, __pa(vmx->nested.vmcs02.msr_bitmap)); + + if (enable_pml) + vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg)); + + /* + * Set the MSR load/store lists to match L0's settings. Only the + * addresses are constant (for vmcs02), the counts can change based + * on L2's behavior, e.g. switching to/from long mode. + */ + vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); + vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host)); + vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest)); + + vmx_set_constant_host_state(vmx); +} + +static void prepare_vmcs02_early_full(struct vcpu_vmx *vmx, + struct vmcs12 *vmcs12) +{ + prepare_vmcs02_first_run(vmx); + + vmcs_write64(VMCS_LINK_POINTER, -1ull); + + if (enable_vpid) { + if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02) + vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02); + else + vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid); + } +} + +static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) +{ + u32 exec_control, vmcs12_exec_ctrl; + u64 guest_efer = nested_vmx_calc_efer(vmx, vmcs12); + + if (vmx->nested.dirty_vmcs12) + prepare_vmcs02_early_full(vmx, vmcs12); + + /* + * HOST_RSP is normally set correctly in vmx_vcpu_run() just before + * entry, but only if the current (host) sp changed from the value + * we wrote last (vmx->host_rsp). This cache is no longer relevant + * if we switch vmcs, and rather than hold a separate cache per vmcs, + * here we just force the write to happen on entry. + */ + vmx->host_rsp = 0; + + /* + * PIN CONTROLS + */ + exec_control = vmcs12->pin_based_vm_exec_control; + + /* Preemption timer setting is only taken from vmcs01. */ + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; + exec_control |= vmcs_config.pin_based_exec_ctrl; + if (vmx->hv_deadline_tsc == -1) + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; + + /* Posted interrupts setting is only taken from vmcs12. */ + if (nested_cpu_has_posted_intr(vmcs12)) { + vmx->nested.posted_intr_nv = vmcs12->posted_intr_nv; + vmx->nested.pi_pending = false; + } else { + exec_control &= ~PIN_BASED_POSTED_INTR; + } + vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, exec_control); + + /* + * EXEC CONTROLS + */ + exec_control = vmx_exec_control(vmx); /* L0's desires */ + exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING; + exec_control &= ~CPU_BASED_VIRTUAL_NMI_PENDING; + exec_control &= ~CPU_BASED_TPR_SHADOW; + exec_control |= vmcs12->cpu_based_vm_exec_control; + + /* + * Write an illegal value to VIRTUAL_APIC_PAGE_ADDR. Later, if + * nested_get_vmcs12_pages can't fix it up, the illegal value + * will result in a VM entry failure. + */ + if (exec_control & CPU_BASED_TPR_SHADOW) { + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, -1ull); + vmcs_write32(TPR_THRESHOLD, vmcs12->tpr_threshold); + } else { +#ifdef CONFIG_X86_64 + exec_control |= CPU_BASED_CR8_LOAD_EXITING | + CPU_BASED_CR8_STORE_EXITING; +#endif + } + + /* + * A vmexit (to either L1 hypervisor or L0 userspace) is always needed + * for I/O port accesses. + */ + exec_control &= ~CPU_BASED_USE_IO_BITMAPS; + exec_control |= CPU_BASED_UNCOND_IO_EXITING; + vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control); + + /* + * SECONDARY EXEC CONTROLS + */ + if (cpu_has_secondary_exec_ctrls()) { + exec_control = vmx->secondary_exec_control; + + /* Take the following fields only from vmcs12 */ + exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | + SECONDARY_EXEC_ENABLE_INVPCID | + SECONDARY_EXEC_RDTSCP | + SECONDARY_EXEC_XSAVES | + SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | + SECONDARY_EXEC_APIC_REGISTER_VIRT | + SECONDARY_EXEC_ENABLE_VMFUNC); + if (nested_cpu_has(vmcs12, + CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)) { + vmcs12_exec_ctrl = vmcs12->secondary_vm_exec_control & + ~SECONDARY_EXEC_ENABLE_PML; + exec_control |= vmcs12_exec_ctrl; + } + + if (exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) + vmcs_write16(GUEST_INTR_STATUS, + vmcs12->guest_intr_status); + + /* + * Write an illegal value to APIC_ACCESS_ADDR. Later, + * nested_get_vmcs12_pages will either fix it up or + * remove the VM execution control. + */ + if (exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) + vmcs_write64(APIC_ACCESS_ADDR, -1ull); + + vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control); + } + + /* + * ENTRY CONTROLS + * + * vmcs12's VM_{ENTRY,EXIT}_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE + * are emulated by vmx_set_efer() in prepare_vmcs02(), but speculate + * on the related bits (if supported by the CPU) in the hope that + * we can avoid VMWrites during vmx_set_efer(). + */ + exec_control = (vmcs12->vm_entry_controls | vmcs_config.vmentry_ctrl) & + ~VM_ENTRY_IA32E_MODE & ~VM_ENTRY_LOAD_IA32_EFER; + if (cpu_has_load_ia32_efer) { + if (guest_efer & EFER_LMA) + exec_control |= VM_ENTRY_IA32E_MODE; + if (guest_efer != host_efer) + exec_control |= VM_ENTRY_LOAD_IA32_EFER; + } + vm_entry_controls_init(vmx, exec_control); + + /* + * EXIT CONTROLS + * + * L2->L1 exit controls are emulated - the hardware exit is to L0 so + * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER + * bits may be modified by vmx_set_efer() in prepare_vmcs02(). + */ + exec_control = vmcs_config.vmexit_ctrl; + if (cpu_has_load_ia32_efer && guest_efer != host_efer) + exec_control |= VM_EXIT_LOAD_IA32_EFER; + vm_exit_controls_init(vmx, exec_control); + + /* + * Conceptually we want to copy the PML address and index from + * vmcs01 here, and then back to vmcs01 on nested vmexit. But, + * since we always flush the log on each vmexit and never change + * the PML address (once set), this happens to be equivalent to + * simply resetting the index in vmcs02. + */ + if (enable_pml) + vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); + + /* + * Interrupt/Exception Fields + */ + if (vmx->nested.nested_run_pending) { + vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, + vmcs12->vm_entry_intr_info_field); + vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, + vmcs12->vm_entry_exception_error_code); + vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, + vmcs12->vm_entry_instruction_len); + vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, + vmcs12->guest_interruptibility_info); + vmx->loaded_vmcs->nmi_known_unmasked = + !(vmcs12->guest_interruptibility_info & GUEST_INTR_STATE_NMI); + } else { + vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); + } +} + +static void prepare_vmcs02_full(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) +{ vmcs_write16(GUEST_ES_SELECTOR, vmcs12->guest_es_selector); vmcs_write16(GUEST_SS_SELECTOR, vmcs12->guest_ss_selector); vmcs_write16(GUEST_DS_SELECTOR, vmcs12->guest_ds_selector); @@ -11198,10 +11411,6 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) if (nested_cpu_has_xsaves(vmcs12)) vmcs_write64(XSS_EXIT_BITMAP, vmcs12->xss_exit_bitmap); - vmcs_write64(VMCS_LINK_POINTER, -1ull); - - if (cpu_has_vmx_posted_intr()) - vmcs_write16(POSTED_INTR_NV, POSTED_INTR_NESTED_VECTOR); /* * Whether page-faults are trapped is determined by a combination of @@ -11222,10 +11431,6 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, enable_ept ? vmcs12->page_fault_error_code_match : 0); - /* All VMFUNCs are currently emulated through L0 vmexits. */ - if (cpu_has_vmx_vmfunc()) - vmcs_write64(VM_FUNCTION_CONTROL, 0); - if (cpu_has_vmx_apicv()) { vmcs_write64(EOI_EXIT_BITMAP0, vmcs12->eoi_exit_bitmap0); vmcs_write64(EOI_EXIT_BITMAP1, vmcs12->eoi_exit_bitmap1); @@ -11233,35 +11438,14 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3); } - /* - * Set host-state according to L0's settings (vmcs12 is irrelevant here) - * Some constant fields are set here by vmx_set_constant_host_state(). - * Other fields are different per CPU, and will be set later when - * vmx_vcpu_load() is called, and when vmx_save_host_state() is called. - */ - vmx_set_constant_host_state(vmx); - - /* - * Set the MSR load/store lists to match L0's settings. - */ - vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.nr); - vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host)); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.nr); - vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest)); set_cr4_guest_host_mask(vmx); if (vmx_mpx_supported()) vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); - if (enable_vpid) { - if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02) - vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02); - else - vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid); - } - /* * L1 may access the L2's PDPTR, so save them to construct vmcs12 */ @@ -11271,9 +11455,6 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs_write64(GUEST_PDPTR2, vmcs12->guest_pdptr2); vmcs_write64(GUEST_PDPTR3, vmcs12->guest_pdptr3); } - - if (cpu_has_vmx_msr_bitmap()) - vmcs_write64(MSR_BITMAP, __pa(vmx->nested.vmcs02.msr_bitmap)); } /* @@ -11291,11 +11472,9 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, u32 *entry_failure_code) { struct vcpu_vmx *vmx = to_vmx(vcpu); - u32 exec_control, vmcs12_exec_ctrl; - u64 guest_efer; if (vmx->nested.dirty_vmcs12) { - prepare_vmcs02_full(vcpu, vmcs12); + prepare_vmcs02_full(vmx, vmcs12); vmx->nested.dirty_vmcs12 = false; } @@ -11310,11 +11489,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmcs_writel(GUEST_ES_BASE, vmcs12->guest_es_base); vmcs_writel(GUEST_CS_BASE, vmcs12->guest_cs_base); - /* - * Not in vmcs02: GUEST_PML_INDEX, HOST_FS_SELECTOR, HOST_GS_SELECTOR, - * HOST_FS_BASE, HOST_GS_BASE. - */ - if (vmx->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) { kvm_set_dr(vcpu, 7, vmcs12->guest_dr7); @@ -11323,116 +11497,12 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, kvm_set_dr(vcpu, 7, vcpu->arch.dr7); vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.vmcs01_debugctl); } - if (vmx->nested.nested_run_pending) { - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, - vmcs12->vm_entry_intr_info_field); - vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, - vmcs12->vm_entry_exception_error_code); - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, - vmcs12->vm_entry_instruction_len); - vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, - vmcs12->guest_interruptibility_info); - vmx->loaded_vmcs->nmi_known_unmasked = - !(vmcs12->guest_interruptibility_info & GUEST_INTR_STATE_NMI); - } else { - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); - } vmx_set_rflags(vcpu, vmcs12->guest_rflags); - exec_control = vmcs12->pin_based_vm_exec_control; - - /* Preemption timer setting is only taken from vmcs01. */ - exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; - exec_control |= vmcs_config.pin_based_exec_ctrl; - if (vmx->hv_deadline_tsc == -1) - exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; - - /* Posted interrupts setting is only taken from vmcs12. */ - if (nested_cpu_has_posted_intr(vmcs12)) { - vmx->nested.posted_intr_nv = vmcs12->posted_intr_nv; - vmx->nested.pi_pending = false; - } else { - exec_control &= ~PIN_BASED_POSTED_INTR; - } - - vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, exec_control); - vmx->nested.preemption_timer_expired = false; if (nested_cpu_has_preemption_timer(vmcs12)) vmx_start_preemption_timer(vcpu); - if (cpu_has_secondary_exec_ctrls()) { - exec_control = vmx->secondary_exec_control; - - /* Take the following fields only from vmcs12 */ - exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | - SECONDARY_EXEC_ENABLE_INVPCID | - SECONDARY_EXEC_RDTSCP | - SECONDARY_EXEC_XSAVES | - SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | - SECONDARY_EXEC_APIC_REGISTER_VIRT | - SECONDARY_EXEC_ENABLE_VMFUNC); - if (nested_cpu_has(vmcs12, - CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)) { - vmcs12_exec_ctrl = vmcs12->secondary_vm_exec_control & - ~SECONDARY_EXEC_ENABLE_PML; - exec_control |= vmcs12_exec_ctrl; - } - - if (exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) - vmcs_write16(GUEST_INTR_STATUS, - vmcs12->guest_intr_status); - - /* - * Write an illegal value to APIC_ACCESS_ADDR. Later, - * nested_get_vmcs12_pages will either fix it up or - * remove the VM execution control. - */ - if (exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) - vmcs_write64(APIC_ACCESS_ADDR, -1ull); - - vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control); - } - - /* - * HOST_RSP is normally set correctly in vmx_vcpu_run() just before - * entry, but only if the current (host) sp changed from the value - * we wrote last (vmx->host_rsp). This cache is no longer relevant - * if we switch vmcs, and rather than hold a separate cache per vmcs, - * here we just force the write to happen on entry. - */ - vmx->host_rsp = 0; - - exec_control = vmx_exec_control(vmx); /* L0's desires */ - exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING; - exec_control &= ~CPU_BASED_VIRTUAL_NMI_PENDING; - exec_control &= ~CPU_BASED_TPR_SHADOW; - exec_control |= vmcs12->cpu_based_vm_exec_control; - - /* - * Write an illegal value to VIRTUAL_APIC_PAGE_ADDR. Later, if - * nested_get_vmcs12_pages can't fix it up, the illegal value - * will result in a VM entry failure. - */ - if (exec_control & CPU_BASED_TPR_SHADOW) { - vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, -1ull); - vmcs_write32(TPR_THRESHOLD, vmcs12->tpr_threshold); - } else { -#ifdef CONFIG_X86_64 - exec_control |= CPU_BASED_CR8_LOAD_EXITING | - CPU_BASED_CR8_STORE_EXITING; -#endif - } - - /* - * A vmexit (to either L1 hypervisor or L0 userspace) is always needed - * for I/O port accesses. - */ - exec_control &= ~CPU_BASED_USE_IO_BITMAPS; - exec_control |= CPU_BASED_UNCOND_IO_EXITING; - - vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control); - /* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the * bitwise-or of what L1 wants to trap for L2, and what we want to * trap. Note that CR0.TS also needs updating - we do this later. @@ -11441,30 +11511,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vcpu->arch.cr0_guest_owned_bits &= ~vmcs12->cr0_guest_host_mask; vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits); - /* L2->L1 exit controls are emulated - the hardware exit is to L0 so - * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER - * bits are further modified by vmx_set_efer() below. - */ - vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); - - /* - * vmcs12's VM_{ENTRY,EXIT}_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE - * are emulated by vmx_set_efer(), below, but speculate on the - * related bits (if supported by the CPU) in the hope that we can - * avoid VMWrites during vmx_set_efer(). - */ - exec_control = (vmcs12->vm_entry_controls | vmcs_config.vmentry_ctrl) & - ~VM_ENTRY_IA32E_MODE & ~VM_ENTRY_LOAD_IA32_EFER; - - guest_efer = nested_vmx_calc_efer(vmx, vmcs12); - if (cpu_has_load_ia32_efer) { - if (guest_efer & EFER_LMA) - exec_control |= VM_ENTRY_IA32E_MODE; - if (guest_efer != host_efer) - exec_control |= VM_ENTRY_LOAD_IA32_EFER; - } - vm_entry_controls_init(vmx, exec_control); - if (vmx->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT)) { vmcs_write64(GUEST_IA32_PAT, vmcs12->guest_ia32_pat); @@ -11497,18 +11543,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, } } - if (enable_pml) { - /* - * Conceptually we want to copy the PML address and index from - * vmcs01 here, and then back to vmcs01 on nested vmexit. But, - * since we always flush the log on each vmexit, this happens - * to be equivalent to simply resetting the fields in vmcs02. - */ - ASSERT(vmx->pml_pg); - vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg)); - vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); - } - if (nested_cpu_has_ept(vmcs12)) nested_ept_init_mmu_context(vcpu); else if (nested_cpu_has2(vmcs12, @@ -11529,7 +11563,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmx_set_cr4(vcpu, vmcs12->guest_cr4); vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(vmcs12)); - vcpu->arch.efer = guest_efer; + vcpu->arch.efer = nested_vmx_calc_efer(vmx, vmcs12); /* Note: may modify VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */ vmx_set_efer(vcpu, vcpu->arch.efer); @@ -11792,6 +11826,8 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset += vmcs12->tsc_offset; + prepare_vmcs02_early(vmx, vmcs12); + if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) goto fail; From patchwork Tue Jul 31 15:32:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550931 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 57148A748 for ; Tue, 31 Jul 2018 15:32:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 44BB32B181 for ; Tue, 31 Jul 2018 15:32:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 380A82B1A2; Tue, 31 Jul 2018 15:32:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C6BE92B181 for ; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732461AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 Received: from mga03.intel.com ([134.134.136.65]:23684 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732442AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342640" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 11/16] KVM: nVMX: do early preparation of vmcs02 before check_vmentry_postreqs() Date: Tue, 31 Jul 2018 08:32:10 -0700 Message-Id: <20180731153215.31794-12-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In anticipation of using vmcs02 to do early consistency checks, move the early preparation of vmcs02 prior to checking the postreqs. The downside of this approach is that we'll unnecessary load vmcs02 in the case that check_vmentry_postreqs() fails, but that is essentially our slow path anyways (not actually slow, but it's the path we don't really care about optimizing). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6e7ee3637f12..d5328518b1c6 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11808,6 +11808,15 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) u32 msr_entry_idx; u32 exit_qual; + if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) + vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL); + + vmx_switch_vmcs(vcpu, &vmx->nested.vmcs02); + + prepare_vmcs02_early(vmx, vmcs12); + + nested_get_vmcs12_pages(vcpu, vmcs12); + if (!is_vmentry) goto enter_non_root_mode; @@ -11817,22 +11826,14 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) enter_non_root_mode: enter_guest_mode(vcpu); - if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) - vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL); - - vmx_switch_vmcs(vcpu, &vmx->nested.vmcs02); vmx_segment_cache_clear(vmx); if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset += vmcs12->tsc_offset; - prepare_vmcs02_early(vmx, vmcs12); - if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) goto fail; - nested_get_vmcs12_pages(vcpu, vmcs12); - exit_reason = EXIT_REASON_MSR_LOAD_FAIL; msr_entry_idx = nested_vmx_load_msr(vcpu, vmcs12->vm_entry_msr_load_addr, @@ -11852,7 +11853,6 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); - vmx_switch_vmcs(vcpu, &vmx->vmcs01); /* * A consistency check VMExit during L1's VMEnter to L2 is a subset @@ -11861,6 +11861,7 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) * reason and exit-qualification parameters). */ consistency_check_vmexit: + vmx_switch_vmcs(vcpu, &vmx->vmcs01); vm_entry_controls_reset_shadow(vmx); vm_exit_controls_reset_shadow(vmx); vmx_segment_cache_clear(vmx); From patchwork Tue Jul 31 15:32:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550945 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 636E913B8 for ; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 528DC28F03 for ; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4622E295EC; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C331528F03 for ; Tue, 31 Jul 2018 15:32:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732499AbeGaRNn (ORCPT ); Tue, 31 Jul 2018 13:13:43 -0400 Received: from mga03.intel.com ([134.134.136.65]:23684 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732308AbeGaRN1 (ORCPT ); Tue, 31 Jul 2018 13:13:27 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342641" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 12/16] KVM: nVMX: refactor inner section of nested_vmx_enter_non_root_mode() Date: Tue, 31 Jul 2018 08:32:11 -0700 Message-Id: <20180731153215.31794-13-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Move nested_vmx_load_msr() into prepare_vmcs02() and remove the "fail" label now that the inner section of nested_vmx_enter_non_root_mode() has a single error path. Move the call to vmx_segment_cache_clear() above the call to enter_guest_mode() so that it is explicitly clear that the state being restored in the prepare_vmcs02() error path is the state that is set immediately before the call to prepare_vmcs02. Eliminating the "fail" label resolves the oddity of having multiple entry points into the consistency check VMExit handling in nested_vmx_enter_non_root_mode() and reduces the probability of breaking the related code (speaking from experience). Pass exit_reason to check_vmentry_postreqs() and prepare_vmcs02(), and unconditionally set exit_reason and exit_qual in both functions. This fixes a subtle bug where exit_qual would be used uninitialized when vmx->nested.nested_run_pending is false and the MSR load fails, and hopefully prevents such issues in the future, e.g. a similar bug from the past: https://www.spinics.net/lists/kvm/msg165988.html. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 49 ++++++++++++++++++++++------------------------ 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d5328518b1c6..7c3bdd9deb63 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11469,10 +11469,13 @@ static void prepare_vmcs02_full(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) * is assigned to entry_failure_code on failure. */ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, - u32 *entry_failure_code) + u32 *exit_reason, u32 *entry_failure_code) { struct vcpu_vmx *vmx = to_vmx(vcpu); + *exit_reason = EXIT_REASON_INVALID_STATE; + *entry_failure_code = ENTRY_FAIL_DEFAULT; + if (vmx->nested.dirty_vmcs12) { prepare_vmcs02_full(vmx, vmcs12); vmx->nested.dirty_vmcs12 = false; @@ -11572,10 +11575,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, * which means L1 attempted VMEntry to L2 with invalid state. * Fail the VMEntry. */ - if (vmx->emulation_required) { - *entry_failure_code = ENTRY_FAIL_DEFAULT; + if (vmx->emulation_required) return 1; - } /* Shadow page tables on either EPT or shadow page tables. */ if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12), @@ -11587,6 +11588,12 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->guest_rsp); kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->guest_rip); + + if (nested_vmx_load_msr(vcpu, vmcs12->vm_entry_msr_load_addr, + vmcs12->vm_entry_msr_load_count)) { + *exit_reason = EXIT_REASON_MSR_LOAD_FAIL; + return 1; + } return 0; } @@ -11753,10 +11760,11 @@ static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) } static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, - u32 *exit_qual) + u32 *exit_reason, u32 *exit_qual) { bool ia32e; + *exit_reason = EXIT_REASON_INVALID_STATE; *exit_qual = ENTRY_FAIL_DEFAULT; if (!nested_guest_cr0_valid(vcpu, vmcs12->guest_cr0) || @@ -11804,9 +11812,7 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); bool is_vmentry = vmx->nested.nested_run_pending; - u32 exit_reason = EXIT_REASON_INVALID_STATE; - u32 msr_entry_idx; - u32 exit_qual; + u32 exit_qual, exit_reason; if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL); @@ -11820,26 +11826,22 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) if (!is_vmentry) goto enter_non_root_mode; - if (check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) + if (check_vmentry_postreqs(vcpu, vmcs12, &exit_reason, &exit_qual)) goto consistency_check_vmexit; enter_non_root_mode: + vmx_segment_cache_clear(vmx); + enter_guest_mode(vcpu); - - vmx_segment_cache_clear(vmx); - if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset += vmcs12->tsc_offset; - if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) - goto fail; - - exit_reason = EXIT_REASON_MSR_LOAD_FAIL; - msr_entry_idx = nested_vmx_load_msr(vcpu, - vmcs12->vm_entry_msr_load_addr, - vmcs12->vm_entry_msr_load_count); - if (msr_entry_idx) - goto fail; + if (prepare_vmcs02(vcpu, vmcs12, &exit_reason, &exit_qual)) { + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vcpu->arch.tsc_offset -= vmcs12->tsc_offset; + leave_guest_mode(vcpu); + goto consistency_check_vmexit; + } /* * Note no nested_vmx_succeed or nested_vmx_fail here. At this point @@ -11849,11 +11851,6 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) */ return 0; -fail: - if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) - vcpu->arch.tsc_offset -= vmcs12->tsc_offset; - leave_guest_mode(vcpu); - /* * A consistency check VMExit during L1's VMEnter to L2 is a subset * of a normal VMexit, as explained in 23.7 "VM-entry failures during From patchwork Tue Jul 31 15:32:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550947 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D861796FA for ; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C876F28F03 for ; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BBFDE293CB; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FD91295FA for ; Tue, 31 Jul 2018 15:32:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732496AbeGaRNn (ORCPT ); Tue, 31 Jul 2018 13:13:43 -0400 Received: from mga03.intel.com ([134.134.136.65]:23683 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732432AbeGaRN1 (ORCPT ); Tue, 31 Jul 2018 13:13:27 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342643" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 13/16] KVM: nVMX: do not skip VMEnter instruction that succeeds Date: Tue, 31 Jul 2018 08:32:12 -0700 Message-Id: <20180731153215.31794-14-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A successful VMEnter is essentially a fancy indirect branch that pulls the target RIP from the VMCS. Skipping the instruction is unnecessary (RIP will get overwritten by the VMExit handler) and is problematic because it can incorrectly suppress a #DB due to EFLAGS.TF when a VMFail is detected by hardware (happens after we skip the instruction). Now that vmx_nested_run() is not prematurely skipping the instr, use the full kvm_skip_emulated_instruction() in the VMFail path of nested_vmx_vmexit(). We also need to explicitly update the GUEST_INTERRUPTIBILITY_INFO when loading vmcs12 host state. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 7c3bdd9deb63..278d259dbc3e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11923,15 +11923,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) goto out; } - /* - * After this point, the trap flag no longer triggers a singlestep trap - * on the vm entry instructions; don't call kvm_skip_emulated_instruction. - * This is not 100% correct; for performance reasons, we delegate most - * of the checks on host state to the processor. If those fail, - * the singlestep trap is missed. - */ - skip_emulated_instruction(vcpu); - /* * We're finally done with prerequisite checking, and can start with * the nested entry. @@ -12308,6 +12299,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->host_rsp); kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->host_rip); vmx_set_rflags(vcpu, X86_EFLAGS_FIXED); + vmx_set_interrupt_shadow(vcpu, 0); + /* * Note that calling vmx_set_cr0 is important, even if cr0 hasn't * actually changed, because vmx_set_cr0 refers to efer set above. @@ -12552,18 +12545,14 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, * in L1 which thinks it just finished a VMLAUNCH or * VMRESUME instruction, so we need to set the failure * flag and the VM-instruction error field of the VMCS - * accordingly. + * accordingly, and skip the emulated instruction. */ nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); + kvm_skip_emulated_instruction(vcpu); + load_vmcs12_mmu_host_state(vcpu, vmcs12); - /* - * The emulated instruction was already skipped in - * nested_vmx_run, but the updated RIP was never - * written back to the vmcs01. - */ - skip_emulated_instruction(vcpu); vmx->fail = 0; } From patchwork Tue Jul 31 15:32:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550933 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C9D713B8 for ; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BE2A2B19E for ; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0FF392B1A8; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 515142B19A for ; Tue, 31 Jul 2018 15:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732446AbeGaRN1 (ORCPT ); Tue, 31 Jul 2018 13:13:27 -0400 Received: from mga03.intel.com ([134.134.136.65]:23685 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732425AbeGaRN1 (ORCPT ); Tue, 31 Jul 2018 13:13:27 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342646" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 14/16] KVM: vmx: write HOST_IA32_EFER in vmx_set_constant_host_state() Date: Tue, 31 Jul 2018 08:32:13 -0700 Message-Id: <20180731153215.31794-15-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP EFER is constant in the host and writing it once during setup means we can skip writing the host value in add_atomic_switch_msr_special(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 278d259dbc3e..af65477fe317 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2417,7 +2417,8 @@ static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx, u64 guest_val, u64 host_val) { vmcs_write64(guest_val_vmcs, guest_val); - vmcs_write64(host_val_vmcs, host_val); + if (host_val_vmcs != HOST_IA32_EFER) + vmcs_write64(host_val_vmcs, host_val); vm_entry_controls_setbit(vmx, entry); vm_exit_controls_setbit(vmx, exit); } @@ -5948,6 +5949,9 @@ static void vmx_set_constant_host_state(struct vcpu_vmx *vmx) rdmsr(MSR_IA32_CR_PAT, low32, high32); vmcs_write64(HOST_IA32_PAT, low32 | ((u64) high32 << 32)); } + + if (cpu_has_load_ia32_efer) + vmcs_write64(HOST_IA32_EFER, host_efer); } static void set_cr4_guest_host_mask(struct vcpu_vmx *vmx) From patchwork Tue Jul 31 15:32:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550939 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 56BA014E2 for ; Tue, 31 Jul 2018 15:32:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 451992B19A for ; Tue, 31 Jul 2018 15:32:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 398032B1A2; Tue, 31 Jul 2018 15:32:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E9772B19A for ; Tue, 31 Jul 2018 15:32:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732464AbeGaRNc (ORCPT ); Tue, 31 Jul 2018 13:13:32 -0400 Received: from mga03.intel.com ([134.134.136.65]:23683 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732448AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342648" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 15/16] KVM: nVMX: add option to perform early consistency checks via H/W Date: Tue, 31 Jul 2018 08:32:14 -0700 Message-Id: <20180731153215.31794-16-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For performance reasons, KVM defers many consistency checks to the CPU, including checks that result in VMFail (as opposed to VMExit). This behavior may be undesirable for some users since KVM detects certain classes of VMFail only after it has processed guest state, e.g. emulated MSR load-on-entry. Because there is a strict ordering between checks that cause VMFail and those that cause VMExit, i.e. all VMFail checks are performed before any checks that cause VMExit, we can detect all VMFail conditions via a dry run of sorts. After preparing vmcs02 with all state needed to pass the VMFail consistency checks, optionally do a "test" VMEnter with an invalid GUEST_RIP. If the VMEnter results in a VMExit (due to bad guest state), then we can safely say that the nested VMEnter should not VMFail, i.e. any VMFail encountered in nested_vmx_vmexit() must be due to an L0 bug. Add a module param, early_consistency_checks, to provide control over whether or not VMX performs the early consistency checks. In addition to standard on/off behavior, the param accepts a value of -1, which is essentialy an "auto" setting whereby KVM does the early checks only when it thinks it's running on bare metal. When running nested, doing early checks is of dubious value since the resulting behavior is heavily dependent on L0. Unfortunately, since the "passing" case causes a VMExit, KVM must be extra diligent to ensure that host state is restored, e.g. DR7 and RFLAGS are reset on VMExit. Failure to restore RFLAGS.IF is particularly fatal. And of course the extra VMEnter and VMExit impacts performance. The raw overhead of the early consistency checks is ~6% on modern hardware (though this could easily vary based on configuration), while the added latency observed from the L1 VMM is ~10%. The early consistency checks do not occur in a vacuum, e.g. spending more time in L0 can lead to more interrupts being serviced while emulating VMEnter, thereby increasing the latency observed by L1. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 152 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 146 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index af65477fe317..92baa2573825 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -53,6 +53,7 @@ #include #include #include +#include #include "trace.h" #include "pmu.h" @@ -109,6 +110,13 @@ module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO); static bool __read_mostly nested = 0; module_param(nested, bool, S_IRUGO); +/* + * early_consistency_checks is a tri-state: -1 == "native only", + * 0 == "never" and 1 == "always". + */ +static int __read_mostly early_consistency_checks = -1; +module_param(early_consistency_checks, int, S_IRUGO); + static u64 __read_mostly host_xss; static bool __read_mostly enable_pml = 1; @@ -187,6 +195,7 @@ static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; module_param(ple_window_max, uint, 0444); extern const ulong vmx_return; +extern const ulong vmx_early_consistency_check_return; struct kvm_vmx { struct kvm kvm; @@ -7527,6 +7536,10 @@ static __init int hardware_setup(void) if (!cpu_has_virtual_nmis()) enable_vnmi = 0; + if (early_consistency_checks < 0 && + !hypervisor_is_type(X86_HYPER_NATIVE)) + early_consistency_checks = 0; + /* * set_apic_access_page_addr() is used to reload apic access * page upon invalidation. No need to do anything if not @@ -11169,6 +11182,14 @@ static void prepare_vmcs02_first_run(struct vcpu_vmx *vmx) if (vmx->nested.vmcs02.launched) return; + /* + * We don't care what the EPTP value is we just need to guarantee + * it's valid so we don't get a false positive when doing early + * consistency checks. + */ + if (enable_ept && early_consistency_checks) + vmcs_write64(EPT_POINTER, construct_eptp(&vmx->vcpu, 0)); + /* All VMFUNCs are currently emulated through L0 vmexits. */ if (cpu_has_vmx_vmfunc()) vmcs_write64(VM_FUNCTION_CONTROL, 0); @@ -11222,7 +11243,9 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) * entry, but only if the current (host) sp changed from the value * we wrote last (vmx->host_rsp). This cache is no longer relevant * if we switch vmcs, and rather than hold a separate cache per vmcs, - * here we just force the write to happen on entry. + * here we just force the write to happen on entry. host_rsp will + * also be written unconditionally by nested_vmx_check_vmentry_hw() + * if we are doing early consistency checks via hardware. */ vmx->host_rsp = 0; @@ -11808,9 +11831,120 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 0; } +static int __noclone nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long cr3, cr4; + + if (!early_consistency_checks) + return 0; + + if (vmx->msr_autoload.nr) { + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0); + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 0); + } + + preempt_disable(); + + vmx_save_host_state(vcpu); + + /* + * prepare_vmcs02() writes GUEST_RIP unconditionally, no need + * to save/restore or set dirty bits. + */ + vmcs_writel(GUEST_RIP, 0xf0f0ULL << 48); + + vmcs_writel(HOST_RIP, vmx_early_consistency_check_return); + + cr3 = __get_current_cr3_fast(); + if (unlikely(cr3 != vmx->loaded_vmcs->vmcs_host_cr3)) { + vmcs_writel(HOST_CR3, cr3); + vmx->loaded_vmcs->vmcs_host_cr3 = cr3; + } + + cr4 = cr4_read_shadow(); + if (unlikely(cr4 != vmx->loaded_vmcs->vmcs_host_cr4)) { + vmcs_writel(HOST_CR4, cr4); + vmx->loaded_vmcs->vmcs_host_cr4 = cr4; + } + + vmx->__launched = vmx->loaded_vmcs->launched; + + asm( + /* Set HOST_RSP */ + __ex(ASM_VMX_VMWRITE_RSP_RDX) "\n\t" + "mov %%" _ASM_SP ", %c[host_rsp](%0)\n\t" + + /* Check if vmlaunch of vmresume is needed */ + "cmpl $0, %c[launched](%0)\n\t" + "je 1f\n\t" + __ex(ASM_VMX_VMRESUME) "\n\t" + "jmp 2f\n\t" + "1: " __ex(ASM_VMX_VMLAUNCH) "\n\t" + "jmp 2f\n\t" + "2: " + + /* Set vmx->fail accordingly */ + "setbe %c[fail](%0)\n\t" + + ".pushsection .rodata\n\t" + ".global vmx_early_consistency_check_return\n\t" + "vmx_early_consistency_check_return: " _ASM_PTR " 2b\n\t" + ".popsection" + : + : "c"(vmx), "d"((unsigned long)HOST_RSP), + [launched]"i"(offsetof(struct vcpu_vmx, __launched)), + [fail]"i"(offsetof(struct vcpu_vmx, fail)), + [host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)) + : "rax", "cc", "memory" + ); + + vmcs_writel(HOST_RIP, vmx_return); + + preempt_enable(); + + if (vmx->msr_autoload.nr) { + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.nr); + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.nr); + } + + if (vmx->fail) { + WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) != + VMXERR_ENTRY_INVALID_CONTROL_FIELD); + vmx->fail = 0; + return 1; + } + + /* + * VMExit clears RFLAGS.IF and DR7, even on a consistency check. + */ + local_irq_enable(); + if (hw_breakpoint_active()) + set_debugreg(__this_cpu_read(cpu_dr7), 7); + + /* + * A non-failing VMEntry means we somehow entered guest mode with + * an illegal RIP, and that's just the tip of the iceberg. There + * is no telling what memory has been modified or what state has + * been exposed to unknown code. Hitting this all but guarantees + * a (very critical) hardware issue. + */ + BUG_ON(!(vmcs_read32(VM_EXIT_REASON) & + VMX_EXIT_REASONS_FAILED_VMENTRY)); + + return 0; +} +STACK_FRAME_NON_STANDARD(nested_vmx_check_vmentry_hw); + static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12); +/* + * Return values: + * 0 - success, i.e. proceed with actual VMEnter + * 1 - consistency check VMExit + * -1 - consistency check VMFail + */ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -11830,6 +11964,11 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu) if (!is_vmentry) goto enter_non_root_mode; + if (nested_vmx_check_vmentry_hw(vcpu)) { + vmx_switch_vmcs(vcpu, &vmx->vmcs01); + return -1; + } + if (check_vmentry_postreqs(vcpu, vmcs12, &exit_reason, &exit_qual)) goto consistency_check_vmexit; @@ -11931,13 +12070,14 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) * We're finally done with prerequisite checking, and can start with * the nested entry. */ - vmx->nested.nested_run_pending = 1; ret = nested_vmx_enter_non_root_mode(vcpu); - if (ret) { - vmx->nested.nested_run_pending = 0; - return ret; - } + vmx->nested.nested_run_pending = !ret; + if (ret < 0) { + nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); + goto out; + } else if (ret) + return 1; /* * If we're entering a halted L2 vcpu and the L2 vcpu won't be woken From patchwork Tue Jul 31 15:32:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10550937 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39B1814E2 for ; Tue, 31 Jul 2018 15:32:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A4B12B19A for ; Tue, 31 Jul 2018 15:32:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1E8412B1A2; Tue, 31 Jul 2018 15:32:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C2CC82B19A for ; Tue, 31 Jul 2018 15:32:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732472AbeGaRNa (ORCPT ); Tue, 31 Jul 2018 13:13:30 -0400 Received: from mga03.intel.com ([134.134.136.65]:23685 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732449AbeGaRN2 (ORCPT ); Tue, 31 Jul 2018 13:13:28 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Jul 2018 08:32:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,427,1526367600"; d="scan'208";a="79342649" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.132]) by orsmga002.jf.intel.com with ESMTP; 31 Jul 2018 08:32:20 -0700 From: Sean Christopherson To: kvm@vger.kernel.org, pbonzini@redhat.com, rkrcmar@redhat.com Cc: jmattson@google.com, Sean Christopherson Subject: [PATCH 16/16] KVM: nVMX: WARN if nested run hits VMFail with early consistency checks enabled Date: Tue, 31 Jul 2018 08:32:15 -0700 Message-Id: <20180731153215.31794-17-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180731153215.31794-1-sean.j.christopherson@intel.com> References: <20180731153215.31794-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When early consistency checks are enabled, all VMFail conditions should be caught by nested_vmx_check_vmentry_hw(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 92baa2573825..cc513c62f622 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12570,14 +12570,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, /* trying to cancel vmlaunch/vmresume is a bug */ WARN_ON_ONCE(vmx->nested.nested_run_pending); - /* - * The only expected VM-instruction error is "VM entry with - * invalid control field(s)." Anything else indicates a - * problem with L0. - */ - WARN_ON_ONCE(vmx->fail && (vmcs_read32(VM_INSTRUCTION_ERROR) != - VMXERR_ENTRY_INVALID_CONTROL_FIELD)); - leave_guest_mode(vcpu); if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) @@ -12593,6 +12585,16 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr, vmcs12->vm_exit_msr_store_count)) nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL); + } else { + /* + * The only expected VM-instruction error is "VM entry with + * invalid control field(s)." Anything else indicates a + * problem with L0. And we should never get here with a + * VMFail of any type if early consistency checks are enabled. + */ + WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) != + VMXERR_ENTRY_INVALID_CONTROL_FIELD); + WARN_ON_ONCE(early_consistency_checks); } vmx_switch_vmcs(vcpu, &vmx->vmcs01);