From patchwork Tue Aug 28 16:04:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578781 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24FD9139B for ; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 13C522A662 for ; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 05E102A669; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1AB7C2A662 for ; Tue, 28 Aug 2018 16:05:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727195AbeH1T5c (ORCPT ); Tue, 28 Aug 2018 15:57:32 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727133AbeH1T5c (ORCPT ); Tue, 28 Aug 2018 15:57:32 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826876" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:04 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 01/18] KVM: nVMX: move host EFER consistency checks to VMFail path Date: Tue, 28 Aug 2018 09:04:42 -0700 Message-Id: <20180828160459.14093-2-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Invalid host state related to loading EFER on VMExit causes a VMFail(VMXERR_ENTRY_INVALID_HOST_STATE_FIELD), not a VMExit. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson Reviewed-by: Krish Sadhukhan --- arch/x86/kvm/vmx.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 8dae47e7267a..b217614de7ac 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12316,6 +12316,7 @@ static int nested_vmx_check_nmi_controls(struct vmcs12 *vmcs12) static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { struct vcpu_vmx *vmx = to_vmx(vcpu); + bool ia32e; if (vmcs12->guest_activity_state != GUEST_ACTIVITY_ACTIVE && vmcs12->guest_activity_state != GUEST_ACTIVITY_HLT) @@ -12386,6 +12387,21 @@ static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) !nested_cr3_valid(vcpu, vmcs12->host_cr3)) return VMXERR_ENTRY_INVALID_HOST_STATE_FIELD; + /* + * If the load IA32_EFER VM-exit control is 1, bits reserved in the + * IA32_EFER MSR must be 0 in the field for that register. In addition, + * the values of the LMA and LME bits in the field must each be that of + * the host address-space size VM-exit control. + */ + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER) { + ia32e = (vmcs12->vm_exit_controls & + VM_EXIT_HOST_ADDR_SPACE_SIZE) != 0; + if (!kvm_valid_efer(vcpu, vmcs12->host_ia32_efer) || + ia32e != !!(vmcs12->host_ia32_efer & EFER_LMA) || + ia32e != !!(vmcs12->host_ia32_efer & EFER_LME)) + return VMXERR_ENTRY_INVALID_HOST_STATE_FIELD; + } + /* * From the Intel SDM, volume 3: * Fields relevant to VM-entry event injection must be set properly. @@ -12507,21 +12523,6 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 1; } - /* - * If the load IA32_EFER VM-exit control is 1, bits reserved in the - * IA32_EFER MSR must be 0 in the field for that register. In addition, - * the values of the LMA and LME bits in the field must each be that of - * the host address-space size VM-exit control. - */ - if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER) { - ia32e = (vmcs12->vm_exit_controls & - VM_EXIT_HOST_ADDR_SPACE_SIZE) != 0; - if (!kvm_valid_efer(vcpu, vmcs12->host_ia32_efer) || - ia32e != !!(vmcs12->host_ia32_efer & EFER_LMA) || - ia32e != !!(vmcs12->host_ia32_efer & EFER_LME)) - return 1; - } - if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS) && (is_noncanonical_address(vmcs12->guest_bndcfgs & PAGE_MASK, vcpu) || (vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD))) From patchwork Tue Aug 28 16:04:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578817 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 18208139B for ; Tue, 28 Aug 2018 16:05:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0809929566 for ; Tue, 28 Aug 2018 16:05:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F0BFB2A656; Tue, 28 Aug 2018 16:05:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FD8229566 for ; Tue, 28 Aug 2018 16:05:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727514AbeH1T5r (ORCPT ); Tue, 28 Aug 2018 15:57:47 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727233AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826878" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:04 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 02/18] KVM: nVMX: move vmcs12 EPTP consistency check to check_vmentry_prereqs() Date: Tue, 28 Aug 2018 09:04:43 -0700 Message-Id: <20180828160459.14093-3-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP An invalid EPTP causes a VMFail(VMXERR_ENTRY_INVALID_CONTROL_FIELD), not a VMExit. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson Reviewed-by: Krish Sadhukhan --- arch/x86/kvm/vmx.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b217614de7ac..6451e63847d9 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11277,11 +11277,9 @@ static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) return get_vmcs12(vcpu)->ept_pointer; } -static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +static void nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) { WARN_ON(mmu_is_nested(vcpu)); - if (!valid_ept_address(vcpu, nested_ept_get_cr3(vcpu))) - return 1; kvm_init_shadow_ept_mmu(vcpu, to_vmx(vcpu)->nested.msrs.ept_caps & @@ -11293,7 +11291,6 @@ static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; - return 0; } static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) @@ -12243,15 +12240,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); } - if (nested_cpu_has_ept(vmcs12)) { - if (nested_ept_init_mmu_context(vcpu)) { - *entry_failure_code = ENTRY_FAIL_DEFAULT; - return 1; - } - } else if (nested_cpu_has2(vmcs12, - SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) { + if (nested_cpu_has_ept(vmcs12)) + nested_ept_init_mmu_context(vcpu); + else if (nested_cpu_has2(vmcs12, + SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) vmx_flush_tlb(vcpu, true); - } /* * This sets GUEST_CR0 to vmcs12->guest_cr0, possibly modifying those @@ -12458,6 +12451,10 @@ static int check_vmentry_prereqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) } } + if (nested_cpu_has_ept(vmcs12) && + !valid_ept_address(vcpu, vmcs12->ept_pointer)) + return VMXERR_ENTRY_INVALID_CONTROL_FIELD; + return 0; } From patchwork Tue Aug 28 16:04:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578809 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E82817DB for ; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1F45329566 for ; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 13C942A656; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB50729566 for ; Tue, 28 Aug 2018 16:05:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727308AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727111AbeH1T5c (ORCPT ); Tue, 28 Aug 2018 15:57:32 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826882" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 03/18] KVM: nVMX: use vm_exit_controls_init() to write exit controls for vmcs02 Date: Tue, 28 Aug 2018 09:04:44 -0700 Message-Id: <20180828160459.14093-4-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring vmcs02, otherwise vm_exit_controls_shadow will be stale. EFER in particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated due to an incorrect shadow optimization, which can crash L0 due to EFER not being loaded on exit. This does not occur with the current code base simply because update_transition_efer() unconditionally clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER clear, i.e. we'll only ever unnecessarily clear the bit. That is, until someone optimizes update_transition_efer()... Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6451e63847d9..b7aca0edeb59 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12186,7 +12186,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER * bits are further modified by vmx_set_efer() below. */ - vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl); + vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are * emulated by vmx_set_efer(), below. From patchwork Tue Aug 28 16:04:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578815 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3533617DB for ; Tue, 28 Aug 2018 16:05:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 259DF29566 for ; Tue, 28 Aug 2018 16:05:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A3732A656; Tue, 28 Aug 2018 16:05:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C7BE02A603 for ; Tue, 28 Aug 2018 16:05:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727525AbeH1T5r (ORCPT ); Tue, 28 Aug 2018 15:57:47 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727133AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826883" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 04/18] KVM: nVMX: reset cache/shadows on nested consistency check VMExit Date: Tue, 28 Aug 2018 09:04:45 -0700 Message-Id: <20180828160459.14093-5-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Reset the vm_{entry,exit}_controls_shadow variables as well as the segment cache on consistency check VMExit. The shadow values in particular can lead to missed updates due to stale shadows. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b7aca0edeb59..6097d0115056 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -13355,12 +13355,18 @@ static void nested_vmx_entry_failure(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, u32 reason, unsigned long qualification) { + struct vcpu_vmx *vmx = to_vmx(vcpu); + + vm_entry_controls_reset_shadow(vmx); + vm_exit_controls_reset_shadow(vmx); + vmx_segment_cache_clear(vmx); + load_vmcs12_host_state(vcpu, vmcs12); vmcs12->vm_exit_reason = reason | VMX_EXIT_REASONS_FAILED_VMENTRY; vmcs12->exit_qualification = qualification; nested_vmx_succeed(vcpu); if (enable_shadow_vmcs) - to_vmx(vcpu)->nested.sync_shadow_vmcs = true; + vmx->nested.sync_shadow_vmcs = true; } static int vmx_check_intercept(struct kvm_vcpu *vcpu, From patchwork Tue Aug 28 16:04:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578793 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 40801920 for ; Tue, 28 Aug 2018 16:05:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F67829566 for ; Tue, 28 Aug 2018 16:05:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 23ACE2A662; Tue, 28 Aug 2018 16:05:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CF3DC29566 for ; Tue, 28 Aug 2018 16:05:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727485AbeH1T5f (ORCPT ); Tue, 28 Aug 2018 15:57:35 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727111AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826885" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 05/18] KVM: vmx: do not unconditionally clear EFER switching Date: Tue, 28 Aug 2018 09:04:46 -0700 Message-Id: <20180828160459.14093-6-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Do not unconditionally call clear_atomic_switch_msr() when updating EFER. This adds up to four unnecessary VMWrites in the case where guest_efer != host_efer, e.g. if the load_on_{entry,exit} bits were already set. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6097d0115056..1fcf374a1475 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2798,8 +2798,6 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) ignore_bits &= ~(u64)EFER_SCE; #endif - clear_atomic_switch_msr(vmx, MSR_EFER); - /* * On EPT, we can't emulate NX, so we must switch EFER atomically. * On CPUs that support "load IA32_EFER", always switch EFER @@ -2812,8 +2810,12 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset) if (guest_efer != host_efer) add_atomic_switch_msr(vmx, MSR_EFER, guest_efer, host_efer, false); + else + clear_atomic_switch_msr(vmx, MSR_EFER); return false; } else { + clear_atomic_switch_msr(vmx, MSR_EFER); + guest_efer &= ~ignore_bits; guest_efer |= host_efer & ignore_bits; From patchwork Tue Aug 28 16:04:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578805 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A826920 for ; Tue, 28 Aug 2018 16:05:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 196332A603 for ; Tue, 28 Aug 2018 16:05:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0DF942A656; Tue, 28 Aug 2018 16:05:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9DC4029566 for ; Tue, 28 Aug 2018 16:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727440AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727309AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826890" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 06/18] KVM: nVMX: try to set EFER bits correctly when init'ing entry controls Date: Tue, 28 Aug 2018 09:04:47 -0700 Message-Id: <20180828160459.14093-7-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP VM_ENTRY_IA32E_MODE and VM_{ENTRY,EXIT}_LOAD_IA32_EFER will be explicitly set/cleared as needed by vmx_set_efer(), but attempt to get the bits set correctly when intializing the control fields. Setting the value correctly can avoid multiple VMWrites. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1fcf374a1475..e58dd3a66abf 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11896,6 +11896,17 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne return 0; } +static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) +{ + if (vmx->nested.nested_run_pending && + (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER)) + return vmcs12->guest_ia32_efer; + else if (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) + return vmx->vcpu.arch.efer | (EFER_LMA | EFER_LME); + else + return vmx->vcpu.arch.efer & ~(EFER_LMA | EFER_LME); +} + static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -12035,6 +12046,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 exec_control, vmcs12_exec_ctrl; + u64 guest_efer; if (vmx->nested.dirty_vmcs12) { prepare_vmcs02_full(vcpu, vmcs12); @@ -12190,13 +12202,23 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, */ vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); - /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are - * emulated by vmx_set_efer(), below. + /* + * vmcs12's VM_{ENTRY,EXIT}_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE + * are emulated by vmx_set_efer(), below, but speculate on the + * related bits (if supported by the CPU) in the hope that we can + * avoid VMWrites during vmx_set_efer(). */ - vm_entry_controls_init(vmx, - (vmcs12->vm_entry_controls & ~VM_ENTRY_LOAD_IA32_EFER & - ~VM_ENTRY_IA32E_MODE) | - (vmcs_config.vmentry_ctrl & ~VM_ENTRY_IA32E_MODE)); + exec_control = (vmcs12->vm_entry_controls | vmcs_config.vmentry_ctrl) & + ~VM_ENTRY_IA32E_MODE & ~VM_ENTRY_LOAD_IA32_EFER; + + guest_efer = nested_vmx_calc_efer(vmx, vmcs12); + if (cpu_has_load_ia32_efer) { + if (guest_efer & EFER_LMA) + exec_control |= VM_ENTRY_IA32E_MODE; + if (guest_efer != host_efer) + exec_control |= VM_ENTRY_LOAD_IA32_EFER; + } + vm_entry_controls_init(vmx, exec_control); if (vmx->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT)) { @@ -12262,14 +12284,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmx_set_cr4(vcpu, vmcs12->guest_cr4); vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(vmcs12)); - if (vmx->nested.nested_run_pending && - (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER)) - vcpu->arch.efer = vmcs12->guest_ia32_efer; - else if (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) - vcpu->arch.efer |= (EFER_LMA | EFER_LME); - else - vcpu->arch.efer &= ~(EFER_LMA | EFER_LME); - /* Note: modifies VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */ + vcpu->arch.efer = guest_efer; + /* Note: may modify VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */ vmx_set_efer(vcpu, vcpu->arch.efer); /* From patchwork Tue Aug 28 16:04:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578787 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A071920 for ; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 193CE29566 for ; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0D7B72A662; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A003029566 for ; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727375AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727104AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826892" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 07/18] KVM: nVMX: rename enter_vmx_non_root_mode to nested_vmx_enter_non_root_mode Date: Tue, 28 Aug 2018 09:04:48 -0700 Message-Id: <20180828160459.14093-8-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP ...to be more consistent with the nested VMX nomenclature. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson Reviewed-by: Krish Sadhukhan --- arch/x86/kvm/vmx.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e58dd3a66abf..5fe44462f713 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12550,7 +12550,7 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, * If exit_qual is NULL, this is being called from state restore (either RSM * or KVM_SET_NESTED_STATE). Otherwise it's called from vmlaunch/vmresume. */ -static int enter_vmx_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) +static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); @@ -12694,7 +12694,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) */ vmx->nested.nested_run_pending = 1; - ret = enter_vmx_non_root_mode(vcpu, &exit_qual); + ret = nested_vmx_enter_non_root_mode(vcpu, &exit_qual); if (ret) { nested_vmx_entry_failure(vcpu, vmcs12, ret, exit_qual); vmx->nested.nested_run_pending = 0; @@ -12705,7 +12705,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) vmx->vcpu.arch.l1tf_flush_l1d = true; /* - * Must happen outside of enter_vmx_non_root_mode() as it will + * Must happen outside of nested_vmx_enter_non_root_mode() as it will * also be used as part of restoring nVMX state for * snapshot restore (migration). * @@ -13816,7 +13816,7 @@ static int vmx_pre_leave_smm(struct kvm_vcpu *vcpu, u64 smbase) if (vmx->nested.smm.guest_mode) { vcpu->arch.hflags &= ~HF_SMM_MASK; - ret = enter_vmx_non_root_mode(vcpu, NULL); + ret = nested_vmx_enter_non_root_mode(vcpu, NULL); vcpu->arch.hflags |= HF_SMM_MASK; if (ret) return ret; @@ -14017,7 +14017,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, vmx->nested.nested_run_pending = 1; vmx->nested.dirty_vmcs12 = true; - ret = enter_vmx_non_root_mode(vcpu, NULL); + ret = nested_vmx_enter_non_root_mode(vcpu, NULL); if (ret) return -EINVAL; From patchwork Tue Aug 28 16:04:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578783 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 67D6C920 for ; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 56F732A662 for ; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4B2B22A666; Tue, 28 Aug 2018 16:05:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B512A2A663 for ; Tue, 28 Aug 2018 16:05:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727271AbeH1T5c (ORCPT ); Tue, 28 Aug 2018 15:57:32 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727133AbeH1T5c (ORCPT ); Tue, 28 Aug 2018 15:57:32 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826893" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 08/18] KVM: nVMX: move check_vmentry_postreqs() call to nested_vmx_enter_non_root_mode() Date: Tue, 28 Aug 2018 09:04:49 -0700 Message-Id: <20180828160459.14093-9-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In preparation of supporting checkpoint/restore for nested state, commit ca0bde28f2ed ("kvm: nVMX: Split VMCS checks from nested_vmx_run()") modified check_vmentry_postreqs() to only perform the guest EFER consistency checks when nested_run_pending is true. But, in the normal nested VMEntry flow, nested_run_pending is only set after check_vmentry_postreqs(), i.e. the consistency check is being skipped. Alternatively, nested_run_pending could be set prior to calling check_vmentry_postreqs() in nested_vmx_run(), but placing the consistency checks in nested_vmx_enter_non_root_mode() allows us to split prepare_vmcs02() and interleave the preparation with the consistency checks without having to change the call sites of nested_vmx_enter_non_root_mode(). In other words, the rest of the consistency check code in nested_vmx_run() will be joining the postreqs checks in future patches. Fixes: ca0bde28f2ed ("kvm: nVMX: Split VMCS checks from nested_vmx_run()") Signed-off-by: Sean Christopherson Cc: Jim Mattson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 5fe44462f713..43e87a2e172e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12556,7 +12556,16 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) struct vmcs12 *vmcs12 = get_vmcs12(vcpu); bool from_vmentry = !!exit_qual; u32 dummy_exit_qual; - int r = 0; + int r; + + if (from_vmentry) { + r = check_vmentry_postreqs(vcpu, vmcs12, exit_qual); + if (r) { + nested_vmx_entry_failure(vcpu, vmcs12, + EXIT_REASON_INVALID_STATE, *exit_qual); + return 1; + } + } enter_guest_mode(vcpu); @@ -12681,13 +12690,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) */ skip_emulated_instruction(vcpu); - ret = check_vmentry_postreqs(vcpu, vmcs12, &exit_qual); - if (ret) { - nested_vmx_entry_failure(vcpu, vmcs12, - EXIT_REASON_INVALID_STATE, exit_qual); - return 1; - } - /* * We're finally done with prerequisite checking, and can start with * the nested entry. From patchwork Tue Aug 28 16:04:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578813 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA138920 for ; Tue, 28 Aug 2018 16:05:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D988C29566 for ; Tue, 28 Aug 2018 16:05:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CDDD52A662; Tue, 28 Aug 2018 16:05:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4DB2729566 for ; Tue, 28 Aug 2018 16:05:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727305AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727212AbeH1T5c (ORCPT ); Tue, 28 Aug 2018 15:57:32 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826894" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 09/18] KVM: nVMX: assimilate nested_vmx_entry_failure() into nested_vmx_enter_non_root_mode() Date: Tue, 28 Aug 2018 09:04:50 -0700 Message-Id: <20180828160459.14093-10-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Handling consistency check VMExits in nested_vmx_enter_non_root_mode() consolidates all relevant code into a single location, and removing nested_vmx_entry_failure() eliminates a confusing function name and label. For a VMEntry, "fail" and its derivatives has a very specific meaning due to the different behavior of a VMEnter VMFail versus VMExit, i.e. a more appropriate name for nested_vmx_entry_failure() would have been nested_vmx_entry_consistency_check_vmexit(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 90 ++++++++++++++++++++-------------------------- 1 file changed, 39 insertions(+), 51 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 43e87a2e172e..cb8df73e9b49 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2056,9 +2056,6 @@ static inline bool is_nmi(u32 intr_info) static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, u32 exit_intr_info, unsigned long exit_qualification); -static void nested_vmx_entry_failure(struct kvm_vcpu *vcpu, - struct vmcs12 *vmcs12, - u32 reason, unsigned long qualification); static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr) { @@ -12546,25 +12543,23 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 0; } +static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, + struct vmcs12 *vmcs12); /* * If exit_qual is NULL, this is being called from state restore (either RSM * or KVM_SET_NESTED_STATE). Otherwise it's called from vmlaunch/vmresume. */ -static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) +static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, + bool from_vmentry) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - bool from_vmentry = !!exit_qual; - u32 dummy_exit_qual; - int r; + u32 exit_reason = EXIT_REASON_INVALID_STATE; + u32 exit_qual; if (from_vmentry) { - r = check_vmentry_postreqs(vcpu, vmcs12, exit_qual); - if (r) { - nested_vmx_entry_failure(vcpu, vmcs12, - EXIT_REASON_INVALID_STATE, *exit_qual); - return 1; - } + if (check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) + goto consistency_check_vmexit; } enter_guest_mode(vcpu); @@ -12578,18 +12573,17 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset += vmcs12->tsc_offset; - r = EXIT_REASON_INVALID_STATE; - if (prepare_vmcs02(vcpu, vmcs12, from_vmentry ? exit_qual : &dummy_exit_qual)) + if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) goto fail; if (from_vmentry) { nested_get_vmcs12_pages(vcpu); - r = EXIT_REASON_MSR_LOAD_FAIL; - *exit_qual = nested_vmx_load_msr(vcpu, - vmcs12->vm_entry_msr_load_addr, - vmcs12->vm_entry_msr_load_count); - if (*exit_qual) + exit_reason = EXIT_REASON_MSR_LOAD_FAIL; + exit_qual = nested_vmx_load_msr(vcpu, + vmcs12->vm_entry_msr_load_addr, + vmcs12->vm_entry_msr_load_count); + if (exit_qual) goto fail; } else { /* @@ -12615,7 +12609,28 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, u32 *exit_qual) vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); vmx_switch_vmcs(vcpu, &vmx->vmcs01); - return r; + + /* + * A consistency check VMExit during L1's VMEnter to L2 is a subset + * of a normal VMexit, as explained in 23.7 "VM-entry failures during + * or after loading guest state" (this also lists the acceptable exit- + * reason and exit-qualification parameters). + */ +consistency_check_vmexit: + vm_entry_controls_reset_shadow(vmx); + vm_exit_controls_reset_shadow(vmx); + vmx_segment_cache_clear(vmx); + + if (!from_vmentry) + return 1; + + load_vmcs12_host_state(vcpu, vmcs12); + vmcs12->vm_exit_reason = exit_reason | VMX_EXIT_REASONS_FAILED_VMENTRY; + vmcs12->exit_qualification = exit_qual; + nested_vmx_succeed(vcpu); + if (enable_shadow_vmcs) + vmx->nested.sync_shadow_vmcs = true; + return 1; } /* @@ -12627,7 +12642,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) struct vmcs12 *vmcs12; struct vcpu_vmx *vmx = to_vmx(vcpu); u32 interrupt_shadow = vmx_get_interrupt_shadow(vcpu); - u32 exit_qual; int ret; if (!nested_vmx_check_permission(vcpu)) @@ -12696,9 +12710,8 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) */ vmx->nested.nested_run_pending = 1; - ret = nested_vmx_enter_non_root_mode(vcpu, &exit_qual); + ret = nested_vmx_enter_non_root_mode(vcpu, true); if (ret) { - nested_vmx_entry_failure(vcpu, vmcs12, ret, exit_qual); vmx->nested.nested_run_pending = 0; return 1; } @@ -13364,31 +13377,6 @@ static void vmx_leave_nested(struct kvm_vcpu *vcpu) free_nested(to_vmx(vcpu)); } -/* - * L1's failure to enter L2 is a subset of a normal exit, as explained in - * 23.7 "VM-entry failures during or after loading guest state" (this also - * lists the acceptable exit-reason and exit-qualification parameters). - * It should only be called before L2 actually succeeded to run, and when - * vmcs01 is current (it doesn't leave_guest_mode() or switch vmcss). - */ -static void nested_vmx_entry_failure(struct kvm_vcpu *vcpu, - struct vmcs12 *vmcs12, - u32 reason, unsigned long qualification) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - vm_entry_controls_reset_shadow(vmx); - vm_exit_controls_reset_shadow(vmx); - vmx_segment_cache_clear(vmx); - - load_vmcs12_host_state(vcpu, vmcs12); - vmcs12->vm_exit_reason = reason | VMX_EXIT_REASONS_FAILED_VMENTRY; - vmcs12->exit_qualification = qualification; - nested_vmx_succeed(vcpu); - if (enable_shadow_vmcs) - vmx->nested.sync_shadow_vmcs = true; -} - static int vmx_check_intercept(struct kvm_vcpu *vcpu, struct x86_instruction_info *info, enum x86_intercept_stage stage) @@ -13818,7 +13806,7 @@ static int vmx_pre_leave_smm(struct kvm_vcpu *vcpu, u64 smbase) if (vmx->nested.smm.guest_mode) { vcpu->arch.hflags &= ~HF_SMM_MASK; - ret = nested_vmx_enter_non_root_mode(vcpu, NULL); + ret = nested_vmx_enter_non_root_mode(vcpu, false); vcpu->arch.hflags |= HF_SMM_MASK; if (ret) return ret; @@ -14019,7 +14007,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, vmx->nested.nested_run_pending = 1; vmx->nested.dirty_vmcs12 = true; - ret = nested_vmx_enter_non_root_mode(vcpu, NULL); + ret = nested_vmx_enter_non_root_mode(vcpu, false); if (ret) return -EINVAL; From patchwork Tue Aug 28 16:04:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578803 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E9B8C920 for ; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D75E229566 for ; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBABC2A656; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC4562A603 for ; Tue, 28 Aug 2018 16:05:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727464AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727329AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826897" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:05 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 10/18] KVM: nVMX: split pieces of prepare_vmcs02() to prepare_vmcs02_early() Date: Tue, 28 Aug 2018 09:04:51 -0700 Message-Id: <20180828160459.14093-11-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add prepare_vmcs02_early() and move pieces of prepare_vmcs02() to the new function. prepare_vmcs02_early() writes the bits of vmcs02 that a) must be in place to pass the VMFail consistency checks (assuming vmcs12 is valid) and b) are needed recover from a VMExit, e.g. host state that is loaded on VMExit. Splitting the functionality will enable KVM to leverage hardware to do VMFail consistency checks via a dry run of VMEnter and recover from a potential VMExit without having to fully initialize vmcs02. Add prepare_vmcs02_first_run() to handle writing vmcs02 state that comes from vmcs01 and never changes, i.e. we don't need to rewrite any of the vmcs02 that is effectively constant once defined. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 410 +++++++++++++++++++++++++-------------------- 1 file changed, 225 insertions(+), 185 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index cb8df73e9b49..492dc154c31e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11904,10 +11904,229 @@ static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) return vmx->vcpu.arch.efer & ~(EFER_LMA | EFER_LME); } -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) +static void prepare_vmcs02_first_run(struct vcpu_vmx *vmx) { - struct vcpu_vmx *vmx = to_vmx(vcpu); + /* + * If we have never launched vmcs02, set the constant vmcs02 state + * according to L0's settings (vmcs12 is irrelevant here). Host + * fields that come from L0 and are not constant, e.g. HOST_CR3, + * will be set as needed prior to VMLAUNCH/VMRESUME. + */ + if (vmx->nested.vmcs02.launched) + return; + /* All VMFUNCs are currently emulated through L0 vmexits. */ + if (cpu_has_vmx_vmfunc()) + vmcs_write64(VM_FUNCTION_CONTROL, 0); + + if (cpu_has_vmx_posted_intr()) + vmcs_write16(POSTED_INTR_NV, POSTED_INTR_NESTED_VECTOR); + + if (cpu_has_vmx_msr_bitmap()) + vmcs_write64(MSR_BITMAP, __pa(vmx->nested.vmcs02.msr_bitmap)); + + if (enable_pml) + vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg)); + + /* + * Set the MSR load/store lists to match L0's settings. Only the + * addresses are constant (for vmcs02), the counts can change based + * on L2's behavior, e.g. switching to/from long mode. + */ + vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); + vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val)); + vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val)); + + vmx_set_constant_host_state(vmx); +} + +static void prepare_vmcs02_early_full(struct vcpu_vmx *vmx, + struct vmcs12 *vmcs12) +{ + prepare_vmcs02_first_run(vmx); + + vmcs_write64(VMCS_LINK_POINTER, -1ull); + + if (enable_vpid) { + if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02) + vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02); + else + vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid); + } +} + +static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) +{ + u32 exec_control, vmcs12_exec_ctrl; + u64 guest_efer = nested_vmx_calc_efer(vmx, vmcs12); + + if (vmx->nested.dirty_vmcs12) + prepare_vmcs02_early_full(vmx, vmcs12); + + /* + * HOST_RSP is normally set correctly in vmx_vcpu_run() just before + * entry, but only if the current (host) sp changed from the value + * we wrote last (vmx->host_rsp). This cache is no longer relevant + * if we switch vmcs, and rather than hold a separate cache per vmcs, + * here we just force the write to happen on entry. + */ + vmx->host_rsp = 0; + + /* + * PIN CONTROLS + */ + exec_control = vmcs12->pin_based_vm_exec_control; + + /* Preemption timer setting is only taken from vmcs01. */ + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; + exec_control |= vmcs_config.pin_based_exec_ctrl; + if (vmx->hv_deadline_tsc == -1) + exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; + + /* Posted interrupts setting is only taken from vmcs12. */ + if (nested_cpu_has_posted_intr(vmcs12)) { + vmx->nested.posted_intr_nv = vmcs12->posted_intr_nv; + vmx->nested.pi_pending = false; + } else { + exec_control &= ~PIN_BASED_POSTED_INTR; + } + vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, exec_control); + + /* + * EXEC CONTROLS + */ + exec_control = vmx_exec_control(vmx); /* L0's desires */ + exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING; + exec_control &= ~CPU_BASED_VIRTUAL_NMI_PENDING; + exec_control &= ~CPU_BASED_TPR_SHADOW; + exec_control |= vmcs12->cpu_based_vm_exec_control; + + /* + * Write an illegal value to VIRTUAL_APIC_PAGE_ADDR. Later, if + * nested_get_vmcs12_pages can't fix it up, the illegal value + * will result in a VM entry failure. + */ + if (exec_control & CPU_BASED_TPR_SHADOW) { + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, -1ull); + vmcs_write32(TPR_THRESHOLD, vmcs12->tpr_threshold); + } else { +#ifdef CONFIG_X86_64 + exec_control |= CPU_BASED_CR8_LOAD_EXITING | + CPU_BASED_CR8_STORE_EXITING; +#endif + } + + /* + * A vmexit (to either L1 hypervisor or L0 userspace) is always needed + * for I/O port accesses. + */ + exec_control &= ~CPU_BASED_USE_IO_BITMAPS; + exec_control |= CPU_BASED_UNCOND_IO_EXITING; + vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control); + + /* + * SECONDARY EXEC CONTROLS + */ + if (cpu_has_secondary_exec_ctrls()) { + exec_control = vmx->secondary_exec_control; + + /* Take the following fields only from vmcs12 */ + exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | + SECONDARY_EXEC_ENABLE_INVPCID | + SECONDARY_EXEC_RDTSCP | + SECONDARY_EXEC_XSAVES | + SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | + SECONDARY_EXEC_APIC_REGISTER_VIRT | + SECONDARY_EXEC_ENABLE_VMFUNC); + if (nested_cpu_has(vmcs12, + CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)) { + vmcs12_exec_ctrl = vmcs12->secondary_vm_exec_control & + ~SECONDARY_EXEC_ENABLE_PML; + exec_control |= vmcs12_exec_ctrl; + } + + /* VMCS shadowing for L2 is emulated for now */ + exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS; + + if (exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) + vmcs_write16(GUEST_INTR_STATUS, + vmcs12->guest_intr_status); + + /* + * Write an illegal value to APIC_ACCESS_ADDR. Later, + * nested_get_vmcs12_pages will either fix it up or + * remove the VM execution control. + */ + if (exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) + vmcs_write64(APIC_ACCESS_ADDR, -1ull); + + if (exec_control & SECONDARY_EXEC_ENCLS_EXITING) + vmcs_write64(ENCLS_EXITING_BITMAP, -1ull); + + vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control); + } + + /* + * ENTRY CONTROLS + * + * vmcs12's VM_{ENTRY,EXIT}_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE + * are emulated by vmx_set_efer() in prepare_vmcs02(), but speculate + * on the related bits (if supported by the CPU) in the hope that + * we can avoid VMWrites during vmx_set_efer(). + */ + exec_control = (vmcs12->vm_entry_controls | vmcs_config.vmentry_ctrl) & + ~VM_ENTRY_IA32E_MODE & ~VM_ENTRY_LOAD_IA32_EFER; + if (cpu_has_load_ia32_efer) { + if (guest_efer & EFER_LMA) + exec_control |= VM_ENTRY_IA32E_MODE; + if (guest_efer != host_efer) + exec_control |= VM_ENTRY_LOAD_IA32_EFER; + } + vm_entry_controls_init(vmx, exec_control); + + /* + * EXIT CONTROLS + * + * L2->L1 exit controls are emulated - the hardware exit is to L0 so + * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER + * bits may be modified by vmx_set_efer() in prepare_vmcs02(). + */ + exec_control = vmcs_config.vmexit_ctrl; + if (cpu_has_load_ia32_efer && guest_efer != host_efer) + exec_control |= VM_EXIT_LOAD_IA32_EFER; + vm_exit_controls_init(vmx, exec_control); + + /* + * Conceptually we want to copy the PML address and index from + * vmcs01 here, and then back to vmcs01 on nested vmexit. But, + * since we always flush the log on each vmexit and never change + * the PML address (once set), this happens to be equivalent to + * simply resetting the index in vmcs02. + */ + if (enable_pml) + vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); + + /* + * Interrupt/Exception Fields + */ + if (vmx->nested.nested_run_pending) { + vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, + vmcs12->vm_entry_intr_info_field); + vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, + vmcs12->vm_entry_exception_error_code); + vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, + vmcs12->vm_entry_instruction_len); + vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, + vmcs12->guest_interruptibility_info); + vmx->loaded_vmcs->nmi_known_unmasked = + !(vmcs12->guest_interruptibility_info & GUEST_INTR_STATE_NMI); + } else { + vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); + } +} + +static void prepare_vmcs02_full(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) +{ vmcs_write16(GUEST_ES_SELECTOR, vmcs12->guest_es_selector); vmcs_write16(GUEST_SS_SELECTOR, vmcs12->guest_ss_selector); vmcs_write16(GUEST_DS_SELECTOR, vmcs12->guest_ds_selector); @@ -11948,10 +12167,6 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) if (nested_cpu_has_xsaves(vmcs12)) vmcs_write64(XSS_EXIT_BITMAP, vmcs12->xss_exit_bitmap); - vmcs_write64(VMCS_LINK_POINTER, -1ull); - - if (cpu_has_vmx_posted_intr()) - vmcs_write16(POSTED_INTR_NV, POSTED_INTR_NESTED_VECTOR); /* * Whether page-faults are trapped is determined by a combination of @@ -11972,10 +12187,6 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, enable_ept ? vmcs12->page_fault_error_code_match : 0); - /* All VMFUNCs are currently emulated through L0 vmexits. */ - if (cpu_has_vmx_vmfunc()) - vmcs_write64(VM_FUNCTION_CONTROL, 0); - if (cpu_has_vmx_apicv()) { vmcs_write64(EOI_EXIT_BITMAP0, vmcs12->eoi_exit_bitmap0); vmcs_write64(EOI_EXIT_BITMAP1, vmcs12->eoi_exit_bitmap1); @@ -11983,36 +12194,14 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs_write64(EOI_EXIT_BITMAP3, vmcs12->eoi_exit_bitmap3); } - /* - * Set host-state according to L0's settings (vmcs12 is irrelevant here) - * Some constant fields are set here by vmx_set_constant_host_state(). - * Other fields are different per CPU, and will be set later when - * vmx_vcpu_load() is called, and when vmx_prepare_switch_to_guest() - * is called. - */ - vmx_set_constant_host_state(vmx); - - /* - * Set the MSR load/store lists to match L0's settings. - */ - vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0); vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); - vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val)); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); - vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val)); set_cr4_guest_host_mask(vmx); if (vmx_mpx_supported()) vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); - if (enable_vpid) { - if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02) - vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02); - else - vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid); - } - /* * L1 may access the L2's PDPTR, so save them to construct vmcs12 */ @@ -12022,9 +12211,6 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs_write64(GUEST_PDPTR2, vmcs12->guest_pdptr2); vmcs_write64(GUEST_PDPTR3, vmcs12->guest_pdptr3); } - - if (cpu_has_vmx_msr_bitmap()) - vmcs_write64(MSR_BITMAP, __pa(vmx->nested.vmcs02.msr_bitmap)); } /* @@ -12042,11 +12228,9 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, u32 *entry_failure_code) { struct vcpu_vmx *vmx = to_vmx(vcpu); - u32 exec_control, vmcs12_exec_ctrl; - u64 guest_efer; if (vmx->nested.dirty_vmcs12) { - prepare_vmcs02_full(vcpu, vmcs12); + prepare_vmcs02_full(vmx, vmcs12); vmx->nested.dirty_vmcs12 = false; } @@ -12069,122 +12253,12 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, kvm_set_dr(vcpu, 7, vcpu->arch.dr7); vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.vmcs01_debugctl); } - if (vmx->nested.nested_run_pending) { - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, - vmcs12->vm_entry_intr_info_field); - vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, - vmcs12->vm_entry_exception_error_code); - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, - vmcs12->vm_entry_instruction_len); - vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, - vmcs12->guest_interruptibility_info); - vmx->loaded_vmcs->nmi_known_unmasked = - !(vmcs12->guest_interruptibility_info & GUEST_INTR_STATE_NMI); - } else { - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); - } vmx_set_rflags(vcpu, vmcs12->guest_rflags); - exec_control = vmcs12->pin_based_vm_exec_control; - - /* Preemption timer setting is only taken from vmcs01. */ - exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; - exec_control |= vmcs_config.pin_based_exec_ctrl; - if (vmx->hv_deadline_tsc == -1) - exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER; - - /* Posted interrupts setting is only taken from vmcs12. */ - if (nested_cpu_has_posted_intr(vmcs12)) { - vmx->nested.posted_intr_nv = vmcs12->posted_intr_nv; - vmx->nested.pi_pending = false; - } else { - exec_control &= ~PIN_BASED_POSTED_INTR; - } - - vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, exec_control); - vmx->nested.preemption_timer_expired = false; if (nested_cpu_has_preemption_timer(vmcs12)) vmx_start_preemption_timer(vcpu); - if (cpu_has_secondary_exec_ctrls()) { - exec_control = vmx->secondary_exec_control; - - /* Take the following fields only from vmcs12 */ - exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | - SECONDARY_EXEC_ENABLE_INVPCID | - SECONDARY_EXEC_RDTSCP | - SECONDARY_EXEC_XSAVES | - SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | - SECONDARY_EXEC_APIC_REGISTER_VIRT | - SECONDARY_EXEC_ENABLE_VMFUNC); - if (nested_cpu_has(vmcs12, - CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)) { - vmcs12_exec_ctrl = vmcs12->secondary_vm_exec_control & - ~SECONDARY_EXEC_ENABLE_PML; - exec_control |= vmcs12_exec_ctrl; - } - - /* VMCS shadowing for L2 is emulated for now */ - exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS; - - if (exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) - vmcs_write16(GUEST_INTR_STATUS, - vmcs12->guest_intr_status); - - /* - * Write an illegal value to APIC_ACCESS_ADDR. Later, - * nested_get_vmcs12_pages will either fix it up or - * remove the VM execution control. - */ - if (exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) - vmcs_write64(APIC_ACCESS_ADDR, -1ull); - - if (exec_control & SECONDARY_EXEC_ENCLS_EXITING) - vmcs_write64(ENCLS_EXITING_BITMAP, -1ull); - - vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control); - } - - /* - * HOST_RSP is normally set correctly in vmx_vcpu_run() just before - * entry, but only if the current (host) sp changed from the value - * we wrote last (vmx->host_rsp). This cache is no longer relevant - * if we switch vmcs, and rather than hold a separate cache per vmcs, - * here we just force the write to happen on entry. - */ - vmx->host_rsp = 0; - - exec_control = vmx_exec_control(vmx); /* L0's desires */ - exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING; - exec_control &= ~CPU_BASED_VIRTUAL_NMI_PENDING; - exec_control &= ~CPU_BASED_TPR_SHADOW; - exec_control |= vmcs12->cpu_based_vm_exec_control; - - /* - * Write an illegal value to VIRTUAL_APIC_PAGE_ADDR. Later, if - * nested_get_vmcs12_pages can't fix it up, the illegal value - * will result in a VM entry failure. - */ - if (exec_control & CPU_BASED_TPR_SHADOW) { - vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, -1ull); - vmcs_write32(TPR_THRESHOLD, vmcs12->tpr_threshold); - } else { -#ifdef CONFIG_X86_64 - exec_control |= CPU_BASED_CR8_LOAD_EXITING | - CPU_BASED_CR8_STORE_EXITING; -#endif - } - - /* - * A vmexit (to either L1 hypervisor or L0 userspace) is always needed - * for I/O port accesses. - */ - exec_control &= ~CPU_BASED_USE_IO_BITMAPS; - exec_control |= CPU_BASED_UNCOND_IO_EXITING; - - vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control); - /* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the * bitwise-or of what L1 wants to trap for L2, and what we want to * trap. Note that CR0.TS also needs updating - we do this later. @@ -12193,30 +12267,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vcpu->arch.cr0_guest_owned_bits &= ~vmcs12->cr0_guest_host_mask; vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits); - /* L2->L1 exit controls are emulated - the hardware exit is to L0 so - * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER - * bits are further modified by vmx_set_efer() below. - */ - vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); - - /* - * vmcs12's VM_{ENTRY,EXIT}_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE - * are emulated by vmx_set_efer(), below, but speculate on the - * related bits (if supported by the CPU) in the hope that we can - * avoid VMWrites during vmx_set_efer(). - */ - exec_control = (vmcs12->vm_entry_controls | vmcs_config.vmentry_ctrl) & - ~VM_ENTRY_IA32E_MODE & ~VM_ENTRY_LOAD_IA32_EFER; - - guest_efer = nested_vmx_calc_efer(vmx, vmcs12); - if (cpu_has_load_ia32_efer) { - if (guest_efer & EFER_LMA) - exec_control |= VM_ENTRY_IA32E_MODE; - if (guest_efer != host_efer) - exec_control |= VM_ENTRY_LOAD_IA32_EFER; - } - vm_entry_controls_init(vmx, exec_control); - if (vmx->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT)) { vmcs_write64(GUEST_IA32_PAT, vmcs12->guest_ia32_pat); @@ -12249,18 +12299,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, } } - if (enable_pml) { - /* - * Conceptually we want to copy the PML address and index from - * vmcs01 here, and then back to vmcs01 on nested vmexit. But, - * since we always flush the log on each vmexit, this happens - * to be equivalent to simply resetting the fields in vmcs02. - */ - ASSERT(vmx->pml_pg); - vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg)); - vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1); - } - if (nested_cpu_has_ept(vmcs12)) nested_ept_init_mmu_context(vcpu); else if (nested_cpu_has2(vmcs12, @@ -12281,7 +12319,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmx_set_cr4(vcpu, vmcs12->guest_cr4); vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(vmcs12)); - vcpu->arch.efer = guest_efer; + vcpu->arch.efer = nested_vmx_calc_efer(vmx, vmcs12); /* Note: may modify VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */ vmx_set_efer(vcpu, vcpu->arch.efer); @@ -12573,6 +12611,8 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset += vmcs12->tsc_offset; + prepare_vmcs02_early(vmx, vmcs12); + if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) goto fail; From patchwork Tue Aug 28 16:04:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578791 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A810139B for ; Tue, 28 Aug 2018 16:05:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A20C29566 for ; Tue, 28 Aug 2018 16:05:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4E9E92A662; Tue, 28 Aug 2018 16:05:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED38C29566 for ; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727490AbeH1T5f (ORCPT ); Tue, 28 Aug 2018 15:57:35 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727333AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826899" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:06 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 11/18] KVM: nVMX: do early preparation of vmcs02 before check_vmentry_postreqs() Date: Tue, 28 Aug 2018 09:04:52 -0700 Message-Id: <20180828160459.14093-12-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In anticipation of using vmcs02 to do early consistency checks, move the early preparation of vmcs02 prior to checking the postreqs. The downside of this approach is that we'll unnecessary load vmcs02 in the case that check_vmentry_postreqs() fails, but that is essentially our slow path anyways (not actually slow, but it's the path we don't really care about optimizing). Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 492dc154c31e..ed0f9de50ff7 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12595,30 +12595,30 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, u32 exit_reason = EXIT_REASON_INVALID_STATE; u32 exit_qual; - if (from_vmentry) { - if (check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) - goto consistency_check_vmexit; - } - - enter_guest_mode(vcpu); - if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL); vmx_switch_vmcs(vcpu, &vmx->nested.vmcs02); - vmx_segment_cache_clear(vmx); - - if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) - vcpu->arch.tsc_offset += vmcs12->tsc_offset; prepare_vmcs02_early(vmx, vmcs12); - if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) - goto fail; - if (from_vmentry) { nested_get_vmcs12_pages(vcpu); + if (check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) + goto consistency_check_vmexit; + } + + vmx_segment_cache_clear(vmx); + + enter_guest_mode(vcpu); + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vcpu->arch.tsc_offset += vmcs12->tsc_offset; + + if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) + goto fail; + + if (from_vmentry) { exit_reason = EXIT_REASON_MSR_LOAD_FAIL; exit_qual = nested_vmx_load_msr(vcpu, vmcs12->vm_entry_msr_load_addr, @@ -12648,7 +12648,6 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); - vmx_switch_vmcs(vcpu, &vmx->vmcs01); /* * A consistency check VMExit during L1's VMEnter to L2 is a subset @@ -12657,6 +12656,7 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, * reason and exit-qualification parameters). */ consistency_check_vmexit: + vmx_switch_vmcs(vcpu, &vmx->vmcs01); vm_entry_controls_reset_shadow(vmx); vm_exit_controls_reset_shadow(vmx); vmx_segment_cache_clear(vmx); From patchwork Tue Aug 28 16:04:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578801 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A9A25139B for ; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98E0129566 for ; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8D8FA2A662; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 402B229566 for ; Tue, 28 Aug 2018 16:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727507AbeH1T5l (ORCPT ); Tue, 28 Aug 2018 15:57:41 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727398AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826902" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:06 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 12/18] KVM: vVMX: rename label for post-enter_guest_mode consistency check Date: Tue, 28 Aug 2018 09:04:53 -0700 Message-Id: <20180828160459.14093-13-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Rename 'fail' to 'consistency_check_vmexit_guest_mode' to make it more obvious that it's simply a different entry point to the consitency check VMExit path, whose purpose is unwind the updates done prior to calling prepare_vmcs02. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ed0f9de50ff7..db4751c626ce 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12616,7 +12616,7 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, vcpu->arch.tsc_offset += vmcs12->tsc_offset; if (prepare_vmcs02(vcpu, vmcs12, &exit_qual)) - goto fail; + goto consistency_check_vmexit_guest_mode; if (from_vmentry) { exit_reason = EXIT_REASON_MSR_LOAD_FAIL; @@ -12624,7 +12624,7 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, vmcs12->vm_entry_msr_load_addr, vmcs12->vm_entry_msr_load_count); if (exit_qual) - goto fail; + goto consistency_check_vmexit_guest_mode; } else { /* * The MMU is not initialized to point at the right entities yet and @@ -12644,7 +12644,7 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, */ return 0; -fail: +consistency_check_vmexit_guest_mode: if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) vcpu->arch.tsc_offset -= vmcs12->tsc_offset; leave_guest_mode(vcpu); From patchwork Tue Aug 28 16:04:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA139139B for ; Tue, 28 Aug 2018 16:05:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9D572A603 for ; Tue, 28 Aug 2018 16:05:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AE13F2A662; Tue, 28 Aug 2018 16:05:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 51F5E2A603 for ; Tue, 28 Aug 2018 16:05:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727466AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727111AbeH1T5d (ORCPT ); Tue, 28 Aug 2018 15:57:33 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826903" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:06 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 13/18] KVM: nVMX: do not skip VMEnter instruction that succeeds Date: Tue, 28 Aug 2018 09:04:54 -0700 Message-Id: <20180828160459.14093-14-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A successful VMEnter is essentially a fancy indirect branch that pulls the target RIP from the VMCS. Skipping the instruction is unnecessary (RIP will get overwritten by the VMExit handler) and is problematic because it can incorrectly suppress a #DB due to EFLAGS.TF when a VMFail is detected by hardware (happens after we skip the instruction). Now that vmx_nested_run() is not prematurely skipping the instr, use the full kvm_skip_emulated_instruction() in the VMFail path of nested_vmx_vmexit(). We also need to explicitly update the GUEST_INTERRUPTIBILITY_INFO when loading vmcs12 host state. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index db4751c626ce..c94d600baefe 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12735,15 +12735,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) goto out; } - /* - * After this point, the trap flag no longer triggers a singlestep trap - * on the vm entry instructions; don't call kvm_skip_emulated_instruction. - * This is not 100% correct; for performance reasons, we delegate most - * of the checks on host state to the processor. If those fail, - * the singlestep trap is missed. - */ - skip_emulated_instruction(vcpu); - /* * We're finally done with prerequisite checking, and can start with * the nested entry. @@ -13135,6 +13126,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->host_rsp); kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->host_rip); vmx_set_rflags(vcpu, X86_EFLAGS_FIXED); + vmx_set_interrupt_shadow(vcpu, 0); + /* * Note that calling vmx_set_cr0 is important, even if cr0 hasn't * actually changed, because vmx_set_cr0 refers to efer set above. @@ -13390,18 +13383,14 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, * in L1 which thinks it just finished a VMLAUNCH or * VMRESUME instruction, so we need to set the failure * flag and the VM-instruction error field of the VMCS - * accordingly. + * accordingly, and skip the emulated instruction. */ nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); + kvm_skip_emulated_instruction(vcpu); + load_vmcs12_mmu_host_state(vcpu, vmcs12); - /* - * The emulated instruction was already skipped in - * nested_vmx_run, but the updated RIP was never - * written back to the vmcs01. - */ - skip_emulated_instruction(vcpu); vmx->fail = 0; } From patchwork Tue Aug 28 16:04:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578789 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B6B2B139B for ; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A708129566 for ; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9B1E42A662; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F6A929566 for ; Tue, 28 Aug 2018 16:05:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727471AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 Received: from mga01.intel.com ([192.55.52.88]:33262 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727334AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826904" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:06 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 14/18] KVM: nVMX: do not call nested_vmx_succeed() for consistency check VMExit Date: Tue, 28 Aug 2018 09:04:55 -0700 Message-Id: <20180828160459.14093-15-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP EFLAGS is set to a fixed value on VMExit, calling nested_vmx_succeed() is unnecessary and wrong. Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c94d600baefe..8ab028fb1f5f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -12667,7 +12667,6 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, load_vmcs12_host_state(vcpu, vmcs12); vmcs12->vm_exit_reason = exit_reason | VMX_EXIT_REASONS_FAILED_VMENTRY; vmcs12->exit_qualification = exit_qual; - nested_vmx_succeed(vcpu); if (enable_shadow_vmcs) vmx->nested.sync_shadow_vmcs = true; return 1; From patchwork Tue Aug 28 16:04:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C108A920 for ; Tue, 28 Aug 2018 16:05:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF49129566 for ; Tue, 28 Aug 2018 16:05:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A3CEB2A656; Tue, 28 Aug 2018 16:05:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9DE6E29566 for ; Tue, 28 Aug 2018 16:05:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727496AbeH1T5g (ORCPT ); Tue, 28 Aug 2018 15:57:36 -0400 Received: from mga01.intel.com ([192.55.52.88]:33258 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727334AbeH1T5f (ORCPT ); Tue, 28 Aug 2018 15:57:35 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826906" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:07 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 15/18] KVM: nVMX: call kvm_skip_emulated_instruction in nested_vmx_{fail,succeed} Date: Tue, 28 Aug 2018 09:04:56 -0700 Message-Id: <20180828160459.14093-16-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP ... as every invocation of nested_vmx_{fail,succeed} is immediately followed by a call to kvm_skip_emulated_instruction(). This saves a bit of code and eliminates some silly paths, e.g. nested_vmx_run() ended up with a goto label purely used to call and return kvm_skip_emulated_instruction(). Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 199 +++++++++++++++++---------------------------- 1 file changed, 76 insertions(+), 123 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 8ab028fb1f5f..1edf82632832 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8054,35 +8054,37 @@ static int handle_monitor(struct kvm_vcpu *vcpu) /* * The following 3 functions, nested_vmx_succeed()/failValid()/failInvalid(), - * set the success or error code of an emulated VMX instruction, as specified - * by Vol 2B, VMX Instruction Reference, "Conventions". + * set the success or error code of an emulated VMX instruction (as specified + * by Vol 2B, VMX Instruction Reference, "Conventions"), and skip the emulated + * instruction. */ -static void nested_vmx_succeed(struct kvm_vcpu *vcpu) +static int nested_vmx_succeed(struct kvm_vcpu *vcpu) { vmx_set_rflags(vcpu, vmx_get_rflags(vcpu) & ~(X86_EFLAGS_CF | X86_EFLAGS_PF | X86_EFLAGS_AF | X86_EFLAGS_ZF | X86_EFLAGS_SF | X86_EFLAGS_OF)); + return kvm_skip_emulated_instruction(vcpu); } -static void nested_vmx_failInvalid(struct kvm_vcpu *vcpu) +static int nested_vmx_failInvalid(struct kvm_vcpu *vcpu) { vmx_set_rflags(vcpu, (vmx_get_rflags(vcpu) & ~(X86_EFLAGS_PF | X86_EFLAGS_AF | X86_EFLAGS_ZF | X86_EFLAGS_SF | X86_EFLAGS_OF)) | X86_EFLAGS_CF); + return kvm_skip_emulated_instruction(vcpu); } -static void nested_vmx_failValid(struct kvm_vcpu *vcpu, +static int nested_vmx_failValid(struct kvm_vcpu *vcpu, u32 vm_instruction_error) { - if (to_vmx(vcpu)->nested.current_vmptr == -1ull) { - /* - * failValid writes the error number to the current VMCS, which - * can't be done there isn't a current VMCS. - */ - nested_vmx_failInvalid(vcpu); - return; - } + /* + * failValid writes the error number to the current VMCS, which + * can't be done if there isn't a current VMCS. + */ + if (to_vmx(vcpu)->nested.current_vmptr == -1ull) + return nested_vmx_failInvalid(vcpu); + vmx_set_rflags(vcpu, (vmx_get_rflags(vcpu) & ~(X86_EFLAGS_CF | X86_EFLAGS_PF | X86_EFLAGS_AF | X86_EFLAGS_SF | X86_EFLAGS_OF)) @@ -8092,6 +8094,7 @@ static void nested_vmx_failValid(struct kvm_vcpu *vcpu, * We don't need to force a shadow sync because * VM_INSTRUCTION_ERROR is not shadowed */ + return kvm_skip_emulated_instruction(vcpu); } static void nested_vmx_abort(struct kvm_vcpu *vcpu, u32 indicator) @@ -8333,8 +8336,8 @@ static int handle_vmon(struct kvm_vcpu *vcpu) } if (vmx->nested.vmxon) { - nested_vmx_failValid(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_VMXON_IN_VMX_ROOT_OPERATION); } if ((vmx->msr_ia32_feature_control & VMXON_NEEDED_FEATURES) @@ -8354,21 +8357,17 @@ static int handle_vmon(struct kvm_vcpu *vcpu) * Note - IA32_VMX_BASIC[48] will never be 1 for the nested case; * which replaces physical address width with 32 */ - if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu))) { - nested_vmx_failInvalid(vcpu); - return kvm_skip_emulated_instruction(vcpu); - } + if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu))) + return nested_vmx_failInvalid(vcpu); page = kvm_vcpu_gpa_to_page(vcpu, vmptr); - if (is_error_page(page)) { - nested_vmx_failInvalid(vcpu); - return kvm_skip_emulated_instruction(vcpu); - } + if (is_error_page(page)) + return nested_vmx_failInvalid(vcpu); + if (*(u32 *)kmap(page) != VMCS12_REVISION) { kunmap(page); kvm_release_page_clean(page); - nested_vmx_failInvalid(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failInvalid(vcpu); } kunmap(page); kvm_release_page_clean(page); @@ -8378,8 +8377,7 @@ static int handle_vmon(struct kvm_vcpu *vcpu) if (ret) return ret; - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } /* @@ -8479,8 +8477,7 @@ static int handle_vmoff(struct kvm_vcpu *vcpu) if (!nested_vmx_check_permission(vcpu)) return 1; free_nested(to_vmx(vcpu)); - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } /* Emulate the VMCLEAR instruction */ @@ -8497,13 +8494,13 @@ static int handle_vmclear(struct kvm_vcpu *vcpu) return 1; if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu))) { - nested_vmx_failValid(vcpu, VMXERR_VMCLEAR_INVALID_ADDRESS); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_VMCLEAR_INVALID_ADDRESS); } if (vmptr == vmx->nested.vmxon_ptr) { - nested_vmx_failValid(vcpu, VMXERR_VMCLEAR_VMXON_POINTER); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_VMCLEAR_VMXON_POINTER); } if (vmptr == vmx->nested.current_vmptr) @@ -8513,8 +8510,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu) vmptr + offsetof(struct vmcs12, launch_state), &zero, sizeof(zero)); - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch); @@ -8670,20 +8666,6 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) vmcs_load(vmx->loaded_vmcs->vmcs); } -/* - * VMX instructions which assume a current vmcs12 (i.e., that VMPTRLD was - * used before) all generate the same failure when it is missing. - */ -static int nested_vmx_check_vmcs12(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - if (vmx->nested.current_vmptr == -1ull) { - nested_vmx_failInvalid(vcpu); - return 0; - } - return 1; -} - static int handle_vmread(struct kvm_vcpu *vcpu) { unsigned long field; @@ -8696,8 +8678,8 @@ static int handle_vmread(struct kvm_vcpu *vcpu) if (!nested_vmx_check_permission(vcpu)) return 1; - if (!nested_vmx_check_vmcs12(vcpu)) - return kvm_skip_emulated_instruction(vcpu); + if (to_vmx(vcpu)->nested.current_vmptr == -1ull) + return nested_vmx_failInvalid(vcpu); if (!is_guest_mode(vcpu)) vmcs12 = get_vmcs12(vcpu); @@ -8706,10 +8688,8 @@ static int handle_vmread(struct kvm_vcpu *vcpu) * When vmcs->vmcs_link_pointer is -1ull, any VMREAD * to shadowed-field sets the ALU flags for VMfailInvalid. */ - if (get_vmcs12(vcpu)->vmcs_link_pointer == -1ull) { - nested_vmx_failInvalid(vcpu); - return kvm_skip_emulated_instruction(vcpu); - } + if (get_vmcs12(vcpu)->vmcs_link_pointer == -1ull) + return nested_vmx_failInvalid(vcpu); vmcs12 = get_shadow_vmcs12(vcpu); } @@ -8717,9 +8697,10 @@ static int handle_vmread(struct kvm_vcpu *vcpu) field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf)); /* Read the field, zero-extended to a u64 field_value */ if (vmcs12_read_any(vmcs12, field, &field_value) < 0) { - nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_UNSUPPORTED_VMCS_COMPONENT); } + /* * Now copy part of this value to register or memory, as requested. * Note that the number of bits actually copied is 32 or 64 depending @@ -8737,8 +8718,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu) (is_long_mode(vcpu) ? 8 : 4), NULL); } - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } @@ -8763,8 +8743,8 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) if (!nested_vmx_check_permission(vcpu)) return 1; - if (!nested_vmx_check_vmcs12(vcpu)) - return kvm_skip_emulated_instruction(vcpu); + if (vmx->nested.current_vmptr == -1ull) + return nested_vmx_failInvalid(vcpu); if (vmx_instruction_info & (1u << 10)) field_value = kvm_register_readl(vcpu, @@ -8788,9 +8768,8 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) */ if (vmcs_field_readonly(field) && !nested_cpu_has_vmwrite_any_field(vcpu)) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT); - return kvm_skip_emulated_instruction(vcpu); } if (!is_guest_mode(vcpu)) @@ -8800,17 +8779,14 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) * When vmcs->vmcs_link_pointer is -1ull, any VMWRITE * to shadowed-field sets the ALU flags for VMfailInvalid. */ - if (get_vmcs12(vcpu)->vmcs_link_pointer == -1ull) { - nested_vmx_failInvalid(vcpu); - return kvm_skip_emulated_instruction(vcpu); - } + if (get_vmcs12(vcpu)->vmcs_link_pointer == -1ull) + return nested_vmx_failInvalid(vcpu); vmcs12 = get_shadow_vmcs12(vcpu); - } if (vmcs12_write_any(vmcs12, field, field_value) < 0) { - nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_UNSUPPORTED_VMCS_COMPONENT); } /* @@ -8833,8 +8809,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) } } - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } static void set_current_vmptr(struct vcpu_vmx *vmx, gpa_t vmptr) @@ -8863,32 +8838,30 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu) return 1; if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu))) { - nested_vmx_failValid(vcpu, VMXERR_VMPTRLD_INVALID_ADDRESS); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_VMPTRLD_INVALID_ADDRESS); } if (vmptr == vmx->nested.vmxon_ptr) { - nested_vmx_failValid(vcpu, VMXERR_VMPTRLD_VMXON_POINTER); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_failValid(vcpu, + VMXERR_VMPTRLD_VMXON_POINTER); } if (vmx->nested.current_vmptr != vmptr) { struct vmcs12 *new_vmcs12; struct page *page; page = kvm_vcpu_gpa_to_page(vcpu, vmptr); - if (is_error_page(page)) { - nested_vmx_failInvalid(vcpu); - return kvm_skip_emulated_instruction(vcpu); - } + if (is_error_page(page)) + return nested_vmx_failInvalid(vcpu); + new_vmcs12 = kmap(page); if (new_vmcs12->hdr.revision_id != VMCS12_REVISION || (new_vmcs12->hdr.shadow_vmcs && !nested_cpu_has_vmx_shadow_vmcs(vcpu))) { kunmap(page); kvm_release_page_clean(page); - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_VMPTRLD_INCORRECT_VMCS_REVISION_ID); - return kvm_skip_emulated_instruction(vcpu); } nested_release_vmcs12(vmx); @@ -8903,8 +8876,7 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu) set_current_vmptr(vmx, vmptr); } - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } /* Emulate the VMPTRST instruction */ @@ -8927,8 +8899,7 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu) kvm_inject_page_fault(vcpu, &e); return 1; } - nested_vmx_succeed(vcpu); - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } /* Emulate the INVEPT instruction */ @@ -8959,9 +8930,8 @@ static int handle_invept(struct kvm_vcpu *vcpu) types = (vmx->nested.msrs.ept_caps >> VMX_EPT_EXTENT_SHIFT) & 6; if (type >= 32 || !(types & (1 << type))) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - return kvm_skip_emulated_instruction(vcpu); } /* According to the Intel VMX instruction reference, the memory @@ -8984,14 +8954,13 @@ static int handle_invept(struct kvm_vcpu *vcpu) case VMX_EPT_EXTENT_CONTEXT: kvm_mmu_sync_roots(vcpu); kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); - nested_vmx_succeed(vcpu); break; default: BUG_ON(1); break; } - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } static int handle_invvpid(struct kvm_vcpu *vcpu) @@ -9023,9 +8992,8 @@ static int handle_invvpid(struct kvm_vcpu *vcpu) VMX_VPID_EXTENT_SUPPORTED_MASK) >> 8; if (type >= 32 || !(types & (1 << type))) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - return kvm_skip_emulated_instruction(vcpu); } /* according to the intel vmx instruction reference, the memory @@ -9039,19 +9007,18 @@ static int handle_invvpid(struct kvm_vcpu *vcpu) return 1; } if (operand.vpid >> 16) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - return kvm_skip_emulated_instruction(vcpu); } switch (type) { case VMX_VPID_EXTENT_INDIVIDUAL_ADDR: if (!operand.vpid || is_noncanonical_address(operand.gla, vcpu)) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - return kvm_skip_emulated_instruction(vcpu); } + if (cpu_has_vmx_invvpid_individual_addr() && vmx->nested.vpid02) { __invvpid(VMX_VPID_EXTENT_INDIVIDUAL_ADDR, @@ -9062,9 +9029,8 @@ static int handle_invvpid(struct kvm_vcpu *vcpu) case VMX_VPID_EXTENT_SINGLE_CONTEXT: case VMX_VPID_EXTENT_SINGLE_NON_GLOBAL: if (!operand.vpid) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - return kvm_skip_emulated_instruction(vcpu); } __vmx_flush_tlb(vcpu, vmx->nested.vpid02, true); break; @@ -9076,9 +9042,7 @@ static int handle_invvpid(struct kvm_vcpu *vcpu) return kvm_skip_emulated_instruction(vcpu); } - nested_vmx_succeed(vcpu); - - return kvm_skip_emulated_instruction(vcpu); + return nested_vmx_succeed(vcpu); } static int handle_invpcid(struct kvm_vcpu *vcpu) @@ -12686,8 +12650,8 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) if (!nested_vmx_check_permission(vcpu)) return 1; - if (!nested_vmx_check_vmcs12(vcpu)) - goto out; + if (vmx->nested.current_vmptr == -1ull) + return nested_vmx_failInvalid(vcpu); vmcs12 = get_vmcs12(vcpu); @@ -12697,10 +12661,8 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) * rather than RFLAGS.ZF, and no error number is stored to the * VM-instruction error field. */ - if (vmcs12->hdr.shadow_vmcs) { - nested_vmx_failInvalid(vcpu); - goto out; - } + if (vmcs12->hdr.shadow_vmcs) + return nested_vmx_failInvalid(vcpu); if (enable_shadow_vmcs) copy_shadow_to_vmcs12(vmx); @@ -12716,23 +12678,19 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) * when using the merged vmcs02. */ if (interrupt_shadow & KVM_X86_SHADOW_INT_MOV_SS) { - nested_vmx_failValid(vcpu, - VMXERR_ENTRY_EVENTS_BLOCKED_BY_MOV_SS); - goto out; + return nested_vmx_failValid(vcpu, + VMXERR_ENTRY_EVENTS_BLOCKED_BY_MOV_SS); } if (vmcs12->launch_state == launch) { - nested_vmx_failValid(vcpu, + return nested_vmx_failValid(vcpu, launch ? VMXERR_VMLAUNCH_NONCLEAR_VMCS : VMXERR_VMRESUME_NONLAUNCHED_VMCS); - goto out; } ret = check_vmentry_prereqs(vcpu, vmcs12); - if (ret) { - nested_vmx_failValid(vcpu, ret); - goto out; - } + if (ret) + return nested_vmx_failValid(vcpu, ret); /* * We're finally done with prerequisite checking, and can start with @@ -12771,9 +12729,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) return kvm_vcpu_halt(vcpu); } return 1; - -out: - return kvm_skip_emulated_instruction(vcpu); } /* @@ -13376,7 +13331,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, return; } - + /* * After an early L2 VM-entry failure, we're now back * in L1 which thinks it just finished a VMLAUNCH or @@ -13384,9 +13339,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, * flag and the VM-instruction error field of the VMCS * accordingly, and skip the emulated instruction. */ - nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - - kvm_skip_emulated_instruction(vcpu); + (void)nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); load_vmcs12_mmu_host_state(vcpu, vmcs12); From patchwork Tue Aug 28 16:04:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578811 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 915F2920 for ; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 80EFC29566 for ; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 756FB2A656; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 30CB72A603 for ; Tue, 28 Aug 2018 16:05:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727526AbeH1T5p (ORCPT ); Tue, 28 Aug 2018 15:57:45 -0400 Received: from mga01.intel.com ([192.55.52.88]:33260 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727336AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826909" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:07 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 16/18] KVM: vmx: write HOST_IA32_EFER in vmx_set_constant_host_state() Date: Tue, 28 Aug 2018 09:04:57 -0700 Message-Id: <20180828160459.14093-17-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP EFER is constant in the host and writing it once during setup means we can skip writing the host value in add_atomic_switch_msr_special(). Signed-off-by: Sean Christopherson Reviewed-by: Jim Mattson --- arch/x86/kvm/vmx.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1edf82632832..3d826550cc9d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2697,7 +2697,8 @@ static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx, u64 guest_val, u64 host_val) { vmcs_write64(guest_val_vmcs, guest_val); - vmcs_write64(host_val_vmcs, host_val); + if (host_val_vmcs != HOST_IA32_EFER) + vmcs_write64(host_val_vmcs, host_val); vm_entry_controls_setbit(vmx, entry); vm_exit_controls_setbit(vmx, exit); } @@ -6329,6 +6330,9 @@ static void vmx_set_constant_host_state(struct vcpu_vmx *vmx) rdmsr(MSR_IA32_CR_PAT, low32, high32); vmcs_write64(HOST_IA32_PAT, low32 | ((u64) high32 << 32)); } + + if (cpu_has_load_ia32_efer) + vmcs_write64(HOST_IA32_EFER, host_efer); } static void set_cr4_guest_host_mask(struct vcpu_vmx *vmx) From patchwork Tue Aug 28 16:04:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578807 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE50E920 for ; Tue, 28 Aug 2018 16:05:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E7412A603 for ; Tue, 28 Aug 2018 16:05:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9263C29566; Tue, 28 Aug 2018 16:05:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C867D29566 for ; Tue, 28 Aug 2018 16:05:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727518AbeH1T5o (ORCPT ); Tue, 28 Aug 2018 15:57:44 -0400 Received: from mga01.intel.com ([192.55.52.88]:33262 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727379AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826911" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:07 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 17/18] KVM: nVMX: add option to perform early consistency checks via H/W Date: Tue, 28 Aug 2018 09:04:58 -0700 Message-Id: <20180828160459.14093-18-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP KVM defers many VMX consistency checks to the CPU, ostensibly for performance reasons[1], including checks that result in VMFail (as opposed to VMExit). This behavior may be undesirable for some users since this means KVM detects certain classes of VMFail only after it has processed guest state, e.g. emulated MSR load-on-entry. Because there is a strict ordering between checks that cause VMFail and those that cause VMExit, i.e. all VMFail checks are performed before any checks that cause VMExit, we can detect all VMFail conditions via a dry run of sorts. After preparing vmcs02 with all state needed to pass the VMFail consistency checks, optionally do a "test" VMEnter with an invalid GUEST_RIP. If the VMEnter results in a VMExit (due to bad guest state), then we can safely say that the nested VMEnter should not VMFail, i.e. any VMFail encountered in nested_vmx_vmexit() must be due to an L0 bug. Unfortunately, since the "passing" case causes a VMExit, KVM must be extra diligent to ensure that host state is restored, e.g. DR7 and RFLAGS are reset on VMExit. Failure to restore RFLAGS.IF is particularly fatal. And of course the extra VMEnter and VMExit impacts performance. The raw overhead of the early consistency checks is ~6% on modern hardware (though this could easily vary based on configuration), while the added latency observed from the L1 VMM is ~10%. The early consistency checks do not occur in a vacuum, e.g. spending more time in L0 can lead to more interrupts being serviced while emulating VMEnter, thereby increasing the latency observed by L1. Add a module param, early_consistency_checks, to provide control over whether or not VMX performs the early consistency checks. In addition to standard on/off behavior, the param accepts a value of -1, which is essentialy an "auto" setting whereby KVM does the early checks only when it thinks it's running on bare metal. When running nested, doing early checks is of dubious value since the resulting behavior is heavily dependent on L0. In the future, the "auto" setting could also be used to default to skipping the early hardware checks for certain configurations/platforms if KVM reaches a state where it has 100% coverage of VMFail conditions. [1] To my knowledge no one has implemented and tested full software emulation of the VMFail consistency checks. Until that happens, one can only speculate about the actual performance overhead of doing all VMFail consistency checks in software. Obviously any code is slower than no code, but in the grand scheme of nested virtualization it's entirely possible the overhead is negligible. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 151 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 146 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3d826550cc9d..311cd32ab584 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -54,6 +54,7 @@ #include #include #include +#include #include "trace.h" #include "pmu.h" @@ -110,6 +111,15 @@ module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO); static bool __read_mostly nested = 0; module_param(nested, bool, S_IRUGO); +/* + * early_consistency_checks is a tri-state: + * -1 == "auto" + * 0 == "never" + * 1 == "always". + */ +static int __read_mostly early_consistency_checks = -1; +module_param(early_consistency_checks, int, 0444); + static u64 __read_mostly host_xss; static bool __read_mostly enable_pml = 1; @@ -188,6 +198,7 @@ static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; module_param(ple_window_max, uint, 0444); extern const ulong vmx_return; +extern const ulong vmx_early_consistency_check_return; static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush); static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond); @@ -7904,6 +7915,10 @@ static __init int hardware_setup(void) if (!cpu_has_virtual_nmis()) enable_vnmi = 0; + if (early_consistency_checks < 0 && + !hypervisor_is_type(X86_HYPER_NATIVE)) + early_consistency_checks = 0; + /* * set_apic_access_page_addr() is used to reload apic access * page upon invalidation. No need to do anything if not @@ -11883,6 +11898,14 @@ static void prepare_vmcs02_first_run(struct vcpu_vmx *vmx) if (vmx->nested.vmcs02.launched) return; + /* + * We don't care what the EPTP value is we just need to guarantee + * it's valid so we don't get a false positive when doing early + * consistency checks. + */ + if (enable_ept && early_consistency_checks) + vmcs_write64(EPT_POINTER, construct_eptp(&vmx->vcpu, 0)); + /* All VMFUNCs are currently emulated through L0 vmexits. */ if (cpu_has_vmx_vmfunc()) vmcs_write64(VM_FUNCTION_CONTROL, 0); @@ -11936,7 +11959,9 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) * entry, but only if the current (host) sp changed from the value * we wrote last (vmx->host_rsp). This cache is no longer relevant * if we switch vmcs, and rather than hold a separate cache per vmcs, - * here we just force the write to happen on entry. + * here we just force the write to happen on entry. host_rsp will + * also be written unconditionally by nested_vmx_check_vmentry_hw() + * if we are doing early consistency checks via hardware. */ vmx->host_rsp = 0; @@ -12549,11 +12574,121 @@ static int check_vmentry_postreqs(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, return 0; } +static int __noclone nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long cr3, cr4; + + if (!early_consistency_checks) + return 0; + + if (vmx->msr_autoload.host.nr) + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0); + if (vmx->msr_autoload.guest.nr) + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 0); + + preempt_disable(); + + vmx_prepare_switch_to_guest(vcpu); + + /* + * prepare_vmcs02() writes GUEST_RIP unconditionally, no need + * to save/restore or set dirty bits. + */ + vmcs_writel(GUEST_RIP, 0xf0f0ULL << 48); + + vmcs_writel(HOST_RIP, vmx_early_consistency_check_return); + + cr3 = __get_current_cr3_fast(); + if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) { + vmcs_writel(HOST_CR3, cr3); + vmx->loaded_vmcs->host_state.cr3 = cr3; + } + + cr4 = cr4_read_shadow(); + if (unlikely(cr4 != vmx->loaded_vmcs->host_state.cr4)) { + vmcs_writel(HOST_CR4, cr4); + vmx->loaded_vmcs->host_state.cr4 = cr4; + } + + vmx->__launched = vmx->loaded_vmcs->launched; + + asm( + /* Set HOST_RSP */ + __ex(ASM_VMX_VMWRITE_RSP_RDX) "\n\t" + "mov %%" _ASM_SP ", %c[host_rsp](%0)\n\t" + + /* Check if vmlaunch of vmresume is needed */ + "cmpl $0, %c[launched](%0)\n\t" + "je 1f\n\t" + __ex(ASM_VMX_VMRESUME) "\n\t" + "jmp 2f\n\t" + "1: " __ex(ASM_VMX_VMLAUNCH) "\n\t" + "jmp 2f\n\t" + "2: " + + /* Set vmx->fail accordingly */ + "setbe %c[fail](%0)\n\t" + + ".pushsection .rodata\n\t" + ".global vmx_early_consistency_check_return\n\t" + "vmx_early_consistency_check_return: " _ASM_PTR " 2b\n\t" + ".popsection" + : + : "c"(vmx), "d"((unsigned long)HOST_RSP), + [launched]"i"(offsetof(struct vcpu_vmx, __launched)), + [fail]"i"(offsetof(struct vcpu_vmx, fail)), + [host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)) + : "rax", "cc", "memory" + ); + + vmcs_writel(HOST_RIP, vmx_return); + + preempt_enable(); + + if (vmx->msr_autoload.host.nr) + vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); + if (vmx->msr_autoload.guest.nr) + vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); + + if (vmx->fail) { + WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) != + VMXERR_ENTRY_INVALID_CONTROL_FIELD); + vmx->fail = 0; + return 1; + } + + /* + * VMExit clears RFLAGS.IF and DR7, even on a consistency check. + */ + local_irq_enable(); + if (hw_breakpoint_active()) + set_debugreg(__this_cpu_read(cpu_dr7), 7); + + /* + * A non-failing VMEntry means we somehow entered guest mode with + * an illegal RIP, and that's just the tip of the iceberg. There + * is no telling what memory has been modified or what state has + * been exposed to unknown code. Hitting this all but guarantees + * a (very critical) hardware issue. + */ + WARN_ON(!(vmcs_read32(VM_EXIT_REASON) & + VMX_EXIT_REASONS_FAILED_VMENTRY)); + + return 0; +} +STACK_FRAME_NON_STANDARD(nested_vmx_check_vmentry_hw); + static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12); /* * If exit_qual is NULL, this is being called from state restore (either RSM * or KVM_SET_NESTED_STATE). Otherwise it's called from vmlaunch/vmresume. + * + * Returns: + * 0 - success, i.e. proceed with actual VMEnter + * 1 - consistency check VMExit + * -1 - consistency check VMFail */ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry) @@ -12573,6 +12708,11 @@ static int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, if (from_vmentry) { nested_get_vmcs12_pages(vcpu); + if (nested_vmx_check_vmentry_hw(vcpu)) { + vmx_switch_vmcs(vcpu, &vmx->vmcs01); + return -1; + } + if (check_vmentry_postreqs(vcpu, vmcs12, &exit_qual)) goto consistency_check_vmexit; } @@ -12700,13 +12840,14 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) * We're finally done with prerequisite checking, and can start with * the nested entry. */ - vmx->nested.nested_run_pending = 1; ret = nested_vmx_enter_non_root_mode(vcpu, true); - if (ret) { - vmx->nested.nested_run_pending = 0; + vmx->nested.nested_run_pending = !ret; + if (ret > 0) return 1; - } + else if (ret) + return nested_vmx_failValid(vcpu, + VMXERR_ENTRY_INVALID_CONTROL_FIELD); /* Hide L1D cache contents from the nested guest. */ vmx->vcpu.arch.l1tf_flush_l1d = true; From patchwork Tue Aug 28 16:04:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 10578797 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 59D26139B for ; Tue, 28 Aug 2018 16:05:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A4A829566 for ; Tue, 28 Aug 2018 16:05:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3EBAD2A656; Tue, 28 Aug 2018 16:05:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E36EC29566 for ; Tue, 28 Aug 2018 16:05:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727476AbeH1T5f (ORCPT ); Tue, 28 Aug 2018 15:57:35 -0400 Received: from mga01.intel.com ([192.55.52.88]:33262 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727442AbeH1T5e (ORCPT ); Tue, 28 Aug 2018 15:57:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Aug 2018 09:05:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,300,1531810800"; d="scan'208";a="65826912" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.9]) by fmsmga007.fm.intel.com with ESMTP; 28 Aug 2018 09:05:08 -0700 From: Sean Christopherson To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Cc: kvm@vger.kernel.org, Jim Mattson , Sean Christopherson Subject: [PATCH v2 18/18] KVM: nVMX: WARN if nested run hits VMFail with early consistency checks enabled Date: Tue, 28 Aug 2018 09:04:59 -0700 Message-Id: <20180828160459.14093-19-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180828160459.14093-1-sean.j.christopherson@intel.com> References: <20180828160459.14093-1-sean.j.christopherson@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When early consistency checks are enabled, all VMFail conditions should be caught by nested_vmx_check_vmentry_hw(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 311cd32ab584..6d0040b175e6 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -13352,14 +13352,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, /* trying to cancel vmlaunch/vmresume is a bug */ WARN_ON_ONCE(vmx->nested.nested_run_pending); - /* - * The only expected VM-instruction error is "VM entry with - * invalid control field(s)." Anything else indicates a - * problem with L0. - */ - WARN_ON_ONCE(vmx->fail && (vmcs_read32(VM_INSTRUCTION_ERROR) != - VMXERR_ENTRY_INVALID_CONTROL_FIELD)); - leave_guest_mode(vcpu); if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) @@ -13386,6 +13378,16 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr, vmcs12->vm_exit_msr_store_count)) nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL); + } else { + /* + * The only expected VM-instruction error is "VM entry with + * invalid control field(s)." Anything else indicates a + * problem with L0. And we should never get here with a + * VMFail of any type if early consistency checks are enabled. + */ + WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) != + VMXERR_ENTRY_INVALID_CONTROL_FIELD); + WARN_ON_ONCE(early_consistency_checks); } vmx_switch_vmcs(vcpu, &vmx->vmcs01);