From patchwork Wed Sep 23 18:44:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795523 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85B49618 for ; Wed, 23 Sep 2020 18:45:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 720A62053B for ; Wed, 23 Sep 2020 18:45:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726992AbgIWSp1 (ORCPT ); Wed, 23 Sep 2020 14:45:27 -0400 Received: from mga07.intel.com ([134.134.136.100]:14506 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726487AbgIWSoy (ORCPT ); Wed, 23 Sep 2020 14:44:54 -0400 IronPort-SDR: lYVYORvuiGnQpiGL6gJCW2XHjrIHWetwCYE2C9FnSzkKniUtGOUtF0UaFQlug2CS7H5WY6+VEA QoieX+LXRayA== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124471" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124471" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:53 -0700 IronPort-SDR: raRX+UdgFpVe0RwVtA3YHBB4KtQmsIqt3dXRq+oX7WgMMhcnsZK0vmJbUXMCzfcheJY3kDUVsD MjTCj77I07qg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457646" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:53 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 1/7] KVM: nVMX: Reset the segment cache when stuffing guest segs Date: Wed, 23 Sep 2020 11:44:46 -0700 Message-Id: <20200923184452.980-2-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Explicitly reset the segment cache after stuffing guest segment regs in prepare_vmcs02_rare(). Although the cache is reset when switching to vmcs02, there is nothing that prevents KVM from re-populating the cache prior to writing vmcs02 with vmcs12's values. E.g. if the vCPU is preempted after switching to vmcs02 but before prepare_vmcs02_rare(), kvm_arch_vcpu_put() will dereference GUEST_SS_AR_BYTES via .get_cpl() and cache the stale vmcs02 value. While the current code base only caches stale data in the preemption case, it's theoretically possible future code could read a segment register during the nested flow itself, i.e. this isn't technically illegal behavior in kvm_arch_vcpu_put(), although it did introduce the bug. This manifests as an unexpected nested VM-Enter failure when running with unrestricted guest disabled if the above preemption case coincides with L1 switching L2's CPL, e.g. when switching from a L2 vCPU at CPL3 to to a L2 vCPU at CPL0. stack_segment_valid() will see the new SS_SEL but the old SS_AR_BYTES and incorrectly mark the guest state as invalid due to SS.dpl != SS.rpl. Don't bother updating the cache even though prepare_vmcs02_rare() writes every segment. With unrestricted guest, guest segments are almost never read, let alone L2 guest segments. On the other hand, populating the cache requires a large number of memory writes, i.e. it's unlikely to be a net win. Updating the cache would be a win when unrestricted guest is not supported, as guest_state_valid() will immediately cache all segment registers. But, nested virtualization without unrestricted guest is dirt slow, saving some VMREADs won't change that, and every CPU manufactured in the last decade supports unrestricted guest. In other words, the extra (minor) complexity isn't worth the trouble. Note, kvm_arch_vcpu_put() may see stale data when querying guest CPL depending on when preemption occurs. This is "ok" in that the usage is imperfect by nature, i.e. it's used heuristically to improve performance but doesn't affect functionality. kvm_arch_vcpu_put() could be "fixed" by also disabling preemption while loading segments, but that's pointless and misleading as reading state from kvm_sched_{in,out}() is guaranteed to see stale data in one form or another. E.g. even if all the usage of regs_avail is fixed to call kvm_register_mark_available() after the associated state is set, the individual state might still be stale with respect to the overall vCPU state. I.e. making functional decisions in an asynchronous hook is doomed from the get go. Thankfully KVM doesn't do that. Fixes: de63ad4cf4973 ("KVM: X86: implement the logic for spinlock optimization") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index b6ce9ce91029..04441663a631 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2408,6 +2408,8 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) vmcs_writel(GUEST_TR_BASE, vmcs12->guest_tr_base); vmcs_writel(GUEST_GDTR_BASE, vmcs12->guest_gdtr_base); vmcs_writel(GUEST_IDTR_BASE, vmcs12->guest_idtr_base); + + vmx->segment_cache.bitmask = 0; } if (!hv_evmcs || !(hv_evmcs->hv_clean_fields & From patchwork Wed Sep 23 18:44:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795511 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72325618 for ; Wed, 23 Sep 2020 18:45:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5501A2371F for ; Wed, 23 Sep 2020 18:45:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726921AbgIWSoz (ORCPT ); Wed, 23 Sep 2020 14:44:55 -0400 Received: from mga07.intel.com ([134.134.136.100]:14506 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726711AbgIWSoz (ORCPT ); Wed, 23 Sep 2020 14:44:55 -0400 IronPort-SDR: VbZQuuZ+uUK8e8z4f2nEeqHvBbaa25PHUsmaC2PkKNF3bmmKj3a4bH7GUJdhl6i3x3+KUuL+OZ fqRIWDeivXKA== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124474" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124474" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:54 -0700 IronPort-SDR: LjcIMUL8xH72IKsmsKayxSISTkND3noBLIYVlSPe2Ub1AIz3Nb3RU+oymkMq3+G3HBqsvn8DxO 98ncN8HmVSZw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457651" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:53 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 2/7] KVM: nVMX: Reload vmcs01 if getting vmcs12's pages fails Date: Wed, 23 Sep 2020 11:44:47 -0700 Message-Id: <20200923184452.980-3-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Reload vmcs01 when bailing from nested_vmx_enter_non_root_mode() as KVM expects vmcs01 to be loaded when is_guest_mode() is false. Fixes: 671ddc700fd08 ("KVM: nVMX: Don't leak L1 MMIO regions to L2") Cc: stable@vger.kernel.org Cc: Dan Cross Cc: Jim Mattson Cc: Peter Shier Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 04441663a631..171e34286908 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3346,8 +3346,10 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, prepare_vmcs02_early(vmx, vmcs12); if (from_vmentry) { - if (unlikely(!nested_get_vmcs12_pages(vcpu))) + if (unlikely(!nested_get_vmcs12_pages(vcpu))) { + vmx_switch_vmcs(vcpu, &vmx->vmcs01); return NVMX_VMENTRY_KVM_INTERNAL_ERROR; + } if (nested_vmx_check_vmentry_hw(vcpu)) { vmx_switch_vmcs(vcpu, &vmx->vmcs01); From patchwork Wed Sep 23 18:44:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795519 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 56A7D618 for ; Wed, 23 Sep 2020 18:45:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3F7FD2371F for ; Wed, 23 Sep 2020 18:45:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726814AbgIWSoz (ORCPT ); Wed, 23 Sep 2020 14:44:55 -0400 Received: from mga07.intel.com ([134.134.136.100]:14508 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726572AbgIWSoz (ORCPT ); Wed, 23 Sep 2020 14:44:55 -0400 IronPort-SDR: 1F7CkZkGOyF8NgCYve3Ca3B9ZBVcic9I5Dd8XC7vG035bEyI9ZdlbzjxsHVmGzOKoi2OGdG18G yayQ+VLGfYSg== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124476" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124476" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:54 -0700 IronPort-SDR: kD4EuJhIxwZ98QtVceAE/7zQkm55kLcJtHz1/7pjNAWRZPuVoZwIpeuMm99Gh+f/y2dIvaj/QN B0F0czfVTbVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457655" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:54 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 3/7] KVM: nVMX: Explicitly check for valid guest state for !unrestricted guest Date: Wed, 23 Sep 2020 11:44:48 -0700 Message-Id: <20200923184452.980-4-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Call guest_state_valid() directly instead of querying emulation_required when checking if L1 is attempting VM-Enter with invalid guest state. If emulate_invalid_guest_state is false, KVM will fixup segment regs to avoid emulation and will never set emulation_required, i.e. KVM will incorrectly miss the associated consistency checks because the nested path stuffs segments directly into vmcs02. Opportunsitically add Consistency Check tracing to make future debug suck a little less. Fixes: 2bb8cafea80bf ("KVM: vVMX: signal failure for nested VMEntry if emulation_required") Fixes: 3184a995f782c ("KVM: nVMX: fix vmentry failure code when L2 state would require emulation") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/vmx/vmx.c | 8 ++------ arch/x86/kvm/vmx/vmx.h | 9 +++++++++ 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 171e34286908..a50714a86dde 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2573,7 +2573,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, * which means L1 attempted VMEntry to L2 with invalid state. * Fail the VMEntry. */ - if (vmx->emulation_required) { + if (CC(!vmx_guest_state_valid(vcpu))) { *entry_failure_code = ENTRY_FAIL_DEFAULT; return -EINVAL; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6f9a0c6d5dc5..e8480dbef881 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -341,7 +341,6 @@ static const struct kernel_param_ops vmentry_l1d_flush_ops = { }; module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); -static bool guest_state_valid(struct kvm_vcpu *vcpu); static u32 vmx_segment_access_rights(struct kvm_segment *var); static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type); @@ -1415,7 +1414,7 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu) static bool emulation_required(struct kvm_vcpu *vcpu) { - return emulate_invalid_guest_state && !guest_state_valid(vcpu); + return emulate_invalid_guest_state && !vmx_guest_state_valid(vcpu); } unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu) @@ -3471,11 +3470,8 @@ static bool cs_ss_rpl_check(struct kvm_vcpu *vcpu) * not. * We assume that registers are always usable */ -static bool guest_state_valid(struct kvm_vcpu *vcpu) +bool __vmx_guest_state_valid(struct kvm_vcpu *vcpu) { - if (enable_unrestricted_guest) - return true; - /* real mode guest state checks */ if (!is_protmode(vcpu) || (vmx_get_rflags(vcpu) & X86_EFLAGS_VM)) { if (!rmode_segment_valid(vcpu, VCPU_SREG_CS)) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index d7ec66db5eb8..e147f180350f 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -343,6 +343,15 @@ void vmx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); u64 construct_eptp(struct kvm_vcpu *vcpu, unsigned long root_hpa, int root_level); + +bool __vmx_guest_state_valid(struct kvm_vcpu *vcpu); +static inline bool vmx_guest_state_valid(struct kvm_vcpu *vcpu) +{ + if (enable_unrestricted_guest) + return true; + + return __vmx_guest_state_valid(vcpu); +} void update_exception_bitmap(struct kvm_vcpu *vcpu); void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu); bool vmx_nmi_blocked(struct kvm_vcpu *vcpu); From patchwork Wed Sep 23 18:44:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795509 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D543618 for ; Wed, 23 Sep 2020 18:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 68CCC2376E for ; Wed, 23 Sep 2020 18:45:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726935AbgIWSo5 (ORCPT ); Wed, 23 Sep 2020 14:44:57 -0400 Received: from mga07.intel.com ([134.134.136.100]:14510 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726773AbgIWSoz (ORCPT ); Wed, 23 Sep 2020 14:44:55 -0400 IronPort-SDR: dt2zHc70Iu124FEdgob2ULPmBh/qHl565TzG1QWzQyjsest6nRzrDuuUYlyMWXBueEtN76fzfW ELLggUjNZaLQ== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124479" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124479" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:55 -0700 IronPort-SDR: mbwuOKgc2yOPK25bFniFcAjvPh+thE0mG9RaXY1HSBsoWOomHDR9EHpX7TGpiDM3TSx5sTTHey 7rkD96mI5O0g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457660" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:54 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 4/7] KVM: nVMX: Move free_nested() below vmx_switch_vmcs() Date: Wed, 23 Sep 2020 11:44:49 -0700 Message-Id: <20200923184452.980-5-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move free_nested() down below vmx_switch_vmcs() so that a future patch can do an "emergency" invocation of vmx_switch_vmcs() if vmcs01 is not the loaded VMCS when freeing nested resources. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 88 +++++++++++++++++++-------------------- 1 file changed, 44 insertions(+), 44 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index a50714a86dde..03dddf1b6009 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -233,50 +233,6 @@ static inline void nested_release_evmcs(struct kvm_vcpu *vcpu) vmx->nested.hv_evmcs = NULL; } -/* - * Free whatever needs to be freed from vmx->nested when L1 goes down, or - * just stops using VMX. - */ -static void free_nested(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon) - return; - - kvm_clear_request(KVM_REQ_GET_VMCS12_PAGES, vcpu); - - vmx->nested.vmxon = false; - vmx->nested.smm.vmxon = false; - free_vpid(vmx->nested.vpid02); - vmx->nested.posted_intr_nv = -1; - vmx->nested.current_vmptr = -1ull; - if (enable_shadow_vmcs) { - vmx_disable_shadow_vmcs(vmx); - vmcs_clear(vmx->vmcs01.shadow_vmcs); - free_vmcs(vmx->vmcs01.shadow_vmcs); - vmx->vmcs01.shadow_vmcs = NULL; - } - kfree(vmx->nested.cached_vmcs12); - vmx->nested.cached_vmcs12 = NULL; - kfree(vmx->nested.cached_shadow_vmcs12); - vmx->nested.cached_shadow_vmcs12 = NULL; - /* Unpin physical memory we referred to in the vmcs02 */ - if (vmx->nested.apic_access_page) { - kvm_release_page_clean(vmx->nested.apic_access_page); - vmx->nested.apic_access_page = NULL; - } - kvm_vcpu_unmap(vcpu, &vmx->nested.virtual_apic_map, true); - kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true); - vmx->nested.pi_desc = NULL; - - kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL); - - nested_release_evmcs(vcpu); - - free_loaded_vmcs(&vmx->nested.vmcs02); -} - static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx, struct loaded_vmcs *prev) { @@ -315,6 +271,50 @@ static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs) vmx_register_cache_reset(vcpu); } +/* + * Free whatever needs to be freed from vmx->nested when L1 goes down, or + * just stops using VMX. + */ +static void free_nested(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon) + return; + + kvm_clear_request(KVM_REQ_GET_VMCS12_PAGES, vcpu); + + vmx->nested.vmxon = false; + vmx->nested.smm.vmxon = false; + free_vpid(vmx->nested.vpid02); + vmx->nested.posted_intr_nv = -1; + vmx->nested.current_vmptr = -1ull; + if (enable_shadow_vmcs) { + vmx_disable_shadow_vmcs(vmx); + vmcs_clear(vmx->vmcs01.shadow_vmcs); + free_vmcs(vmx->vmcs01.shadow_vmcs); + vmx->vmcs01.shadow_vmcs = NULL; + } + kfree(vmx->nested.cached_vmcs12); + vmx->nested.cached_vmcs12 = NULL; + kfree(vmx->nested.cached_shadow_vmcs12); + vmx->nested.cached_shadow_vmcs12 = NULL; + /* Unpin physical memory we referred to in the vmcs02 */ + if (vmx->nested.apic_access_page) { + kvm_release_page_clean(vmx->nested.apic_access_page); + vmx->nested.apic_access_page = NULL; + } + kvm_vcpu_unmap(vcpu, &vmx->nested.virtual_apic_map, true); + kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true); + vmx->nested.pi_desc = NULL; + + kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL); + + nested_release_evmcs(vcpu); + + free_loaded_vmcs(&vmx->nested.vmcs02); +} + /* * Ensure that the current vmcs of the logical processor is the * vmcs01 of the vcpu before calling free_nested(). From patchwork Wed Sep 23 18:44:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795517 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1D3EF618 for ; Wed, 23 Sep 2020 18:45:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B1942376E for ; Wed, 23 Sep 2020 18:45:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726979AbgIWSpQ (ORCPT ); Wed, 23 Sep 2020 14:45:16 -0400 Received: from mga07.intel.com ([134.134.136.100]:14510 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726572AbgIWSo4 (ORCPT ); Wed, 23 Sep 2020 14:44:56 -0400 IronPort-SDR: 0cYw9OYCf+6zlqPkwLqvWwvdGxaKTq0yHAhWzSAboqTvXaXUUoFs4tNkeKPPNc4INHuLWU239C l18iXaHHXTAg== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124482" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124482" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:55 -0700 IronPort-SDR: nBFUT/ClqmT7X9z4IaAId7lj3ZBhyGA+jG+t8fWea2uO7jw1PqrAKy/7uyw9MDFLcdC974eTpW ohTPvmS1Jgnw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457664" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:55 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 5/7] KVM: nVMX: Ensure vmcs01 is the loaded VMCS when freeing nested state Date: Wed, 23 Sep 2020 11:44:50 -0700 Message-Id: <20200923184452.980-6-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a WARN in free_nested() to ensure vmcs01 is loaded prior to freeing vmcs02 and friends, and explicitly switch to vmcs01 if it's not. KVM is supposed to keep is_guest_mode() and loaded_vmcs==vmcs02 synchronized, but bugs happen and freeing vmcs02 while it's in use will escalate a KVM error to a use-after-free and potentially crash the kernel. Do the WARN and switch even in the !vmxon case to help detect latent bugs. free_nested() is not a hot path, and the check is cheap. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 03dddf1b6009..3e6cc0d7090e 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -279,6 +279,9 @@ static void free_nested(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); + if (WARN_ON_ONCE(vmx->loaded_vmcs != &vmx->vmcs01)) + vmx_switch_vmcs(vcpu, &vmx->vmcs01); + if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon) return; From patchwork Wed Sep 23 18:44:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795515 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0553C618 for ; Wed, 23 Sep 2020 18:45:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DE8422376F for ; Wed, 23 Sep 2020 18:45:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726839AbgIWSpH (ORCPT ); Wed, 23 Sep 2020 14:45:07 -0400 Received: from mga07.intel.com ([134.134.136.100]:14513 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726929AbgIWSo4 (ORCPT ); Wed, 23 Sep 2020 14:44:56 -0400 IronPort-SDR: K+es5OCSxHlIVZRqBnUIEbvXIZe0yVh8RBUACVsqMfp/9fp73Fj4lWl2DRdkxyZbY7mMIpVbnn o/cqCrR6T3yg== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124485" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124485" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:56 -0700 IronPort-SDR: LJmfUAh3YJ/mUEf5ifh8x++CzfEJN3+5Xz6IPpLqHG4SnupQ83mqwCwZLWnltROZYuX52UiM60 gwdRfu0hCAow== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457671" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:55 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 6/7] KVM: nVMX: Drop redundant VMCS switch and free_nested() call Date: Wed, 23 Sep 2020 11:44:51 -0700 Message-Id: <20200923184452.980-7-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Remove the explicit switch to vmcs01 and the call to free_nested() in nested_vmx_free_vcpu(). free_nested(), which is called unconditionally by vmx_leave_nested(), ensures vmcs01 is loaded prior to freeing vmcs02 and friends. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 3e6cc0d7090e..63550dcf6b9f 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -326,8 +326,6 @@ void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu) { vcpu_load(vcpu); vmx_leave_nested(vcpu); - vmx_switch_vmcs(vcpu, &to_vmx(vcpu)->vmcs01); - free_nested(vcpu); vcpu_put(vcpu); } From patchwork Wed Sep 23 18:44:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 11795513 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DC99618 for ; Wed, 23 Sep 2020 18:45:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 152542376E for ; Wed, 23 Sep 2020 18:45:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726953AbgIWSpH (ORCPT ); Wed, 23 Sep 2020 14:45:07 -0400 Received: from mga07.intel.com ([134.134.136.100]:14513 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726674AbgIWSpG (ORCPT ); Wed, 23 Sep 2020 14:45:06 -0400 IronPort-SDR: oAnOQPo+fGVnoEnZ1VDWJmPmPGWYGsc6dKeDHgOR2xC2Q9yDG4kpLIXK8gVa3xkAnHroqP66dR zsL2C6aWj96g== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="225124487" X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="225124487" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 11:44:56 -0700 IronPort-SDR: ipvuSnANJBxVJaIKiu7lfImqITPAsyeQbfSTm8wv0FKyqCo7QM5RGpGzPBHHrNx2kvb6l8rLFr w5lMmwPllLDA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,293,1596524400"; d="scan'208";a="347457677" Received: from sjchrist-coffee.jf.intel.com ([10.54.74.160]) by FMSMGA003.fm.intel.com with ESMTP; 23 Sep 2020 11:44:56 -0700 From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Dan Cross , Peter Shier Subject: [PATCH v2 7/7] KVM: nVMX: WARN on attempt to switch the currently loaded VMCS Date: Wed, 23 Sep 2020 11:44:52 -0700 Message-Id: <20200923184452.980-8-sean.j.christopherson@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200923184452.980-1-sean.j.christopherson@intel.com> References: <20200923184452.980-1-sean.j.christopherson@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org WARN if KVM attempts to switch to the currently loaded VMCS. Now that nested_vmx_free_vcpu() doesn't blindly call vmx_switch_vmcs(), all paths that lead to vmx_switch_vmcs() are implicitly guarded by guest vs. host mode, e.g. KVM should never emulate VMX instructions when guest mode is active, and nested_vmx_vmexit() should never be called when host mode is active. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 63550dcf6b9f..4bddda078370 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -258,7 +258,7 @@ static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs) struct loaded_vmcs *prev; int cpu; - if (vmx->loaded_vmcs == vmcs) + if (WARN_ON_ONCE(vmx->loaded_vmcs == vmcs)) return; cpu = get_cpu();