From patchwork Wed Jul 1 08:04:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635605 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 259ED17D1 for ; Wed, 1 Jul 2020 08:06:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0C71C20775 for ; Wed, 1 Jul 2020 08:06:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728545AbgGAIEZ (ORCPT ); Wed, 1 Jul 2020 04:04:25 -0400 Received: from mga14.intel.com ([192.55.52.115]:2413 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728420AbgGAIEY (ORCPT ); Wed, 1 Jul 2020 04:04:24 -0400 IronPort-SDR: EQuRO9IEuwjQqllaAo0u+uUjjoZfwcgp52vqoPoVMX8oqhbrYe79sNt8mksF++NHyJgVo8EYI5 WmnqunQQTcGw== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581826" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581826" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:14 -0700 IronPort-SDR: mzze7toC3jy/njNgD0C5r2G7i7KN51Ncg/FUIQJ/VGZLtwgf3uyD8GtUFiz6aPiEgjP8/S09q3 KHo7FeAbZ1zA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010225" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:12 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Date: Wed, 1 Jul 2020 16:04:01 +0800 Message-Id: <20200701080411.5802-2-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org These definitions are added by CET kernel patch and referenced by KVM, if the CET KVM patches are tested without CET kernel patches, this patch should be included. Signed-off-by: Yang Weijiang --- include/linux/kvm_host.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 01276e3d01b9..20e0fe70d3f7 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -35,6 +35,38 @@ #include +#ifndef CONFIG_X86_INTEL_CET +#define XFEATURE_CET_USER 11 +#define XFEATURE_CET_KERNEL 12 + +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) + +/* Control-flow Enforcement Technology MSRs */ +#define MSR_IA32_U_CET 0x6a0 /* user mode cet setting */ +#define MSR_IA32_S_CET 0x6a2 /* kernel mode cet setting */ +#define MSR_IA32_PL0_SSP 0x6a4 /* kernel shstk pointer */ +#define MSR_IA32_PL1_SSP 0x6a5 /* ring-1 shstk pointer */ +#define MSR_IA32_PL2_SSP 0x6a6 /* ring-2 shstk pointer */ +#define MSR_IA32_PL3_SSP 0x6a7 /* user shstk pointer */ +#define MSR_IA32_INT_SSP_TAB 0x6a8 /* exception shstk table */ + +#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement */ +#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) + +#define X86_FEATURE_SHSTK (16*32+ 7) /* Shadow Stack */ +#define X86_FEATURE_IBT (18*32+20) /* Indirect Branch Tracking */ + +/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */ +#define MSR_IA32_CET_SHSTK_EN 0x0000000000000001ULL +#define MSR_IA32_CET_WRSS_EN 0x0000000000000002ULL +#define MSR_IA32_CET_ENDBR_EN 0x0000000000000004ULL +#define MSR_IA32_CET_LEG_IW_EN 0x0000000000000008ULL +#define MSR_IA32_CET_NO_TRACK_EN 0x0000000000000010ULL +#define MSR_IA32_CET_WAIT_ENDBR 0x00000000000000800UL +#define MSR_IA32_CET_BITMAP_MASK 0xfffffffffffff000ULL +#endif + #ifndef KVM_MAX_VCPU_ID #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS #endif From patchwork Wed Jul 1 08:04:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635613 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 40C4914E3 for ; Wed, 1 Jul 2020 08:06:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2F9002073E for ; Wed, 1 Jul 2020 08:06:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728842AbgGAIGZ (ORCPT ); Wed, 1 Jul 2020 04:06:25 -0400 Received: from mga14.intel.com ([192.55.52.115]:2415 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728525AbgGAIEY (ORCPT ); Wed, 1 Jul 2020 04:04:24 -0400 IronPort-SDR: YCsLNhQHuGybz99u3w35Mvdyntp3dU3qmXSyI1OcOKzc8w3cJawZm8ysEhK5cFM9eDUz1xMExc G2IXNSamGu3A== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581839" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581839" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:16 -0700 IronPort-SDR: Uwe4DyF0X1ZZyS+BQvguySEHK0wMqgqu+e4evwVYX6fgkPYEZ2P9H1sZGikYtaz0524b1+ETtj CiF80+/ncJyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010243" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:14 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Date: Wed, 1 Jul 2020 16:04:02 +0800 Message-Id: <20200701080411.5802-3-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET(Control-flow Enforcement Technology) is a CPU feature used to prevent Return/Jump-Oriented Programming(ROP/JOP) attacks. It provides the following sub-features to defend against ROP/JOP style control-flow subversion attacks: Shadow Stack (SHSTK): A second stack for program which is used exclusively for control transfer operations. Indirect Branch Tracking (IBT): Code branching protection to defend against jump/call oriented programming. Several new CET MSRs are defined in kernel to support CET: MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode respectively. MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3 protection respectively. MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores shadow stack pointer of current task/thread. {HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer table. If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from below VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from below VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 5e090d1f03f8..f301def9125a 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -94,6 +94,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -107,6 +108,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -328,6 +330,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822, GUEST_SYSENTER_ESP = 0x00006824, GUEST_SYSENTER_EIP = 0x00006826, + GUEST_S_CET = 0x00006828, + GUEST_SSP = 0x0000682a, + GUEST_INTR_SSP_TABLE = 0x0000682c, HOST_CR0 = 0x00006c00, HOST_CR3 = 0x00006c02, HOST_CR4 = 0x00006c04, @@ -340,6 +345,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP = 0x00006c12, HOST_RSP = 0x00006c14, HOST_RIP = 0x00006c16, + HOST_S_CET = 0x00006c18, + HOST_SSP = 0x00006c1a, + HOST_INTR_SSP_TABLE = 0x00006c1c }; /* From patchwork Wed Jul 1 08:04:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635609 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 55DBD14E3 for ; Wed, 1 Jul 2020 08:06:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4215420775 for ; Wed, 1 Jul 2020 08:06:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728579AbgGAIGP (ORCPT ); Wed, 1 Jul 2020 04:06:15 -0400 Received: from mga14.intel.com ([192.55.52.115]:2413 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728539AbgGAIEZ (ORCPT ); Wed, 1 Jul 2020 04:04:25 -0400 IronPort-SDR: D1/kF3CPEZz69m98BBC7EVJnrP9v8BhbF7FT8wwJe+uYDLUOuuS5Ilc+gu8mYxe/L3+B6FNw7U Cu/oB2vIrMvA== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581852" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581852" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:19 -0700 IronPort-SDR: T/46yqb6MTgFgR70xah2N11zDNTNCFFc8QPVsWnK6209N82B025OUsyNv0ikGSSmNw642oeLP/ /whbcHrD4/6w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010254" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:16 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Date: Wed, 1 Jul 2020 16:04:03 +0800 Message-Id: <20200701080411.5802-4-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET MSRs pass through guest directly to enhance performance. CET runtime control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP) are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here. MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR contents are switched between threads during scheduling, it makes sense to pass through them so that the guest kernel can use xsaves/xrstors to operate them efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed info. The difference between CET VMCS fields and CET MSRs is that,the former are used during VMEnter/VMExit, whereas the latter are used for CET state storage between task/thread scheduling. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 3 +++ 2 files changed, 49 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d52d470e36b1..97e766875a7e 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3020,6 +3020,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long cr3) vmcs_writel(GUEST_CR3, guest_cr3); } +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states) +{ + return ((supported_xss & xss_states) && + (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) || + guest_cpuid_has(vcpu, X86_FEATURE_IBT))); +} + int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7098,6 +7105,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; + bool incpt; + + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER); + /* + * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as + * one component and controlled by IA32_XSS[bit 11]. + */ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW, + incpt); + + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL); + /* + * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are + * bound as one component and controlled by IA32_XSS[bit 12]. + */ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW, + incpt); + + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + /* SSP_TAB is only available for KERNEL SHSTK.*/ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, + incpt); +} + static void vmx_cpuid_update(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7136,6 +7179,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu) vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE); } } + + if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER)) + vmx_update_intercept_for_cet_msr(vcpu); } static __init void vmx_set_cpu_caps(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c5835f9cb9ad..6390b62c12ed 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -186,6 +186,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU) +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) + u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); From patchwork Wed Jul 1 08:04:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635593 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 51812174A for ; Wed, 1 Jul 2020 08:05:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41F1220775 for ; Wed, 1 Jul 2020 08:05:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728604AbgGAIEc (ORCPT ); Wed, 1 Jul 2020 04:04:32 -0400 Received: from mga14.intel.com ([192.55.52.115]:2418 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728542AbgGAIE1 (ORCPT ); Wed, 1 Jul 2020 04:04:27 -0400 IronPort-SDR: kCyYez8SbF4ojfKV+nMExKTnloaEq/pvC1DRhBlwoJkPIJF71ZfjH7X7Jy2vcZzvoVhl25+vpc r1bxNnsMxe+A== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581877" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581877" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:21 -0700 IronPort-SDR: IEZK2Ncsyc/n/I85fAQUdmYa0DgrawWG4GM+EFbYPGkBn4qge3sPE7cOL/ddvGyRTPIslE6w2y OK1xV3pFoNVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010267" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:19 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Date: Wed, 1 Jul 2020 16:04:04 +0800 Message-Id: <20200701080411.5802-5-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CR4.CET is master control bit for CET function. There're mutual constrains between CR0.WP and CR4.CET, so need to check the dependent bit while changing the control registers. The processor does not allow CR4.CET to be set if CR0.WP = 0,similarly, it does not allow CR0.WP to be cleared while CR4.CET = 1. In either case, KVM would inject #GP to guest. CET state load bit is set/cleared along with CR4.CET bit set/clear. Note: SHSTK and IBT features share one control MSR: MSR_IA32_{U,S}_CET, which means it's difficult to hide one feature from another in the case of SHSTK != IBT, after discussed in community, it's agreed to allow guest control two features independently as it won't introduce security hole. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/capabilities.h | 5 +++++ arch/x86/kvm/vmx/vmx.c | 30 ++++++++++++++++++++++++++++-- arch/x86/kvm/x86.c | 3 +++ 3 files changed, 36 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 8903475f751e..52223f7d31d8 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -101,6 +101,11 @@ static inline bool cpu_has_load_perf_global_ctrl(void) (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL); } +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE) && + (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) && diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 97e766875a7e..7137e252ab38 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2440,7 +2440,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, VM_EXIT_LOAD_IA32_EFER | VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_PT_CONCEAL_PIP | - VM_EXIT_CLEAR_IA32_RTIT_CTL; + VM_EXIT_CLEAR_IA32_RTIT_CTL | + VM_EXIT_LOAD_CET_STATE; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, &_vmexit_control) < 0) return -EIO; @@ -2464,7 +2465,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_BNDCFGS | VM_ENTRY_PT_CONCEAL_PIP | - VM_ENTRY_LOAD_IA32_RTIT_CTL; + VM_ENTRY_LOAD_IA32_RTIT_CTL | + VM_ENTRY_LOAD_CET_STATE; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, &_vmentry_control) < 0) return -EIO; @@ -3027,6 +3029,12 @@ static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states) guest_cpuid_has(vcpu, X86_FEATURE_IBT))); } +static bool is_cet_supported(struct kvm_vcpu *vcpu) +{ + return is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); +} + int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3067,6 +3075,10 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if ((cr4 & X86_CR4_CET) && (!is_cet_supported(vcpu) || + !(kvm_read_cr0(vcpu) & X86_CR0_WP))) + return 1; + if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4)) return 1; @@ -3097,6 +3109,20 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); } + if (cpu_has_load_cet_ctrl()) { + if ((hw_cr4 & X86_CR4_CET) && is_cet_supported(vcpu)) { + vm_entry_controls_setbit(to_vmx(vcpu), + VM_ENTRY_LOAD_CET_STATE); + vm_exit_controls_setbit(to_vmx(vcpu), + VM_EXIT_LOAD_CET_STATE); + } else { + vm_entry_controls_clearbit(to_vmx(vcpu), + VM_ENTRY_LOAD_CET_STATE); + vm_exit_controls_clearbit(to_vmx(vcpu), + VM_EXIT_LOAD_CET_STATE); + } + } + vmcs_writel(CR4_READ_SHADOW, cr4); vmcs_writel(GUEST_CR4, hw_cr4); return 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6390b62c12ed..b63727318da1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -803,6 +803,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) return 1; + if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET)) + return 1; + kvm_x86_ops.set_cr0(vcpu, cr0); if ((cr0 ^ old_cr0) & X86_CR0_PG) { From patchwork Wed Jul 1 08:04:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635597 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3850814E3 for ; Wed, 1 Jul 2020 08:06:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27F3C2073E for ; Wed, 1 Jul 2020 08:06:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728573AbgGAIE1 (ORCPT ); Wed, 1 Jul 2020 04:04:27 -0400 Received: from mga14.intel.com ([192.55.52.115]:2419 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728258AbgGAIE0 (ORCPT ); Wed, 1 Jul 2020 04:04:26 -0400 IronPort-SDR: yp0F7uIO61AbnUpHIyRxZawPdnfA8s2sU0Zl9mmHd/pd+vtEsCy/Ou/och4ufEhQKA6/Q0n1eL UFhMGFfzRvUg== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581891" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581891" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:23 -0700 IronPort-SDR: pj8xvA+ZoUi2T0+JcFPPirZCBI887UqkrWc9L95otJRwGJ2sUUne1syM9zY+r6ZUtR07XGkNq5 dGZIBuwAEdyA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010276" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:21 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Date: Wed, 1 Jul 2020 16:04:05 +0800 Message-Id: <20200701080411.5802-6-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS, when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise guest will fetch old state size, and results to some WARN traces during guest running. supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends on the setting of CPUID(0xd,1).{ECX, EDX} for guest. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/cpuid.c | 23 +++++++++++++++++++---- arch/x86/kvm/x86.c | 12 ++++++++---- 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 42a2d0d3984a..f68c825e94ad 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -649,6 +649,7 @@ struct kvm_vcpu_arch { u64 xcr0; u64 guest_supported_xcr0; + u64 guest_supported_xss; u32 guest_xstate_size; struct kvm_pio_request pio; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 901cd1fdecd9..984ab2b395b3 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -89,15 +89,30 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.guest_xstate_size = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET; } else { vcpu->arch.guest_supported_xcr0 = - (best->eax | ((u64)best->edx << 32)) & supported_xcr0; + (((u64)best->edx << 32) | best->eax) & supported_xcr0; vcpu->arch.guest_xstate_size = best->ebx = xstate_required_size(vcpu->arch.xcr0, false); } best = kvm_find_cpuid_entry(vcpu, 0xD, 1); - if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || - cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx = xstate_required_size(vcpu->arch.xcr0, true); + if (best) { + if (cpuid_entry_has(best, X86_FEATURE_XSAVES) || + cpuid_entry_has(best, X86_FEATURE_XSAVEC)) { + u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss; + + best->ebx = xstate_required_size(xstate, true); + } + + if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) { + best->ecx = 0; + best->edx = 0; + } + vcpu->arch.guest_supported_xss = + (((u64)best->edx << 32) | best->ecx) & supported_xss; + + } else { + vcpu->arch.guest_supported_xss = 0; + } /* * The existing code assumes virtual address is 48-bit or 57-bit in the diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b63727318da1..c866087ed0ef 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2843,9 +2843,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than * XSAVES/XRSTORS to save/restore PT MSRs. */ - if (data & ~supported_xss) + if (data & ~vcpu->arch.guest_supported_xss) return 1; - vcpu->arch.ia32_xss = data; + if (vcpu->arch.ia32_xss != data) { + vcpu->arch.ia32_xss = data; + kvm_update_cpuid(vcpu); + } break; case MSR_SMI_COUNT: if (!msr_info->host_initiated) @@ -9678,8 +9681,9 @@ int kvm_arch_hardware_setup(void *opaque) memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); - if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) - supported_xss = 0; + supported_xss = 0; + if (kvm_cpu_cap_has(X86_FEATURE_XSAVES)) + supported_xss = host_xss & KVM_SUPPORTED_XSS; cr4_reserved_bits = kvm_host_cr4_reserved_bits(&boot_cpu_data); From patchwork Wed Jul 1 08:04:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A9BE92A for ; Wed, 1 Jul 2020 08:04:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 043762078A for ; Wed, 1 Jul 2020 08:04:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728584AbgGAIEa (ORCPT ); Wed, 1 Jul 2020 04:04:30 -0400 Received: from mga14.intel.com ([192.55.52.115]:2420 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728559AbgGAIE0 (ORCPT ); Wed, 1 Jul 2020 04:04:26 -0400 IronPort-SDR: fH2f/k7R8Ta23mI+JVt069IAwHoHHK70hHG2jsA8dpp1zorZT4bc7QnoPei8MExtibQmPohORS R33OzXsrZQyg== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581902" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581902" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:25 -0700 IronPort-SDR: h6kF0ie/3Tu/eBacfEyB1wdARMI1pHv/z8552Qwwq1s8MR5Z27v1TthsHE0UNVdcbiIyREjGej nwPsUvkPqmww== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010296" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:23 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Date: Wed, 1 Jul 2020 16:04:06 +0800 Message-Id: <20200701080411.5802-7-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson A handful of CET MSRs are not context switched through "traditional" methods, e.g. VMCS or manual switching, but rather are passed through to the guest and are saved and restored by XSAVES/XRSTORS, i.e. in the guest's FPU state. Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(), can simply do {RD,WR}MSR to access the guest's value. Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c866087ed0ef..50f80dcab3a9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -109,6 +109,8 @@ static void enter_smm(struct kvm_vcpu *vcpu); static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); static void store_regs(struct kvm_vcpu *vcpu); static int sync_regs(struct kvm_vcpu *vcpu); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); struct kvm_x86_ops kvm_x86_ops __read_mostly; EXPORT_SYMBOL_GPL(kvm_x86_ops); @@ -3267,6 +3269,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static bool is_xsaves_msr(u32 index) +{ + return index == MSR_IA32_U_CET || + (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP); +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -3277,11 +3285,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && supported_xss && + is_xsaves_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } From patchwork Wed Jul 1 08:04:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635591 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69C1F14E3 for ; Wed, 1 Jul 2020 08:05:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5121320775 for ; Wed, 1 Jul 2020 08:05:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728612AbgGAIEe (ORCPT ); Wed, 1 Jul 2020 04:04:34 -0400 Received: from mga14.intel.com ([192.55.52.115]:2426 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728420AbgGAIE2 (ORCPT ); Wed, 1 Jul 2020 04:04:28 -0400 IronPort-SDR: VAGoSOlpaH3GlSDzIl9bRBk52CYIku/MDOk/DMv7AscMIM7NgEs5Rc+/8ep5IEwp+xFMZo6FRa iddW9E/Eld9g== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581915" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581915" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:27 -0700 IronPort-SDR: VtelblvfoNFveprPZP8/kAP3n/IgO3JPLKC3aeuX6e53J9vOvOqyHRKsaxDjRq87667Y9Cdl8T Tx//YPRQIi9w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010309" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:25 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Date: Wed, 1 Jul 2020 16:04:07 +0800 Message-Id: <20200701080411.5802-8-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There're two different places storing Guest CET states, states managed with XSAVES/XRSTORS, as restored/saved in previous patch, can be read/write directly from/to the MSRs. For those stored in VMCS fields, they're access via vmcs_read/vmcs_write. To correctly read/write the CET MSRs, it's necessary to check whether the kernel FPU context switch happened and reload guest FPU context if needed. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm_para.h | 7 +- arch/x86/kvm/vmx/vmx.c | 148 +++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 4 + 3 files changed, 156 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 2a8e0b6b9805..211bba6f7d8a 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -46,10 +46,11 @@ /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */ #define MSR_KVM_WALL_CLOCK_NEW 0x4b564d00 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01 -#define MSR_KVM_ASYNC_PF_EN 0x4b564d02 -#define MSR_KVM_STEAL_TIME 0x4b564d03 -#define MSR_KVM_PV_EOI_EN 0x4b564d04 +#define MSR_KVM_ASYNC_PF_EN 0x4b564d02 +#define MSR_KVM_STEAL_TIME 0x4b564d03 +#define MSR_KVM_PV_EOI_EN 0x4b564d04 #define MSR_KVM_POLL_CONTROL 0x4b564d05 +#define MSR_KVM_GUEST_SSP 0x4b564d06 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7137e252ab38..7f3a65ee64c5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1777,6 +1777,94 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) } } +static void vmx_get_xsave_msr(struct msr_data *msr_info) +{ + local_irq_disable(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + rdmsrl(msr_info->index, msr_info->data); + local_irq_enable(); +} + +static void vmx_set_xsave_msr(struct msr_data *msr_info) +{ + local_irq_disable(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + wrmsrl(msr_info->index, msr_info->data); + local_irq_enable(); +} + +#define CET_MSR_RSVD_BITS_1 GENMASK(2, 0) +#define CET_MSR_RSVD_BITS_2 GENMASK(9, 6) + +static bool cet_check_msr_valid(struct kvm_vcpu *vcpu, + struct msr_data *msr, u64 rsvd_bits) +{ + u64 data = msr->data; + u32 index = msr->index; + + if ((index == MSR_IA32_PL0_SSP || index == MSR_IA32_PL1_SSP || + index == MSR_IA32_PL2_SSP || index == MSR_IA32_PL3_SSP || + index == MSR_IA32_INT_SSP_TAB || index == MSR_KVM_GUEST_SSP) && + is_noncanonical_address(data, vcpu)) + return false; + + if ((index == MSR_IA32_S_CET || index == MSR_IA32_U_CET) && + data & MSR_IA32_CET_ENDBR_EN) { + u64 bitmap_base = data >> 12; + + if (is_noncanonical_address(bitmap_base, vcpu)) + return false; + } + + return !(data & rsvd_bits); +} + +static bool cet_check_ssp_msr_accessible(struct kvm_vcpu *vcpu, + struct msr_data *msr) +{ + u32 index = msr->index; + + if (!boot_cpu_has(X86_FEATURE_SHSTK)) + return false; + + if (!msr->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) + return false; + + if (index == MSR_KVM_GUEST_SSP) + return msr->host_initiated && + guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + if (index == MSR_IA32_INT_SSP_TAB) + return true; + + if (index == MSR_IA32_PL3_SSP) + return supported_xss & XFEATURE_MASK_CET_USER; + + return supported_xss & XFEATURE_MASK_CET_KERNEL; +} + +static bool cet_check_ctl_msr_accessible(struct kvm_vcpu *vcpu, + struct msr_data *msr) +{ + u32 index = msr->index; + + if (!boot_cpu_has(X86_FEATURE_SHSTK) && + !boot_cpu_has(X86_FEATURE_IBT)) + return false; + + if (!msr->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) && + !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) + return false; + + if (index == MSR_IA32_U_CET) + return supported_xss & XFEATURE_MASK_CET_USER; + + return supported_xss & XFEATURE_MASK_CET_KERNEL; +} /* * Reads an msr value (of 'msr_index') into 'pdata'. * Returns 0 on success, non-0 otherwise. @@ -1909,6 +1997,31 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_KVM_GUEST_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + msr_info->data = vmcs_readl(GUEST_SSP); + break; + case MSR_IA32_S_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + msr_info->data = vmcs_readl(GUEST_S_CET); + break; + case MSR_IA32_INT_SSP_TAB: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE); + break; + case MSR_IA32_U_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + vmx_get_xsave_msr(msr_info); + break; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + vmx_get_xsave_msr(msr_info); + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) @@ -2165,6 +2278,41 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_KVM_GUEST_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1)) + return 1; + vmcs_writel(GUEST_SSP, data); + break; + case MSR_IA32_S_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2)) + return 1; + vmcs_writel(GUEST_S_CET, data); + break; + case MSR_IA32_INT_SSP_TAB: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, 0)) + return 1; + vmcs_writel(GUEST_INTR_SSP_TABLE, data); + break; + case MSR_IA32_U_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2)) + return 1; + vmx_set_xsave_msr(msr_info); + break; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1)) + return 1; + vmx_set_xsave_msr(msr_info); + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 50f80dcab3a9..9c16ce65fe74 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1228,6 +1228,10 @@ static const u32 msrs_to_save_all[] = { MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13, MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15, MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17, + + MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET, + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, MSR_KVM_GUEST_SSP, }; static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)]; From patchwork Wed Jul 1 08:04:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A34B6161F for ; Wed, 1 Jul 2020 08:06:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 887702078A for ; Wed, 1 Jul 2020 08:06:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728804AbgGAIFz (ORCPT ); Wed, 1 Jul 2020 04:05:55 -0400 Received: from mga14.intel.com ([192.55.52.115]:2439 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728587AbgGAIEb (ORCPT ); Wed, 1 Jul 2020 04:04:31 -0400 IronPort-SDR: JlLtcbXu53bTWac2fVG4UBoftG3sjYev1OOXcXhUDbQt1+vfGb90y/zDayT5REvanxNvTDRhmN 6ML1J0sz/gyQ== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581934" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581934" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:30 -0700 IronPort-SDR: O+1tP+AKmFozbc9QTV2l7zvDvVm54Wq8wTf5EFw+9TWmZ++qSUqTTz1rEqVhzD3crm3ABLFmZP ROLnT2UG6slg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010329" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:27 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 08/11] KVM: VMX: Enable CET support for nested VM Date: Wed, 1 Jul 2020 16:04:08 +0800 Message-Id: <20200701080411.5802-9-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET MSRs pass through guests for performance consideration. Configure the MSRs to match L0/L1 settings so that nested VM is able to run with CET. Add assertions for vmcs12 offset table initialization, these assertions can detect the mismatch of VMCS field encoding and data type at compiling time. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/nested.c | 34 +++++ arch/x86/kvm/vmx/vmcs12.c | 275 ++++++++++++++++++++++---------------- arch/x86/kvm/vmx/vmcs12.h | 14 +- arch/x86/kvm/vmx/vmx.c | 10 ++ 4 files changed, 220 insertions(+), 113 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index fd78ffbde644..ce29475226b6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -555,6 +555,18 @@ static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap) } } +static void nested_vmx_update_intercept_for_msr(struct kvm_vcpu *vcpu, + u32 msr, + unsigned long *msr_bitmap_l1, + unsigned long *msr_bitmap_l0, + int type) +{ + if (!msr_write_intercepted_l01(vcpu, msr)) + nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, + msr_bitmap_l0, + msr, type); +} + /* * Merge L0's and L1's MSR bitmap, return false to indicate that * we do not use the hardware. @@ -626,6 +638,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0, MSR_KERNEL_GS_BASE, MSR_TYPE_RW); + /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */ + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_U_CET, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_S_CET, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); /* * Checking the L0->L1 bitmap is trying to verify two things: * diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 53dfb401316d..f68ac66b1170 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -4,31 +4,76 @@ #define ROL16(val, n) ((u16)(((u16)(val) << (n)) | ((u16)(val) >> (16 - (n))))) #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x) -#define FIELD(number, name) [ROL16(number, 6)] = VMCS12_OFFSET(name) -#define FIELD64(number, name) \ - FIELD(number, name), \ - [ROL16(number##_HIGH, 6)] = VMCS12_OFFSET(name) + sizeof(u32) + +#define VMCS_CHECK_SIZE16(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2001) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x4000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x6000)) + +#define VMCS_CHECK_SIZE32(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x6000)) + +#define VMCS_CHECK_SIZE64(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2001) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x4000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x6000)) + +#define VMCS_CHECK_SIZE_N(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2001) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x4000)) + +#define FIELD16(number, name) \ + [ROL16(number, 6)] = VMCS_CHECK_SIZE16(number) + VMCS12_OFFSET(name) + +#define FIELD32(number, name) \ + [ROL16(number, 6)] = VMCS_CHECK_SIZE32(number) + VMCS12_OFFSET(name) + +#define FIELD64(number, name) \ + FIELD32(number, name), \ + [ROL16(number##_HIGH, 6)] = VMCS_CHECK_SIZE32(number) + \ + VMCS12_OFFSET(name) + sizeof(u32) +#define FIELDN(number, name) \ + [ROL16(number, 6)] = VMCS_CHECK_SIZE_N(number) + VMCS12_OFFSET(name) const unsigned short vmcs_field_to_offset_table[] = { - FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id), - FIELD(POSTED_INTR_NV, posted_intr_nv), - FIELD(GUEST_ES_SELECTOR, guest_es_selector), - FIELD(GUEST_CS_SELECTOR, guest_cs_selector), - FIELD(GUEST_SS_SELECTOR, guest_ss_selector), - FIELD(GUEST_DS_SELECTOR, guest_ds_selector), - FIELD(GUEST_FS_SELECTOR, guest_fs_selector), - FIELD(GUEST_GS_SELECTOR, guest_gs_selector), - FIELD(GUEST_LDTR_SELECTOR, guest_ldtr_selector), - FIELD(GUEST_TR_SELECTOR, guest_tr_selector), - FIELD(GUEST_INTR_STATUS, guest_intr_status), - FIELD(GUEST_PML_INDEX, guest_pml_index), - FIELD(HOST_ES_SELECTOR, host_es_selector), - FIELD(HOST_CS_SELECTOR, host_cs_selector), - FIELD(HOST_SS_SELECTOR, host_ss_selector), - FIELD(HOST_DS_SELECTOR, host_ds_selector), - FIELD(HOST_FS_SELECTOR, host_fs_selector), - FIELD(HOST_GS_SELECTOR, host_gs_selector), - FIELD(HOST_TR_SELECTOR, host_tr_selector), + FIELD16(VIRTUAL_PROCESSOR_ID, virtual_processor_id), + FIELD16(POSTED_INTR_NV, posted_intr_nv), + FIELD16(GUEST_ES_SELECTOR, guest_es_selector), + FIELD16(GUEST_CS_SELECTOR, guest_cs_selector), + FIELD16(GUEST_SS_SELECTOR, guest_ss_selector), + FIELD16(GUEST_DS_SELECTOR, guest_ds_selector), + FIELD16(GUEST_FS_SELECTOR, guest_fs_selector), + FIELD16(GUEST_GS_SELECTOR, guest_gs_selector), + FIELD16(GUEST_LDTR_SELECTOR, guest_ldtr_selector), + FIELD16(GUEST_TR_SELECTOR, guest_tr_selector), + FIELD16(GUEST_INTR_STATUS, guest_intr_status), + FIELD16(GUEST_PML_INDEX, guest_pml_index), + FIELD16(HOST_ES_SELECTOR, host_es_selector), + FIELD16(HOST_CS_SELECTOR, host_cs_selector), + FIELD16(HOST_SS_SELECTOR, host_ss_selector), + FIELD16(HOST_DS_SELECTOR, host_ds_selector), + FIELD16(HOST_FS_SELECTOR, host_fs_selector), + FIELD16(HOST_GS_SELECTOR, host_gs_selector), + FIELD16(HOST_TR_SELECTOR, host_tr_selector), FIELD64(IO_BITMAP_A, io_bitmap_a), FIELD64(IO_BITMAP_B, io_bitmap_b), FIELD64(MSR_BITMAP, msr_bitmap), @@ -64,94 +109,100 @@ const unsigned short vmcs_field_to_offset_table[] = { FIELD64(HOST_IA32_PAT, host_ia32_pat), FIELD64(HOST_IA32_EFER, host_ia32_efer), FIELD64(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl), - FIELD(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control), - FIELD(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control), - FIELD(EXCEPTION_BITMAP, exception_bitmap), - FIELD(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask), - FIELD(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match), - FIELD(CR3_TARGET_COUNT, cr3_target_count), - FIELD(VM_EXIT_CONTROLS, vm_exit_controls), - FIELD(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count), - FIELD(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count), - FIELD(VM_ENTRY_CONTROLS, vm_entry_controls), - FIELD(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count), - FIELD(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field), - FIELD(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code), - FIELD(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len), - FIELD(TPR_THRESHOLD, tpr_threshold), - FIELD(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control), - FIELD(VM_INSTRUCTION_ERROR, vm_instruction_error), - FIELD(VM_EXIT_REASON, vm_exit_reason), - FIELD(VM_EXIT_INTR_INFO, vm_exit_intr_info), - FIELD(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code), - FIELD(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field), - FIELD(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code), - FIELD(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len), - FIELD(VMX_INSTRUCTION_INFO, vmx_instruction_info), - FIELD(GUEST_ES_LIMIT, guest_es_limit), - FIELD(GUEST_CS_LIMIT, guest_cs_limit), - FIELD(GUEST_SS_LIMIT, guest_ss_limit), - FIELD(GUEST_DS_LIMIT, guest_ds_limit), - FIELD(GUEST_FS_LIMIT, guest_fs_limit), - FIELD(GUEST_GS_LIMIT, guest_gs_limit), - FIELD(GUEST_LDTR_LIMIT, guest_ldtr_limit), - FIELD(GUEST_TR_LIMIT, guest_tr_limit), - FIELD(GUEST_GDTR_LIMIT, guest_gdtr_limit), - FIELD(GUEST_IDTR_LIMIT, guest_idtr_limit), - FIELD(GUEST_ES_AR_BYTES, guest_es_ar_bytes), - FIELD(GUEST_CS_AR_BYTES, guest_cs_ar_bytes), - FIELD(GUEST_SS_AR_BYTES, guest_ss_ar_bytes), - FIELD(GUEST_DS_AR_BYTES, guest_ds_ar_bytes), - FIELD(GUEST_FS_AR_BYTES, guest_fs_ar_bytes), - FIELD(GUEST_GS_AR_BYTES, guest_gs_ar_bytes), - FIELD(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes), - FIELD(GUEST_TR_AR_BYTES, guest_tr_ar_bytes), - FIELD(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info), - FIELD(GUEST_ACTIVITY_STATE, guest_activity_state), - FIELD(GUEST_SYSENTER_CS, guest_sysenter_cs), - FIELD(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs), - FIELD(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value), - FIELD(CR0_GUEST_HOST_MASK, cr0_guest_host_mask), - FIELD(CR4_GUEST_HOST_MASK, cr4_guest_host_mask), - FIELD(CR0_READ_SHADOW, cr0_read_shadow), - FIELD(CR4_READ_SHADOW, cr4_read_shadow), - FIELD(CR3_TARGET_VALUE0, cr3_target_value0), - FIELD(CR3_TARGET_VALUE1, cr3_target_value1), - FIELD(CR3_TARGET_VALUE2, cr3_target_value2), - FIELD(CR3_TARGET_VALUE3, cr3_target_value3), - FIELD(EXIT_QUALIFICATION, exit_qualification), - FIELD(GUEST_LINEAR_ADDRESS, guest_linear_address), - FIELD(GUEST_CR0, guest_cr0), - FIELD(GUEST_CR3, guest_cr3), - FIELD(GUEST_CR4, guest_cr4), - FIELD(GUEST_ES_BASE, guest_es_base), - FIELD(GUEST_CS_BASE, guest_cs_base), - FIELD(GUEST_SS_BASE, guest_ss_base), - FIELD(GUEST_DS_BASE, guest_ds_base), - FIELD(GUEST_FS_BASE, guest_fs_base), - FIELD(GUEST_GS_BASE, guest_gs_base), - FIELD(GUEST_LDTR_BASE, guest_ldtr_base), - FIELD(GUEST_TR_BASE, guest_tr_base), - FIELD(GUEST_GDTR_BASE, guest_gdtr_base), - FIELD(GUEST_IDTR_BASE, guest_idtr_base), - FIELD(GUEST_DR7, guest_dr7), - FIELD(GUEST_RSP, guest_rsp), - FIELD(GUEST_RIP, guest_rip), - FIELD(GUEST_RFLAGS, guest_rflags), - FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), - FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), - FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), - FIELD(HOST_CR0, host_cr0), - FIELD(HOST_CR3, host_cr3), - FIELD(HOST_CR4, host_cr4), - FIELD(HOST_FS_BASE, host_fs_base), - FIELD(HOST_GS_BASE, host_gs_base), - FIELD(HOST_TR_BASE, host_tr_base), - FIELD(HOST_GDTR_BASE, host_gdtr_base), - FIELD(HOST_IDTR_BASE, host_idtr_base), - FIELD(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp), - FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), - FIELD(HOST_RSP, host_rsp), - FIELD(HOST_RIP, host_rip), + FIELD32(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control), + FIELD32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control), + FIELD32(EXCEPTION_BITMAP, exception_bitmap), + FIELD32(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask), + FIELD32(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match), + FIELD32(CR3_TARGET_COUNT, cr3_target_count), + FIELD32(VM_EXIT_CONTROLS, vm_exit_controls), + FIELD32(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count), + FIELD32(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count), + FIELD32(VM_ENTRY_CONTROLS, vm_entry_controls), + FIELD32(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count), + FIELD32(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field), + FIELD32(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code), + FIELD32(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len), + FIELD32(TPR_THRESHOLD, tpr_threshold), + FIELD32(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control), + FIELD32(VM_INSTRUCTION_ERROR, vm_instruction_error), + FIELD32(VM_EXIT_REASON, vm_exit_reason), + FIELD32(VM_EXIT_INTR_INFO, vm_exit_intr_info), + FIELD32(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code), + FIELD32(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field), + FIELD32(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code), + FIELD32(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len), + FIELD32(VMX_INSTRUCTION_INFO, vmx_instruction_info), + FIELD32(GUEST_ES_LIMIT, guest_es_limit), + FIELD32(GUEST_CS_LIMIT, guest_cs_limit), + FIELD32(GUEST_SS_LIMIT, guest_ss_limit), + FIELD32(GUEST_DS_LIMIT, guest_ds_limit), + FIELD32(GUEST_FS_LIMIT, guest_fs_limit), + FIELD32(GUEST_GS_LIMIT, guest_gs_limit), + FIELD32(GUEST_LDTR_LIMIT, guest_ldtr_limit), + FIELD32(GUEST_TR_LIMIT, guest_tr_limit), + FIELD32(GUEST_GDTR_LIMIT, guest_gdtr_limit), + FIELD32(GUEST_IDTR_LIMIT, guest_idtr_limit), + FIELD32(GUEST_ES_AR_BYTES, guest_es_ar_bytes), + FIELD32(GUEST_CS_AR_BYTES, guest_cs_ar_bytes), + FIELD32(GUEST_SS_AR_BYTES, guest_ss_ar_bytes), + FIELD32(GUEST_DS_AR_BYTES, guest_ds_ar_bytes), + FIELD32(GUEST_FS_AR_BYTES, guest_fs_ar_bytes), + FIELD32(GUEST_GS_AR_BYTES, guest_gs_ar_bytes), + FIELD32(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes), + FIELD32(GUEST_TR_AR_BYTES, guest_tr_ar_bytes), + FIELD32(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info), + FIELD32(GUEST_ACTIVITY_STATE, guest_activity_state), + FIELD32(GUEST_SYSENTER_CS, guest_sysenter_cs), + FIELD32(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs), + FIELD32(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value), + FIELDN(CR0_GUEST_HOST_MASK, cr0_guest_host_mask), + FIELDN(CR4_GUEST_HOST_MASK, cr4_guest_host_mask), + FIELDN(CR0_READ_SHADOW, cr0_read_shadow), + FIELDN(CR4_READ_SHADOW, cr4_read_shadow), + FIELDN(CR3_TARGET_VALUE0, cr3_target_value0), + FIELDN(CR3_TARGET_VALUE1, cr3_target_value1), + FIELDN(CR3_TARGET_VALUE2, cr3_target_value2), + FIELDN(CR3_TARGET_VALUE3, cr3_target_value3), + FIELDN(EXIT_QUALIFICATION, exit_qualification), + FIELDN(GUEST_LINEAR_ADDRESS, guest_linear_address), + FIELDN(GUEST_CR0, guest_cr0), + FIELDN(GUEST_CR3, guest_cr3), + FIELDN(GUEST_CR4, guest_cr4), + FIELDN(GUEST_ES_BASE, guest_es_base), + FIELDN(GUEST_CS_BASE, guest_cs_base), + FIELDN(GUEST_SS_BASE, guest_ss_base), + FIELDN(GUEST_DS_BASE, guest_ds_base), + FIELDN(GUEST_FS_BASE, guest_fs_base), + FIELDN(GUEST_GS_BASE, guest_gs_base), + FIELDN(GUEST_LDTR_BASE, guest_ldtr_base), + FIELDN(GUEST_TR_BASE, guest_tr_base), + FIELDN(GUEST_GDTR_BASE, guest_gdtr_base), + FIELDN(GUEST_IDTR_BASE, guest_idtr_base), + FIELDN(GUEST_DR7, guest_dr7), + FIELDN(GUEST_RSP, guest_rsp), + FIELDN(GUEST_RIP, guest_rip), + FIELDN(GUEST_RFLAGS, guest_rflags), + FIELDN(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), + FIELDN(GUEST_SYSENTER_ESP, guest_sysenter_esp), + FIELDN(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELDN(GUEST_S_CET, guest_s_cet), + FIELDN(GUEST_SSP, guest_ssp), + FIELDN(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), + FIELDN(HOST_CR0, host_cr0), + FIELDN(HOST_CR3, host_cr3), + FIELDN(HOST_CR4, host_cr4), + FIELDN(HOST_FS_BASE, host_fs_base), + FIELDN(HOST_GS_BASE, host_gs_base), + FIELDN(HOST_TR_BASE, host_tr_base), + FIELDN(HOST_GDTR_BASE, host_gdtr_base), + FIELDN(HOST_IDTR_BASE, host_idtr_base), + FIELDN(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp), + FIELDN(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), + FIELDN(HOST_RSP, host_rsp), + FIELDN(HOST_RIP, host_rip), + FIELDN(HOST_S_CET, host_s_cet), + FIELDN(HOST_SSP, host_ssp), + FIELDN(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index d0c6df373f67..62b7be68f05c 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -118,7 +118,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -301,6 +307,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7f3a65ee64c5..32893573b630 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7189,6 +7189,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); #undef cr4_fixed1_update } @@ -7208,6 +7209,15 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu) vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS; } } + + if (is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL)) { + vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_CET_STATE; + vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_CET_STATE; + } else { + vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_CET_STATE; + vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_CET_STATE; + } } static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) From patchwork Wed Jul 1 08:04:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F8FE913 for ; Wed, 1 Jul 2020 08:04:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 23F922078B for ; Wed, 1 Jul 2020 08:04:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728626AbgGAIEf (ORCPT ); Wed, 1 Jul 2020 04:04:35 -0400 Received: from mga14.intel.com ([192.55.52.115]:2448 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728605AbgGAIEd (ORCPT ); Wed, 1 Jul 2020 04:04:33 -0400 IronPort-SDR: eC2CcBf7HtzLKGjjrDvrp/XJaStblirT3DHke7OGqNAxIAWHjvpmLjLJWpwoVEL+ypRTjqH4O8 LRdg64tmC5qQ== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581952" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581952" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:32 -0700 IronPort-SDR: XPNqiTnSJykft4Um8Mi/OIjM3UP+K5u3J3gRCoXuzGJFGq/+KcQ+dkFGPbTNmT/i+y/p+bgOsV KolUFuXLtCEA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010340" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:30 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Date: Wed, 1 Jul 2020 16:04:09 +0800 Message-Id: <20200701080411.5802-10-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Dump CET VMCS states for debug purpose. Since CET kernel protection is not enabled, if related MSRs in host are filled by mistake, warn once on detecting it. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 32893573b630..70cb2d4a1391 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5941,6 +5941,12 @@ void dump_vmcs(void) pr_err("InterruptStatus = %04x\n", vmcs_read16(GUEST_INTR_STATUS)); + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP)); + pr_err("SSP TABLE = 0x%016lx\n", + vmcs_readl(GUEST_INTR_SSP_TABLE)); + } pr_err("*** Host State ***\n"); pr_err("RIP = 0x%016lx RSP = 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6023,6 +6029,12 @@ void dump_vmcs(void) if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID) pr_err("Virtual processor ID = 0x%04x\n", vmcs_read16(VIRTUAL_PROCESSOR_ID)); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP)); + pr_err("SSP TABLE = 0x%016lx\n", + vmcs_readl(HOST_INTR_SSP_TABLE)); + } } /* @@ -8075,6 +8087,7 @@ static __init int hardware_setup(void) unsigned long host_bndcfgs; struct desc_ptr dt; int r, i, ept_lpage_level; + u64 cet_msr; store_idt(&dt); host_idt_base = dt.address; @@ -8236,6 +8249,16 @@ static __init int hardware_setup(void) return r; } + if (boot_cpu_has(X86_FEATURE_IBT) || boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_S_CET, cet_msr); + WARN_ONCE(cet_msr, "KVM: CET S_CET in host will be lost!\n"); + } + + if (boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_PL0_SSP, cet_msr); + WARN_ONCE(cet_msr, "KVM: CET PL0_SSP in host will be lost!\n"); + } + vmx_set_cpu_caps(); r = alloc_kvm_area(); From patchwork Wed Jul 1 08:04:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635589 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA1F717D1 for ; Wed, 1 Jul 2020 08:05:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC6462078B for ; Wed, 1 Jul 2020 08:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728792AbgGAIFk (ORCPT ); Wed, 1 Jul 2020 04:05:40 -0400 Received: from mga14.intel.com ([192.55.52.115]:2452 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728624AbgGAIEf (ORCPT ); Wed, 1 Jul 2020 04:04:35 -0400 IronPort-SDR: 0MGT5X/ti1ZumtLNeXNg1u+1aOCtRAxAML1JZ2zf3Jp3kpVEi6VVFdzTDWEBONeHtetCEaxUNv zkw7tsP+wx9Q== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581967" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581967" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:34 -0700 IronPort-SDR: NzHo0OsUtr3zWdFDSv5PoarSpUL/OePkL+T+sYHxbw1DanwxECqECpEudzIZtggQGdOAZGj5mO egF4Bk6/Cy9w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010360" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:32 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Date: Wed, 1 Jul 2020 16:04:10 +0800 Message-Id: <20200701080411.5802-11-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CPU defined #CP(21) to handle CET induced exception, it's accompanied with several error codes corresponding to different CET violation cases, see SDM for detailed description. The exception is classified as a contibutory exception w.r.t #DF. Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/x86.c | 1 + arch/x86/kvm/x86.h | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 3f3f780c8c65..78e5c4266270 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -31,6 +31,7 @@ #define MC_VECTOR 18 #define XM_VECTOR 19 #define VE_VECTOR 20 +#define CP_VECTOR 21 /* Select x86 specific features in */ #define __KVM_HAVE_PIT diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9c16ce65fe74..94ca5b56d233 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -407,6 +407,7 @@ static int exception_class(int vector) case NP_VECTOR: case SS_VECTOR: case GP_VECTOR: + case CP_VECTOR: return EXCPT_CONTRIBUTORY; default: break; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index b968acc0516f..7374e77c91d8 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -115,7 +115,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector) { static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) | BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) | - BIT(PF_VECTOR) | BIT(AC_VECTOR); + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR); return (1U << vector) & exception_has_error_code; } From patchwork Wed Jul 1 08:04:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11635587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3ED792A for ; Wed, 1 Jul 2020 08:05:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E47442073E for ; Wed, 1 Jul 2020 08:05:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728683AbgGAIEy (ORCPT ); Wed, 1 Jul 2020 04:04:54 -0400 Received: from mga14.intel.com ([192.55.52.115]:2458 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728643AbgGAIEh (ORCPT ); Wed, 1 Jul 2020 04:04:37 -0400 IronPort-SDR: zx6TwUhaiKdaLIxfxEZe1mU4TzDmrfkmsq7sRakjMYMna8uw9Lb7alEw/3GVMVOzkO73bGIOsp ue5E3EefwiRw== X-IronPort-AV: E=McAfee;i="6000,8403,9668"; a="145581978" X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="145581978" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2020 01:04:36 -0700 IronPort-SDR: Q9FhKmkDmusvJ3qohagPB+NSn5Zy1xo2na6n0+6+Mac5BoRqWSbk7wWMT4+8MHngdC9AofVBA6 2ME9duFZtpXg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,299,1589266800"; d="scan'208";a="455010371" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga005.jf.intel.com with ESMTP; 01 Jul 2020 01:04:34 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [PATCH v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Date: Wed, 1 Jul 2020 16:04:11 +0800 Message-Id: <20200701080411.5802-12-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200701080411.5802-1-weijiang.yang@intel.com> References: <20200701080411.5802-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Set the feature bits so that CET capabilities can be seen in guest via CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET master control bit(CR4.CET). Disable KVM CET feature once unrestricted_guest is turned off because KVM cannot emulate guest CET behavior well in this case. Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 5 +++-- arch/x86/kvm/vmx/vmx.c | 5 +++++ arch/x86/kvm/x86.c | 5 +++++ 4 files changed, 15 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f68c825e94ad..21f3c89d8c70 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -95,7 +95,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_CET)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 984ab2b395b3..333a9e0d7cdf 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -344,7 +344,8 @@ void kvm_set_cpu_caps(void) F(AVX512VBMI) | F(LA57) | 0 /*PKU*/ | 0 /*OSPKE*/ | F(RDPID) | F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | - F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ + F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | + F(SHSTK) ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) @@ -353,7 +354,7 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_mask(CPUID_7_EDX, F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | - F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) + F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(IBT) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 70cb2d4a1391..7dac5747adc8 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7411,6 +7411,11 @@ static __init void vmx_set_cpu_caps(void) /* CPUID 0x80000001 */ if (!cpu_has_vmx_rdtscp()) kvm_cpu_cap_clear(X86_FEATURE_RDTSCP); + + if (!enable_unrestricted_guest) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } } static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 94ca5b56d233..a4cf5f3211f3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9707,6 +9707,11 @@ int kvm_arch_hardware_setup(void *opaque) if (kvm_cpu_cap_has(X86_FEATURE_XSAVES)) supported_xss = host_xss & KVM_SUPPORTED_XSS; + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + supported_xss &= ~(XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); + cr4_reserved_bits = kvm_host_cr4_reserved_bits(&boot_cpu_data); if (kvm_has_tsc_control) {