From patchwork Thu Jul 16 03:16:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666567 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B42F722 for ; Thu, 16 Jul 2020 03:17:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4D6E120775 for ; Thu, 16 Jul 2020 03:17:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728199AbgGPDQ7 (ORCPT ); Wed, 15 Jul 2020 23:16:59 -0400 Received: from mga06.intel.com ([134.134.136.31]:8147 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726770AbgGPDQ5 (ORCPT ); Wed, 15 Jul 2020 23:16:57 -0400 IronPort-SDR: sgQybs7i+MF/9gGXs6AlB2uLUlkxnLrZq1je4JpvisQxg2b1bIiC6NYCrDsV/Df+fezVtcu7dS owPl2ZVdnqUg== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844831" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844831" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:16:57 -0700 IronPort-SDR: bTOUrYvWWf/ceKqf+FwF8rnI090LMkr1t4DERq+nqOYbHvnvacPhaBW+Uyy5H01tvck1+gY4QK 6+W9weU20zsg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910412" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:16:55 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Date: Thu, 16 Jul 2020 11:16:17 +0800 Message-Id: <20200716031627.11492-2-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org These definitions are added by CET kernel patch and referenced by KVM, if the CET KVM patches are tested without CET kernel patches, this patch should be included. Signed-off-by: Yang Weijiang --- include/linux/kvm_host.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d564855243d8..aae1775a29ac 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -35,6 +35,38 @@ #include +#ifndef CONFIG_X86_INTEL_CET +#define XFEATURE_CET_USER 11 +#define XFEATURE_CET_KERNEL 12 + +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) + +/* Control-flow Enforcement Technology MSRs */ +#define MSR_IA32_U_CET 0x6a0 /* user mode cet setting */ +#define MSR_IA32_S_CET 0x6a2 /* kernel mode cet setting */ +#define MSR_IA32_PL0_SSP 0x6a4 /* kernel shstk pointer */ +#define MSR_IA32_PL1_SSP 0x6a5 /* ring-1 shstk pointer */ +#define MSR_IA32_PL2_SSP 0x6a6 /* ring-2 shstk pointer */ +#define MSR_IA32_PL3_SSP 0x6a7 /* user shstk pointer */ +#define MSR_IA32_INT_SSP_TAB 0x6a8 /* exception shstk table */ + +#define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement */ +#define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) + +#define X86_FEATURE_SHSTK (16*32+ 7) /* Shadow Stack */ +#define X86_FEATURE_IBT (18*32+20) /* Indirect Branch Tracking */ + +/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */ +#define MSR_IA32_CET_SHSTK_EN 0x0000000000000001ULL +#define MSR_IA32_CET_WRSS_EN 0x0000000000000002ULL +#define MSR_IA32_CET_ENDBR_EN 0x0000000000000004ULL +#define MSR_IA32_CET_LEG_IW_EN 0x0000000000000008ULL +#define MSR_IA32_CET_NO_TRACK_EN 0x0000000000000010ULL +#define MSR_IA32_CET_WAIT_ENDBR 0x00000000000000800UL +#define MSR_IA32_CET_BITMAP_MASK 0xfffffffffffff000ULL +#endif + #ifndef KVM_MAX_VCPU_ID #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS #endif From patchwork Thu Jul 16 03:16:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 337FD722 for ; Thu, 16 Jul 2020 03:18:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2488720791 for ; Thu, 16 Jul 2020 03:18:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728215AbgGPDRB (ORCPT ); Wed, 15 Jul 2020 23:17:01 -0400 Received: from mga06.intel.com ([134.134.136.31]:8148 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728202AbgGPDRA (ORCPT ); Wed, 15 Jul 2020 23:17:00 -0400 IronPort-SDR: WZdZG2LjeAexcXXOUVciHy2vTPfUvi0KyXmviCcXP+jgBem06kSh69i29uLYIvWswaSqYKtv6k VCoQPRHnqhug== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844839" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844839" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:16:59 -0700 IronPort-SDR: a7aNfgQLxQCdNKvyXw9yDgMY2ItXXqXasHSRR7WvgO7UhFkl1qMV8tS970+FmOE9Ym0WORcgeK XvrenmOUg8Ug== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910423" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:16:57 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Date: Thu, 16 Jul 2020 11:16:18 +0800 Message-Id: <20200716031627.11492-3-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET(Control-flow Enforcement Technology) is a CPU feature used to prevent Return/Jump-Oriented Programming(ROP/JOP) attacks. It provides the following sub-features to defend against ROP/JOP style control-flow subversion attacks: Shadow Stack (SHSTK): A second stack for program which is used exclusively for control transfer operations. Indirect Branch Tracking (IBT): Code branching protection to defend against jump/call oriented programming. Several new CET MSRs are defined in kernel to support CET: MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode respectively. MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3 protection respectively. MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores shadow stack pointer of current task/thread. {HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer table. If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from below VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from below VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index cd7de4b401fe..879c57ff2dc5 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -94,6 +94,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -107,6 +108,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -328,6 +330,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822, GUEST_SYSENTER_ESP = 0x00006824, GUEST_SYSENTER_EIP = 0x00006826, + GUEST_S_CET = 0x00006828, + GUEST_SSP = 0x0000682a, + GUEST_INTR_SSP_TABLE = 0x0000682c, HOST_CR0 = 0x00006c00, HOST_CR3 = 0x00006c02, HOST_CR4 = 0x00006c04, @@ -340,6 +345,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP = 0x00006c12, HOST_RSP = 0x00006c14, HOST_RIP = 0x00006c16, + HOST_S_CET = 0x00006c18, + HOST_SSP = 0x00006c1a, + HOST_INTR_SSP_TABLE = 0x00006c1c }; /* From patchwork Thu Jul 16 03:16:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 318FE138C for ; Thu, 16 Jul 2020 03:17:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 215AD20775 for ; Thu, 16 Jul 2020 03:17:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728381AbgGPDRy (ORCPT ); Wed, 15 Jul 2020 23:17:54 -0400 Received: from mga06.intel.com ([134.134.136.31]:8148 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728214AbgGPDRD (ORCPT ); Wed, 15 Jul 2020 23:17:03 -0400 IronPort-SDR: 21+yi4nCdZLwI0xWpKZXKrd/3PnERlK3UF7ynDZ3wxXZ7TDOmz6dfSTX7NQ5zBKMEEuL7kqmgX HClmlru12oLg== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844846" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844846" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:01 -0700 IronPort-SDR: nps9WVQV5xIg+UyKTTq75xibVbW6wJLd+so25eSlMl5Dk9ZxhFhb1GkAHWInAVezz8HbE65Kkg lAXGE6/1rYSA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910429" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:16:59 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Date: Thu, 16 Jul 2020 11:16:19 +0800 Message-Id: <20200716031627.11492-4-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET MSRs pass through guest directly to enhance performance. CET runtime control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP) are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here. MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR contents are switched between threads during scheduling, it makes sense to pass through them so that the guest kernel can use xsaves/xrstors to operate them efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed info. The difference between CET VMCS fields and CET MSRs is that,the former are used during VMEnter/VMExit, whereas the latter are used for CET state storage between task/thread scheduling. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 3 +++ 2 files changed, 49 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 13745f2a5ecd..a9f135c52cbc 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3126,6 +3126,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd) vmcs_writel(GUEST_CR3, guest_cr3); } +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states) +{ + return ((supported_xss & xss_states) && + (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) || + guest_cpuid_has(vcpu, X86_FEATURE_IBT))); +} + int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7230,6 +7237,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; + bool incpt; + + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER); + /* + * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as + * one component and controlled by IA32_XSS[bit 11]. + */ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW, + incpt); + + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL); + /* + * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are + * bound as one component and controlled by IA32_XSS[bit 12]. + */ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW, + incpt); + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW, + incpt); + + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + /* SSP_TAB is only available for KERNEL SHSTK.*/ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, + incpt); +} + static void vmx_cpuid_update(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7268,6 +7311,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu) vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE); } } + + if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER)) + vmx_update_intercept_for_cet_msr(vcpu); } static __init void vmx_set_cpu_caps(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 88c593f83b28..ea8a9dc9fbad 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -184,6 +184,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU) +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) + u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); From patchwork Thu Jul 16 03:16:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666569 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F954722 for ; Thu, 16 Jul 2020 03:17:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 81B3920775 for ; Thu, 16 Jul 2020 03:17:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728236AbgGPDRF (ORCPT ); Wed, 15 Jul 2020 23:17:05 -0400 Received: from mga06.intel.com ([134.134.136.31]:8148 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728230AbgGPDRE (ORCPT ); Wed, 15 Jul 2020 23:17:04 -0400 IronPort-SDR: ChyNNk7yQntGanN2rjJTM8gzJbbPNyz9WCCh8pBdtkdHGYCzRK5OSbLB7KQEg9z/Zpto8/qJNG 5huMYpV4Cc+Q== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844851" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844851" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:03 -0700 IronPort-SDR: AtgFEO+Y19gOCmjaM1cJXjz5Bw/OabRMHZWeZ3Hlub8IMKfHzBUipcl1NZG6Uk1sQtjVk0DPsl 9QXbm0nqmpfA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910445" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:01 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Date: Thu, 16 Jul 2020 11:16:20 +0800 Message-Id: <20200716031627.11492-5-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CR4.CET is master control bit for CET function. There're mutual constrains between CR0.WP and CR4.CET, so need to check the dependent bit while changing the control registers. The processor does not allow CR4.CET to be set if CR0.WP = 0,similarly, it does not allow CR0.WP to be cleared while CR4.CET = 1. In either case, KVM would inject #GP to guest. CET state load bit is set/cleared along with CR4.CET bit set/clear. Note: SHSTK and IBT features share one control MSR: MSR_IA32_{U,S}_CET, which means it's difficult to hide one feature from another in the case of SHSTK != IBT, after discussed in community, it's agreed to allow guest control two features independently as it won't introduce security hole. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/capabilities.h | 5 +++++ arch/x86/kvm/vmx/vmx.c | 30 ++++++++++++++++++++++++++++-- arch/x86/kvm/x86.c | 3 +++ 3 files changed, 36 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 4bbd8b448d22..dbc87c5997cc 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -103,6 +103,11 @@ static inline bool cpu_has_load_perf_global_ctrl(void) (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL); } +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE) && + (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) && diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a9f135c52cbc..0089943fbb31 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2510,7 +2510,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, VM_EXIT_LOAD_IA32_EFER | VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_PT_CONCEAL_PIP | - VM_EXIT_CLEAR_IA32_RTIT_CTL; + VM_EXIT_CLEAR_IA32_RTIT_CTL | + VM_EXIT_LOAD_CET_STATE; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, &_vmexit_control) < 0) return -EIO; @@ -2534,7 +2535,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_BNDCFGS | VM_ENTRY_PT_CONCEAL_PIP | - VM_ENTRY_LOAD_IA32_RTIT_CTL; + VM_ENTRY_LOAD_IA32_RTIT_CTL | + VM_ENTRY_LOAD_CET_STATE; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, &_vmentry_control) < 0) return -EIO; @@ -3133,6 +3135,12 @@ static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states) guest_cpuid_has(vcpu, X86_FEATURE_IBT))); } +static bool is_cet_supported(struct kvm_vcpu *vcpu) +{ + return is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); +} + int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3173,6 +3181,10 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if ((cr4 & X86_CR4_CET) && (!is_cet_supported(vcpu) || + !(kvm_read_cr0(vcpu) & X86_CR0_WP))) + return 1; + if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4)) return 1; @@ -3204,6 +3216,20 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); } + if (cpu_has_load_cet_ctrl()) { + if ((hw_cr4 & X86_CR4_CET) && is_cet_supported(vcpu)) { + vm_entry_controls_setbit(to_vmx(vcpu), + VM_ENTRY_LOAD_CET_STATE); + vm_exit_controls_setbit(to_vmx(vcpu), + VM_EXIT_LOAD_CET_STATE); + } else { + vm_entry_controls_clearbit(to_vmx(vcpu), + VM_ENTRY_LOAD_CET_STATE); + vm_exit_controls_clearbit(to_vmx(vcpu), + VM_EXIT_LOAD_CET_STATE); + } + } + vmcs_writel(CR4_READ_SHADOW, cr4); vmcs_writel(GUEST_CR4, hw_cr4); return 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ea8a9dc9fbad..906e07039d59 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -815,6 +815,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) return 1; + if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET)) + return 1; + kvm_x86_ops.set_cr0(vcpu, cr0); if ((cr0 ^ old_cr0) & X86_CR0_PG) { From patchwork Thu Jul 16 03:16:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666571 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A59CB722 for ; Thu, 16 Jul 2020 03:17:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 97D5420791 for ; Thu, 16 Jul 2020 03:17:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728252AbgGPDRK (ORCPT ); Wed, 15 Jul 2020 23:17:10 -0400 Received: from mga06.intel.com ([134.134.136.31]:8148 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728238AbgGPDRG (ORCPT ); Wed, 15 Jul 2020 23:17:06 -0400 IronPort-SDR: /W1/v4ee3af8ZAyWtPZhVMDXUfrp/fhA+QO/kOUKNKYcDns8EhuVeBPkrMFrJ+UYiZBNVLDQK3 WPpb/sa95rgw== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844855" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844855" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:05 -0700 IronPort-SDR: sBbk95E8CgFg+2aBOH0e5p1KT6FjtbJKGU0/G1lD+eGSi0PsqlpeS7SUykkiauNfRnMhMQubjz 6+wTJfj1M2Tw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910452" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:03 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Date: Thu, 16 Jul 2020 11:16:21 +0800 Message-Id: <20200716031627.11492-6-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS, when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise guest will fetch old state size, and results to some WARN traces during guest running. supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends on the setting of CPUID(0xd,1).{ECX, EDX} for guest. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/cpuid.c | 23 +++++++++++++++++++---- arch/x86/kvm/x86.c | 12 ++++++++---- 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index be5363b21540..e8c749596ba2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -654,6 +654,7 @@ struct kvm_vcpu_arch { u64 xcr0; u64 guest_supported_xcr0; + u64 guest_supported_xss; struct kvm_pio_request pio; void *pio_data; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 8a294f9747aa..d97b2a6e8a8c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -88,14 +88,29 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.guest_supported_xcr0 = 0; } else { vcpu->arch.guest_supported_xcr0 = - (best->eax | ((u64)best->edx << 32)) & supported_xcr0; + (((u64)best->edx << 32) | best->eax) & supported_xcr0; best->ebx = xstate_required_size(vcpu->arch.xcr0, false); } best = kvm_find_cpuid_entry(vcpu, 0xD, 1); - if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || - cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx = xstate_required_size(vcpu->arch.xcr0, true); + if (best) { + if (cpuid_entry_has(best, X86_FEATURE_XSAVES) || + cpuid_entry_has(best, X86_FEATURE_XSAVEC)) { + u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss; + + best->ebx = xstate_required_size(xstate, true); + } + + if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) { + best->ecx = 0; + best->edx = 0; + } + vcpu->arch.guest_supported_xss = + (((u64)best->edx << 32) | best->ecx) & supported_xss; + + } else { + vcpu->arch.guest_supported_xss = 0; + } /* * The existing code assumes virtual address is 48-bit or 57-bit in the diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 906e07039d59..8aed32ff9c0c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2912,9 +2912,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than * XSAVES/XRSTORS to save/restore PT MSRs. */ - if (data & ~supported_xss) + if (data & ~vcpu->arch.guest_supported_xss) return 1; - vcpu->arch.ia32_xss = data; + if (vcpu->arch.ia32_xss != data) { + vcpu->arch.ia32_xss = data; + kvm_update_cpuid(vcpu); + } break; case MSR_SMI_COUNT: if (!msr_info->host_initiated) @@ -9779,8 +9782,9 @@ int kvm_arch_hardware_setup(void *opaque) memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); - if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) - supported_xss = 0; + supported_xss = 0; + if (kvm_cpu_cap_has(X86_FEATURE_XSAVES)) + supported_xss = host_xss & KVM_SUPPORTED_XSS; #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_); From patchwork Thu Jul 16 03:16:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FBC9138C for ; Thu, 16 Jul 2020 03:17:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5059D207E8 for ; Thu, 16 Jul 2020 03:17:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728354AbgGPDRq (ORCPT ); Wed, 15 Jul 2020 23:17:46 -0400 Received: from mga06.intel.com ([134.134.136.31]:8148 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728248AbgGPDRI (ORCPT ); Wed, 15 Jul 2020 23:17:08 -0400 IronPort-SDR: zMQUJhGJfoPkGBBjXbUOHcNmdY2OH5THH0qS6xqVqTgCIjizHLsDM8DCcEfhFSLjBLvengRX8Y yg8gy1zsDkmg== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844857" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844857" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:07 -0700 IronPort-SDR: US4WisnG2Zr98tq6JL3gDlbeQdJrC9ygahEMWmD/J2CGNunAV7Mv0fm5vqY2+QeevkkPUC8fpb 3geOCdJiNY4Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910458" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:06 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Date: Thu, 16 Jul 2020 11:16:22 +0800 Message-Id: <20200716031627.11492-7-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson A handful of CET MSRs are not context switched through "traditional" methods, e.g. VMCS or manual switching, but rather are passed through to the guest and are saved and restored by XSAVES/XRSTORS, i.e. in the guest's FPU state. Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(), can simply do {RD,WR}MSR to access the guest's value. Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8aed32ff9c0c..c437ddc22ad6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -107,6 +107,8 @@ static void enter_smm(struct kvm_vcpu *vcpu); static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); static void store_regs(struct kvm_vcpu *vcpu); static int sync_regs(struct kvm_vcpu *vcpu); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); struct kvm_x86_ops kvm_x86_ops __read_mostly; EXPORT_SYMBOL_GPL(kvm_x86_ops); @@ -3356,6 +3358,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static bool is_xsaves_msr(u32 index) +{ + return index == MSR_IA32_U_CET || + (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP); +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -3366,11 +3374,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && supported_xss && + is_xsaves_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } From patchwork Thu Jul 16 03:16:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666577 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 538A8138C for ; Thu, 16 Jul 2020 03:17:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3FFC9207E8 for ; Thu, 16 Jul 2020 03:17:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728335AbgGPDRa (ORCPT ); Wed, 15 Jul 2020 23:17:30 -0400 Received: from mga06.intel.com ([134.134.136.31]:8154 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728254AbgGPDRR (ORCPT ); Wed, 15 Jul 2020 23:17:17 -0400 IronPort-SDR: zBDExVvQI0chJUNDa5qLq5+UblZGjlp3JClFGz60dU31hl3jKPPIjo7a0AmP2reLXf5hRD2r6L 9TTZ+5lEE90w== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844858" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844858" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:10 -0700 IronPort-SDR: vIFzIvX3Gj0Ca9twvyDoMRqAZ4G2Jdc9X7ITqDhIpgx73WwfFVE667vboyW9rYktHfGTlGQgUm AgqsEg2dRgaw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910472" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:07 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Date: Thu, 16 Jul 2020 11:16:23 +0800 Message-Id: <20200716031627.11492-8-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There're two different places storing Guest CET states, states managed with XSAVES/XRSTORS, as restored/saved in previous patch, can be read/write directly from/to the MSRs. For those stored in VMCS fields, they're access via vmcs_read/vmcs_write. To correctly read/write the CET MSRs, it's necessary to check whether the kernel FPU context switch happened and reload guest FPU context if needed. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm_para.h | 7 +- arch/x86/kvm/vmx/vmx.c | 148 +++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 4 + 3 files changed, 156 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 812e9b4c1114..2d3422dc4c81 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -47,12 +47,13 @@ /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */ #define MSR_KVM_WALL_CLOCK_NEW 0x4b564d00 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01 -#define MSR_KVM_ASYNC_PF_EN 0x4b564d02 -#define MSR_KVM_STEAL_TIME 0x4b564d03 -#define MSR_KVM_PV_EOI_EN 0x4b564d04 +#define MSR_KVM_ASYNC_PF_EN 0x4b564d02 +#define MSR_KVM_STEAL_TIME 0x4b564d03 +#define MSR_KVM_PV_EOI_EN 0x4b564d04 #define MSR_KVM_POLL_CONTROL 0x4b564d05 #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 #define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 +#define MSR_KVM_GUEST_SSP 0x4b564d08 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0089943fbb31..4ce61427ed49 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1819,6 +1819,94 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) } } +static void vmx_get_xsave_msr(struct msr_data *msr_info) +{ + local_irq_disable(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + rdmsrl(msr_info->index, msr_info->data); + local_irq_enable(); +} + +static void vmx_set_xsave_msr(struct msr_data *msr_info) +{ + local_irq_disable(); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + wrmsrl(msr_info->index, msr_info->data); + local_irq_enable(); +} + +#define CET_MSR_RSVD_BITS_1 GENMASK(2, 0) +#define CET_MSR_RSVD_BITS_2 GENMASK(9, 6) + +static bool cet_check_msr_valid(struct kvm_vcpu *vcpu, + struct msr_data *msr, u64 rsvd_bits) +{ + u64 data = msr->data; + u32 index = msr->index; + + if ((index == MSR_IA32_PL0_SSP || index == MSR_IA32_PL1_SSP || + index == MSR_IA32_PL2_SSP || index == MSR_IA32_PL3_SSP || + index == MSR_IA32_INT_SSP_TAB || index == MSR_KVM_GUEST_SSP) && + is_noncanonical_address(data, vcpu)) + return false; + + if ((index == MSR_IA32_S_CET || index == MSR_IA32_U_CET) && + data & MSR_IA32_CET_ENDBR_EN) { + u64 bitmap_base = data >> 12; + + if (is_noncanonical_address(bitmap_base, vcpu)) + return false; + } + + return !(data & rsvd_bits); +} + +static bool cet_check_ssp_msr_accessible(struct kvm_vcpu *vcpu, + struct msr_data *msr) +{ + u32 index = msr->index; + + if (!boot_cpu_has(X86_FEATURE_SHSTK)) + return false; + + if (!msr->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) + return false; + + if (index == MSR_KVM_GUEST_SSP) + return msr->host_initiated && + guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + if (index == MSR_IA32_INT_SSP_TAB) + return true; + + if (index == MSR_IA32_PL3_SSP) + return supported_xss & XFEATURE_MASK_CET_USER; + + return supported_xss & XFEATURE_MASK_CET_KERNEL; +} + +static bool cet_check_ctl_msr_accessible(struct kvm_vcpu *vcpu, + struct msr_data *msr) +{ + u32 index = msr->index; + + if (!boot_cpu_has(X86_FEATURE_SHSTK) && + !boot_cpu_has(X86_FEATURE_IBT)) + return false; + + if (!msr->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) && + !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) + return false; + + if (index == MSR_IA32_U_CET) + return supported_xss & XFEATURE_MASK_CET_USER; + + return supported_xss & XFEATURE_MASK_CET_KERNEL; +} /* * Reads an msr value (of 'msr_index') into 'pdata'. * Returns 0 on success, non-0 otherwise. @@ -1951,6 +2039,31 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_KVM_GUEST_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + msr_info->data = vmcs_readl(GUEST_SSP); + break; + case MSR_IA32_S_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + msr_info->data = vmcs_readl(GUEST_S_CET); + break; + case MSR_IA32_INT_SSP_TAB: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE); + break; + case MSR_IA32_U_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + vmx_get_xsave_msr(msr_info); + break; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + vmx_get_xsave_msr(msr_info); + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) @@ -2221,6 +2334,41 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_KVM_GUEST_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1)) + return 1; + vmcs_writel(GUEST_SSP, data); + break; + case MSR_IA32_S_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2)) + return 1; + vmcs_writel(GUEST_S_CET, data); + break; + case MSR_IA32_INT_SSP_TAB: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, 0)) + return 1; + vmcs_writel(GUEST_INTR_SSP_TABLE, data); + break; + case MSR_IA32_U_CET: + if (!cet_check_ctl_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2)) + return 1; + vmx_set_xsave_msr(msr_info); + break; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!cet_check_ssp_msr_accessible(vcpu, msr_info)) + return 1; + if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1)) + return 1; + vmx_set_xsave_msr(msr_info); + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c437ddc22ad6..c71a9ceac05e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1234,6 +1234,10 @@ static const u32 msrs_to_save_all[] = { MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13, MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15, MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17, + + MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET, + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, MSR_KVM_GUEST_SSP, }; static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)]; From patchwork Thu Jul 16 03:16:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666581 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 35FB5722 for ; Thu, 16 Jul 2020 03:17:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1753D2080D for ; Thu, 16 Jul 2020 03:17:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728328AbgGPDRi (ORCPT ); Wed, 15 Jul 2020 23:17:38 -0400 Received: from mga06.intel.com ([134.134.136.31]:8148 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728263AbgGPDRO (ORCPT ); Wed, 15 Jul 2020 23:17:14 -0400 IronPort-SDR: PWFTnT7xPJABkLjSgcB5tKjCTzWu32XsTv0TQ6YB73ef/3E07g1VSiCFSBg7oE8tt/6JQVLJwl 88k3pSbVtNNQ== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844862" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844862" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:12 -0700 IronPort-SDR: kLXt/Zd1rgrlxS4PoAfjYsv8K+b/LnLlP/s/o5QaFQM692BVIap3jwo1qGm04Zj5Avq1ebN7ud I/DaQCKmA75w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910476" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:10 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM Date: Thu, 16 Jul 2020 11:16:24 +0800 Message-Id: <20200716031627.11492-9-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET MSRs pass through guests for performance consideration. Configure the MSRs to match L0/L1 settings so that nested VM is able to run with CET. Add assertions for vmcs12 offset table initialization, these assertions can detect the mismatch of VMCS field encoding and data type at compiling time. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/nested.c | 34 +++++ arch/x86/kvm/vmx/vmcs12.c | 267 +++++++++++++++++++++++--------------- arch/x86/kvm/vmx/vmcs12.h | 14 +- arch/x86/kvm/vmx/vmx.c | 10 ++ 4 files changed, 216 insertions(+), 109 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index d4a4cec034d0..ddb1a69ce947 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -550,6 +550,18 @@ static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap) } } +static void nested_vmx_update_intercept_for_msr(struct kvm_vcpu *vcpu, + u32 msr, + unsigned long *msr_bitmap_l1, + unsigned long *msr_bitmap_l0, + int type) +{ + if (!msr_write_intercepted_l01(vcpu, msr)) + nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, + msr_bitmap_l0, + msr, type); +} + /* * Merge L0's and L1's MSR bitmap, return false to indicate that * we do not use the hardware. @@ -621,6 +633,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0, MSR_KERNEL_GS_BASE, MSR_TYPE_RW); + /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */ + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_U_CET, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_S_CET, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); + nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, + msr_bitmap_l1, msr_bitmap_l0, + MSR_TYPE_RW); /* * Checking the L0->L1 bitmap is trying to verify two things: * diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index c8e51c004f78..147e0d8eeab2 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -4,31 +4,76 @@ #define ROL16(val, n) ((u16)(((u16)(val) << (n)) | ((u16)(val) >> (16 - (n))))) #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x) -#define FIELD(number, name) [ROL16(number, 6)] = VMCS12_OFFSET(name) -#define FIELD64(number, name) \ - FIELD(number, name), \ - [ROL16(number##_HIGH, 6)] = VMCS12_OFFSET(name) + sizeof(u32) + +#define VMCS_CHECK_SIZE16(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2001) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x4000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x6000)) + +#define VMCS_CHECK_SIZE32(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x6000)) + +#define VMCS_CHECK_SIZE64(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2001) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x4000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x6000)) + +#define VMCS_CHECK_SIZE_N(number) \ + (BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2000) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6001) == 0x2001) + \ + BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \ + ((number) & 0x6000) == 0x4000)) + +#define FIELD16(number, name) \ + [ROL16(number, 6)] = VMCS_CHECK_SIZE16(number) + VMCS12_OFFSET(name) + +#define FIELD32(number, name) \ + [ROL16(number, 6)] = VMCS_CHECK_SIZE32(number) + VMCS12_OFFSET(name) + +#define FIELD64(number, name) \ + FIELD32(number, name), \ + [ROL16(number##_HIGH, 6)] = VMCS_CHECK_SIZE32(number) + \ + VMCS12_OFFSET(name) + sizeof(u32) +#define FIELDN(number, name) \ + [ROL16(number, 6)] = VMCS_CHECK_SIZE_N(number) + VMCS12_OFFSET(name) const unsigned short vmcs_field_to_offset_table[] = { - FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id), - FIELD(POSTED_INTR_NV, posted_intr_nv), - FIELD(GUEST_ES_SELECTOR, guest_es_selector), - FIELD(GUEST_CS_SELECTOR, guest_cs_selector), - FIELD(GUEST_SS_SELECTOR, guest_ss_selector), - FIELD(GUEST_DS_SELECTOR, guest_ds_selector), - FIELD(GUEST_FS_SELECTOR, guest_fs_selector), - FIELD(GUEST_GS_SELECTOR, guest_gs_selector), - FIELD(GUEST_LDTR_SELECTOR, guest_ldtr_selector), - FIELD(GUEST_TR_SELECTOR, guest_tr_selector), - FIELD(GUEST_INTR_STATUS, guest_intr_status), - FIELD(GUEST_PML_INDEX, guest_pml_index), - FIELD(HOST_ES_SELECTOR, host_es_selector), - FIELD(HOST_CS_SELECTOR, host_cs_selector), - FIELD(HOST_SS_SELECTOR, host_ss_selector), - FIELD(HOST_DS_SELECTOR, host_ds_selector), - FIELD(HOST_FS_SELECTOR, host_fs_selector), - FIELD(HOST_GS_SELECTOR, host_gs_selector), - FIELD(HOST_TR_SELECTOR, host_tr_selector), + FIELD16(VIRTUAL_PROCESSOR_ID, virtual_processor_id), + FIELD16(POSTED_INTR_NV, posted_intr_nv), + FIELD16(GUEST_ES_SELECTOR, guest_es_selector), + FIELD16(GUEST_CS_SELECTOR, guest_cs_selector), + FIELD16(GUEST_SS_SELECTOR, guest_ss_selector), + FIELD16(GUEST_DS_SELECTOR, guest_ds_selector), + FIELD16(GUEST_FS_SELECTOR, guest_fs_selector), + FIELD16(GUEST_GS_SELECTOR, guest_gs_selector), + FIELD16(GUEST_LDTR_SELECTOR, guest_ldtr_selector), + FIELD16(GUEST_TR_SELECTOR, guest_tr_selector), + FIELD16(GUEST_INTR_STATUS, guest_intr_status), + FIELD16(GUEST_PML_INDEX, guest_pml_index), + FIELD16(HOST_ES_SELECTOR, host_es_selector), + FIELD16(HOST_CS_SELECTOR, host_cs_selector), + FIELD16(HOST_SS_SELECTOR, host_ss_selector), + FIELD16(HOST_DS_SELECTOR, host_ds_selector), + FIELD16(HOST_FS_SELECTOR, host_fs_selector), + FIELD16(HOST_GS_SELECTOR, host_gs_selector), + FIELD16(HOST_TR_SELECTOR, host_tr_selector), FIELD64(IO_BITMAP_A, io_bitmap_a), FIELD64(IO_BITMAP_B, io_bitmap_b), FIELD64(MSR_BITMAP, msr_bitmap), @@ -64,90 +109,96 @@ const unsigned short vmcs_field_to_offset_table[] = { FIELD64(HOST_IA32_PAT, host_ia32_pat), FIELD64(HOST_IA32_EFER, host_ia32_efer), FIELD64(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl), - FIELD(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control), - FIELD(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control), - FIELD(EXCEPTION_BITMAP, exception_bitmap), - FIELD(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask), - FIELD(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match), - FIELD(CR3_TARGET_COUNT, cr3_target_count), - FIELD(VM_EXIT_CONTROLS, vm_exit_controls), - FIELD(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count), - FIELD(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count), - FIELD(VM_ENTRY_CONTROLS, vm_entry_controls), - FIELD(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count), - FIELD(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field), - FIELD(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code), - FIELD(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len), - FIELD(TPR_THRESHOLD, tpr_threshold), - FIELD(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control), - FIELD(VM_INSTRUCTION_ERROR, vm_instruction_error), - FIELD(VM_EXIT_REASON, vm_exit_reason), - FIELD(VM_EXIT_INTR_INFO, vm_exit_intr_info), - FIELD(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code), - FIELD(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field), - FIELD(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code), - FIELD(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len), - FIELD(VMX_INSTRUCTION_INFO, vmx_instruction_info), - FIELD(GUEST_ES_LIMIT, guest_es_limit), - FIELD(GUEST_CS_LIMIT, guest_cs_limit), - FIELD(GUEST_SS_LIMIT, guest_ss_limit), - FIELD(GUEST_DS_LIMIT, guest_ds_limit), - FIELD(GUEST_FS_LIMIT, guest_fs_limit), - FIELD(GUEST_GS_LIMIT, guest_gs_limit), - FIELD(GUEST_LDTR_LIMIT, guest_ldtr_limit), - FIELD(GUEST_TR_LIMIT, guest_tr_limit), - FIELD(GUEST_GDTR_LIMIT, guest_gdtr_limit), - FIELD(GUEST_IDTR_LIMIT, guest_idtr_limit), - FIELD(GUEST_ES_AR_BYTES, guest_es_ar_bytes), - FIELD(GUEST_CS_AR_BYTES, guest_cs_ar_bytes), - FIELD(GUEST_SS_AR_BYTES, guest_ss_ar_bytes), - FIELD(GUEST_DS_AR_BYTES, guest_ds_ar_bytes), - FIELD(GUEST_FS_AR_BYTES, guest_fs_ar_bytes), - FIELD(GUEST_GS_AR_BYTES, guest_gs_ar_bytes), - FIELD(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes), - FIELD(GUEST_TR_AR_BYTES, guest_tr_ar_bytes), - FIELD(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info), - FIELD(GUEST_ACTIVITY_STATE, guest_activity_state), - FIELD(GUEST_SYSENTER_CS, guest_sysenter_cs), - FIELD(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs), - FIELD(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value), - FIELD(CR0_GUEST_HOST_MASK, cr0_guest_host_mask), - FIELD(CR4_GUEST_HOST_MASK, cr4_guest_host_mask), - FIELD(CR0_READ_SHADOW, cr0_read_shadow), - FIELD(CR4_READ_SHADOW, cr4_read_shadow), - FIELD(EXIT_QUALIFICATION, exit_qualification), - FIELD(GUEST_LINEAR_ADDRESS, guest_linear_address), - FIELD(GUEST_CR0, guest_cr0), - FIELD(GUEST_CR3, guest_cr3), - FIELD(GUEST_CR4, guest_cr4), - FIELD(GUEST_ES_BASE, guest_es_base), - FIELD(GUEST_CS_BASE, guest_cs_base), - FIELD(GUEST_SS_BASE, guest_ss_base), - FIELD(GUEST_DS_BASE, guest_ds_base), - FIELD(GUEST_FS_BASE, guest_fs_base), - FIELD(GUEST_GS_BASE, guest_gs_base), - FIELD(GUEST_LDTR_BASE, guest_ldtr_base), - FIELD(GUEST_TR_BASE, guest_tr_base), - FIELD(GUEST_GDTR_BASE, guest_gdtr_base), - FIELD(GUEST_IDTR_BASE, guest_idtr_base), - FIELD(GUEST_DR7, guest_dr7), - FIELD(GUEST_RSP, guest_rsp), - FIELD(GUEST_RIP, guest_rip), - FIELD(GUEST_RFLAGS, guest_rflags), - FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), - FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), - FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), - FIELD(HOST_CR0, host_cr0), - FIELD(HOST_CR3, host_cr3), - FIELD(HOST_CR4, host_cr4), - FIELD(HOST_FS_BASE, host_fs_base), - FIELD(HOST_GS_BASE, host_gs_base), - FIELD(HOST_TR_BASE, host_tr_base), - FIELD(HOST_GDTR_BASE, host_gdtr_base), - FIELD(HOST_IDTR_BASE, host_idtr_base), - FIELD(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp), - FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), - FIELD(HOST_RSP, host_rsp), - FIELD(HOST_RIP, host_rip), + FIELD32(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control), + FIELD32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control), + FIELD32(EXCEPTION_BITMAP, exception_bitmap), + FIELD32(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask), + FIELD32(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match), + FIELD32(CR3_TARGET_COUNT, cr3_target_count), + FIELD32(VM_EXIT_CONTROLS, vm_exit_controls), + FIELD32(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count), + FIELD32(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count), + FIELD32(VM_ENTRY_CONTROLS, vm_entry_controls), + FIELD32(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count), + FIELD32(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field), + FIELD32(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code), + FIELD32(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len), + FIELD32(TPR_THRESHOLD, tpr_threshold), + FIELD32(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control), + FIELD32(VM_INSTRUCTION_ERROR, vm_instruction_error), + FIELD32(VM_EXIT_REASON, vm_exit_reason), + FIELD32(VM_EXIT_INTR_INFO, vm_exit_intr_info), + FIELD32(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code), + FIELD32(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field), + FIELD32(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code), + FIELD32(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len), + FIELD32(VMX_INSTRUCTION_INFO, vmx_instruction_info), + FIELD32(GUEST_ES_LIMIT, guest_es_limit), + FIELD32(GUEST_CS_LIMIT, guest_cs_limit), + FIELD32(GUEST_SS_LIMIT, guest_ss_limit), + FIELD32(GUEST_DS_LIMIT, guest_ds_limit), + FIELD32(GUEST_FS_LIMIT, guest_fs_limit), + FIELD32(GUEST_GS_LIMIT, guest_gs_limit), + FIELD32(GUEST_LDTR_LIMIT, guest_ldtr_limit), + FIELD32(GUEST_TR_LIMIT, guest_tr_limit), + FIELD32(GUEST_GDTR_LIMIT, guest_gdtr_limit), + FIELD32(GUEST_IDTR_LIMIT, guest_idtr_limit), + FIELD32(GUEST_ES_AR_BYTES, guest_es_ar_bytes), + FIELD32(GUEST_CS_AR_BYTES, guest_cs_ar_bytes), + FIELD32(GUEST_SS_AR_BYTES, guest_ss_ar_bytes), + FIELD32(GUEST_DS_AR_BYTES, guest_ds_ar_bytes), + FIELD32(GUEST_FS_AR_BYTES, guest_fs_ar_bytes), + FIELD32(GUEST_GS_AR_BYTES, guest_gs_ar_bytes), + FIELD32(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes), + FIELD32(GUEST_TR_AR_BYTES, guest_tr_ar_bytes), + FIELD32(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info), + FIELD32(GUEST_ACTIVITY_STATE, guest_activity_state), + FIELD32(GUEST_SYSENTER_CS, guest_sysenter_cs), + FIELD32(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs), + FIELD32(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value), + FIELDN(CR0_GUEST_HOST_MASK, cr0_guest_host_mask), + FIELDN(CR4_GUEST_HOST_MASK, cr4_guest_host_mask), + FIELDN(CR0_READ_SHADOW, cr0_read_shadow), + FIELDN(CR4_READ_SHADOW, cr4_read_shadow), + FIELDN(EXIT_QUALIFICATION, exit_qualification), + FIELDN(GUEST_LINEAR_ADDRESS, guest_linear_address), + FIELDN(GUEST_CR0, guest_cr0), + FIELDN(GUEST_CR3, guest_cr3), + FIELDN(GUEST_CR4, guest_cr4), + FIELDN(GUEST_ES_BASE, guest_es_base), + FIELDN(GUEST_CS_BASE, guest_cs_base), + FIELDN(GUEST_SS_BASE, guest_ss_base), + FIELDN(GUEST_DS_BASE, guest_ds_base), + FIELDN(GUEST_FS_BASE, guest_fs_base), + FIELDN(GUEST_GS_BASE, guest_gs_base), + FIELDN(GUEST_LDTR_BASE, guest_ldtr_base), + FIELDN(GUEST_TR_BASE, guest_tr_base), + FIELDN(GUEST_GDTR_BASE, guest_gdtr_base), + FIELDN(GUEST_IDTR_BASE, guest_idtr_base), + FIELDN(GUEST_DR7, guest_dr7), + FIELDN(GUEST_RSP, guest_rsp), + FIELDN(GUEST_RIP, guest_rip), + FIELDN(GUEST_RFLAGS, guest_rflags), + FIELDN(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), + FIELDN(GUEST_SYSENTER_ESP, guest_sysenter_esp), + FIELDN(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELDN(GUEST_S_CET, guest_s_cet), + FIELDN(GUEST_SSP, guest_ssp), + FIELDN(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), + FIELDN(HOST_CR0, host_cr0), + FIELDN(HOST_CR3, host_cr3), + FIELDN(HOST_CR4, host_cr4), + FIELDN(HOST_FS_BASE, host_fs_base), + FIELDN(HOST_GS_BASE, host_gs_base), + FIELDN(HOST_TR_BASE, host_tr_base), + FIELDN(HOST_GDTR_BASE, host_gdtr_base), + FIELDN(HOST_IDTR_BASE, host_idtr_base), + FIELDN(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp), + FIELDN(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), + FIELDN(HOST_RSP, host_rsp), + FIELDN(HOST_RIP, host_rip), + FIELDN(HOST_S_CET, host_s_cet), + FIELDN(HOST_SSP, host_ssp), + FIELDN(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 80232daf00ff..016896c9e701 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -115,7 +115,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -295,6 +301,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4ce61427ed49..d465ff990094 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7321,6 +7321,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); #undef cr4_fixed1_update } @@ -7340,6 +7341,15 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu) vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS; } } + + if (is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL)) { + vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_CET_STATE; + vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_CET_STATE; + } else { + vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_CET_STATE; + vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_CET_STATE; + } } static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) From patchwork Thu Jul 16 03:16:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666579 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB631722 for ; Thu, 16 Jul 2020 03:17:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A07F5207CB for ; Thu, 16 Jul 2020 03:17:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728268AbgGPDRQ (ORCPT ); Wed, 15 Jul 2020 23:17:16 -0400 Received: from mga06.intel.com ([134.134.136.31]:8158 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728271AbgGPDRO (ORCPT ); Wed, 15 Jul 2020 23:17:14 -0400 IronPort-SDR: yvXKX+Eq0ty5Bnj0MsI+h8F8Kp93IFkBOJn51tn/um13Kro6Rw8oD72duSMWP+jZio+vXoQqmp /iwj8RDS+GzQ== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844864" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844864" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:14 -0700 IronPort-SDR: SrJQS4EtKr8PXj1FJSSsKS1x7Kelif0kMOiFZn3LoOaUORCMiNNUfCCW78Hk9LJe1YJbrFmJf1 iZokZpt7s+ig== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910481" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:12 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Date: Thu, 16 Jul 2020 11:16:25 +0800 Message-Id: <20200716031627.11492-10-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Dump CET VMCS states for debug purpose. Since CET kernel protection is not enabled, if related MSRs in host are filled by mistake, warn once on detecting it. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d465ff990094..5d4250b9dec8 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6056,6 +6056,12 @@ void dump_vmcs(void) pr_err("InterruptStatus = %04x\n", vmcs_read16(GUEST_INTR_STATUS)); + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP)); + pr_err("SSP TABLE = 0x%016lx\n", + vmcs_readl(GUEST_INTR_SSP_TABLE)); + } pr_err("*** Host State ***\n"); pr_err("RIP = 0x%016lx RSP = 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6130,6 +6136,12 @@ void dump_vmcs(void) if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID) pr_err("Virtual processor ID = 0x%04x\n", vmcs_read16(VIRTUAL_PROCESSOR_ID)); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP)); + pr_err("SSP TABLE = 0x%016lx\n", + vmcs_readl(HOST_INTR_SSP_TABLE)); + } } /* @@ -8205,6 +8217,7 @@ static __init int hardware_setup(void) unsigned long host_bndcfgs; struct desc_ptr dt; int r, i, ept_lpage_level; + u64 cet_msr; store_idt(&dt); host_idt_base = dt.address; @@ -8365,6 +8378,16 @@ static __init int hardware_setup(void) return r; } + if (boot_cpu_has(X86_FEATURE_IBT) || boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_S_CET, cet_msr); + WARN_ONCE(cet_msr, "KVM: CET S_CET in host will be lost!\n"); + } + + if (boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_PL0_SSP, cet_msr); + WARN_ONCE(cet_msr, "KVM: CET PL0_SSP in host will be lost!\n"); + } + vmx_set_cpu_caps(); r = alloc_kvm_area(); From patchwork Thu Jul 16 03:16:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666573 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D8FBF722 for ; Thu, 16 Jul 2020 03:17:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CA49B207E8 for ; Thu, 16 Jul 2020 03:17:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728289AbgGPDRS (ORCPT ); Wed, 15 Jul 2020 23:17:18 -0400 Received: from mga06.intel.com ([134.134.136.31]:8160 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728275AbgGPDRR (ORCPT ); Wed, 15 Jul 2020 23:17:17 -0400 IronPort-SDR: 6tJ9OI36CziBtFvFD2RY87QznhbHv+NDFhc9NN2TMgjNQx4rqfQvclBkBjY9FeNDI1Snf8XpRA 1H/IXE/Q8aPg== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844866" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844866" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:16 -0700 IronPort-SDR: DoNhtr9jlm5GbwGimNXBb0DZIFTwoHK/vmP6i5asBBydbTu3ZCFuXKRcHh/1DaXL18WPdz8h72 54ZjVgTMengg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910490" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:14 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Date: Thu, 16 Jul 2020 11:16:26 +0800 Message-Id: <20200716031627.11492-11-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CPU defined #CP(21) to handle CET induced exception, it's accompanied with several error codes corresponding to different CET violation cases, see SDM for detailed description. The exception is classified as a contibutory exception w.r.t #DF. Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/x86.c | 1 + arch/x86/kvm/x86.h | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 0780f97c1850..fb33cacc8935 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -31,6 +31,7 @@ #define MC_VECTOR 18 #define XM_VECTOR 19 #define VE_VECTOR 20 +#define CP_VECTOR 21 /* Select x86 specific features in */ #define __KVM_HAVE_PIT diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c71a9ceac05e..76892fb0b0a0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -405,6 +405,7 @@ static int exception_class(int vector) case NP_VECTOR: case SS_VECTOR: case GP_VECTOR: + case CP_VECTOR: return EXCPT_CONTRIBUTORY; default: break; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 6eb62e97e59f..53a92fc5f065 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -115,7 +115,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector) { static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) | BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) | - BIT(PF_VECTOR) | BIT(AC_VECTOR); + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR); return (1U << vector) & exception_has_error_code; } From patchwork Thu Jul 16 03:16:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 11666575 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BEF47138C for ; Thu, 16 Jul 2020 03:17:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF5EF2076C for ; Thu, 16 Jul 2020 03:17:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728315AbgGPDRZ (ORCPT ); Wed, 15 Jul 2020 23:17:25 -0400 Received: from mga06.intel.com ([134.134.136.31]:8164 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728291AbgGPDRT (ORCPT ); Wed, 15 Jul 2020 23:17:19 -0400 IronPort-SDR: OdnQU85mYcQfWJiMkKli+PVyooRpjEYBiF+mX7UegCHpa3NBHLgk/rTt+0hI+TIT7JSoVvJxk9 VCkXWeXFaUKg== X-IronPort-AV: E=McAfee;i="6000,8403,9683"; a="210844869" X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="210844869" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jul 2020 20:17:18 -0700 IronPort-SDR: oPH15PgiG93QSLB7jnt4dL1RUrJBgcCH09jfpQLd8HhDCezAW4bvbu8vUIRuK1SMlahR1s/dGR KkkQy03e5BXg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,357,1589266800"; d="scan'208";a="360910494" Received: from unknown (HELO local-michael-cet-test.sh.intel.com) ([10.239.159.128]) by orsmga001.jf.intel.com with ESMTP; 15 Jul 2020 20:17:16 -0700 From: Yang Weijiang To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com Cc: yu.c.zhang@linux.intel.com, Yang Weijiang Subject: [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Date: Thu, 16 Jul 2020 11:16:27 +0800 Message-Id: <20200716031627.11492-12-weijiang.yang@intel.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20200716031627.11492-1-weijiang.yang@intel.com> References: <20200716031627.11492-1-weijiang.yang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Set the feature bits so that CET capabilities can be seen in guest via CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET master control bit(CR4.CET). Disable KVM CET feature once unrestricted_guest is turned off because KVM cannot emulate guest CET behavior well in this case. Don't expose CET feature if dependent CET bits are cleared in host XSS. Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 5 +++-- arch/x86/kvm/vmx/vmx.c | 5 +++++ arch/x86/kvm/x86.c | 11 +++++++++++ 4 files changed, 21 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e8c749596ba2..c4c82db68b6a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -99,7 +99,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_CET)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index d97b2a6e8a8c..a085b8c57f34 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -340,7 +340,8 @@ void kvm_set_cpu_caps(void) F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) | F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | - F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ + F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | + F(SHSTK) ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) @@ -356,7 +357,7 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_mask(CPUID_7_EDX, F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | - F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) + F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(IBT) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5d4250b9dec8..31593339b6fe 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7542,6 +7542,11 @@ static __init void vmx_set_cpu_caps(void) if (vmx_waitpkg_supported()) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); + + if (!enable_unrestricted_guest) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } } static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 76892fb0b0a0..c7393d62ad72 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9808,10 +9808,21 @@ int kvm_arch_hardware_setup(void *opaque) if (kvm_cpu_cap_has(X86_FEATURE_XSAVES)) supported_xss = host_xss & KVM_SUPPORTED_XSS; + if (!(supported_xss & (XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL))) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } + #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_); #undef __kvm_cpu_cap_has + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + supported_xss &= ~(XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); + if (kvm_has_tsc_control) { /* * Make sure the user can only configure tsc_khz values that