From patchwork Thu Sep 14 06:33:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88B53EDE99F for ; Thu, 14 Sep 2023 09:38:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237238AbjINJiW (ORCPT ); Thu, 14 Sep 2023 05:38:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235825AbjINJiU (ORCPT ); Thu, 14 Sep 2023 05:38:20 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14FDE1BEF; Thu, 14 Sep 2023 02:38:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684296; x=1726220296; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0H4/DsmRdq6ZAqAGPRiCyRQ3eI7Btuj1gad9cmvRYIo=; b=FHzUbTRatf9MQriLjNnfG0O8qD+rH/kpYfWsRys1cg5LkHPeJ1fmN5N/ f+1j/bWGSvw1vTuBo+vSe8MWPY0cy0UcdaD3HPEUaEqCCFRPTWwXgQplB FNpCa3Q/mKeYEmBYeYdwktRIoes82ZHExpNAJg+gmLGL/Jn/ivI5TVV35 M/fp7egW/8zwrLoX5cVe4FG9zHkvy+h+uByaREa8LQqFBtWNKfjNLzHMM saGuaf5VUwZ3K/tCAXsaiM1+JIXLE9QhRF8Pzu973JJh7lBqEIyw59OfK QrNUsEcCGwIub7ihsD90eaLuEznhLhJWc9bWLl6EBzzq7uh+JZIdyvV9s w==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857303" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857303" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656209" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656209" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:15 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 01/25] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Date: Thu, 14 Sep 2023 02:33:01 -0400 Message-Id: <20230914063325.85503-2-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Remove XFEATURE_CET_USER entry from dependency array as the entry doesn't reflect true dependency between CET features and the xstate bit, instead manually check and add the bit back if either SHSTK or IBT is supported. Both user mode shadow stack and indirect branch tracking features depend on XFEATURE_CET_USER bit in XSS to automatically save/restore user mode xstate registers, i.e., IA32_U_CET and IA32_PL3_SSP whenever necessary. Although in real world a platform with IBT but no SHSTK is rare, but in virtualization world it's common, guest SHSTK and IBT can be controlled independently via userspace app. Signed-off-by: Yang Weijiang Reviewed-by: Rick Edgecombe Tested-by: Rick Edgecombe --- arch/x86/kernel/fpu/xstate.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index cadf68737e6b..12c8cb278346 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -73,7 +73,6 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_OSPKE, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, - [XFEATURE_CET_USER] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -798,6 +797,14 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) fpu_kernel_cfg.max_features &= ~BIT_ULL(i); } + /* + * Manually add CET user mode xstate bit if either SHSTK or IBT is + * available. Both features depend on the xstate bit to save/restore + * CET user mode state. + */ + if (boot_cpu_has(X86_FEATURE_SHSTK) || boot_cpu_has(X86_FEATURE_IBT)) + fpu_kernel_cfg.max_features |= BIT_ULL(XFEATURE_CET_USER); + if (!cpu_feature_enabled(X86_FEATURE_XFD)) fpu_kernel_cfg.max_features &= ~XFEATURE_MASK_USER_DYNAMIC; From patchwork Thu Sep 14 06:33:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E37F7EDE99C for ; Thu, 14 Sep 2023 09:38:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235913AbjINJiX (ORCPT ); Thu, 14 Sep 2023 05:38:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235604AbjINJiU (ORCPT ); Thu, 14 Sep 2023 05:38:20 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 769FB83; Thu, 14 Sep 2023 02:38:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684296; x=1726220296; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GCpk9GKZRRCMIQFL0D0NzM82+1bNcf1kwTMzwzthcgA=; b=cDIdwRKuoPTGI/WjCaaxW/Olk3zhTefEYBrEPNYRlQZmO1e5XXntAdO9 gHXyXLX6rc1YCtV3lxQ3IiCvRiaYpZHA8R8/W516ufKvtDabcKJWo5pmy 7w/3RuhjH/JWG54Wj7qKuxLliNvQv1s/5DdxwOh1VZbOCq/QnaCx5CGEa sLKynkOMKRtX1SdkltkQowdEpZ4EDg+pzO1idyiJSdfoZaNucm5fNUxh1 TMYsIpcIa74EYC+riVLEfX+1ulBUTY+OBN0X5isXBOsxr4lGe03tr1/FR Zh+cwwTJgb0DIeKoJC/Rcly/cpfoI1dgmQB76OqzaTJ7Xq2DbkP/B9gBy Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857313" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857313" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656212" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656212" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:15 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 02/25] x86/fpu/xstate: Fix guest fpstate allocation size calculation Date: Thu, 14 Sep 2023 02:33:02 -0400 Message-Id: <20230914063325.85503-3-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Fix guest xsave area allocation size from fpu_user_cfg.default_size to fpu_kernel_cfg.default_size so that the xsave area size is consistent with fpstate->size set in __fpstate_reset(). With the fix, guest fpstate size is sufficient for KVM supported guest xfeatures. Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index a86d37052a64..a42d8ad26ce6 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -220,7 +220,9 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) struct fpstate *fpstate; unsigned int size; - size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64); + size = fpu_kernel_cfg.default_size + + ALIGN(offsetof(struct fpstate, regs), 64); + fpstate = vzalloc(size); if (!fpstate) return false; From patchwork Thu Sep 14 06:33:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB5D8EDE99C for ; Thu, 14 Sep 2023 09:38:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237404AbjINJi2 (ORCPT ); Thu, 14 Sep 2023 05:38:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237139AbjINJiV (ORCPT ); Thu, 14 Sep 2023 05:38:21 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E571B1BF9; Thu, 14 Sep 2023 02:38:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684296; x=1726220296; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DH/kwoW6eRQtxK0yZo8tCx3soKPTztRafZKZYlHzFUA=; b=DLXeEsQuY2bUJhHhoWPhTqwDe3ENNA7Aa++gb2KZjSAVZ7ka4t/bMDRj GGT4SNXLy58fhmcnzse3yNCgDtxuyoF9qvyA5SY6ycYXF081oScV1We5g 11iocSNWjFR+D5r5fN3yqbt9sla6djjpQx/8Tg7BfqJk/V9LxCP34686w l0vrYgOVFIKEBfOAzhF11kn0T95hTIlhU7bXNgMHMPCWkC1aq2jj/6KyP akyoUhxDKUzSQH37TMr2EdDdKh4iUsN7Pd9cShx7MGKC5EeptdxRLjbYq tpQ+drZnAhZMNqIqYmssq7PBzl0WXF5BQ5oedijemTSgTd7eUpyhXgRQm w==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857318" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857318" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656216" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656216" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:16 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 03/25] x86/fpu/xstate: Add CET supervisor mode state support Date: Thu, 14 Sep 2023 02:33:03 -0400 Message-Id: <20230914063325.85503-4-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add supervisor mode state support within FPU xstate management framework. Although supervisor shadow stack is not enabled/used today in kernel,KVM requires the support because when KVM advertises shadow stack feature to guest, architechturally it claims the support for both user and supervisor modes for Linux and non-Linux guest OSes. With the xstate support, guest supervisor mode shadow stack state can be properly saved/restored when 1) guest/host FPU context is swapped 2) vCPU thread is sched out/in. The alternative is to enable it in KVM domain, but KVM maintainers NAKed the solution. The external discussion can be found at [*], it ended up with adding the support in kernel instead of KVM domain. Note, in KVM case, guest CET supervisor state i.e., IA32_PL{0,1,2}_MSRs, are preserved after VM-Exit until host/guest fpstates are swapped, but since host supervisor shadow stack is disabled, the preserved MSRs won't hurt host. [*]: https://lore.kernel.org/all/806e26c2-8d21-9cc9-a0b7-7787dd231729@intel.com/ Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/fpu/types.h | 14 ++++++++++++-- arch/x86/include/asm/fpu/xstate.h | 6 +++--- arch/x86/kernel/fpu/xstate.c | 6 +++++- 3 files changed, 20 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index eb810074f1e7..c6fd13a17205 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -116,7 +116,7 @@ enum xfeature { XFEATURE_PKRU, XFEATURE_PASID, XFEATURE_CET_USER, - XFEATURE_CET_KERNEL_UNUSED, + XFEATURE_CET_KERNEL, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -139,7 +139,7 @@ enum xfeature { #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) #define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) -#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) #define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) @@ -264,6 +264,16 @@ struct cet_user_state { u64 user_ssp; }; +/* + * State component 12 is Control-flow Enforcement supervisor states + */ +struct cet_supervisor_state { + /* supervisor ssp pointers */ + u64 pl0_ssp; + u64 pl1_ssp; + u64 pl2_ssp; +}; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index d4427b88ee12..3b4a038d3c57 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -51,7 +51,8 @@ /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ - XFEATURE_MASK_CET_USER) + XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) /* * A supervisor state component may not always contain valuable information, @@ -78,8 +79,7 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ - XFEATURE_MASK_CET_KERNEL) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 12c8cb278346..c3ed86732d33 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -51,7 +51,7 @@ static const char *xfeature_names[] = "Protection Keys User registers", "PASID state", "Control-flow User registers", - "Control-flow Kernel registers (unused)", + "Control-flow Kernel registers", "unknown xstate feature", "unknown xstate feature", "unknown xstate feature", @@ -73,6 +73,7 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_OSPKE, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, + [XFEATURE_CET_KERNEL] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -277,6 +278,7 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_PKRU); print_xstate_feature(XFEATURE_MASK_PASID); print_xstate_feature(XFEATURE_MASK_CET_USER); + print_xstate_feature(XFEATURE_MASK_CET_KERNEL); print_xstate_feature(XFEATURE_MASK_XTILE_CFG); print_xstate_feature(XFEATURE_MASK_XTILE_DATA); } @@ -346,6 +348,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate) XFEATURE_MASK_BNDCSR | \ XFEATURE_MASK_PASID | \ XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL | \ XFEATURE_MASK_XTILE) /* @@ -546,6 +549,7 @@ static bool __init check_xstate_against_struct(int nr) case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state); case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg); case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state); + case XFEATURE_CET_KERNEL: return XCHECK_SZ(sz, nr, struct cet_supervisor_state); case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true; default: XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr); From patchwork Thu Sep 14 06:33:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03F54EDE99F for ; Thu, 14 Sep 2023 09:38:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237430AbjINJi3 (ORCPT ); Thu, 14 Sep 2023 05:38:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230413AbjINJiV (ORCPT ); Thu, 14 Sep 2023 05:38:21 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4541783; Thu, 14 Sep 2023 02:38:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684297; x=1726220297; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RiRpO69NN4+D9BiPv1b8kqQ9mDIXR5b5ZIvKZMvuZW0=; b=Ip7QVM/FkJrJHMMsv18V7r/amDdzk2zjr6TL2+txRHNxr1Qfpaji8jGp ozOU1RHSoB2OtWrFCWQbpZAwc+j5KnX4q9GhcvGZ0++DhYsWQ/qXs4Rii 7gTLkxuMUIwKCkY8KCvIQdQjf99xKvP4s81lGCs++llFM0Bkas2tE6uia try/Oqho+3WkPG5h0oA0eXRZzAa2qwjljKMNLEP/bhJK1YXnWdKPRU1hO 8PtJaBuUz/8UbIHwQ7J8UgQenrIAJg+hcyZz1pBTEl9kCe+hGzWHEi9R5 HP061Mg9fVqGBqrgY+wNuU1Wo0P8uzt4ZvH/vVKtFu/txAI/KTIdiAzrw A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857323" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857323" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656219" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656219" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:16 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 04/25] x86/fpu/xstate: Introduce kernel dynamic xfeature set Date: Thu, 14 Sep 2023 02:33:04 -0400 Message-Id: <20230914063325.85503-5-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Define a new kernel xfeature set including the features can be dynamically enabled, i.e., the relevant feature is enabled on demand. The xfeature set is currently used by KVM to configure __guest__ fpstate, i.e., calculating the xfeature and fpstate storage size etc. The xfeature set is initialized once and used whenever it's referenced to avoid repeat calculation. Currently it's used when 1) guest fpstate __state_size is calculated while guest permits are configured 2) guest vCPU is created and its fpstate is initialized. Suggested-by: Dave Hansen Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/xstate.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index c3ed86732d33..eaec05bc1b3c 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -84,6 +84,8 @@ static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init = { [ 0 ... XFEATURE_MAX - 1] = -1}; static unsigned int xstate_flags[XFEATURE_MAX] __ro_after_init; +u64 fpu_kernel_dynamic_xfeatures __ro_after_init; + #define XSTATE_FLAG_SUPERVISOR BIT(0) #define XSTATE_FLAG_ALIGNED64 BIT(1) @@ -740,6 +742,23 @@ static void __init fpu__init_disable_system_xstate(unsigned int legacy_size) fpstate_reset(¤t->thread.fpu); } +static unsigned short xsave_kernel_dynamic_xfeatures[] = { + [XFEATURE_CET_KERNEL] = X86_FEATURE_SHSTK, +}; + +static void __init init_kernel_dynamic_xfeatures(void) +{ + unsigned short cid; + int i; + + for (i = 0; i < ARRAY_SIZE(xsave_kernel_dynamic_xfeatures); i++) { + cid = xsave_kernel_dynamic_xfeatures[i]; + + if (cid && boot_cpu_has(cid)) + fpu_kernel_dynamic_xfeatures |= BIT_ULL(i); + } +} + /* * Enable and initialize the xsave feature. * Called once per system bootup. @@ -809,6 +828,8 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) if (boot_cpu_has(X86_FEATURE_SHSTK) || boot_cpu_has(X86_FEATURE_IBT)) fpu_kernel_cfg.max_features |= BIT_ULL(XFEATURE_CET_USER); + init_kernel_dynamic_xfeatures(); + if (!cpu_feature_enabled(X86_FEATURE_XFD)) fpu_kernel_cfg.max_features &= ~XFEATURE_MASK_USER_DYNAMIC; From patchwork Thu Sep 14 06:33:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 763DFEDE99E for ; Thu, 14 Sep 2023 09:38:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237382AbjINJib (ORCPT ); Thu, 14 Sep 2023 05:38:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236923AbjINJiV (ORCPT ); Thu, 14 Sep 2023 05:38:21 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B73C71BEF; Thu, 14 Sep 2023 02:38:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684297; x=1726220297; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7cLERNq1vvTBAZ52sQIMEUGAduSmp4Knqxddo4MpnMo=; b=Q+uN2PFZ/YM49u3guS8f3wncn9EMDc7BCIpgsnE3evehRTkaNepQ0zmH f34zdTgtYxzFtHv0Y8T/GWvGSGVvT1ovgPoFXVGY7HKxPjLcFbKcWRlUu tlMNJmrTcUHfY0puW2k6Y4jM/8S8eMWY0pXt0zdnQBCatoS34nrrOYrIF uIEw3SNKvIWYDdfR9soCrRlXnG8s9jjsP3FBJshzRgyrPnOt7I6DTlhg2 VgvkKTwUyJyFcCULM6MwzlZtLhBgxMwf/9UYgkFWlc2klZ4HJ7eMl53qv nXzxhvtgedBuLfp3zWc3vxYCstuTlemyS6flDUf5W8HAjC6Ap+dijOrAG Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857328" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857328" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656226" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656226" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:16 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 05/25] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features Date: Thu, 14 Sep 2023 02:33:05 -0400 Message-Id: <20230914063325.85503-6-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The kernel dynamic xfeatures are supported by host, i.e., they're enabled in xsaves/xrstors operating xfeature set (XCR0 | XSS), but the corresponding CPU features are disabled for the time-being in host kernel so the bits are not necessarily set by default. Remove the bits from fpu_kernel_cfg.default_features so that the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be optimized by HW for normal fpstate. Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/xstate.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index eaec05bc1b3c..4753c677e2e1 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -845,6 +845,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) /* Clean out dynamic features from default */ fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features; fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; + fpu_kernel_cfg.default_features &= ~fpu_kernel_dynamic_xfeatures; fpu_user_cfg.default_features = fpu_user_cfg.max_features; fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; From patchwork Thu Sep 14 06:33:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCBA4EDE9A1 for ; Thu, 14 Sep 2023 09:38:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237299AbjINJid (ORCPT ); Thu, 14 Sep 2023 05:38:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237227AbjINJiW (ORCPT ); Thu, 14 Sep 2023 05:38:22 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAA1183; Thu, 14 Sep 2023 02:38:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684297; x=1726220297; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kg4jzy9ndQH5j3Qk6jXnrqWtPLjorVO+dQ5iT2oQlDE=; b=j0grAlD2pHYOmxYjCnQHRCMwsVtmyClUbLyGPRyq61a+SP2H/yjGsE/e Iq947Qgze9ZvHh7VOZR2AYmwKK9tSiGTtfXJFyfKOCpeai4H9a73RaLox aWhiF4ZrlWjwb0DMC9tgR4b8Q+wqPwbFuKciTy3dB/lfP9uwd8hBO4UqO LrjUO8yXGijYUU33rCi6tTXalk6rX8f2jW37LfALgg3/q6rRDFKsPl43f UhyQPpevr3nSg3AV9uIVUfWshaB0+Yq9ULiDcafit3O6ihJObR79VPah/ HJPLbhaaH+hYgJDGCE+s7TYryHBdC9DUTmZLoVXhsWBBqBIFZLaYyY9zg w==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857335" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857335" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656229" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656229" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:17 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 06/25] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size Date: Thu, 14 Sep 2023 02:33:06 -0400 Message-Id: <20230914063325.85503-7-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When user space requests guest xstate permits, the sufficient xstate size is calculated from permitted mask. Currently the max guest permits are set to fpu_kernel_cfg.default_features, and the latter doesn't include kernel dynamic xfeatures, so add them back for correct guest fpstate size. If guest dynamic xfeatures are enabled, KVM re-allocates guest fpstate area with above resulting size before launches VM. Signed-off-by: Yang Weijiang Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/xstate.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 4753c677e2e1..c5d903b4df4d 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1636,9 +1636,17 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest) /* Calculate the resulting kernel state size */ mask = permitted | requested; - /* Take supervisor states into account on the host */ + /* + * Take supervisor states into account on the host. And add + * kernel dynamic xfeatures to guest since guest kernel may + * enable corresponding CPU feaures and the xstate registers + * need to be saved/restored properly. + */ if (!guest) mask |= xfeatures_mask_supervisor(); + else + mask |= fpu_kernel_dynamic_xfeatures; + ksize = xstate_calculate_size(mask, compacted); /* Calculate the resulting user state size */ From patchwork Thu Sep 14 06:33:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B847EDE9A0 for ; Thu, 14 Sep 2023 09:38:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235410AbjINJii (ORCPT ); Thu, 14 Sep 2023 05:38:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237261AbjINJiW (ORCPT ); Thu, 14 Sep 2023 05:38:22 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A671283; Thu, 14 Sep 2023 02:38:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684298; x=1726220298; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qn5wUWy4Vkg3CghO5UGfFtXHTUkaS9WcgUAH/F4Vglo=; b=BbCng3w8S/y86ljpQZbVzEfEltJG3Qwgv4wtOQIqDktnowRq3eCkem6Q 8zaMjN4DBDELD+nFyIG3qyCZ8oBEwltOYJjg9upHoZE0RqZQeNQ2VO+FT FLzsvUb3SEV/jvODZDhAh8iby2Dv5YJO9SD4j7Bmm32uyTJfGwYXDfuKh 5ZSFk3Ya2dY7y5DvdAXKLBAYP8cHRAa7ad7A7Njr7pLdoHYaUxpXqzak3 IfGNbc+EP1l1bv7CfttxviEh8MFXFzkZCOiJc8aLUX5DftSUCBpOkofZ6 fhsogzrwFJTMco8F9HcYNzb7Qb0VsmHf7urQZmp7Wb8HSGrq9Po/gP8qB w==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857341" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857341" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656232" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656232" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:17 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 07/25] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures Date: Thu, 14 Sep 2023 02:33:07 -0400 Message-Id: <20230914063325.85503-8-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The guest fpstate is sized with fpu_kernel_cfg.default_size (by preceding fix) and the kernel dynamic xfeatures are not taken into account, so add the support and tweak fpstate xfeatures and size accordingly. Below configuration steps are currently enforced to get guest fpstate: 1) User space sets thread group xstate permits via arch_prctl(). 2) User space creates vcpu thread. 3) User space enables guest dynamic xfeatures. In #1, guest fpstate size (i.e., __state_size [1]) is induced from (fpu_kernel_cfg.default_features | user dynamic xfeatures) [2]. In #2, guest fpstate size is calculated with fpu_kernel_cfg.default_size and fpstate->size is set to the same. fpstate->xfeatures is set to fpu_kernel_cfg.default_features. In #3, guest fpstate is re-allocated as [1] and fpstate->xfeatures is set to [2]. By adding kernel dynamic xfeatures in above #1 and #2, guest xstate area size is expanded to hold (fpu_kernel_cfg.default_features | kernel dynamic _xfeatures | user dynamic xfeatures)[3], and guest fpstate->xfeatures is set to [3]. Then host xsaves/xrstors can act on all guest xfeatures. The user_* fields remain unchanged for compatibility of non-compacted KVM uAPIs. Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/core.c | 56 +++++++++++++++++++++++++++++------- arch/x86/kernel/fpu/xstate.c | 2 +- arch/x86/kernel/fpu/xstate.h | 2 ++ 3 files changed, 49 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index a42d8ad26ce6..e5819b38545a 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -33,6 +33,8 @@ DEFINE_STATIC_KEY_FALSE(__fpu_state_size_dynamic); DEFINE_PER_CPU(u64, xfd_state); #endif +extern unsigned int xstate_calculate_size(u64 xfeatures, bool compacted); + /* The FPU state configuration data for kernel and user space */ struct fpu_state_config fpu_kernel_cfg __ro_after_init; struct fpu_state_config fpu_user_cfg __ro_after_init; @@ -193,8 +195,6 @@ void fpu_reset_from_exception_fixup(void) } #if IS_ENABLED(CONFIG_KVM) -static void __fpstate_reset(struct fpstate *fpstate, u64 xfd); - static void fpu_init_guest_permissions(struct fpu_guest *gfpu) { struct fpu_state_perm *fpuperm; @@ -215,28 +215,64 @@ static void fpu_init_guest_permissions(struct fpu_guest *gfpu) gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED; } -bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) +static struct fpstate *__fpu_alloc_init_guest_fpstate(struct fpu_guest *gfpu) { + bool compacted = cpu_feature_enabled(X86_FEATURE_XCOMPACTED); + unsigned int gfpstate_size, size; struct fpstate *fpstate; - unsigned int size; + u64 xfeatures; + + /* + * fpu_kernel_cfg.default_features includes all enabled xfeatures + * except those dynamic xfeatures. Compared with user dynamic + * xfeatures, the kernel dynamic ones are enabled for guest by + * default, so add the kernel dynamic xfeatures back when calculate + * guest fpstate size. + * + * If the user dynamic xfeatures are enabled, the guest fpstate will + * be re-allocated to hold all guest enabled xfeatures, so omit user + * dynamic xfeatures here. + */ + xfeatures = fpu_kernel_cfg.default_features | + fpu_kernel_dynamic_xfeatures; + + gfpstate_size = xstate_calculate_size(xfeatures, compacted); - size = fpu_kernel_cfg.default_size + - ALIGN(offsetof(struct fpstate, regs), 64); + size = gfpstate_size + ALIGN(offsetof(struct fpstate, regs), 64); fpstate = vzalloc(size); if (!fpstate) - return false; + return NULL; + /* + * Initialize sizes and feature masks, use fpu_user_cfg.* + * for user_* settings for compatibility of exiting uAPIs. + */ + fpstate->size = gfpstate_size; + fpstate->xfeatures = xfeatures; + fpstate->user_size = fpu_user_cfg.default_size; + fpstate->user_xfeatures = fpu_user_cfg.default_features; + fpstate->xfd = 0; - /* Leave xfd to 0 (the reset value defined by spec) */ - __fpstate_reset(fpstate, 0); fpstate_init_user(fpstate); fpstate->is_valloc = true; fpstate->is_guest = true; gfpu->fpstate = fpstate; - gfpu->xfeatures = fpu_user_cfg.default_features; + gfpu->xfeatures = xfeatures; gfpu->perm = fpu_user_cfg.default_features; + return fpstate; +} + +bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) +{ + struct fpstate *fpstate; + + fpstate = __fpu_alloc_init_guest_fpstate(gfpu); + + if (!fpstate) + return false; + /* * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state * to userspace, even when XSAVE is unsupported, so that restoring FPU diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index c5d903b4df4d..87149aba6f11 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -561,7 +561,7 @@ static bool __init check_xstate_against_struct(int nr) return true; } -static unsigned int xstate_calculate_size(u64 xfeatures, bool compacted) +unsigned int xstate_calculate_size(u64 xfeatures, bool compacted) { unsigned int topmost = fls64(xfeatures) - 1; unsigned int offset = xstate_offsets[topmost]; diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index a4ecb04d8d64..9c6e3ca05c5c 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -10,6 +10,8 @@ DECLARE_PER_CPU(u64, xfd_state); #endif +extern u64 fpu_kernel_dynamic_xfeatures; + static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask) { /* From patchwork Thu Sep 14 06:33:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBFC9EDE99F for ; Thu, 14 Sep 2023 09:38:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237267AbjINJie (ORCPT ); Thu, 14 Sep 2023 05:38:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237264AbjINJiW (ORCPT ); Thu, 14 Sep 2023 05:38:22 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBB931BF2; Thu, 14 Sep 2023 02:38:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684298; x=1726220298; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Mo0J7nmC7/MpnJ3d9RwhENdZi4KZZ89Pbi3lSWxz018=; b=gfsfPXzRKPscqyalcDfTKP10vg4ulbrXvfC9cSe/vgEgerL3h1RMhN3/ +WIX3/LvPAyyCAKJ4M5OZh2anH0LHTsB1/b2KZKkIQnov9ONS6mgtPqze uc3/hZRD7VY8DWGR1Ffui2ppVgH9Ky3LuZ7TJtLjqE56diaJ/fut2lVmn 8ZAcbhzfs+Tf5ZpcJDfHxEDDUoxVNEeIFK8XeJAvY7MoyTSXDi10Rkwav RsbJ3yg7L/+R0JG77FZfqkqdeK1HpUNsWRedPW2Tzc/H5YPJ6vW2q9Sll l8JNmiLQWShyZdeN/iHmObsNvXtKyB9vfi/6grwA8jXKpPRGhFtneT2Kw A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857346" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857346" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656237" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656237" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:18 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 08/25] x86/fpu/xstate: WARN if normal fpstate contains kernel dynamic xfeatures Date: Thu, 14 Sep 2023 02:33:08 -0400 Message-Id: <20230914063325.85503-9-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org fpu_kernel_dynamic_xfeatures now are __ONLY__ enabled by guest kernel and used for guest fpstate, i.e., none for normal fpstate. The bits are added when guest fpstate is allocated and fpstate->is_guest set to %true. For normal fpstate, the bits should have been removed when init system FPU settings, WARN_ONCE() if normal fpstate contains kernel dynamic xfeatures before xsaves is executed. Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/xstate.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 9c6e3ca05c5c..c2b33a5db53d 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -186,6 +186,9 @@ static inline void os_xsave(struct fpstate *fpstate) WARN_ON_FPU(!alternatives_patched); xfd_validate_state(fpstate, mask, false); + WARN_ON_FPU(!fpstate->is_guest && + (mask & fpu_kernel_dynamic_xfeatures)); + XSTATE_XSAVE(&fpstate->regs.xsave, lmask, hmask, err); /* We should never fault when copying to a kernel buffer: */ From patchwork Thu Sep 14 06:33:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1D63EDE99F for ; Thu, 14 Sep 2023 09:38:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237544AbjINJij (ORCPT ); Thu, 14 Sep 2023 05:38:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237270AbjINJiX (ORCPT ); Thu, 14 Sep 2023 05:38:23 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 423791BFF; Thu, 14 Sep 2023 02:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684299; x=1726220299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fmKaQuGnM9SpdEUHMKDlC8GkY5+N9nPibVd+P2rGn30=; b=KfGyQVvH5gvUg9pKIOpTEgtYI2AmdhU9qnyMvEGGL1yPEehDE3BIOFVo DPdZukVNeY86oYz61N/+ucSZpkvHM5DIs6cIu2CtrIpb6ApkpUnRRexiz Tp99laEBJsP3C5POiodv2GfLQyd41XpklYSfyBvffXEYuBmvrncSnHdfo 3SOBvJL1eJwt7e+yZeIqZ+zztponf7dpyTRXo6C6NP2XosJEfiU5ifokQ FCGf46ZfltqaAlA6dwamv53p8O9HPko8ERZwJGUHgwTiYIj+yjlLjF+qc VTFtwJReA9YxfqfgNcsZTb3yzl+w7ZAF0VhUL6/V2HplQS0VcIXjAJY0f Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857351" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857351" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656240" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656240" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:18 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 09/25] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Date: Thu, 14 Sep 2023 02:33:09 -0400 Message-Id: <20230914063325.85503-10-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Rework and rename cpuid_get_supported_xcr0() to explicitly operate on vCPU state, i.e. on a vCPU's CPUID state. Prior to commit 275a87244ec8 ("KVM: x86: Don't adjust guest's CPUID.0x12.1 (allowed SGX enclave XFRM)"), KVM incorrectly fudged guest CPUID at runtime, which in turn necessitated massaging the incoming CPUID state for KVM_SET_CPUID{2} so as not to run afoul of kvm_cpuid_check_equal(). Opportunistically move the helper below kvm_update_cpuid_runtime() to make it harder to repeat the mistake of querying supported XCR0 for runtime updates. No functional change intended. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/cpuid.c | 33 ++++++++++++++++----------------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0544e30b4946..7c3e4a550ca7 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -247,21 +247,6 @@ void kvm_update_pv_runtime(struct kvm_vcpu *vcpu) vcpu->arch.pv_cpuid.features = best->eax; } -/* - * Calculate guest's supported XCR0 taking into account guest CPUID data and - * KVM's supported XCR0 (comprised of host's XCR0 and KVM_SUPPORTED_XCR0). - */ -static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent) -{ - struct kvm_cpuid_entry2 *best; - - best = cpuid_entry2_find(entries, nent, 0xd, 0); - if (!best) - return 0; - - return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; -} - static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries, int nent) { @@ -312,6 +297,21 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime); +/* + * Calculate guest's supported XCR0 taking into account guest CPUID data and + * KVM's supported XCR0 (comprised of host's XCR0 and KVM_SUPPORTED_XCR0). + */ +static u64 vcpu_get_supported_xcr0(struct kvm_vcpu *vcpu) +{ + struct kvm_cpuid_entry2 *best; + + best = kvm_find_cpuid_entry_index(vcpu, 0xd, 0); + if (!best) + return 0; + + return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; +} + static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent) { struct kvm_cpuid_entry2 *entry; @@ -357,8 +357,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) kvm_apic_set_version(vcpu); } - vcpu->arch.guest_supported_xcr0 = - cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); + vcpu->arch.guest_supported_xcr0 = vcpu_get_supported_xcr0(vcpu); /* * FP+SSE can always be saved/restored via KVM_{G,S}ET_XSAVE, even if From patchwork Thu Sep 14 06:33:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26AA9EDE9A1 for ; Thu, 14 Sep 2023 09:38:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237563AbjINJik (ORCPT ); Thu, 14 Sep 2023 05:38:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237286AbjINJiX (ORCPT ); Thu, 14 Sep 2023 05:38:23 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CC9FCCD; Thu, 14 Sep 2023 02:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684299; x=1726220299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=joiI9l6YnLOfTvcebVDodp/6FgyILV4huxPWCPMd3T8=; b=CL2wMSn3I+HM850gOAz2L4CkGL6NJPI56Ne6/giq7HeMlaScMwP/bkUj EdPPe1OjuA6KBxVEkrMMyzRkGRsPf1A3o3HT6xtHxYQxJuPJaCx68HA56 XZwqY3piMi3ycKQzMHi0xEsdgd2i2sBC9e8CLSmhm3xHb2mwSz03RDxNm YGv/qPmM/kCwR7ue8dBY/NJQVGsr4ugR/5ELDQ86kBz+vy6KDDVUa8Dqo 6PBpjA4MdZHAzfjqUmfGgmYemeiNHRYiOvDjaMuzJrG88YIfM4rlg9uzw mkktM2qUlAr+EUmt0BhyOyAM/bkrxWJL4rrmKnmVDi9fYJrunAgjd1kNn g==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857358" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857358" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656243" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656243" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:18 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 10/25] KVM: x86: Add kvm_msr_{read,write}() helpers Date: Thu, 14 Sep 2023 02:33:10 -0400 Message-Id: <20230914063325.85503-11-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Wrap __kvm_{get,set}_msr() into two new helpers for KVM usage and use the helpers to replace existing usage of the raw functions. kvm_msr_{read,write}() are KVM-internal helpers, i.e. used when KVM needs to get/set a MSR value for emulating CPU behavior. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 4 +++- arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/x86.c | 16 +++++++++++++--- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 1a4def36d5bb..0fc5e6312e93 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1956,7 +1956,9 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu); void kvm_enable_efer_bits(u64); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); -int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated); + +int kvm_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); +int kvm_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data); int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data); int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 7c3e4a550ca7..1f206caec559 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -1531,7 +1531,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, *edx = entry->edx; if (function == 7 && index == 0) { u64 data; - if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) && + if (!kvm_msr_read(vcpu, MSR_IA32_TSX_CTRL, &data) && (data & TSX_CTRL_CPUID_CLEAR)) *ebx &= ~(F(RTM) | F(HLE)); } else if (function == 0x80000007) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6c9c81e82e65..e0b55c043dab 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1917,8 +1917,8 @@ static int kvm_set_msr_ignored_check(struct kvm_vcpu *vcpu, * Returns 0 on success, non-0 otherwise. * Assumes vcpu_load() was already called. */ -int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, - bool host_initiated) +static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, + bool host_initiated) { struct msr_data msr; int ret; @@ -1944,6 +1944,16 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, return ret; } +int kvm_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) +{ + return __kvm_set_msr(vcpu, index, data, true); +} + +int kvm_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) +{ + return __kvm_get_msr(vcpu, index, data, true); +} + static int kvm_get_msr_ignored_check(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated) { @@ -12082,7 +12092,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) MSR_IA32_MISC_ENABLE_BTS_UNAVAIL; __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP); - __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true); + kvm_msr_write(vcpu, MSR_IA32_XSS, 0); } /* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */ From patchwork Thu Sep 14 06:33:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C83FEDE99E for ; Thu, 14 Sep 2023 09:38:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237603AbjINJil (ORCPT ); Thu, 14 Sep 2023 05:38:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237308AbjINJiY (ORCPT ); Thu, 14 Sep 2023 05:38:24 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D6211BFF; Thu, 14 Sep 2023 02:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684300; x=1726220300; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=faimbtNjTig8ktvwEdQFWKOpWka/XGI0e5+YlNc3lTQ=; b=Ge1ePs9NdamCIEJc8g9sVnh5FWpbq1OianxLrjLI64n+mLOzr8RE2Vf8 cqynR8qaTy7SeapKsU0w+Wk8LaDW5v/bS0jb/mweXFBL/XwAGJ5C4nv+R 8jeuDD1uwypS73LPLfhy6+QwfQmI+nFmGXGb9RXv/GlQLkYQFvpMEqrEq qv/iH22Q7KhqkccuvGEW51pf/931cBDry/vUZBJJMBDXUC7brj/z6RRaX BsVFzwg5vWGdEF+ZXDScZKdBwtsQLdRqrGMtCpbgayyCs27I1x5iJGE6t BOpTcBruVmdarA0l8JlqxS5ZzwZYr0cVAeU7wAXIkaNSyHGn/WOGflzp2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857363" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857363" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656246" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656246" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:19 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 11/25] KVM: x86: Report XSS as to-be-saved if there are supported features Date: Thu, 14 Sep 2023 02:33:11 -0400 Message-Id: <20230914063325.85503-12-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Add MSR_IA32_XSS to list of MSRs reported to userspace if supported_xss is non-zero, i.e. KVM supports at least one XSS based feature. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e0b55c043dab..1258d1d6dd52 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1464,6 +1464,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_UMWAIT_CONTROL, MSR_IA32_XFD, MSR_IA32_XFD_ERR, + MSR_IA32_XSS, }; static const u32 msrs_to_save_pmu[] = { @@ -7195,6 +7196,10 @@ static void kvm_probe_msr_to_save(u32 msr_index) if (!(kvm_get_arch_capabilities() & ARCH_CAP_TSX_CTRL_MSR)) return; break; + case MSR_IA32_XSS: + if (!kvm_caps.supported_xss) + return; + break; default: break; } From patchwork Thu Sep 14 06:33:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2B80EDE9A2 for ; Thu, 14 Sep 2023 09:38:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237573AbjINJil (ORCPT ); Thu, 14 Sep 2023 05:38:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237307AbjINJiY (ORCPT ); Thu, 14 Sep 2023 05:38:24 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42BA81FC0; Thu, 14 Sep 2023 02:38:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684300; x=1726220300; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SgNlcd5ktSWMn2hdoXAHgtutH/xRPayw7k0UwTE2UoM=; b=R0jpc+kkwyUnD4vj0n5qK1M8z37Rw40Dnf/udaVAiWOsWK2eCysg7rtK vUIi4OqMn97Ilru9fx4+2O1OslChFVM6/KWyU9MDo3TzRyl/lw6/GQiAs Ey0T9JZ8+EzRuXIg93xvOSk4d3BsA8Jhp/Z8PACdNcrqoC65c5oIeFxtA KgUuvw7FfrMEimuF1I0Di3HOGQkruxeLtIdjBgSAOSWcfJPjpK/KhuB8F cMz213jdlLgC3sFBP7nSlx2Lg5EEpTX1IczxGvu09MmeqIuE3xQOZiM9t z343GFk2jLxEhAV0kmYynjlFSmQiOwOMiqZZOhbBR5nQMTBcUAEkh0ALZ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857365" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857365" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656249" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656249" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:19 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Zhang Yi Z Subject: [PATCH v6 12/25] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Date: Thu, 14 Sep 2023 02:33:12 -0400 Message-Id: <20230914063325.85503-13-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Update CPUID.(EAX=0DH,ECX=1).EBX to reflect current required xstate size due to XSS MSR modification. CPUID(EAX=0DH,ECX=1).EBX reports the required storage size of all enabled xstate features in (XCR0 | IA32_XSS). The CPUID value can be used by guest before allocate sufficient xsave buffer. Note, KVM does not yet support any XSS based features, i.e. supported_xss is guaranteed to be zero at this time. Opportunistically modify XSS write access logic as: if !guest_cpuid_has(), write initiated from host is allowed iff the write is reset operaiton, i.e., data == 0, reject host_initiated non-reset write and any guest write. Suggested-by: Sean Christopherson Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/cpuid.c | 15 ++++++++++++++- arch/x86/kvm/x86.c | 13 +++++++++---- 3 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0fc5e6312e93..d77b030e996c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -803,6 +803,7 @@ struct kvm_vcpu_arch { u64 xcr0; u64 guest_supported_xcr0; + u64 guest_supported_xss; struct kvm_pio_request pio; void *pio_data; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 1f206caec559..4e7a820cba62 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -275,7 +275,8 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e best = cpuid_entry2_find(entries, nent, 0xD, 1); if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx = xstate_required_size(vcpu->arch.xcr0, true); + best->ebx = xstate_required_size(vcpu->arch.xcr0 | + vcpu->arch.ia32_xss, true); best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); if (kvm_hlt_in_guest(vcpu->kvm) && best && @@ -312,6 +313,17 @@ static u64 vcpu_get_supported_xcr0(struct kvm_vcpu *vcpu) return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; } +static u64 vcpu_get_supported_xss(struct kvm_vcpu *vcpu) +{ + struct kvm_cpuid_entry2 *best; + + best = kvm_find_cpuid_entry_index(vcpu, 0xd, 1); + if (!best) + return 0; + + return (best->ecx | ((u64)best->edx << 32)) & kvm_caps.supported_xss; +} + static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent) { struct kvm_cpuid_entry2 *entry; @@ -358,6 +370,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) } vcpu->arch.guest_supported_xcr0 = vcpu_get_supported_xcr0(vcpu); + vcpu->arch.guest_supported_xss = vcpu_get_supported_xss(vcpu); /* * FP+SSE can always be saved/restored via KVM_{G,S}ET_XSAVE, even if diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1258d1d6dd52..9a616d84bd39 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3795,20 +3795,25 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vcpu->arch.ia32_tsc_adjust_msr += adj; } break; - case MSR_IA32_XSS: - if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)) + case MSR_IA32_XSS: { + bool host_msr_reset = msr_info->host_initiated && data == 0; + + if (!guest_cpuid_has(vcpu, X86_FEATURE_XSAVES) && + (!host_msr_reset || !msr_info->host_initiated)) return 1; /* * KVM supports exposing PT to the guest, but does not support * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than * XSAVES/XRSTORS to save/restore PT MSRs. */ - if (data & ~kvm_caps.supported_xss) + if (data & ~vcpu->arch.guest_supported_xss) return 1; + if (vcpu->arch.ia32_xss == data) + break; vcpu->arch.ia32_xss = data; kvm_update_cpuid_runtime(vcpu); break; + } case MSR_SMI_COUNT: if (!msr_info->host_initiated) return 1; From patchwork Thu Sep 14 06:33:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A1B6EDE99F for ; Thu, 14 Sep 2023 09:38:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237611AbjINJin (ORCPT ); Thu, 14 Sep 2023 05:38:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237340AbjINJiY (ORCPT ); Thu, 14 Sep 2023 05:38:24 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A09C1FC2; Thu, 14 Sep 2023 02:38:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684300; x=1726220300; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=H3+t7eF715WT9lWgIVo67gkWhRRXLUl+sRA4/HxOaqw=; b=YYYj2j8DPOh8K5Dm6C9vFbGk5IL2uogv1psLVfXl6wT0Fv1b9XdKlpZ2 fuAyFxiLwx+ruYgj0TMiEbQGUUSyzNQehu37wIaIvgXFZgD+RQruuj5Tg oJ5wLv3ClUUY74dftn2WWdQcU+9PZtCeg3mQNji0/DRT1nTvR+7mRHwJF s+YTFD2EhOMnyvAKNaC89f42jdmqsUgVEDCH8ZVwZ6s/pF89vkzt8vhUO pSmc9DE6H+ZGVuEhGjYi3abANzvINzNadBstFf6kdRPLpfVyUy0Gy+dSb RAaBDkzCXAVEKcRUvg5BdfcDAgA8swV7+MUz75twW9KLIcD+93Nq0/MxL Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857379" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857379" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656254" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656254" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:19 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 13/25] KVM: x86: Initialize kvm_caps.supported_xss Date: Thu, 14 Sep 2023 02:33:13 -0400 Message-Id: <20230914063325.85503-14-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Set original kvm_caps.supported_xss to (host_xss & KVM_SUPPORTED_XSS) if XSAVES is supported. host_xss contains the host supported xstate feature bits for thread FPU context switch, KVM_SUPPORTED_XSS includes all KVM enabled XSS feature bits, the resulting value represents the supervisor xstates that are available to guest and are backed by host FPU framework for swapping {guest,host} XSAVE-managed registers/MSRs. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9a616d84bd39..66edbed25db8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -226,6 +226,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) +#define KVM_SUPPORTED_XSS 0 + u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9515,12 +9517,13 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK); kvm_caps.supported_xcr0 = host_xcr0 & KVM_SUPPORTED_XCR0; } + if (boot_cpu_has(X86_FEATURE_XSAVES)) { + rdmsrl(MSR_IA32_XSS, host_xss); + kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS; + } rdmsrl_safe(MSR_EFER, &host_efer); - if (boot_cpu_has(X86_FEATURE_XSAVES)) - rdmsrl(MSR_IA32_XSS, host_xss); - kvm_init_pmu_capability(ops->pmu_ops); if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) From patchwork Thu Sep 14 06:33:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4134EDE99C for ; Thu, 14 Sep 2023 09:38:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237616AbjINJip (ORCPT ); Thu, 14 Sep 2023 05:38:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237369AbjINJi1 (ORCPT ); Thu, 14 Sep 2023 05:38:27 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 323691FC6; Thu, 14 Sep 2023 02:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684301; x=1726220301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qlHnWyHRSXxxRhVYMCd0RGmIeUUDEFAPvADSMDz/aBU=; b=fpU2R+va+0unNcE5frrR2g6O+qIlBIG9i4RYfEq15BkcmR/Z89oUs3iJ apkzLFCbBRnTJBjuPEH3m5DjGAQI3WAzbSpHyzFJiom1ux8EEPCb7lkuu KFImlQnT5SBM+1komwCLnYsOHKoEO+o1dpbER6xGRn8NMY3uDCmuGvYE2 673uV+fhOOZAOsF5U+eGtX0ByR7CGFr2ZXK0NR618vjajN/e15R6NyEQ6 zNB4JS02LMJRlo7/LmqLHTAkRTE6gMtQLZmqfnnejXXl4DI3tWKKNRyE1 ErLoMdeHZL2gJ5BiZ6t1JjQ9asG7hCar25TKhGnp9Z/mqFrn015LxPMQY A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857385" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857385" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656260" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656260" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 14/25] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Date: Thu, 14 Sep 2023 02:33:14 -0400 Message-Id: <20230914063325.85503-15-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(), to facilitate access to such kind of MSRs. If MSRs supported in kvm_caps.supported_xss are passed through to guest, the guest MSRs are swapped with host's before vCPU exits to userspace and after it re-enters kernel before next VM-entry. Because the modified code is also used for the KVM_GET_MSRS device ioctl(), explicitly check @vcpu is non-null before attempting to load guest state. The XSS supporting MSRs cannot be retrieved via the device ioctl() without loading guest FPU state (which doesn't exist). Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 30 +++++++++++++++++++++++++++++- arch/x86/kvm/x86.h | 24 ++++++++++++++++++++++++ 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 66edbed25db8..a091764bf1d2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; #define KVM_X86_OP(func) \ @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static const u32 xstate_msrs[] = { + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, +}; + +static bool is_xstate_msr(u32 index) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { + if (index == xstate_msrs[i]) + return true; + } + return false; +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4382,11 +4401,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xstate_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 1e7be1f6ab29..9a8e3a84eaf4 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -540,4 +540,28 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +/* + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest, + * guest FPU should have been loaded already. + */ + +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + rdmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + wrmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + #endif From patchwork Thu Sep 14 06:33:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5134EDE99E for ; Thu, 14 Sep 2023 09:38:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237631AbjINJiq (ORCPT ); Thu, 14 Sep 2023 05:38:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237379AbjINJi1 (ORCPT ); Thu, 14 Sep 2023 05:38:27 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82A1A1FC8; Thu, 14 Sep 2023 02:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684301; x=1726220301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eOwVTkRE0fn4PxlTlhArnb0PaHsbEIKS1k0cy/K5iE4=; b=Igb7a9z+s0YwBvnrASfWT91tui2prdVT/lZsYqqNXonmC0KrdqfKp3vL xM1L6Yhgznb2uC/9WHCyeXe9Oi/R0eASlr48BX3rRNTL9bY9wrZDq33/v dVKVbVGYecyilH/RwdrSvyXbS5tM1ybnAAG+eRletcqt8L+tC7zMwGQBY gKnZLLZjVlUKVP2jJ2f5dvMe357Keuw6JahmykFZrUin5GvTO5FJISL7+ r4JLqOfWkgeuIdtRxVf7WDkTLOoazlk50w/6YkNddBBkZF602yqCje/hU j3c2laltFZg6XbWfDaNStIwrCTVxD+9lBMm/2NQbFs0xoWLzQOl6WXHju g==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857391" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857391" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656265" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656265" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 15/25] KVM: x86: Add fault checks for guest CR4.CET setting Date: Thu, 14 Sep 2023 02:33:15 -0400 Message-Id: <20230914063325.85503-16-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Check potential faults for CR4.CET setting per Intel SDM requirements. CET can be enabled if and only if CR0.WP == 1, i.e. setting CR4.CET == 1 faults if CR0.WP == 0 and setting CR0.WP == 0 fails if CR4.CET == 1. Reviewed-by: Chao Gao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a091764bf1d2..dda9c7141ea1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1006,6 +1006,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE))) return 1; + if (!(cr0 & X86_CR0_WP) && kvm_is_cr4_bit_set(vcpu, X86_CR4_CET)) + return 1; + static_call(kvm_x86_set_cr0)(vcpu, cr0); kvm_post_set_cr0(vcpu, old_cr0, cr0); @@ -1217,6 +1220,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if ((cr4 & X86_CR4_CET) && !kvm_is_cr0_bit_set(vcpu, X86_CR0_WP)) + return 1; + static_call(kvm_x86_set_cr4)(vcpu, cr4); kvm_post_set_cr4(vcpu, old_cr4, cr4); From patchwork Thu Sep 14 06:33:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C40BEDE9A5 for ; Thu, 14 Sep 2023 09:38:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237657AbjINJis (ORCPT ); Thu, 14 Sep 2023 05:38:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237398AbjINJi2 (ORCPT ); Thu, 14 Sep 2023 05:38:28 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C14EE1FCA; Thu, 14 Sep 2023 02:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684301; x=1726220301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VPJQf4U39E7DZwum/HGUJ0vCeET4v76K5tsmsMkVRMg=; b=lGwzX0Qb5ZuZQ70Fj6+m3l6vg6oZMquG+OQbxq5MMbRRGUzekiD1FVOU OGrUVh4BflXRTYbhMmYw/dNZQVf2VjBboDzpJoAPxrKXBqTSyxK/o1wIU FCd1J+wNKsxPxSRqmewM+Bykz/66eXoTjZk/sluMQyWbCVI0JzAdsyWnB ZLDy5SJa2MShWJii4+u+DbhT+C6RR9JxnnYzyceaigz1jPH81Qnk5o5w5 LiJ5/sauFu7PspVGFFlyh4KP4mM6G87N8j5nao87OfonYy1GfpJDfRo4q HC4TiHs1cfaEUlbVXmiBR8I6pYuL0uFgrQEAUNiNnnIUuXKYPWRqVfQ7Y w==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857396" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857396" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656270" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656270" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:21 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 16/25] KVM: x86: Report KVM supported CET MSRs as to-be-saved Date: Thu, 14 Sep 2023 02:33:16 -0400 Message-Id: <20230914063325.85503-17-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add CET MSRs to the list of MSRs reported to userspace if the feature, i.e. IBT or SHSTK, associated with the MSRs is supported by KVM. SSP can only be read via RDSSP. Writing even requires destructive and potentially faulting operations such as SAVEPREVSSP/RSTORSSP or SETSSBSY/CLRSSBSY. Let the host use a pseudo-MSR that is just a wrapper for the GUEST_SSP field of the VMCS. Suggested-by: Chao Gao Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/vmx/vmx.c | 2 ++ arch/x86/kvm/x86.c | 18 ++++++++++++++++++ 3 files changed, 21 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 6e64b27b2c1e..9864bbcf2470 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -58,6 +58,7 @@ #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 #define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 #define MSR_KVM_MIGRATION_CONTROL 0x4b564d08 +#define MSR_KVM_SSP 0x4b564d09 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 72e3943f3693..9409753f45b0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7009,6 +7009,8 @@ static bool vmx_has_emulated_msr(struct kvm *kvm, u32 index) case MSR_AMD64_TSC_RATIO: /* This is AMD only. */ return false; + case MSR_KVM_SSP: + return kvm_cpu_cap_has(X86_FEATURE_SHSTK); default: return true; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index dda9c7141ea1..73b45351c0fc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1476,6 +1476,9 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, + MSR_IA32_U_CET, MSR_IA32_S_CET, + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, }; static const u32 msrs_to_save_pmu[] = { @@ -1576,6 +1579,7 @@ static const u32 emulated_msrs_all[] = { MSR_K7_HWCR, MSR_KVM_POLL_CONTROL, + MSR_KVM_SSP, }; static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)]; @@ -7241,6 +7245,20 @@ static void kvm_probe_msr_to_save(u32 msr_index) if (!kvm_caps.supported_xss) return; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + return; + break; + case MSR_IA32_INT_SSP_TAB: + if (!kvm_cpu_cap_has(X86_FEATURE_LM)) + return; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) + return; + break; default: break; } From patchwork Thu Sep 14 06:33:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57B2FEDE99C for ; Thu, 14 Sep 2023 09:38:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237408AbjINJiv (ORCPT ); Thu, 14 Sep 2023 05:38:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237397AbjINJi2 (ORCPT ); Thu, 14 Sep 2023 05:38:28 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 345EA1FCC; Thu, 14 Sep 2023 02:38:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684302; x=1726220302; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TS/VKDE1O/rn+ikFL14LmVI7YbiOfvQZpzAeNZ2MVYI=; b=GdteVxjvqstWaf0NMEnBtwNaZgsI5DHcD6LbIGTz1qCGlT1hzU9KMJb3 06NcRcUS3RA4U40d+PkWekR9KOR0DoGf6UMnO0cKV1h4eaHwVAAFPwCpz 1Z7hyfVK+TAnZJKi8vzdAIRFKWuIhnSpK/bhug5q3OD40rV1mZOeiK+hv mpBneFdJL74NHE0Sar4lfRAm6qsZdayiJKzZn9rZmfepGM/vfv7OQQebQ 8ppYq0C15Yj8FwsAejyA82s2bbjSOMhA5UGmyocUdZdqIEQaCyY0ViAoH 7SAEwu/uC9agCtQGor8E3942CpLaKtQtpQX3gdHdvkoVgWqDxtk/TagL3 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857398" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857398" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656273" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656273" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:21 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Zhang Yi Z Subject: [PATCH v6 17/25] KVM: VMX: Introduce CET VMCS fields and control bits Date: Thu, 14 Sep 2023 02:33:17 -0400 Message-Id: <20230914063325.85503-18-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Control-flow Enforcement Technology (CET) is a kind of CPU feature used to prevent Return/CALL/Jump-Oriented Programming (ROP/COP/JOP) attacks. It provides two sub-features(SHSTK,IBT) to defend against ROP/COP/JOP style control-flow subversion attacks. Shadow Stack (SHSTK): A shadow stack is a second stack used exclusively for control transfer operations. The shadow stack is separate from the data/normal stack and can be enabled individually in user and kernel mode. When shadow stack is enabled, CALL pushes the return address on both the data and shadow stack. RET pops the return address from both stacks and compares them. If the return addresses from the two stacks do not match, the processor generates a #CP. Indirect Branch Tracking (IBT): IBT introduces instruction(ENDBRANCH)to mark valid target addresses of indirect branches (CALL, JMP etc...). If an indirect branch is executed and the next instruction is _not_ an ENDBRANCH, the processor generates a #CP. These instruction behaves as a NOP on platforms that have no CET. Several new CET MSRs are defined to support CET: MSR_IA32_{U,S}_CET: CET settings for {user,supervisor} CET respectively. MSR_IA32_PL{0,1,2,3}_SSP: SHSTK pointer linear address for CPL{0,1,2,3}. MSR_IA32_INT_SSP_TAB: Linear address of SHSTK pointer table, whose entry is indexed by IST of interrupt gate desc. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring supervisor mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores current active SSP. {HOST,GUEST}_INTR_SSP_TABLE: Stores current active MSR_IA32_INT_SSP_TAB. On Intel platforms, two additional bits are defined in VM_EXIT and VM_ENTRY control fields: If VM_EXIT_LOAD_CET_STATE = 1, host CET states are loaded from following VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_CET_STATE = 1, guest CET states are loaded from following VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Reviewed-by: Chao Gao Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 0e73616b82f3..451fd4f4fedc 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -104,6 +104,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -117,6 +118,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -345,6 +347,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822, GUEST_SYSENTER_ESP = 0x00006824, GUEST_SYSENTER_EIP = 0x00006826, + GUEST_S_CET = 0x00006828, + GUEST_SSP = 0x0000682a, + GUEST_INTR_SSP_TABLE = 0x0000682c, HOST_CR0 = 0x00006c00, HOST_CR3 = 0x00006c02, HOST_CR4 = 0x00006c04, @@ -357,6 +362,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP = 0x00006c12, HOST_RSP = 0x00006c14, HOST_RIP = 0x00006c16, + HOST_S_CET = 0x00006c18, + HOST_SSP = 0x00006c1a, + HOST_INTR_SSP_TABLE = 0x00006c1c }; /* From patchwork Thu Sep 14 06:33:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8149FEDE99F for ; Thu, 14 Sep 2023 09:38:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237669AbjINJit (ORCPT ); Thu, 14 Sep 2023 05:38:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237403AbjINJi2 (ORCPT ); Thu, 14 Sep 2023 05:38:28 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76EA31FCD; Thu, 14 Sep 2023 02:38:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684302; x=1726220302; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+nX7XbfVJDmuajtuURcflfcN8G9QEQqgxp6V39Gi2/4=; b=E0BwAGNpX3ZGaRcX3ukmz3L68KkLiByzH4Uk5e/pIHyqAc+0vPRahE8r YXrYMYsDHUnRSurguhEP38+V+rTSBaleENKqAZlX3AEGLQdPn0sSR1p/X KLitecvwT0b65iHHROUaAqVZjuSr/yaaoyctYECmszOo1MthqIm8xgcGu OXvpjCJMe7XZXbUMCsmzmLeRGitm7tv5oTE0I2zOxoOoW941VMByN52pY YWjPR4SKkJC7a7TL9oW+P0vTVcIrNoPUTyN9n8nAJMKLuX473GQ4iTH75 J0WF88st1/xsYbqiuAbABxH1yQTsBb84/titEFpiX2ZT/jyDSiB4AR5CJ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857407" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857407" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656279" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656279" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:21 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 18/25] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Date: Thu, 14 Sep 2023 02:33:18 -0400 Message-Id: <20230914063325.85503-19-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use the governed feature framework to track whether X86_FEATURE_SHSTK and X86_FEATURE_IBT features can be used by userspace and guest, i.e., the features can be used iff both KVM and guest CPUID can support them. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/governed_features.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h index 423a73395c10..db7e21c5ecc2 100644 --- a/arch/x86/kvm/governed_features.h +++ b/arch/x86/kvm/governed_features.h @@ -16,6 +16,8 @@ KVM_GOVERNED_X86_FEATURE(PAUSEFILTER) KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD) KVM_GOVERNED_X86_FEATURE(VGIF) KVM_GOVERNED_X86_FEATURE(VNMI) +KVM_GOVERNED_X86_FEATURE(SHSTK) +KVM_GOVERNED_X86_FEATURE(IBT) #undef KVM_GOVERNED_X86_FEATURE #undef KVM_GOVERNED_FEATURE diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9409753f45b0..fd5893b3a2c8 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7765,6 +7765,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_XSAVES); kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX); + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_SHSTK); + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IBT); vmx_setup_uret_msrs(vmx); From patchwork Thu Sep 14 06:33:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6596EDE9A0 for ; Thu, 14 Sep 2023 09:38:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237638AbjINJis (ORCPT ); Thu, 14 Sep 2023 05:38:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237402AbjINJi2 (ORCPT ); Thu, 14 Sep 2023 05:38:28 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3DE91FCE; Thu, 14 Sep 2023 02:38:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684302; x=1726220302; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=n4Ly4tJ3v4uXUm/GZNY8ToGEs2H/HmEPxplAta3n6CA=; b=Gatj0u3pOfTkNpfzV0MGbxhNz8pik+P4VVTyHqLnndNj71eVUkYx/PS2 AAul1XA87xFWgp1WZbHGwgVmnoKyQOeYpi8sV5UwAEsKkmxQTaF+UoISd lvSFo39FXriizeoKt5xDklIp1DcpP8wCxAMGXyM4HsHaqJ8xjBTk4E/b9 k6+weMVoz6a3Gk0usvW84V3A6bIRvL2/FJz+q8MbKAHXNJN93vnjgD1fk L1btzBmQ+QBZ+UZFafoHkTOmMiqkMpBZVwXmKDF/e81XZeNbvd4B/usPh TMPbL09zcUZsGfIHFSWvrNUu4QENtjohpUzz48ZIafNeTPJ1KutdWi+9H A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857413" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857413" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656283" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656283" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:22 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 19/25] KVM: VMX: Emulate read and write to CET MSRs Date: Thu, 14 Sep 2023 02:33:19 -0400 Message-Id: <20230914063325.85503-20-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add emulation interface for CET MSR access. The emulation code is split into common part and vendor specific part. The former does common check for MSRs and reads/writes directly from/to XSAVE-managed MSRs via the helpers while the latter accesses the MSRs linked to VMCS fields. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 18 +++++++++++ arch/x86/kvm/x86.c | 71 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fd5893b3a2c8..9f4b56337251 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2111,6 +2111,15 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_IA32_S_CET: + msr_info->data = vmcs_readl(GUEST_S_CET); + break; + case MSR_KVM_SSP: + msr_info->data = vmcs_readl(GUEST_SSP); + break; + case MSR_IA32_INT_SSP_TAB: + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE); + break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); break; @@ -2420,6 +2429,15 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_IA32_S_CET: + vmcs_writel(GUEST_S_CET, data); + break; + case MSR_KVM_SSP: + vmcs_writel(GUEST_SSP, data); + break; + case MSR_IA32_INT_SSP_TAB: + vmcs_writel(GUEST_INTR_SSP_TABLE, data); + break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) return 1; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 73b45351c0fc..c85ee42ab4f1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1847,6 +1847,11 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type) } EXPORT_SYMBOL_GPL(kvm_msr_allowed); +#define CET_US_RESERVED_BITS GENMASK(9, 6) +#define CET_US_SHSTK_MASK_BITS GENMASK(1, 0) +#define CET_US_IBT_MASK_BITS (GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10)) +#define CET_US_LEGACY_BITMAP_BASE(data) ((data) >> 12) + /* * Write @data into the MSR specified by @index. Select MSR specific fault * checks are bypassed if @host_initiated is %true. @@ -1856,6 +1861,7 @@ EXPORT_SYMBOL_GPL(kvm_msr_allowed); static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data, bool host_initiated) { + bool host_msr_reset = host_initiated && data == 0; struct msr_data msr; switch (index) { @@ -1906,6 +1912,46 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data, data = (u32)data; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (host_msr_reset && (kvm_cpu_cap_has(X86_FEATURE_SHSTK) || + kvm_cpu_cap_has(X86_FEATURE_IBT))) + break; + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) && + !guest_can_use(vcpu, X86_FEATURE_IBT)) + return 1; + if (data & CET_US_RESERVED_BITS) + return 1; + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) && + (data & CET_US_SHSTK_MASK_BITS)) + return 1; + if (!guest_can_use(vcpu, X86_FEATURE_IBT) && + (data & CET_US_IBT_MASK_BITS)) + return 1; + if (!IS_ALIGNED(CET_US_LEGACY_BITMAP_BASE(data), 4)) + return 1; + + /* IBT can be suppressed iff the TRACKER isn't WAIT_ENDBR. */ + if ((data & CET_SUPPRESS) && (data & CET_WAIT_ENDBR)) + return 1; + break; + case MSR_IA32_INT_SSP_TAB: + if (!guest_cpuid_has(vcpu, X86_FEATURE_LM)) + return 1; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + case MSR_KVM_SSP: + if (host_msr_reset && kvm_cpu_cap_has(X86_FEATURE_SHSTK)) + break; + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK)) + return 1; + if (index == MSR_KVM_SSP && !host_initiated) + return 1; + if (is_noncanonical_address(data, vcpu)) + return 1; + if (index != MSR_IA32_INT_SSP_TAB && !IS_ALIGNED(data, 4)) + return 1; + break; } msr.data = data; @@ -1949,6 +1995,23 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, !guest_cpuid_has(vcpu, X86_FEATURE_RDPID)) return 1; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!guest_can_use(vcpu, X86_FEATURE_IBT) && + !guest_can_use(vcpu, X86_FEATURE_SHSTK)) + return 1; + break; + case MSR_IA32_INT_SSP_TAB: + if (!guest_cpuid_has(vcpu, X86_FEATURE_LM)) + return 1; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + case MSR_KVM_SSP: + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK)) + return 1; + if (index == MSR_KVM_SSP && !host_initiated) + return 1; + break; } msr.index = index; @@ -4009,6 +4072,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vcpu->arch.guest_fpu.xfd_err = data; break; #endif + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + kvm_set_xstate_msr(vcpu, msr_info); + break; default: if (kvm_pmu_is_valid_msr(vcpu, msr)) return kvm_pmu_set_msr(vcpu, msr_info); @@ -4365,6 +4432,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vcpu->arch.guest_fpu.xfd_err; break; #endif + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + kvm_get_xstate_msr(vcpu, msr_info); + break; default: if (kvm_pmu_is_valid_msr(vcpu, msr_info->index)) return kvm_pmu_get_msr(vcpu, msr_info); From patchwork Thu Sep 14 06:33:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 703CAEDE9A1 for ; Thu, 14 Sep 2023 09:38:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237449AbjINJiw (ORCPT ); Thu, 14 Sep 2023 05:38:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237415AbjINJi2 (ORCPT ); Thu, 14 Sep 2023 05:38:28 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 611C61FD6; Thu, 14 Sep 2023 02:38:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684303; x=1726220303; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FIh7tQ4a/6RDY7CSVW6DQiHz8WUvFOSxCbxKQrBiXtw=; b=RTNOUhVPT+fzWnrT8nSyUgNgjU6ZmZz5xlZKATnOpd9HbQUz6JO8zHRi B86Buypd0R6nvRSyznsc9AFGWdzltMlcAwhqEzrdqemZHiE4xESYzhTbD 0kQoqUtTiUPT4JRXJ+pKomTEPggB54eAGCMKBr5YDpD+u+5E2vmCmjgz3 3ENOIYD7rGrcwPgPR3kMO5fM1bq811YaD7RJX34RYPR1Hr78k1Wa6aXnF S/1BqvtIs38XDcmjtdvuYxxf4KXuEYj8JjHmlSlReRoj4M0ByH3QuPuir lnheiA0m/ynSuzU58g1Irru35LGBZ5h4viOUIiv8S0ZtUsgYLMKMqq49i Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857421" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857421" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656287" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656287" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:22 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 20/25] KVM: x86: Save and reload SSP to/from SMRAM Date: Thu, 14 Sep 2023 02:33:20 -0400 Message-Id: <20230914063325.85503-21-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Save CET SSP to SMRAM on SMI and reload it on RSM. KVM emulates HW arch behavior when guest enters/leaves SMM mode,i.e., save registers to SMRAM at the entry of SMM and reload them at the exit to SMM. Per SDM, SSP is one of such registers on 64bit Arch, so add the support for SSP. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/smm.c | 8 ++++++++ arch/x86/kvm/smm.h | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index b42111a24cc2..235fca95f103 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -275,6 +275,10 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, enter_smm_save_seg_64(vcpu, &smram->gs, VCPU_SREG_GS); smram->int_shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu); + + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) + KVM_BUG_ON(kvm_msr_read(vcpu, MSR_KVM_SSP, &smram->ssp), + vcpu->kvm); } #endif @@ -565,6 +569,10 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, static_call(kvm_x86_set_interrupt_shadow)(vcpu, 0); ctxt->interruptibility = (u8)smstate->int_shadow; + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) + KVM_BUG_ON(kvm_msr_write(vcpu, MSR_KVM_SSP, smstate->ssp), + vcpu->kvm); + return X86EMUL_CONTINUE; } #endif diff --git a/arch/x86/kvm/smm.h b/arch/x86/kvm/smm.h index a1cf2ac5bd78..1e2a3e18207f 100644 --- a/arch/x86/kvm/smm.h +++ b/arch/x86/kvm/smm.h @@ -116,8 +116,8 @@ struct kvm_smram_state_64 { u32 smbase; u32 reserved4[5]; - /* ssp and svm_* fields below are not implemented by KVM */ u64 ssp; + /* svm_* fields below are not implemented by KVM */ u64 svm_guest_pat; u64 svm_host_efer; u64 svm_host_cr4; From patchwork Thu Sep 14 06:33:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37E38EDE9A0 for ; Thu, 14 Sep 2023 09:38:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237270AbjINJix (ORCPT ); Thu, 14 Sep 2023 05:38:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237432AbjINJia (ORCPT ); Thu, 14 Sep 2023 05:38:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4A761FD8; Thu, 14 Sep 2023 02:38:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684303; x=1726220303; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/Fcn/JsDDhKgNCQIEeFUvEKU2jF55FyYBOBg3kc7yEI=; b=QDwor15hm0/MuaSP+8JnrRhIdpXo6dFgoHBO5h6ERNmZEF0MNt+C20Lh hM6owtQOE8SXcu3UdsszsLlcfduhFB24EqcpOkkD8dxnvbdvzEHO0hFq7 L9OABgrkyAE8ClQAfL/ziNMDP9prScwI+BoyJrcHAKg+SubuIySVQbN8i 3Z5LJvFr8GDPaV4Kwepb00c5rBxZHL7pmhVirC5aJwKmdzMYo4+bjXW2X 5+ly0GusX8she9A9PwlwjVQC+ScIOZetHYfhZ0z0wPY7jtFGL6mRmKtUh VD2R1/j6NOt7umX+4xzvg1ncja/tec0Wk3UGjDO2lR3iNt66gDG1GSBLf A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857426" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857426" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656292" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656292" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 21/25] KVM: VMX: Set up interception for CET MSRs Date: Thu, 14 Sep 2023 02:33:21 -0400 Message-Id: <20230914063325.85503-22-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enable/disable CET MSRs interception per associated feature configuration. Shadow Stack feature requires all CET MSRs passed through to guest to make it supported in user and supervisor mode while IBT feature only depends on MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT. Note, this MSR design introduced an architectual limitation of SHSTK and IBT control for guest, i.e., when SHSTK is exposed, IBT is also available to guest from architectual perspective since IBT relies on subset of SHSTK relevant MSRs. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9f4b56337251..30373258573d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -699,6 +699,10 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8: /* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */ return true; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -7769,6 +7773,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + bool incpt; + + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, + MSR_TYPE_RW, incpt); + if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, + MSR_TYPE_RW, incpt); + if (!incpt) + return; + } + + if (kvm_cpu_cap_has(X86_FEATURE_IBT)) { + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, + MSR_TYPE_RW, incpt); + } +} + static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7846,6 +7886,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); + + vmx_update_intercept_for_cet_msr(vcpu); } static u64 vmx_get_perf_capabilities(void) From patchwork Thu Sep 14 06:33:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384924 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36F9DEDE99E for ; Thu, 14 Sep 2023 09:38:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237777AbjINJiy (ORCPT ); Thu, 14 Sep 2023 05:38:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237436AbjINJia (ORCPT ); Thu, 14 Sep 2023 05:38:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DE641FD9; Thu, 14 Sep 2023 02:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684304; x=1726220304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4oZsvhTZUqUAAXS788VUv+8I+dApR/h26znaChNPNp0=; b=cQTNe76/BkfbNE/+EKRb5QzW9+7h5snZr0Tu9DC8IrQEvqUedr5KeXgl QKAQy9+E8GCMK7zu5iZx4ZR2a2eqG4tqteViNI75yloygTxDiEHjIh6aS S4JabrfXAb2nadsVWWl821qO++I0WMlLRGieeeKQH9yV2NI/rhfsdkMDJ PPeeMeogVi+4T7iIVe/swexL/Q8G1iAXEMXIh6fS/UfsnmWVpYNzqqcIx DcjD0m6bpmzrZcS6PDiHKBh5BnQyokcJg7LQL3NlSp6xRGkLbNHI1X6/+ epDFX72dathoghmh+KYhZiOVSkzzgCxtNa4U3f4/VenYr0WHHXubt12P+ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857431" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857431" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656297" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656297" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 22/25] KVM: VMX: Set host constant supervisor states to VMCS fields Date: Thu, 14 Sep 2023 02:33:22 -0400 Message-Id: <20230914063325.85503-23-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Save constant values to HOST_{S_CET,SSP,INTR_SSP_TABLE} field explicitly. Kernel IBT is supported and the setting in MSR_IA32_S_CET is static after post-boot(The exception is BIOS call case but vCPU thread never across it) and KVM doesn't need to refresh HOST_S_CET field before every VM-Enter/ VM-Exit sequence. Host supervisor shadow stack is not enabled now and SSP is not accessible to kernel mode, thus it's safe to set host IA32_INT_SSP_TAB/SSP VMCS field to 0s. When shadow stack is enabled for CPL3, SSP is reloaded from PL3_SSP before it exits to userspace. Check SDM Vol 2A/B Chapter 3/4 for SYSCALL/ SYSRET/SYSENTER SYSEXIT/RDSSP/CALL etc. Prevent KVM module loading if host supervisor shadow stack SHSTK_EN is set in MSR_IA32_S_CET as KVM cannot co-exit with it correctly. Suggested-by: Sean Christopherson Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/capabilities.h | 4 ++++ arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++ arch/x86/kvm/x86.c | 14 ++++++++++++++ arch/x86/kvm/x86.h | 1 + 4 files changed, 34 insertions(+) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 41a4533f9989..ee8938818c8a 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -106,6 +106,10 @@ static inline bool cpu_has_load_perf_global_ctrl(void) return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; } +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 30373258573d..9ccc2c552f55 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4375,6 +4375,21 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx) if (cpu_has_load_ia32_efer()) vmcs_write64(HOST_IA32_EFER, host_efer); + + /* + * Supervisor shadow stack is not enabled on host side, i.e., + * host IA32_S_CET.SHSTK_EN bit is guaranteed to 0 now, per SDM + * description(RDSSP instruction), SSP is not readable in CPL0, + * so resetting the two registers to 0s at VM-Exit does no harm + * to kernel execution. When execution flow exits to userspace, + * SSP is reloaded from IA32_PL3_SSP. Check SDM Vol.2A/B Chapter + * 3 and 4 for details. + */ + if (cpu_has_load_cet_ctrl()) { + vmcs_writel(HOST_S_CET, host_s_cet); + vmcs_writel(HOST_SSP, 0); + vmcs_writel(HOST_INTR_SSP_TABLE, 0); + } } void set_cr4_guest_host_mask(struct vcpu_vmx *vmx) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c85ee42ab4f1..231d4a7b6f3d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -114,6 +114,8 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE); #endif static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS; +u64 __read_mostly host_s_cet; +EXPORT_SYMBOL_GPL(host_s_cet); #define KVM_EXIT_HYPERCALL_VALID_MASK (1 << KVM_HC_MAP_GPA_RANGE) @@ -9618,6 +9620,18 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) return -EIO; } + if (boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_S_CET, host_s_cet); + /* + * Linux doesn't yet support supervisor shadow stacks (SSS), so + * KVM doesn't save/restore the associated MSRs, i.e. KVM may + * clobber the host values. Yell and refuse to load if SSS is + * unexpectedly enabled, e.g. to avoid crashing the host. + */ + if (WARN_ON_ONCE(host_s_cet & CET_SHSTK_EN)) + return -EIO; + } + x86_emulator_cache = kvm_alloc_emulator_cache(); if (!x86_emulator_cache) { pr_err("failed to allocate cache for x86 emulator\n"); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 9a8e3a84eaf4..0d5f673338dd 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -324,6 +324,7 @@ fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu); extern u64 host_xcr0; extern u64 host_xss; extern u64 host_arch_capabilities; +extern u64 host_s_cet; extern struct kvm_caps kvm_caps; From patchwork Thu Sep 14 06:33:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2662CEDE9A2 for ; Thu, 14 Sep 2023 09:38:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237811AbjINJiy (ORCPT ); Thu, 14 Sep 2023 05:38:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237440AbjINJia (ORCPT ); Thu, 14 Sep 2023 05:38:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 607E91BF9; Thu, 14 Sep 2023 02:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684304; x=1726220304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5n02LuDiDUjCrYr4wB6bg9rxnnNbdlTBdRZXUE+MxOw=; b=BuiJhux/C4jSjYvLMMzKyy5GJqdIlj1DMWEiSfXB9iRUY2dRjKT2fGcW IXahfoob2PypRn2gR5xUVxf4MxNoM1UjdCOjCBwiSzmY6mz7Z7Gj9zN7T T5ZSX1GXnNNd/a2Nd5zvXSRUtBz04SV1LEWgSbWxB/UMhZzgsfX/HgGP7 swjdlU+HfFeEy7wTcNzQd1iHW2sgx2E4f5xSJB/UuXfHaf3GwEVXUGZNZ Uuu5bPx/ojR/ZN9chmgtB/uzeXeJcL8aN0bjFGfjZuu6MqnGIt8JznyQn 6eQSYPK+1l0SwMBFSJgsCB/kgPfjosJkpzmJMqZadLIA0FWyI70sfy6Uv g==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857436" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857436" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656302" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656302" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 23/25] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Date: Thu, 14 Sep 2023 02:33:23 -0400 Message-Id: <20230914063325.85503-24-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Expose CET features to guest if KVM/host can support them, clear CPUID feature bits if KVM/host cannot support. Set CPUID feature bits so that CET features are available in guest CPUID. Add CR4.CET bit support in order to allow guest set CET master control bit. Disable KVM CET feature if unrestricted_guest is unsupported/disabled as KVM does not support emulating CET. Don't expose CET feature if either of {U,S}_CET xstate bits is cleared in host XSS or if XSAVES isn't supported. The CET load-bits in VM_ENTRY/VM_EXIT control fields should be set to make guest CET xstates isolated from host's. And all platforms that support CET enumerate VMX_BASIC[bit56] as 1, clear CET feature bits if the bit doesn't read 1. Regarding the CET MSR contents after Reset/INIT, SDM doesn't mention the default values, neither can I get the answer internally so far, will fill the gap once it's clear. Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/cpuid.c | 12 ++++++++++-- arch/x86/kvm/vmx/capabilities.h | 6 ++++++ arch/x86/kvm/vmx/vmx.c | 23 ++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 6 ++++-- arch/x86/kvm/x86.c | 12 +++++++++++- arch/x86/kvm/x86.h | 3 +++ 8 files changed, 59 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d77b030e996c..db0010fa3363 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -125,7 +125,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_CET)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 1d111350197f..1f8dc04da468 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1091,6 +1091,7 @@ #define VMX_BASIC_MEM_TYPE_MASK 0x003c000000000000LLU #define VMX_BASIC_MEM_TYPE_WB 6LLU #define VMX_BASIC_INOUT 0x0040000000000000LLU +#define VMX_BASIC_NO_HW_ERROR_CODE_CC 0x0100000000000000LLU /* Resctrl MSRs: */ /* - Intel: */ diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 4e7a820cba62..d787a506746a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -654,7 +654,7 @@ void kvm_set_cpu_caps(void) F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | - F(SGX_LC) | F(BUS_LOCK_DETECT) + F(SGX_LC) | F(BUS_LOCK_DETECT) | F(SHSTK) ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) @@ -672,7 +672,8 @@ void kvm_set_cpu_caps(void) F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | - F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) + F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) | + F(IBT) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ @@ -685,6 +686,13 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP); if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD); + /* + * The feature bit in boot_cpu_data.x86_capability could have been + * cleared due to ibt=off cmdline option, then add it back if CPU + * supports IBT. + */ + if (cpuid_edx(7) & F(IBT)) + kvm_cpu_cap_set(X86_FEATURE_IBT); kvm_cpu_cap_mask(CPUID_7_1_EAX, F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index ee8938818c8a..e12bc233d88b 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -79,6 +79,12 @@ static inline bool cpu_has_vmx_basic_inout(void) return (((u64)vmcs_config.basic_cap << 32) & VMX_BASIC_INOUT); } +static inline bool cpu_has_vmx_basic_no_hw_errcode(void) +{ + return ((u64)vmcs_config.basic_cap << 32) & + VMX_BASIC_NO_HW_ERROR_CODE_CC; +} + static inline bool cpu_has_virtual_nmis(void) { return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS && diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9ccc2c552f55..f0dea8ecd0c6 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2614,6 +2614,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf, { VM_ENTRY_LOAD_IA32_EFER, VM_EXIT_LOAD_IA32_EFER }, { VM_ENTRY_LOAD_BNDCFGS, VM_EXIT_CLEAR_BNDCFGS }, { VM_ENTRY_LOAD_IA32_RTIT_CTL, VM_EXIT_CLEAR_IA32_RTIT_CTL }, + { VM_ENTRY_LOAD_CET_STATE, VM_EXIT_LOAD_CET_STATE }, }; memset(vmcs_conf, 0, sizeof(*vmcs_conf)); @@ -4934,6 +4935,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmcs_write64(GUEST_BNDCFGS, 0); vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); /* 22.2.1 */ + vmcs_writel(GUEST_SSP, 0); + vmcs_writel(GUEST_S_CET, 0); + vmcs_writel(GUEST_INTR_SSP_TABLE, 0); kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); @@ -6354,6 +6358,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0) vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest); + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP)); + pr_err("INTR SSP TABLE = 0x%016lx\n", + vmcs_readl(GUEST_INTR_SSP_TABLE)); + } pr_err("*** Host State ***\n"); pr_err("RIP = 0x%016lx RSP = 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6431,6 +6441,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID) pr_err("Virtual processor ID = 0x%04x\n", vmcs_read16(VIRTUAL_PROCESSOR_ID)); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP)); + pr_err("INTR SSP TABLE = 0x%016lx\n", + vmcs_readl(HOST_INTR_SSP_TABLE)); + } } /* @@ -7967,7 +7983,6 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_UMIP); /* CPUID 0xD.1 */ - kvm_caps.supported_xss = 0; if (!cpu_has_vmx_xsaves()) kvm_cpu_cap_clear(X86_FEATURE_XSAVES); @@ -7979,6 +7994,12 @@ static __init void vmx_set_cpu_caps(void) if (cpu_has_vmx_waitpkg()) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); + + if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest || + !cpu_has_vmx_basic_no_hw_errcode()) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } } static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index c2130d2c8e24..fb72819fbb41 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -480,7 +480,8 @@ static inline u8 vmx_get_rvi(void) VM_ENTRY_LOAD_IA32_EFER | \ VM_ENTRY_LOAD_BNDCFGS | \ VM_ENTRY_PT_CONCEAL_PIP | \ - VM_ENTRY_LOAD_IA32_RTIT_CTL) + VM_ENTRY_LOAD_IA32_RTIT_CTL | \ + VM_ENTRY_LOAD_CET_STATE) #define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS \ (VM_EXIT_SAVE_DEBUG_CONTROLS | \ @@ -502,7 +503,8 @@ static inline u8 vmx_get_rvi(void) VM_EXIT_LOAD_IA32_EFER | \ VM_EXIT_CLEAR_BNDCFGS | \ VM_EXIT_PT_CONCEAL_PIP | \ - VM_EXIT_CLEAR_IA32_RTIT_CTL) + VM_EXIT_CLEAR_IA32_RTIT_CTL | \ + VM_EXIT_LOAD_CET_STATE) #define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \ (PIN_BASED_EXT_INTR_MASK | \ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 231d4a7b6f3d..b7d1ac6b8d75 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -231,7 +231,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) -#define KVM_SUPPORTED_XSS 0 +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9699,6 +9700,15 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) kvm_caps.supported_xss = 0; + if ((kvm_caps.supported_xss & (XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL)) != + (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~XFEATURE_CET_USER; + kvm_caps.supported_xss &= ~XFEATURE_CET_KERNEL; + } + #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_); #undef __kvm_cpu_cap_has diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 0d5f673338dd..665a7f91d04f 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -530,6 +530,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type); __reserved_bits |= X86_CR4_VMXE; \ if (!__cpu_has(__c, X86_FEATURE_PCID)) \ __reserved_bits |= X86_CR4_PCIDE; \ + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \ + !__cpu_has(__c, X86_FEATURE_IBT)) \ + __reserved_bits |= X86_CR4_CET; \ __reserved_bits; \ }) From patchwork Thu Sep 14 06:33:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384926 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB730EDE99C for ; Thu, 14 Sep 2023 09:38:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237526AbjINJi4 (ORCPT ); Thu, 14 Sep 2023 05:38:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237446AbjINJia (ORCPT ); Thu, 14 Sep 2023 05:38:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B97201FC0; Thu, 14 Sep 2023 02:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684304; x=1726220304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z0ux6e/uG0dkaGdbqkAWPBXX8t5ysVuvO2txMAe3nKQ=; b=fJbI0Wz17iRogtXpnj1IJ+15nQVIs9/iNezsmhKdvKeV/5lyZq15n04I s7/pnWNFUEHfrU6SKwXo4avuFxROW0BG8I2FTDBHFZIzPDbmWZP9eaFHb eMZe8ly6+gBxuNcXOJiJLpHGfCC6lCr0ZDnauaUnrCIOKZpqKgVLXgzWb jYkyU6UEkZz70mZKBOATwzNHSvn8Q8M/WmDFhcB9uVT7sBCbeSfDto5iB To560UoYchTdUbcYK7oNAM64yddj5G7MUOT4gXD+g5srOZTYgoPVUxfib qKqOeIExWM9ndwvl1aHi/oNBpN9OAnaeckWQN4spnVz3Ur5NnjuOBFYFd Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857441" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857441" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656306" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656306" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:24 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 24/25] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Date: Thu, 14 Sep 2023 02:33:24 -0400 Message-Id: <20230914063325.85503-25-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Per SDM description(Vol.3D, Appendix A.1): "If bit 56 is read as 1, software can use VM entry to deliver a hardware exception with or without an error code, regardless of vector" Modify has_error_code check before inject events to nested guest. Only enforce the check when guest is in real mode, the exception is not hard exception and the platform doesn't enumerate bit56 in VMX_BASIC, in all other case ignore the check to make the logic consistent with SDM. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 22 ++++++++++++++-------- arch/x86/kvm/vmx/nested.h | 5 +++++ 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index c5ec0ef51ff7..78a3be394d00 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1205,9 +1205,9 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data) { const u64 feature_and_reserved = /* feature (except bit 48; see below) */ - BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) | + BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) | BIT_ULL(56) | /* reserved */ - BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56); + BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 57); u64 vmx_basic = vmcs_config.nested.basic; if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved)) @@ -2846,12 +2846,16 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu, CC(intr_type == INTR_TYPE_OTHER_EVENT && vector != 0)) return -EINVAL; - /* VM-entry interruption-info field: deliver error code */ - should_have_error_code = - intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode && - x86_exception_has_error_code(vector); - if (CC(has_error_code != should_have_error_code)) - return -EINVAL; + if (!prot_mode || intr_type != INTR_TYPE_HARD_EXCEPTION || + !nested_cpu_has_no_hw_errcode_cc(vcpu)) { + /* VM-entry interruption-info field: deliver error code */ + should_have_error_code = + intr_type == INTR_TYPE_HARD_EXCEPTION && + prot_mode && + x86_exception_has_error_code(vector); + if (CC(has_error_code != should_have_error_code)) + return -EINVAL; + } /* VM-entry exception error code */ if (CC(has_error_code && @@ -6968,6 +6972,8 @@ static void nested_vmx_setup_basic(struct nested_vmx_msrs *msrs) if (cpu_has_vmx_basic_inout()) msrs->basic |= VMX_BASIC_INOUT; + if (cpu_has_vmx_basic_no_hw_errcode()) + msrs->basic |= VMX_BASIC_NO_HW_ERROR_CODE_CC; } static void nested_vmx_setup_cr_fixed(struct nested_vmx_msrs *msrs) diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h index b4b9d51438c6..26842da6857d 100644 --- a/arch/x86/kvm/vmx/nested.h +++ b/arch/x86/kvm/vmx/nested.h @@ -284,6 +284,11 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val) __kvm_is_valid_cr4(vcpu, val); } +static inline bool nested_cpu_has_no_hw_errcode_cc(struct kvm_vcpu *vcpu) +{ + return to_vmx(vcpu)->nested.msrs.basic & VMX_BASIC_NO_HW_ERROR_CODE_CC; +} + /* No difference in the restrictions on guest and host CR4 in VMX operation. */ #define nested_guest_cr4_valid nested_cr4_valid #define nested_host_cr4_valid nested_cr4_valid From patchwork Thu Sep 14 06:33:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13384927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B94A9EDE99E for ; Thu, 14 Sep 2023 09:38:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237170AbjINJi7 (ORCPT ); Thu, 14 Sep 2023 05:38:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237444AbjINJia (ORCPT ); Thu, 14 Sep 2023 05:38:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 216D61FC3; Thu, 14 Sep 2023 02:38:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684305; x=1726220305; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iR94rXH2VOkdWOb6yC2IM46f2G3Vcs7YPjw5gLl1bog=; b=TG5iqx88BLHRwNEDmPR70SnRsu6m7YbK1dG+z3zfM1xhOlq4wqzIskY2 hNqfJphjk0sSKxWJ9+//XqAPGn4WfQ/FdboDyQVmXI1XNRSU4Z8BjqGno ac9GCZ9O1WH0vMu3C7/Uh64y+joh11lmrP/qK348ReV+EpLfJ0HWhS+g8 q6yh8F7sOl90+3HFf34kDX4u99Z8Uffu6iWKxXyEeu17iYywQQ1s+FsLs I4XYfANxdS/7bbR6iQQLC6+0tPw+OqTpQbMf6xYd3qa4JNtxgJjfZqaQm FikNefObkkwKN6U5gKJhHEX0rqGBqCQjBRj8Pu5CquYyq+8adPZWGtmAN A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857447" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857447" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656310" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656310" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:24 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 25/25] KVM: nVMX: Enable CET support for nested guest Date: Thu, 14 Sep 2023 02:33:25 -0400 Message-Id: <20230914063325.85503-26-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Set up CET MSRs, related VM_ENTRY/EXIT control bits and fixed CR4 setting to enable CET for nested VM. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 27 +++++++++++++++++++++++++-- arch/x86/kvm/vmx/vmcs12.c | 6 ++++++ arch/x86/kvm/vmx/vmcs12.h | 14 +++++++++++++- arch/x86/kvm/vmx/vmx.c | 2 ++ 4 files changed, 46 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 78a3be394d00..2c4ff13fddb0 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -660,6 +660,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_FLUSH_CMD, MSR_TYPE_W); + /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */ + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_U_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_S_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL0_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL1_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL2_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL3_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false); vmx->nested.force_msr_bitmap_recalc = false; @@ -6794,7 +6816,7 @@ static void nested_vmx_setup_exit_ctls(struct vmcs_config *vmcs_conf, VM_EXIT_HOST_ADDR_SPACE_SIZE | #endif VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT | - VM_EXIT_CLEAR_BNDCFGS; + VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_CET_STATE; msrs->exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER | @@ -6816,7 +6838,8 @@ static void nested_vmx_setup_entry_ctls(struct vmcs_config *vmcs_conf, #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | #endif - VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; + VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS | + VM_ENTRY_LOAD_CET_STATE; msrs->entry_ctls_high |= (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL); diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 106a72c923ca..4233b5ca9461 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELD(GUEST_S_CET, guest_s_cet), + FIELD(GUEST_SSP, guest_ssp), + FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), FIELD(HOST_CR0, host_cr0), FIELD(HOST_CR3, host_cr3), FIELD(HOST_CR4, host_cr4), @@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), + FIELD(HOST_S_CET, host_s_cet), + FIELD(HOST_SSP, host_ssp), + FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs12_field_offsets); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 01936013428b..3884489e7f7e 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -117,7 +117,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -292,6 +298,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f0dea8ecd0c6..2c43f1088d77 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7731,6 +7731,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); + cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT)); #undef cr4_fixed1_update }