From patchwork Mon Feb 19 07:47:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562292 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 879B8210EC; Mon, 19 Feb 2024 07:47:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328868; cv=none; b=AyGnZOgb5TFBmBXe23mHoK5ZUmXg7h3yoVMxrjQ0KB2wTABh7Mj4FW5V8gPbvXUS+71Lb6+R4OL1/Ysvqoj1IoiGlBFsgouTAAZUK5uC4IsuLBP2VNBWnz4rOrwnUf8CW1Oi8Wx5M8GsbxUVx9NDGdZCr70118HF2M/AyY4tjs0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328868; c=relaxed/simple; bh=j7/oTF9PaZHXhsGXMLIHaKH3qGq42otZGD7EVuO6kH4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RUw0CbkiCPx8kF8IJSjVjqjM0Zq/r9gpX22qMHR35MI6s9AjnQssbK+vy8sXgkvhgOyhBAtIbff3BxcE3DJKaNawjZgHBynaKPe+s3dD/MP6Q7meHmUK3jNMlO13+7ePr+pm+BQuLy/exaHq3GMvTCqyNqTTOAlZJazAZJ0bozE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jkWCW/Ug; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jkWCW/Ug" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328866; x=1739864866; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=j7/oTF9PaZHXhsGXMLIHaKH3qGq42otZGD7EVuO6kH4=; b=jkWCW/Ug9PmBRejg2WSwly//SWw32nH7p8AlMUEyq+hCFadHejwb6et+ RO1RbCIoFJiNtiFArzbe30bR46u6tjR5/G325ZpLgTUlkaVtRmOONYoM9 LGFHJ6ArQl70s3oCXbeBIafQyr4HdghCrG0ehAR/kaIcMhajp5Vmth+w9 OoXbRJGEEn4V7eNRUURqi7LlN1smuaGO5iKhHK7vGvA0j8abY2DSKr701 DNp+lqer+/GF9/iNeCL7aEG143TBikJByLantsl+cuVTXuQutgTzl6hzF XGHxyrWcAG/mgCacpNJJs0e1TVci3j+Zm6JBUx8IuK7Xv5d2KvxjeKyyf Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535017" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535017" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966060" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966060" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 01/27] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Date: Sun, 18 Feb 2024 23:47:07 -0800 Message-ID: <20240219074733.122080-2-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Sean Christopherson When granting userspace or a KVM guest access to an xfeature, preserve the entity's existing supervisor and software-defined permissions as tracked by __state_perm, i.e. use __state_perm to track *all* permissions even though all supported supervisor xfeatures are granted to all FPUs and FPU_GUEST_PERM_LOCKED disallows changing permissions. Effectively clobbering supervisor permissions results in inconsistent behavior, as xstate_get_group_perm() will report supervisor features for process that do NOT request access to dynamic user xfeatures, whereas any and all supervisor features will be absent from the set of permissions for any process that is granted access to one or more dynamic xfeatures (which right now means AMX). The inconsistency isn't problematic because fpu_xstate_prctl() already strips out everything except user xfeatures: case ARCH_GET_XCOMP_PERM: /* * Lockless snapshot as it can also change right after the * dropping the lock. */ permitted = xstate_get_host_group_perm(); permitted &= XFEATURE_MASK_USER_SUPPORTED; return put_user(permitted, uptr); case ARCH_GET_XCOMP_GUEST_PERM: permitted = xstate_get_guest_group_perm(); permitted &= XFEATURE_MASK_USER_SUPPORTED; return put_user(permitted, uptr); and similarly KVM doesn't apply the __state_perm to supervisor states (kvm_get_filtered_xcr0() incorporates xstate_get_guest_group_perm()): case 0xd: { u64 permitted_xcr0 = kvm_get_filtered_xcr0(); u64 permitted_xss = kvm_caps.supported_xss; But if KVM in particular were to ever change, dropping supervisor permissions would result in subtle bugs in KVM's reporting of supported CPUID settings. And the above behavior also means that having supervisor xfeatures in __state_perm is correctly handled by all users. Dropping supervisor permissions also creates another landmine for KVM. If more dynamic user xfeatures are ever added, requesting access to multiple xfeatures in separate ARCH_REQ_XCOMP_GUEST_PERM calls will result in the second invocation of __xstate_request_perm() computing the wrong ksize, as as the mask passed to xstate_calculate_size() would not contain *any* supervisor features. Commit 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor states in XSTATE permissions") fudged around the size issue for userspace FPUs, but for reasons unknown skipped guest FPUs. Lack of a fix for KVM "works" only because KVM doesn't yet support virtualizing features that have supervisor xfeatures, i.e. as of today, KVM guest FPUs will never need the relevant xfeatures. Simply extending the hack-a-fix for guests would temporarily solve the ksize issue, but wouldn't address the inconsistency issue and would leave another lurking pitfall for KVM. KVM support for virtualizing CET will likely add CET_KERNEL as a guest-only xfeature, i.e. CET_KERNEL will not be set in xfeatures_mask_supervisor() and would again be dropped when granting access to dynamic xfeatures. Note, the existing clobbering behavior is rather subtle. The @permitted parameter to __xstate_request_perm() comes from: permitted = xstate_get_group_perm(guest); which is either fpu->guest_perm.__state_perm or fpu->perm.__state_perm, where __state_perm is initialized to: fpu->perm.__state_perm = fpu_kernel_cfg.default_features; and copied to the guest side of things: /* Same defaults for guests */ fpu->guest_perm = fpu->perm; fpu_kernel_cfg.default_features contains everything except the dynamic xfeatures, i.e. everything except XFEATURE_MASK_XTILE_DATA: fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features; fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; When __xstate_request_perm() restricts the local "mask" variable to compute the user state size: mask &= XFEATURE_MASK_USER_SUPPORTED; usize = xstate_calculate_size(mask, false); it subtly overwrites the target __state_perm with "mask" containing only user xfeatures: perm = guest ? &fpu->guest_perm : &fpu->perm; /* Pairs with the READ_ONCE() in xstate_get_group_perm() */ WRITE_ONCE(perm->__state_perm, mask); Cc: Maxim Levitsky Cc: Weijiang Yang Cc: Dave Hansen Cc: Paolo Bonzini Cc: Peter Zijlstra Cc: Chao Gao Cc: Rick Edgecombe Cc: John Allen Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/all/ZTqgzZl-reO1m01I@google.com Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Rick Edgecombe --- arch/x86/kernel/fpu/xstate.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 117e74c44e75..07911532b108 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1601,16 +1601,20 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest) if ((permitted & requested) == requested) return 0; - /* Calculate the resulting kernel state size */ + /* + * Calculate the resulting kernel state size. Note, @permitted also + * contains supervisor xfeatures even though supervisor are always + * permitted for kernel and guest FPUs, and never permitted for user + * FPUs. + */ mask = permitted | requested; - /* Take supervisor states into account on the host */ - if (!guest) - mask |= xfeatures_mask_supervisor(); ksize = xstate_calculate_size(mask, compacted); - /* Calculate the resulting user state size */ - mask &= XFEATURE_MASK_USER_SUPPORTED; - usize = xstate_calculate_size(mask, false); + /* + * Calculate the resulting user state size. Take care not to clobber + * the supervisor xfeatures in the new mask! + */ + usize = xstate_calculate_size(mask & XFEATURE_MASK_USER_SUPPORTED, false); if (!guest) { ret = validate_sigaltstack(usize); From patchwork Mon Feb 19 07:47:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562291 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90F8A20DEA; Mon, 19 Feb 2024 07:47:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328867; cv=none; b=L2aM9XUvk8/A9MKk8GQN1h1xOhqm6LOf56rvRP6e0TQuVJYtd9L/LzNdgrWSVE0Nb+JkUXVq+zGPTKT6f9AoXS+i9eiHt3/JcxMXldLXOo27jhY1PegNfmKbEp2P7T+rEOv+lztGxguQ26j5LcheVD1lmygoiqLZjLYjgn5FLrc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328867; c=relaxed/simple; bh=Qcjmmmende+5tsFshQjIf5GC3RiO8JOjs03Rr6rpZfY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PRYLwhTEaIwAun1bpxZ6yQCfOROFVhQIt9ux4iu/D8Z5yAVgLPBA3x5sqEwZ2p0WK2KaGcsiIlUrscHGdP71D2R/1g+lFcz7NIuduoPnpOl3XTxUNxQjSQgR7tBaeQZp2Ps4RqQEhnkcSO2ZAkxXVp28tUxYVMrNo45iKZMnvCw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nYILxYVy; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nYILxYVy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328865; x=1739864865; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qcjmmmende+5tsFshQjIf5GC3RiO8JOjs03Rr6rpZfY=; b=nYILxYVyhhZHgnRO8bR6J+F3oinEOlE3BuQKlWqWnvmVoldB+KxHEbvG KOS0sh5rYs3WUC5aGZsxC1kIZ1TCo8gLuRUeE4b4tg2RMHe+z/UHu717T TyBGa2F3nxMAeexfr1VtswWeW51POq+/jAfuk3pQVOKrVWsVuEYC1DgMv S1flUq4QfMlsJQbIFiFOo3bVnvMa7DrjwybwbUQAE/Sir6+IfhD9nZsPx bbcWEt7tNyswIeNMYaAx36U9BI6eT62FbMw1OV+WKN82fqXRC+MV3rRvE XSxqj3gdq6KPAprLBQYhGL+cw74/KP/8sBDWpbDgugjCfqkD/gtkP3yuV A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535011" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535011" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966063" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966063" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 02/27] x86/fpu/xstate: Refine CET user xstate bit enabling Date: Sun, 18 Feb 2024 23:47:08 -0800 Message-ID: <20240219074733.122080-3-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Remove XFEATURE_CET_USER entry from dependency array as the entry doesn't reflect true dependency between CET features and the user xstate bit. Enable the bit in fpu_kernel_cfg.max_features when either SHSTK or IBT is available. Both user mode shadow stack and indirect branch tracking features depend on XFEATURE_CET_USER bit in XSS to automatically save/restore user mode xstate registers, i.e., IA32_U_CET and IA32_PL3_SSP whenever necessary. Note, the issue, i.e., CPUID only enumerates IBT but no SHSTK is resulted from CET KVM series which synthesizes guest CPUIDs based on userspace settings,in real world the case is rare. In other words, the existing dependency check is correct when only user mode SHSTK is available. Signed-off-by: Yang Weijiang Reviewed-by: Rick Edgecombe Tested-by: Rick Edgecombe --- arch/x86/kernel/fpu/xstate.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 07911532b108..f6b98693da59 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -73,7 +73,6 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_OSPKE, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, - [XFEATURE_CET_USER] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -798,6 +797,14 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) fpu_kernel_cfg.max_features &= ~BIT_ULL(i); } + /* + * CET user mode xstate bit has been cleared by above sanity check. + * Now pick it up if either SHSTK or IBT is available. Either feature + * depends on the xstate bit to save/restore user mode states. + */ + if (boot_cpu_has(X86_FEATURE_SHSTK) || boot_cpu_has(X86_FEATURE_IBT)) + fpu_kernel_cfg.max_features |= BIT_ULL(XFEATURE_CET_USER); + if (!cpu_feature_enabled(X86_FEATURE_XFD)) fpu_kernel_cfg.max_features &= ~XFEATURE_MASK_USER_DYNAMIC; From patchwork Mon Feb 19 07:47:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562293 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 539DF22F02; Mon, 19 Feb 2024 07:47:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328869; cv=none; b=aYYgMqOV+8AKBm1DF3SLvlcRyP28B7yTtcJ9E0OgG9AlMzqG8TIbeAvzpFSu20rj6T6G7aaNEkDm6paS8jmbuShGKJcxhqueSkHBhB6vPv3ToXFvzIRaZGfsojZ1qInX//kNm1tGfApwF6BAAzIn6VmxZeTL6L0lr6h9I1YFK3A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328869; c=relaxed/simple; bh=zGkrHzajqrsHUc2oTBuKUFhIGS/Sq5/+bBPPvT6IyEE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J4UHYD/WNjtDm/13x/iVkrRRPrkzxTm5BrOsFfBnokCV24QKjhbU+YmwmHiUJGqEqmS4Jdu1QZcUdpXabyq8SK9TWVLQCyUqcH39dwSGULrFtgvt0+/JVw6VdsRwXCECRgDd/5kIKN/p2lP85QBOUDyUETKFYUC7f6kg2TVeO+U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=gzhmLEFm; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="gzhmLEFm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328867; x=1739864867; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zGkrHzajqrsHUc2oTBuKUFhIGS/Sq5/+bBPPvT6IyEE=; b=gzhmLEFmRU1ZftBQ4RJ0/8uN8ghPXp28NIEkAWFuuwFouSdGmmYczDW5 zztO5JofNnxyvnbNgjzyi8kc0g8PZ9UX74BVttcdmuJWeuBVVJ6XutOiO H2xgFalwYJrL1Ymr2BF9hBqSO5SRe4NYBfVxD2mAQTCrxel9/zPTrblLR zNQp4nGcCvhnzlhPNttT5hLaxW0uQzgIYFiX30jmuqhIJw9r4WLI55YWu 9XvN178MYNuGGVPfNxq5SkBb44PmV421wEKNdPnD/Z7wpzbvqdFNvNJGs Uzi0DMLpSRZqoDdMfsw1+phVgwrdlCjpmIUhmAwoa07yYoL5imUvZ35fz g==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535023" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535023" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966066" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966066" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 03/27] x86/fpu/xstate: Add CET supervisor mode state support Date: Sun, 18 Feb 2024 23:47:09 -0800 Message-ID: <20240219074733.122080-4-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add supervisor mode state support within FPU xstate management framework. Although supervisor shadow stack is not enabled/used today in kernel,KVM requires the support because when KVM advertises shadow stack feature to guest, architecturally it claims the support for both user and supervisor modes for guest OSes(Linux or non-Linux). CET supervisor states not only includes PL{0,1,2}_SSP but also IA32_S_CET MSR, but the latter is not xsave-managed. In virtualization world, guest IA32_S_CET is saved/stored into/from VM control structure. With supervisor xstate support, guest supervisor mode shadow stack state can be properly saved/restored when 1) guest/host FPU context is swapped 2) vCPU thread is sched out/in. The alternative is to enable it in KVM domain, but KVM maintainers NAKed the solution. The external discussion can be found at [*], it ended up with adding the support in kernel instead of KVM domain. Note, in KVM case, guest CET supervisor state i.e., IA32_PL{0,1,2}_MSRs, are preserved after VM-Exit until host/guest fpstates are swapped, but since host supervisor shadow stack is disabled, the preserved MSRs won't hurt host. [*]: https://lore.kernel.org/all/806e26c2-8d21-9cc9-a0b7-7787dd231729@intel.com/ Signed-off-by: Yang Weijiang Reviewed-by: Rick Edgecombe Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/fpu/types.h | 14 ++++++++++++-- arch/x86/include/asm/fpu/xstate.h | 6 +++--- arch/x86/kernel/fpu/xstate.c | 6 +++++- 3 files changed, 20 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index ace9aa3b78a3..fe12724c50cc 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -118,7 +118,7 @@ enum xfeature { XFEATURE_PKRU, XFEATURE_PASID, XFEATURE_CET_USER, - XFEATURE_CET_KERNEL_UNUSED, + XFEATURE_CET_KERNEL, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -141,7 +141,7 @@ enum xfeature { #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) #define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) -#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) #define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) @@ -266,6 +266,16 @@ struct cet_user_state { u64 user_ssp; }; +/* + * State component 12 is Control-flow Enforcement supervisor states + */ +struct cet_supervisor_state { + /* supervisor ssp pointers */ + u64 pl0_ssp; + u64 pl1_ssp; + u64 pl2_ssp; +}; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index d4427b88ee12..3b4a038d3c57 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -51,7 +51,8 @@ /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ - XFEATURE_MASK_CET_USER) + XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) /* * A supervisor state component may not always contain valuable information, @@ -78,8 +79,7 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ - XFEATURE_MASK_CET_KERNEL) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index f6b98693da59..03e166a87d61 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -51,7 +51,7 @@ static const char *xfeature_names[] = "Protection Keys User registers", "PASID state", "Control-flow User registers", - "Control-flow Kernel registers (unused)", + "Control-flow Kernel registers", "unknown xstate feature", "unknown xstate feature", "unknown xstate feature", @@ -73,6 +73,7 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_OSPKE, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, + [XFEATURE_CET_KERNEL] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -277,6 +278,7 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_PKRU); print_xstate_feature(XFEATURE_MASK_PASID); print_xstate_feature(XFEATURE_MASK_CET_USER); + print_xstate_feature(XFEATURE_MASK_CET_KERNEL); print_xstate_feature(XFEATURE_MASK_XTILE_CFG); print_xstate_feature(XFEATURE_MASK_XTILE_DATA); } @@ -346,6 +348,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate) XFEATURE_MASK_BNDCSR | \ XFEATURE_MASK_PASID | \ XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL | \ XFEATURE_MASK_XTILE) /* @@ -546,6 +549,7 @@ static bool __init check_xstate_against_struct(int nr) case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state); case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg); case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state); + case XFEATURE_CET_KERNEL: return XCHECK_SZ(sz, nr, struct cet_supervisor_state); case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true; default: XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr); From patchwork Mon Feb 19 07:47:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562294 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29DE721101; Mon, 19 Feb 2024 07:47:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328869; cv=none; b=BYTxIxHfM2ggAX5hM9Dx9xhRD2y5U+Z8CWFE1MwM3sfntfSXdRdqbS+xn1yIVXEzjgzvTxGyRdBnW0SctLO9n+CX0AoJUwbg1y5hrZFLzCwHpdNETThGHVORQvPmDW2ABb5xlshLmfCrh6FsybzqZKGyuuO5/yDrgJFJ9szpQ7g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328869; c=relaxed/simple; bh=MlqnfeMU4VB44Hds9887OeMo+hsI9GbnOK6otdEL9WA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OuX1QYSATmzI4/KiMExbl6/ZJSjS6nUzD/iDfcpyAxvGEq/wZc/AKREFG5xLGJ8tlgyB8LKv1g3GDetBG8WdxUIKOx13OjGPHeB6DYGqJ2D1E1MWkvsOmZdKjYRY/xduIi13Lc7N8Bu4KmROu1pFFb5sixItoaZ6WME/7vUowW0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=B0aO/FND; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="B0aO/FND" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328867; x=1739864867; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MlqnfeMU4VB44Hds9887OeMo+hsI9GbnOK6otdEL9WA=; b=B0aO/FNDVdQ4BqhOLxI3ip3ZW6QSpg52i8od/nLfewIXb6XAIalPZtlo SQ9l17pauXFg214aNiHfy6P/10o6IcWp5HAoVEgfL6MGS742ApUQkcN44 KyyQ9HoEiJELZ2+/WBTv/MbrawoglVst3n0bZSl9dP5WwIGFD3Jy9Q3nn fNZ0z/9iBXN4Tfi6pE+Db2/kejcxhtqTeSz0gN/LLeY858nfcSchIB61X mOetZni7w433nPlTsQ0b27uIu8jPDTdrPIF+zZAiU4s1FZy5YXV0qpPNo XIAu46kTwgVCbT+s/LMY0rip8xgw5xe/inIAVWOzDr9DFlVFjP8IeNfAZ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535033" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535033" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966069" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966069" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 04/27] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Date: Sun, 18 Feb 2024 23:47:10 -0800 Message-ID: <20240219074733.122080-5-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features that can be optionally enabled by kernel components. This is similar to XFEATURE_MASK_USER_DYNAMIC in that it contains optional xfeatures that can allows the FPU buffer to be dynamically sized. The difference is that the KERNEL variant contains supervisor features and will be enabled by kernel components that need them, and not directly by the user. Currently it's used by KVM to configure guest dedicated fpstate for calculating the xfeature and fpstate storage size etc. The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is supported by host as they're enabled in kernel XSS MSR setting but relevant CPU feature, i.e., supervisor shadow stack, is not enabled in host kernel therefore it can be omitted for normal fpstate by default. Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be optimized by HW for normal fpstate. Suggested-by: Dave Hansen Signed-off-by: Yang Weijiang Reviewed-by: Rick Edgecombe --- arch/x86/include/asm/fpu/xstate.h | 5 ++++- arch/x86/kernel/fpu/xstate.c | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 3b4a038d3c57..a212d3851429 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -46,9 +46,12 @@ #define XFEATURE_MASK_USER_RESTORE \ (XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU) -/* Features which are dynamically enabled for a process on request */ +/* Features which are dynamically enabled per userspace request */ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA +/* Features which are dynamically enabled per kernel side request */ +#define XFEATURE_MASK_KERNEL_DYNAMIC XFEATURE_MASK_CET_KERNEL + /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ XFEATURE_MASK_CET_USER | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 03e166a87d61..ca4b83c142eb 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) /* Clean out dynamic features from default */ fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features; fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; + fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC; fpu_user_cfg.default_features = fpu_user_cfg.max_features; fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; From patchwork Mon Feb 19 07:47:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562296 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 704BD241E1; Mon, 19 Feb 2024 07:47:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328870; cv=none; b=gXYg/DbUEaPqUFf4EaCb27yRAYCIsjZCA517Yg8xF3/KvzshPxej9We5aNMfH7WAtm0n973XZ3TcMG1mooIjxqyqwjMjP3uUwVT9cLy+kWAwv5BlhmYqTwBMIWNypzRyM0j8V1w8L0V3tvkKmkrWKSFe2EXlYo0kcu65fkSO9Os= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328870; c=relaxed/simple; bh=Np6rck3vYGS0TWNIPT5gHe6jyCmUuYJzYZHXucZHL90=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lM1J7CQ2s/9eTa0V0MpdQAXvFliaVbFmpDz9OhMgC/n8lLxP3vWgXlwrcfjRLLU4gVjV5gddiOQzhZwHxU0VpfyvVe6T9ReEXnk6y4zXtdoxgWVS27THLsD/6F3dH8X3i1fqERcTKSMOM4pLBHdtSP9tXKCQJnNnBsVuKiWd/H8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BhuyfOn8; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BhuyfOn8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328868; x=1739864868; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Np6rck3vYGS0TWNIPT5gHe6jyCmUuYJzYZHXucZHL90=; b=BhuyfOn8iNvCwN3O/FPAi6wMz2qkCFiXoIHiACLtp9nwxVw57v4oVKlt tCa6lt8AKWu18F5osXa/TzGLc3nIUFqrpKFDyKoeRZ/fnhDxC6XZlO96m q604Jw2l58k5BXK6ZTMX4OrBHUVdeV7TnVbkTySHp1EYIcMCXjATW0PH9 kGS2JiEOaFhVB1KGWmqdjj4GHQdcQTSKqNl2yuPmuFU9pPeeHSSaZZTHj Kdhr9skWK5+flf1gopZHfp49EEx85O8AvYUN2aMgR/tCyyJoivamzbBBJ 5SmXlxtfiwh6yPyZsB0WhpIxcIV5ittIFLbw/FJSuZg+4w7Xs+IHAXnkV g==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535039" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535039" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966072" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966072" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 05/27] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Date: Sun, 18 Feb 2024 23:47:11 -0800 Message-ID: <20240219074733.122080-6-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Define new fpu_guest_cfg to hold all guest FPU settings so that it can differ from generic kernel FPU settings, e.g., enabling CET supervisor xstate by default for guest fpstate while it's remained disabled in kernel FPU config. The kernel dynamic xfeatures are specifically used by guest fpstate now, add the mask for guest fpstate so that guest_perm.__state_perm == (fpu_kernel_cfg.default_xfeature | XFEATURE_MASK_KERNEL_DYNAMIC). And if guest fpstate is re-allocated to hold user dynamic xfeatures, the resulting permissions are consumed before calculate new guest fpstate. With new guest FPU config added, there're 3 categories of FPU configs in kernel, the usages and key fields are recapped as below. kernel FPU config: @fpu_kernel_cfg.max_features - all known and CPU supported user and supervisor features except independent kernel features @fpu_kernel_cfg.default_features - all known and CPU supported user and supervisor features except dynamic kernel features, independent kernel features and dynamic userspace features. @fpu_kernel_cfg.max_size - size of compacted buffer with 'fpu_kernel_cfg.max_features' @fpu_kernel_cfg.default_size - size of compacted buffer with 'fpu_kernel_cfg.default_features' user FPU config: @fpu_user_cfg.max_features - all known and CPU supported user features @fpu_user_cfg.default_features - all known and CPU supported user features except dynamic userspace features. @fpu_user_cfg.max_size - size of non-compacted buffer with 'fpu_user_cfg.max_features' @fpu_user_cfg.default_size - size of non-compacted buffer with 'fpu_user_cfg.default_features' guest FPU config: @fpu_guest_cfg.max_features - all known and CPU supported user and supervisor features except independent kernel features. @fpu_guest_cfg.default_features - all known and CPU supported user and supervisor features except independent kernel features and dynamic userspace features. @fpu_guest_cfg.max_size - size of compacted buffer with 'fpu_guest_cfg.max_features' @fpu_guest_cfg.default_size - size of compacted buffer with 'fpu_guest_cfg.default_features' Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Rick Edgecombe --- arch/x86/include/asm/fpu/types.h | 2 +- arch/x86/kernel/fpu/core.c | 14 +++++++++++--- arch/x86/kernel/fpu/xstate.c | 10 ++++++++++ 3 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index fe12724c50cc..aa00a9617832 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -604,6 +604,6 @@ struct fpu_state_config { }; /* FPU state configuration information */ -extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg; +extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg, fpu_guest_cfg; #endif /* _ASM_X86_FPU_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 520deb411a70..e8205e261a24 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -33,9 +33,10 @@ DEFINE_STATIC_KEY_FALSE(__fpu_state_size_dynamic); DEFINE_PER_CPU(u64, xfd_state); #endif -/* The FPU state configuration data for kernel and user space */ +/* The FPU state configuration data for kernel, user space and guest. */ struct fpu_state_config fpu_kernel_cfg __ro_after_init; struct fpu_state_config fpu_user_cfg __ro_after_init; +struct fpu_state_config fpu_guest_cfg __ro_after_init; /* * Represents the initial FPU state. It's mostly (but not completely) zeroes, @@ -536,8 +537,15 @@ void fpstate_reset(struct fpu *fpu) fpu->perm.__state_perm = fpu_kernel_cfg.default_features; fpu->perm.__state_size = fpu_kernel_cfg.default_size; fpu->perm.__user_state_size = fpu_user_cfg.default_size; - /* Same defaults for guests */ - fpu->guest_perm = fpu->perm; + + /* Guest permission settings */ + fpu->guest_perm.__state_perm = fpu_guest_cfg.default_features; + fpu->guest_perm.__state_size = fpu_guest_cfg.default_size; + /* + * Set guest's __user_state_size to fpu_user_cfg.default_size so that + * existing uAPIs can still work. + */ + fpu->guest_perm.__user_state_size = fpu_user_cfg.default_size; } static inline void fpu_inherit_perms(struct fpu *dst_fpu) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index ca4b83c142eb..9cbdc83d1eab 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -681,6 +681,7 @@ static int __init init_xstate_size(void) { /* Recompute the context size for enabled features: */ unsigned int user_size, kernel_size, kernel_default_size; + unsigned int guest_default_size; bool compacted = cpu_feature_enabled(X86_FEATURE_XCOMPACTED); /* Uncompacted user space size */ @@ -702,13 +703,18 @@ static int __init init_xstate_size(void) kernel_default_size = xstate_calculate_size(fpu_kernel_cfg.default_features, compacted); + guest_default_size = + xstate_calculate_size(fpu_guest_cfg.default_features, compacted); + if (!paranoid_xstate_size_valid(kernel_size)) return -EINVAL; fpu_kernel_cfg.max_size = kernel_size; fpu_user_cfg.max_size = user_size; + fpu_guest_cfg.max_size = kernel_size; fpu_kernel_cfg.default_size = kernel_default_size; + fpu_guest_cfg.default_size = guest_default_size; fpu_user_cfg.default_size = xstate_calculate_size(fpu_user_cfg.default_features, false); @@ -829,6 +835,10 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) fpu_user_cfg.default_features = fpu_user_cfg.max_features; fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; + fpu_guest_cfg.max_features = fpu_kernel_cfg.max_features; + fpu_guest_cfg.default_features = fpu_guest_cfg.max_features; + fpu_guest_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; + /* Store it for paranoia check at the end */ xfeatures = fpu_kernel_cfg.max_features; From patchwork Mon Feb 19 07:47:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562295 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4C8423750; Mon, 19 Feb 2024 07:47:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328869; cv=none; b=POkMNes07+nB9SWkWl+DHLIww5MU+z5dLeggB+Z8kjLY/+/gqpAyHyPKaS/IxEopZIcv3Weo4ZC2ZCv0OwKAhcjYENGXIa9qgSy2l8sjo/VOS+rjt4kJH87wTCGYawmT+tPPoUmobXiIVFbAoOAQAFvF9XmfsyLKXvRc4AGTFbc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328869; c=relaxed/simple; bh=SkFizbcLL2iXx+O1d7csJZlnwkmoUAktnTokiB88u0A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pUC+M4iNwmh7qQ5Y0eHRPhPU3uwQS0hiUvOAsrPQpwV2W72i0XjnrFM+WgtmNQbPgmOHI/vIwiY9IE3nD4u831kRDIfcmLZ7GhHOBYYXTRT8WmktSI0jJhwTwXepYmts2PTp+YA/DaHHee9nCnWdzuK6ndTXhL3+AjUp0qwuJzc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=EDshXo79; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="EDshXo79" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328867; x=1739864867; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SkFizbcLL2iXx+O1d7csJZlnwkmoUAktnTokiB88u0A=; b=EDshXo79qBqb2BX379ViBSeqcpYGrd9Qw2rozun5JpurTF2naqiNdfjP dDtGSj2p2tJhbupXUzEnJxIwoe2DHHPE28+79uduAc6rCnqgH6/ylRfEW ISZ6sQDD5fnaG2hAsVmuna1jxS76XgrUQG0EbIIJ1ELCadAwXV7OyXE9T gvOF9lJBCfIA1loFWOl/lMzKCnh6YQXomYC1Ll2O7xDN7FHgOA2/XDeJw DP5NACTFdQx0psmZVfDvfZ2VfYAYwSCzFuvp9oHk7JuZtbO7J++6wZPD9 tQaSvlE5yZHsSXTVcfB7zCT+9i4SkoC4PmOONi7m4JFPIC6rN6cZCXVZ4 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535045" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535045" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966075" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966075" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 06/27] x86/fpu/xstate: Create guest fpstate with guest specific config Date: Sun, 18 Feb 2024 23:47:12 -0800 Message-ID: <20240219074733.122080-7-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Use fpu_guest_cfg to calculate guest fpstate settings, open code for __fpstate_reset() to avoid using kernel FPU config. Below configuration steps are currently enforced to get guest fpstate: 1) Kernel sets up guest FPU settings in fpu__init_system_xstate(). 2) User space sets vCPU thread group xstate permits via arch_prctl(). 3) User space creates guest fpstate via __fpu_alloc_init_guest_fpstate() for vcpu thread. 4) User space enables guest dynamic xfeatures and re-allocate guest fpstate. By adding kernel dynamic xfeatures in above #1 and #2, guest xstate area size is expanded to hold (fpu_kernel_cfg.default_features | kernel dynamic xfeatures | user dynamic xfeatures), then host xsaves/xrstors can operate for all guest xfeatures. The user_* fields remain unchanged for compatibility with KVM uAPIs. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Rick Edgecombe --- arch/x86/kernel/fpu/core.c | 39 +++++++++++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index e8205e261a24..dc2d2641fda7 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -194,8 +194,6 @@ void fpu_reset_from_exception_fixup(void) } #if IS_ENABLED(CONFIG_KVM) -static void __fpstate_reset(struct fpstate *fpstate, u64 xfd); - static void fpu_init_guest_permissions(struct fpu_guest *gfpu) { struct fpu_state_perm *fpuperm; @@ -216,25 +214,48 @@ static void fpu_init_guest_permissions(struct fpu_guest *gfpu) gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED; } -bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) +static struct fpstate *__fpu_alloc_init_guest_fpstate(struct fpu_guest *gfpu) { struct fpstate *fpstate; unsigned int size; - size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64); + /* + * fpu_guest_cfg.default_size is initialized to hold all enabled + * xfeatures except the user dynamic xfeatures. If the user dynamic + * xfeatures are enabled, the guest fpstate will be re-allocated to + * hold all guest enabled xfeatures, so omit user dynamic xfeatures + * here. + */ + size = fpu_guest_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64); + fpstate = vzalloc(size); if (!fpstate) - return false; + return NULL; + /* + * Initialize sizes and feature masks, use fpu_user_cfg.* + * for user_* settings for compatibility of exiting uAPIs. + */ + fpstate->size = fpu_guest_cfg.default_size; + fpstate->xfeatures = fpu_guest_cfg.default_features; + fpstate->user_size = fpu_user_cfg.default_size; + fpstate->user_xfeatures = fpu_user_cfg.default_features; + fpstate->xfd = 0; - /* Leave xfd to 0 (the reset value defined by spec) */ - __fpstate_reset(fpstate, 0); fpstate_init_user(fpstate); fpstate->is_valloc = true; fpstate->is_guest = true; gfpu->fpstate = fpstate; - gfpu->xfeatures = fpu_user_cfg.default_features; - gfpu->perm = fpu_user_cfg.default_features; + gfpu->xfeatures = fpu_guest_cfg.default_features; + gfpu->perm = fpu_guest_cfg.default_features; + + return fpstate; +} + +bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) +{ + if (!__fpu_alloc_init_guest_fpstate(gfpu)) + return false; /* * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state From patchwork Mon Feb 19 07:47:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562297 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6133124B47; Mon, 19 Feb 2024 07:47:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328871; cv=none; b=WTvBXr6F/osXu7x8/gdcd+djA6QyRGQpG0omnoGNJmEHoC3GJVRM2m0SG4klZyFbWZ997fQ9dpY/dwoSiKDQe1M5xn/xpicULd91osnCpaCcXZhVzdr/IcD9V4YTsic1bokfeAiPp5afKcS7q95UdTE8UK3FdYmfWg5r6mhTTBM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328871; c=relaxed/simple; bh=CJ0Jw/rZGxVuVyRHLbKDpIzAQO9DG/ph6iOAbNKFjzE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pMTUn0fI5Er1b94CtacZndKE4wo287djEyEOJ2PQBX+3qiNxDmZNHaR1jvqV5k5EvYFpL6johoTSaxbM9np/m3IuCy+PtKOKj7CAq9c+xCHGATz7kvHJBsHF2Ly0dmeBREUGz6xDOzSPbQE9itm1Mi3RgofmxgiA88O/wSN9sAk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZZLLBN70; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZZLLBN70" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328869; x=1739864869; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CJ0Jw/rZGxVuVyRHLbKDpIzAQO9DG/ph6iOAbNKFjzE=; b=ZZLLBN70XfKV1LaLqdB9OHpt84JhXZ9iiTmWz382BJlKhr5HLW1k9HBS akqrKVUVQS+HzGi36E/LwQSRCtuN6vJb0nurXru/XtfJfS8jKPbcQZrqy QsqgFwbvtc6IRGheyb96o5me8195Gpcr1PhU89Ml9oiLNNYYw+ll9qDAV oMEi8eO3uKG0svRzg0ULOeNveE5b/JMnibfKMlszi2d0LmceDRO/dP53S n7WZ/CTkK4WsbtEe/hxN3tALXBmfgUWjUk+vNIf2/7IYKxX45pmhyR/rl q54osEFVXEJ5sQFFaVfPWTJnfW77zjQdrZbBMnLFm4HAey48Obh6m5OqD w==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535053" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535053" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966078" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966078" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 07/27] x86/fpu/xstate: Warn if kernel dynamic xfeatures detected in normal fpstate Date: Sun, 18 Feb 2024 23:47:13 -0800 Message-ID: <20240219074733.122080-8-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Kernel dynamic xfeatures now are __ONLY__ enabled for guest fpstate, i.e., never for normal kernel fpstate. The bits are added when guest FPU config is initialized. Guest fpstate is allocated with fpstate->is_guest set to %true. For normal fpstate, the bits should have been removed when initializes kernel FPU config settings, WARN_ONCE() if kernel detects normal fpstate xfeatures contains kernel dynamic xfeatures before executes xsaves. Signed-off-by: Yang Weijiang Reviewed-by: Rick Edgecombe Reviewed-by: Maxim Levitsky --- arch/x86/kernel/fpu/xstate.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 3518fb26d06b..83ebf1e1cbb4 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -185,6 +185,9 @@ static inline void os_xsave(struct fpstate *fpstate) WARN_ON_FPU(!alternatives_patched); xfd_validate_state(fpstate, mask, false); + WARN_ON_FPU(!fpstate->is_guest && + (mask & XFEATURE_MASK_KERNEL_DYNAMIC)); + XSTATE_XSAVE(&fpstate->regs.xsave, lmask, hmask, err); /* We should never fault when copying to a kernel buffer: */ From patchwork Mon Feb 19 07:47:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562298 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B72ED20B28; Mon, 19 Feb 2024 07:47:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328871; cv=none; b=Vffat+IDOhyJq+W8vDTC/poDGfPJ8YkmFVvoa1WKpZ848yuGJpVHNe7jYFz+T8aTrjAP7rjWdUmNHDNZYSubW7Jh+wh/DdBFjiGdbdwDfM79qSWsWfbkFtA+U3RLoayO6fFoFztEGNbPISQXTosbKosWf5PYXw6Xp2LRTOtDylQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328871; c=relaxed/simple; bh=iFPryYlo0x9QbRdEVDhSTPJscWPbWs0HWCWU5SaS5aI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Nu1U1CjUbYB1X1QN7/kLQOyCys4NaNXi1Rl1UEVNxCzyr76Qvru4X2l9nDzJdK4kZolGIedjYUYw5nUAvPwWFT26xTNWEY3vzalOohHlB+eLnMBWuXWHcF4POPxLjOScjSFyOs/DkBqO1NpiQWQ5cXiMvaVmoO9zWpxeBDYQkck= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ElChzFnt; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ElChzFnt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328869; x=1739864869; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iFPryYlo0x9QbRdEVDhSTPJscWPbWs0HWCWU5SaS5aI=; b=ElChzFntSW8JIgRMagpDXFfz68uUhmLpaoNhDhZ6daODrzQuIueRspVG yQfcacagbk3DHalM6g4atYJTNNvxQQ9EiVFDqw3YG38DkRyjtDiZ1lwk+ cqvdFHBVVuLHAJ2C9ZuGd6pwUzVUiq5eGpIGM0CiunHmNdJwo4XniiBES 3AVtD6XO14MyKsfIP7KfBKtKYKTAgFWqWIsrMtmfhJoFkBI9WbvvmSAbZ rN/8KQWKDi4pm0TLkcbXPOGbbg1PNcJQPGXKmfHjkTVEJl423h/7zLMKx 0e+J0AL/0e6jX6axeh2/ybOkNSPuRPUi7VNjsqVUuuIa+C94653Js+sOR Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535068" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535068" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966081" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966081" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 08/27] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Date: Sun, 18 Feb 2024 23:47:14 -0800 Message-ID: <20240219074733.122080-9-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Sean Christopherson Rework and rename cpuid_get_supported_xcr0() to explicitly operate on vCPU state, i.e. on a vCPU's CPUID state, now that the only usage of the helper is to retrieve a vCPU's already-set CPUID. Prior to commit 275a87244ec8 ("KVM: x86: Don't adjust guest's CPUID.0x12.1 (allowed SGX enclave XFRM)"), KVM incorrectly fudged guest CPUID at runtime, which in turn necessitated massaging the incoming CPUID state for KVM_SET_CPUID{2} so as not to run afoul of kvm_cpuid_check_equal(). I.e. KVM also invoked cpuid_get_supported_xcr0() with the incoming CPUID state, and thus without an explicit vCPU object. Opportunistically move the helper below kvm_update_cpuid_runtime() to make it harder to repeat the mistake of querying supported XCR0 for runtime updates. No functional change intended. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/kvm/cpuid.c | 33 ++++++++++++++++----------------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index adba49afb5fe..d57a6255b19f 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -247,21 +247,6 @@ void kvm_update_pv_runtime(struct kvm_vcpu *vcpu) vcpu->arch.pv_cpuid.features = best->eax; } -/* - * Calculate guest's supported XCR0 taking into account guest CPUID data and - * KVM's supported XCR0 (comprised of host's XCR0 and KVM_SUPPORTED_XCR0). - */ -static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent) -{ - struct kvm_cpuid_entry2 *best; - - best = cpuid_entry2_find(entries, nent, 0xd, 0); - if (!best) - return 0; - - return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; -} - static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries, int nent) { @@ -312,6 +297,21 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime); +/* + * Calculate guest's supported XCR0 taking into account guest CPUID data and + * KVM's supported XCR0 (comprised of host's XCR0 and KVM_SUPPORTED_XCR0). + */ +static u64 vcpu_get_supported_xcr0(struct kvm_vcpu *vcpu) +{ + struct kvm_cpuid_entry2 *best; + + best = kvm_find_cpuid_entry_index(vcpu, 0xd, 0); + if (!best) + return 0; + + return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; +} + static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent) { #ifdef CONFIG_KVM_HYPERV @@ -361,8 +361,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) kvm_apic_set_version(vcpu); } - vcpu->arch.guest_supported_xcr0 = - cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); + vcpu->arch.guest_supported_xcr0 = vcpu_get_supported_xcr0(vcpu); kvm_update_pv_runtime(vcpu); From patchwork Mon Feb 19 07:47:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562300 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C2EB250FD; Mon, 19 Feb 2024 07:47:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328872; cv=none; b=rcZhonj/yMDZic6cUQkx4YOfKXGjo3rCFbmmEvyK9HYzqtEc0RacldVS1EK1GFESk3HdxXfI1YKSHQkU8Dumc0bMd9GhF8cHYy+M7mh7IOS0p79Cunre+eHrN07mD+IVutKEsZBfj+gpbS8BeM+5pnODojamPLQ/I91kRDdN7pA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328872; c=relaxed/simple; bh=EY9YbZeGN754w7L4zCbG18SzzY67lprG5C22PB8bmD0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ohPMUiZ5kDnZt/CsPeWl/W+NZQwSYgroWucYOrC4nY7kZJXX7BRsIRr3bYZVMV0yp9yr6jMluzMEz0veykjCH7tJ9uh5G4t2733BjJG2fNIpqwqkRhVkQRBMJyrplcFegICj4kKXclfDMZZ5g7PEmOFvHSexD6pLRupJRTSnmsQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ae3tbhQ7; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ae3tbhQ7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328869; x=1739864869; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EY9YbZeGN754w7L4zCbG18SzzY67lprG5C22PB8bmD0=; b=Ae3tbhQ7iMRHkpAqAH+RyMgSzGDxY/JvgBsQpwbx14ihrlyhMfynFowI jYlBx3navEVnkb6jagCG6uJsT4DfbkAmhXu1fgeqSyZnqoiKqNnOg5PiA C6VwK2E4gL2yXi63OKQUkM+fgOlVwNU4GfBpclGq0ZjjV2SDOrnJSaPzm isSKm+OheTEZl9uAes0+QmUCygEgoholt7Pn+ULicZAvNDJheX6459emB 5SyuPopQqr4UCCdLd6/AtEQhxpFrF0kpeTB53bXlJrL5XsaCutm36qkVr fgQ8LAIQmBlvuACxsUVJqFbKscHMAmFM7Ps/lSLDoyXxb2lQxA10a4euc Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535059" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535059" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966083" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966083" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 09/27] KVM: x86: Rename kvm_{g,s}et_msr()* to menifest emulation operations Date: Sun, 18 Feb 2024 23:47:15 -0800 Message-ID: <20240219074733.122080-10-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Rename kvm_{g,s}et_msr()* to kvm_emulate_msr_{read,write}()* to make it more obvious that KVM uses these helpers to emulate guest behaviors, i.e., host_initiated == false in these helpers. Suggested-by: Sean Christopherson Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/kvm/smm.c | 4 ++-- arch/x86/kvm/vmx/nested.c | 13 +++++++------ arch/x86/kvm/x86.c | 24 +++++++++++++----------- 4 files changed, 24 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index aaf5a25ea7ed..5ab122f8843e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2016,8 +2016,8 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu); void kvm_enable_efer_bits(u64); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated); -int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data); -int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data); +int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); +int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu); int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu); int kvm_emulate_as_nop(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index dc3d95fdca7d..45c855389ea7 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -535,7 +535,7 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, vcpu->arch.smbase = smstate->smbase; - if (kvm_set_msr(vcpu, MSR_EFER, smstate->efer & ~EFER_LMA)) + if (kvm_emulate_msr_write(vcpu, MSR_EFER, smstate->efer & ~EFER_LMA)) return X86EMUL_UNHANDLEABLE; rsm_load_seg_64(vcpu, &smstate->tr, VCPU_SREG_TR); @@ -626,7 +626,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt) /* And finally go back to 32-bit mode. */ efer = 0; - kvm_set_msr(vcpu, MSR_EFER, efer); + kvm_emulate_msr_write(vcpu, MSR_EFER, efer); } #endif diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 994e014f8a50..4be0078ca713 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -958,7 +958,7 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) __func__, i, e.index, e.reserved); goto fail; } - if (kvm_set_msr(vcpu, e.index, e.value)) { + if (kvm_emulate_msr_write(vcpu, e.index, e.value)) { pr_debug_ratelimited( "%s cannot write MSR (%u, 0x%x, 0x%llx)\n", __func__, i, e.index, e.value); @@ -994,7 +994,7 @@ static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu, } } - if (kvm_get_msr(vcpu, msr_index, data)) { + if (kvm_emulate_msr_read(vcpu, msr_index, data)) { pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__, msr_index); return false; @@ -2686,7 +2686,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) && kvm_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)) && - WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, + WARN_ON_ONCE(kvm_emulate_msr_write(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, vmcs12->guest_ia32_perf_global_ctrl))) { *entry_failure_code = ENTRY_FAIL_DEFAULT; return -EINVAL; @@ -4568,8 +4568,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, } if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL) && kvm_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu))) - WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, - vmcs12->host_ia32_perf_global_ctrl)); + WARN_ON_ONCE(kvm_emulate_msr_write(vcpu, + MSR_CORE_PERF_GLOBAL_CTRL, + vmcs12->host_ia32_perf_global_ctrl)); /* Set L1 segment info according to Intel SDM 27.5.2 Loading Host Segment and Descriptor-Table Registers */ @@ -4744,7 +4745,7 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu) goto vmabort; } - if (kvm_set_msr(vcpu, h.index, h.value)) { + if (kvm_emulate_msr_write(vcpu, h.index, h.value)) { pr_debug_ratelimited( "%s WRMSR failed (%u, 0x%x, 0x%llx)\n", __func__, j, h.index, h.value); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bcd3258c7ece..10847e1cc413 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1961,31 +1961,33 @@ static int kvm_get_msr_ignored_check(struct kvm_vcpu *vcpu, return ret; } -static int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data) +static int kvm_emulate_msr_read_with_filter(struct kvm_vcpu *vcpu, u32 index, + u64 *data) { if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ)) return KVM_MSR_RET_FILTERED; return kvm_get_msr_ignored_check(vcpu, index, data, false); } -static int kvm_set_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 data) +static int kvm_emulate_msr_write_with_filter(struct kvm_vcpu *vcpu, u32 index, + u64 data) { if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_WRITE)) return KVM_MSR_RET_FILTERED; return kvm_set_msr_ignored_check(vcpu, index, data, false); } -int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data) +int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) { return kvm_get_msr_ignored_check(vcpu, index, data, false); } -EXPORT_SYMBOL_GPL(kvm_get_msr); +EXPORT_SYMBOL_GPL(kvm_emulate_msr_read); -int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data) +int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) { return kvm_set_msr_ignored_check(vcpu, index, data, false); } -EXPORT_SYMBOL_GPL(kvm_set_msr); +EXPORT_SYMBOL_GPL(kvm_emulate_msr_write); static void complete_userspace_rdmsr(struct kvm_vcpu *vcpu) { @@ -2057,7 +2059,7 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu) u64 data; int r; - r = kvm_get_msr_with_filter(vcpu, ecx, &data); + r = kvm_emulate_msr_read_with_filter(vcpu, ecx, &data); if (!r) { trace_kvm_msr_read(ecx, data); @@ -2082,7 +2084,7 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu) u64 data = kvm_read_edx_eax(vcpu); int r; - r = kvm_set_msr_with_filter(vcpu, ecx, data); + r = kvm_emulate_msr_write_with_filter(vcpu, ecx, data); if (!r) { trace_kvm_msr_write(ecx, data); @@ -8365,7 +8367,7 @@ static int emulator_get_msr_with_filter(struct x86_emulate_ctxt *ctxt, struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); int r; - r = kvm_get_msr_with_filter(vcpu, msr_index, pdata); + r = kvm_emulate_msr_read_with_filter(vcpu, msr_index, pdata); if (r < 0) return X86EMUL_UNHANDLEABLE; @@ -8388,7 +8390,7 @@ static int emulator_set_msr_with_filter(struct x86_emulate_ctxt *ctxt, struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); int r; - r = kvm_set_msr_with_filter(vcpu, msr_index, data); + r = kvm_emulate_msr_write_with_filter(vcpu, msr_index, data); if (r < 0) return X86EMUL_UNHANDLEABLE; @@ -8408,7 +8410,7 @@ static int emulator_set_msr_with_filter(struct x86_emulate_ctxt *ctxt, static int emulator_get_msr(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata) { - return kvm_get_msr(emul_to_vcpu(ctxt), msr_index, pdata); + return kvm_emulate_msr_read(emul_to_vcpu(ctxt), msr_index, pdata); } static int emulator_check_rdpmc_early(struct x86_emulate_ctxt *ctxt, u32 pmc) From patchwork Mon Feb 19 07:47:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562299 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C1BC2C6A5; Mon, 19 Feb 2024 07:47:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328872; cv=none; b=Rjepv+AHHVSXUJ1jN4a5Z4WI2jiVfugb/IoaSeUHswUecBahEG/Qu7o0cLsTRnC2bGFaAY6dAo1cSupRXa3r/UNIu+Qy2kVWxAx145qPo7AGdUh5U6DCF/tWnWkyPLftdpFYn1rS4RY65aKxpPQM9UKlYfj0aI8VzYAE81fLXvs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328872; c=relaxed/simple; bh=LLhrBOymq6NYTEUsoYOVmixfi9tT5OVBKgVwc5iax14=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=vB/NL/NSV1CuM3GfzeCBfIL9wvtWUHtSfGBerFgXPCeYqiGM5uZ/N9p/77Mvqf7nmnRhwh9u8zmIwQdD+4vF61pwduFe1vTclHz6rg+SV32+zxo//zyLHvuq6jtuPlWFM28vtOlrQre9p8tAdoq4ti+dvUvC93tGXMaftYcxwzM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=erLHhcPE; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="erLHhcPE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328870; x=1739864870; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LLhrBOymq6NYTEUsoYOVmixfi9tT5OVBKgVwc5iax14=; b=erLHhcPE0ozar76TYW+++JNk/DEeIdbmnd1R/Dh0IOuSYm74NeT6VqK2 An3VBmbx/mk5S1S0zX8BDvqrDB9e5nx7oom8F8nBui+SZx7psI7IjLREE LjXRjT0M6A3/dsH0QFHAn69x1wu1S1xBX58YcjGR99lKsMwtkKiUxhDAW jFhOgvYSVs0PPNj8eGKByCXoZzoLh8/EzzSjYMT6DMicBrz8p2HUCHuAC Hyb6abbAA4cXLMYxaz/JeBgUHXrETAMcA65dSRqk6M7HMScbr9EPRX1+i 5gIGl8UKhY609SeymFVL+ENcYgkVymseNLSdlhxxYrW7hpe58pO0jjQ9a A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535077" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535077" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966087" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966087" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 10/27] KVM: x86: Refine xsave-managed guest register/MSR reset handling Date: Sun, 18 Feb 2024 23:47:16 -0800 Message-ID: <20240219074733.122080-11-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Tweak the code a bit to facilitate resetting more xstate components in the future, e.g., CET's xstate-managed MSRs. No functional change intended. Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Chao Gao --- arch/x86/kvm/x86.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 10847e1cc413..5a9c07751c0e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12217,11 +12217,27 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) static_branch_dec(&kvm_has_noapic_vcpu); } +#define XSTATE_NEED_RESET_MASK (XFEATURE_MASK_BNDREGS | \ + XFEATURE_MASK_BNDCSR) + +static bool kvm_vcpu_has_xstate(unsigned long xfeature) +{ + switch (xfeature) { + case XFEATURE_MASK_BNDREGS: + case XFEATURE_MASK_BNDCSR: + return kvm_cpu_cap_has(X86_FEATURE_MPX); + default: + return false; + } +} + void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) { struct kvm_cpuid_entry2 *cpuid_0x1; unsigned long old_cr0 = kvm_read_cr0(vcpu); + DECLARE_BITMAP(reset_mask, 64); unsigned long new_cr0; + unsigned int i; /* * Several of the "set" flows, e.g. ->set_cr0(), read other registers @@ -12274,7 +12290,12 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) kvm_async_pf_hash_reset(vcpu); vcpu->arch.apf.halted = false; - if (vcpu->arch.guest_fpu.fpstate && kvm_mpx_supported()) { + bitmap_from_u64(reset_mask, (kvm_caps.supported_xcr0 | + kvm_caps.supported_xss) & + XSTATE_NEED_RESET_MASK); + + if (vcpu->arch.guest_fpu.fpstate && + !bitmap_empty(reset_mask, XFEATURE_MAX)) { struct fpstate *fpstate = vcpu->arch.guest_fpu.fpstate; /* @@ -12284,8 +12305,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) if (init_event) kvm_put_guest_fpu(vcpu); - fpstate_clear_xstate_component(fpstate, XFEATURE_BNDREGS); - fpstate_clear_xstate_component(fpstate, XFEATURE_BNDCSR); + for_each_set_bit(i, reset_mask, XFEATURE_MAX) { + if (!kvm_vcpu_has_xstate(i)) + continue; + fpstate_clear_xstate_component(fpstate, i); + } if (init_event) kvm_load_guest_fpu(vcpu); From patchwork Mon Feb 19 07:47:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562303 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38AC52E631; Mon, 19 Feb 2024 07:47:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328874; cv=none; b=TgZ4jxezj7MoTXOI0Waib3vkmkpHSF+9aA1+6hJsruhqAVXLE/+iO43eKMRB6oTg5jbDvPCLEN3UPU1U6GZ2WEeeqPK8Le/Jgn71mOITYtOFYvsMZHVkQaSVLoMtT/9rcKxN469k20313CPrNV/YI/qoJ9SFEYj3/2GORYxYjuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328874; c=relaxed/simple; bh=FWbGGdaxtUvQxwfS23YYfiUKdvCt6zHtKkYDzi8ptYg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hx5LAu6nISnORmq/5KBZ9jKxzBq/eAK6JtsOWAUm/kegmQCoUJvkn2bQviRTtxRIwdP4Pufk2Mm9jKJQPBUSbnSpJbIn+FU1k994yXZzk0Y3exFYCDxYVTXLEg6EbdxJshz2Mn6/WWIAkc9JrlIdrBApfZ4mJseqOnPjJBKkgm8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=hPXf6oYZ; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="hPXf6oYZ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328872; x=1739864872; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FWbGGdaxtUvQxwfS23YYfiUKdvCt6zHtKkYDzi8ptYg=; b=hPXf6oYZhrDP83/kM81EO1tumnevzbUDkNzC7KTpu3LuU/IBqEUanfTD 6zgoCZpHgc1prR1NcFn6N4yqrTYOh/LFpE2Tic7HvgblwgT3gbGGywxXL KIfyiMtEm88Q2bERA5pHlamltig4KTMMy9WWgoORwfITXK6l7Ne08vNnC hsjmCyMw46CnqD3MQQvYJDwzrHheBNfi+lGLjR9LY28bPasTS3MFEAj6t 0xPjaVNbqxG3zuyqkjslAamstf5UF1zASIL72NNBgsV50fCWoCdxpPKol j3OXi7fVz1uhe6hAOYvbZuLxGzfjc+u4TcSkxH/G1wi+oRggIaSpv/KRS A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535078" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535078" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966090" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966090" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 11/27] KVM: x86: Add kvm_msr_{read,write}() helpers Date: Sun, 18 Feb 2024 23:47:17 -0800 Message-ID: <20240219074733.122080-12-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Wrap __kvm_{get,set}_msr() into two new helpers for KVM usage and use the helpers to replace existing usage of the raw functions. kvm_msr_{read,write}() are KVM-internal helpers, i.e. used when KVM needs to get/set a MSR value for emulating CPU behavior, i.e., host_initiated == %true in the helpers. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/x86.c | 16 +++++++++++++--- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5ab122f8843e..f95e93975242 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2015,9 +2015,10 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu); void kvm_enable_efer_bits(u64); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); -int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated); int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); +int kvm_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); +int kvm_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu); int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu); int kvm_emulate_as_nop(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index d57a6255b19f..39529e14ae59 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -1548,7 +1548,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, *edx = entry->edx; if (function == 7 && index == 0) { u64 data; - if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) && + if (!kvm_msr_read(vcpu, MSR_IA32_TSX_CTRL, &data) && (data & TSX_CTRL_CPUID_CLEAR)) *ebx &= ~(F(RTM) | F(HLE)); } else if (function == 0x80000007) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5a9c07751c0e..cbd44f904ba8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1919,8 +1919,8 @@ static int kvm_set_msr_ignored_check(struct kvm_vcpu *vcpu, * Returns 0 on success, non-0 otherwise. * Assumes vcpu_load() was already called. */ -int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, - bool host_initiated) +static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, + bool host_initiated) { struct msr_data msr; int ret; @@ -1946,6 +1946,16 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, return ret; } +int kvm_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) +{ + return __kvm_set_msr(vcpu, index, data, true); +} + +int kvm_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) +{ + return __kvm_get_msr(vcpu, index, data, true); +} + static int kvm_get_msr_ignored_check(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated) { @@ -12323,7 +12333,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) MSR_IA32_MISC_ENABLE_BTS_UNAVAIL; __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP); - __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true); + kvm_msr_write(vcpu, MSR_IA32_XSS, 0); } /* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */ From patchwork Mon Feb 19 07:47:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562301 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20FB52D043; Mon, 19 Feb 2024 07:47:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328873; cv=none; b=XXcuN/9/2XncxFWqz4tgWRaft6VMVikSgPHTynrHCdf2ZUfNCUmbedMp6PHuwJQnmCSBnwlro8MA0LGnMW1yJ3aGTASpSgMeygJQ8SLkuUlgEP2DT01ogzmj/tpK3oov4Kw36llaiNHthiXOXnsOJjz+pm8XOSQy2fb+4IKjnMY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328873; c=relaxed/simple; bh=zsZWtSrIsMEG2dQWduuhWlHPy9Ug6FFiV6Z1WKrBC90=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RJL1TvKZOawukuIYQ7KASiL8qjSBK659VV30GmkVKswomv+OQpup8H1R3H2fw18RbRUTxUQ/DEmoY4eGZC58mP7QpdVtziHE0ZtNrnGa9ywVtKO3YM4c8Sw7IFLc9/IIeXri8qj4OpVWtAY9WisMt2U665YfVKuhoAj1VvYNXkE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jA3xbpzU; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jA3xbpzU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328871; x=1739864871; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zsZWtSrIsMEG2dQWduuhWlHPy9Ug6FFiV6Z1WKrBC90=; b=jA3xbpzUmS+PRuFNyIQoD6+R1sIadpvmfEbsDKehGjJTNmyZ5y5+QT/K tKaTHdGgtPQ5Ct1Ofnyu7L8ctAJdyU73p/wwvYSKt+tWRKUF7tsEqQprd 4iiRh+P9rRQo/CcVzoAbgXJ54TnczmQ/XMmfiguvVjo3bvlgazyxathrz Gk1zzgM/FvaNzasxTRip9vrDbKpDbq6umav1+o++/4KAvwOPj1vaAQoh5 fwpSLSCJ1GZLQ9KYvhkQFGRLHWKf+aBe+CYoKCompiHTYqYdFyhdl8Mf+ 8UUmeJKbcpXAOaoJxl3hpVn6Ah/LtMoK7YzPKmqJ6n0Ot2JZ4KawkivED g==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535092" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535092" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966092" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966092" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 12/27] KVM: x86: Report XSS as to-be-saved if there are supported features Date: Sun, 18 Feb 2024 23:47:18 -0800 Message-ID: <20240219074733.122080-13-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Sean Christopherson Add MSR_IA32_XSS to list of MSRs reported to userspace if supported_xss is non-zero, i.e. KVM supports at least one XSS based feature. Before enabling CET virtualization series, guest IA32_MSR_XSS is guaranteed to be 0, i.e., XSAVES/XRSTORS is executed in non-root mode with XSS == 0, which equals to the effect of XSAVE/XRSTOR. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/kvm/x86.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cbd44f904ba8..9eb5c8dbd4fb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1464,6 +1464,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_UMWAIT_CONTROL, MSR_IA32_XFD, MSR_IA32_XFD_ERR, + MSR_IA32_XSS, }; static const u32 msrs_to_save_pmu[] = { @@ -7388,6 +7389,10 @@ static void kvm_probe_msr_to_save(u32 msr_index) if (!(kvm_get_arch_capabilities() & ARCH_CAP_TSX_CTRL_MSR)) return; break; + case MSR_IA32_XSS: + if (!kvm_caps.supported_xss) + return; + break; default: break; } From patchwork Mon Feb 19 07:47:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562304 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7FA012E847; Mon, 19 Feb 2024 07:47:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328874; cv=none; b=diZQJPQXTYSZueC2+xJde8vs6+hhpqDD4CTUEzqcZtUa+GSYUW/rppxtA/g+v7t8P0gwqgrH+ADvXVon9xtXZjRhz1LomSq/+OLad7rrYgJHTZVl3a8Vtvu1S0RJxsTduYo9LR8SOwETT+g8Z45H3Db0u5agvNYGVo85wLgGk/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328874; c=relaxed/simple; bh=Peu6irBRC3X95i8b8F09rCkxIvnUBqshGjJ19ps9d68=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Aj10AoUQnzy2kLPPux93mw5a5XcWp0uCTjlPE1LdQc5ZGTrpGAlR695gIxxX4scfVj8cnqWSZCeK3u3SyeSKsJjOHw9dCd9Oiv6o7Vdb8CtU9xztCGKKFpQ+Uu9ZzhOZ4jCMhxy4W8AgkRqOSrBkClQVZrOEB44XbBYk30Uwi+E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dy/7D8Uh; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dy/7D8Uh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328872; x=1739864872; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Peu6irBRC3X95i8b8F09rCkxIvnUBqshGjJ19ps9d68=; b=dy/7D8Uh+STgQ9PC67WLeNO4e1RNquPNQTvkkCLUhIPlOZzjEIKvKNhC cb82MhQvT6Gy9mxxfWZ5/nqeg0shEIQiXIfb/cjgA8N9w2jR3ARTg3z/M ahGXp13HWaBrkgDxLyHZvHAtks4i3j6w0dPuvup3bLEotBLuHap/eHuRh STEFdRLluicfceU+oQSVBU3/d+KIWazYMXeJvHdbD5gQGkBFwaNYptlnX OtvEbAVSIiFjhaftRL8sfA6bmqdsb1gUMW9KiVq92wy7alFs94Jr9U53h yeaKOKPG1pjIBUOTUmsnVlt/WL9C7Ql+ZY29u0EBF+0Kvfi6l1vySpqtE Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535089" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535089" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966095" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966095" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com, Zhang Yi Z Subject: [PATCH v10 13/27] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Date: Sun, 18 Feb 2024 23:47:19 -0800 Message-ID: <20240219074733.122080-14-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Update CPUID.(EAX=0DH,ECX=1).EBX to reflect current required xstate size due to XSS MSR modification. CPUID(EAX=0DH,ECX=1).EBX reports the required storage size of all enabled xstate features in (XCR0 | IA32_XSS). The CPUID value can be used by guest before allocate sufficient xsave buffer. Note, KVM does not yet support any XSS based features, i.e. supported_xss is guaranteed to be zero at this time. Opportunistically modify XSS write access logic as: If XSAVES is not enabled in the guest CPUID, forbid setting IA32_XSS msr to anything but 0, even if the write is host initiated. Suggested-by: Sean Christopherson Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 15 ++++++++++++++- arch/x86/kvm/x86.c | 13 ++++++++++--- 3 files changed, 26 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f95e93975242..79f7c18c487b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -773,7 +773,6 @@ struct kvm_vcpu_arch { bool at_instruction_boundary; bool tpr_access_reporting; bool xfd_no_write_intercept; - u64 ia32_xss; u64 microcode_version; u64 arch_capabilities; u64 perf_capabilities; @@ -829,6 +828,8 @@ struct kvm_vcpu_arch { u64 xcr0; u64 guest_supported_xcr0; + u64 guest_supported_xss; + u64 ia32_xss; struct kvm_pio_request pio; void *pio_data; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 39529e14ae59..2bb1931103ad 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -275,7 +275,8 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e best = cpuid_entry2_find(entries, nent, 0xD, 1); if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx = xstate_required_size(vcpu->arch.xcr0, true); + best->ebx = xstate_required_size(vcpu->arch.xcr0 | + vcpu->arch.ia32_xss, true); best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); if (kvm_hlt_in_guest(vcpu->kvm) && best && @@ -312,6 +313,17 @@ static u64 vcpu_get_supported_xcr0(struct kvm_vcpu *vcpu) return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; } +static u64 vcpu_get_supported_xss(struct kvm_vcpu *vcpu) +{ + struct kvm_cpuid_entry2 *best; + + best = kvm_find_cpuid_entry_index(vcpu, 0xd, 1); + if (!best) + return 0; + + return (best->ecx | ((u64)best->edx << 32)) & kvm_caps.supported_xss; +} + static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent) { #ifdef CONFIG_KVM_HYPERV @@ -362,6 +374,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) } vcpu->arch.guest_supported_xcr0 = vcpu_get_supported_xcr0(vcpu); + vcpu->arch.guest_supported_xss = vcpu_get_supported_xss(vcpu); kvm_update_pv_runtime(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9eb5c8dbd4fb..b502d68a2576 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3926,16 +3926,23 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } break; case MSR_IA32_XSS: - if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)) + /* + * If KVM reported support of XSS MSR, even guest CPUID doesn't + * support XSAVES, still allow userspace to set default value(0) + * to this MSR. + */ + if (!guest_cpuid_has(vcpu, X86_FEATURE_XSAVES) && + !(msr_info->host_initiated && data == 0)) return 1; /* * KVM supports exposing PT to the guest, but does not support * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than * XSAVES/XRSTORS to save/restore PT MSRs. */ - if (data & ~kvm_caps.supported_xss) + if (data & ~vcpu->arch.guest_supported_xss) return 1; + if (vcpu->arch.ia32_xss == data) + break; vcpu->arch.ia32_xss = data; kvm_update_cpuid_runtime(vcpu); break; From patchwork Mon Feb 19 07:47:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562302 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A39B42D627; Mon, 19 Feb 2024 07:47:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328873; cv=none; b=ZAy7ESdqaH7f8QapLJ9xDfjmfljR87KdUWbhH3y10okPLGsLbGGucRP9OeuznpND3tWB8anoBx2l9o/ryxe8S0KFuxDX0k2PH4phX5wENDIQYf3VjQGGN5EtjqqYl4yHshTybwwkbDN2opOEvHUJDmyk6H5rRH58oVzRdcfs4j8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328873; c=relaxed/simple; bh=sWkl7Z3XxOiuJFtnFAhcpB/qRiA1y0PFYes31RKezFo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QKjFdkUMbMlzWc7252S2qVg0sZ/DHwubi1vjir2ytd+rCScaY1SyWdl07Un3WKJPEci9Bc/aWnQ4v5whNYB62S0K2QkozaH2CFZB0q4kQHQasxIyaGfqolIAP1MoBkH6+hC3LP2NfFm4QxSob2/OyY4QUwnchrv+EyQBcLxoNw0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lTH9M97n; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lTH9M97n" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328871; x=1739864871; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sWkl7Z3XxOiuJFtnFAhcpB/qRiA1y0PFYes31RKezFo=; b=lTH9M97n9BU2W5/5FNjSZFQ+Fq5DK0ssFnfmMXcpBh2p2OoLxE2uSsGp oKpZzpk0N0WRpf3kmC44QoS1G8YEgXa4Fqr7CKXsTNNiYRohrInr8skn5 2xKHsoy7hzF9mi094PhAGnciJv434/y3TE3e7/29kDG+BlW8Jnw3rNJWt ZAGLRrjOvPjwJn8++J4Tq1gqXTrNjM2oNKXFlPiNM4iUqCv/gd4mKum77 pw/lLoWpcw/TuxAgtF/owZXHFQllGmPwC0+n8EhuEgSzuFnqs1elN7lYb oV4MBdY/XVphfI8vxU7wYtRu/uuMLaCScI9Oeax1cCjEw9cwk6kqRtLZm g==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535101" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535101" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966098" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966098" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 14/27] KVM: x86: Initialize kvm_caps.supported_xss Date: Sun, 18 Feb 2024 23:47:20 -0800 Message-ID: <20240219074733.122080-15-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Set original kvm_caps.supported_xss to (host_xss & KVM_SUPPORTED_XSS) if XSAVES is supported. host_xss contains the host supported xstate feature bits for thread FPU context switch, KVM_SUPPORTED_XSS includes all KVM enabled XSS feature bits, the resulting value represents the supervisor xstates that are available to guest and are backed by host FPU framework for swapping {guest,host} XSAVE-managed registers/MSRs. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/kvm/x86.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b502d68a2576..60b574fc04d1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -226,6 +226,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) +#define KVM_SUPPORTED_XSS 0 + u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9737,12 +9739,13 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK); kvm_caps.supported_xcr0 = host_xcr0 & KVM_SUPPORTED_XCR0; } + if (boot_cpu_has(X86_FEATURE_XSAVES)) { + rdmsrl(MSR_IA32_XSS, host_xss); + kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS; + } rdmsrl_safe(MSR_EFER, &host_efer); - if (boot_cpu_has(X86_FEATURE_XSAVES)) - rdmsrl(MSR_IA32_XSS, host_xss); - kvm_init_pmu_capability(ops->pmu_ops); if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) From patchwork Mon Feb 19 07:47:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562305 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A44833080; Mon, 19 Feb 2024 07:47:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328875; cv=none; b=J2bq1f5561WaC02gjc8FHouZ8iRNoxUL/y1aT9nYGXcHPgz9qMoUVJpZFUx6Vg+13CE1Ad+77upsnoOPY6veYzouQUrEY7HkfxshzrsD7dM2bMDfwuancns4gtRmEtDmYzzfJlZsl1LDzbaf0e7W561B67qzJQZRI9tBjeD7K/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328875; c=relaxed/simple; bh=jwP3gUL56k+hGNBaB4Q0U23pwBgasARA/lYMleLy96s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RuRx6t6JYsWJtHHAqElSXgDDhmqeqoltQHUI6YrSAWSaeY1iZWZM9w2jURABAoEHa1nin5n7zGb/kOST4VOpQObaE5PDdMbv7Z8DEhUgtEbxxy4j1z4079Wvq7VuwDS3JNi5kQYfSgCPk9ZpAyfZbkhwyE8Hxwwp56Zq59EI3aI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CqYsBM9/; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CqYsBM9/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328873; x=1739864873; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jwP3gUL56k+hGNBaB4Q0U23pwBgasARA/lYMleLy96s=; b=CqYsBM9/HOuZmKcuRCptjTdlC31YM+6Lqn6CwAjtE2P0v7ilH4a1OJge HQeJsK+feBbCabKaH34kifo1pmL4+956qa+OALD5zwnkQeATb7TQ7AMCo /15BaeVlcikHRzBUQbbYmzafStbRJeEtfeu7qgshsXhA3L8XSTFkwle8Y KC7NQovDdgxX4oTGRqtsIjmmI9+nF14AuK8NvaFP1jQW0AatHWq0BD4FS TVDKxPCd3Feh27MFF/ONSrUWcf2aikUHlgICNjrC61nJUkDY+XRVNQh7D fzNAbbJoAro42f6nCQUl3O+Oq/x9AChrRfoX5rRs/TFNZgbUlShClASQ9 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535114" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535114" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966102" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966102" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 15/27] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Date: Sun, 18 Feb 2024 23:47:21 -0800 Message-ID: <20240219074733.122080-16-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Sean Christopherson Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(), to facilitate access to such kind of MSRs. If MSRs supported in kvm_caps.supported_xss are passed through to guest, the guest MSRs are swapped with host's before vCPU exits to userspace and after it reenters kernel before next VM-entry. Because the modified code is also used for the KVM_GET_MSRS device ioctl(), explicitly check @vcpu is non-null before attempting to load guest state. The XSAVE-managed MSRs cannot be retrieved via the device ioctl() without loading guest FPU state (which doesn't exist). Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. The two helpers are put here in order to manifest accessing xsave-managed MSRs requires special check and handling to guarantee the correctness of read/write to the MSRs. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 35 ++++++++++++++++++++++++++++++++++- arch/x86/kvm/x86.h | 24 ++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 60b574fc04d1..906307757159 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; #define KVM_X86_OP(func) \ @@ -4509,6 +4512,21 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +/* + * Returns true if the MSR in question is managed via XSTATE, i.e. is context + * switched with the rest of guest FPU state. + */ +static bool is_xstate_managed_msr(u32 index) +{ + switch (index) { + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + return true; + default: + return false; + } +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4519,11 +4537,26 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + /* + * If userspace is accessing one or more XSTATE-managed MSRs, + * temporarily load the guest's FPU state so that the guest's + * MSR value(s) is resident in hardware, i.e. so that KVM can + * get/set the MSR via RDMSR/WRMSR. + */ + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xstate_managed_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 2f7e19166658..9c19dfb5011d 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -543,4 +543,28 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +/* + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest, + * guest FPU should have been loaded already. + */ + +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + rdmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + wrmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + #endif From patchwork Mon Feb 19 07:47:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562306 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9444633CCF; Mon, 19 Feb 2024 07:47:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328875; cv=none; b=mGvcsJ5UayJM7LNaKOyV2ytz1npTI478UyMrpqcASHRfkSk2wpOY+NaMlBI1XmcYRP7o1f+IOWdscfeOftll2yqhP4b4TBFVQbw0YgoNQpaCOMJ81bgBzS5F1rmdBLB5ZXJJtMQLzrJC3KuArVK0v/3WMlVMjKQlY5dYXUu3FNU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328875; c=relaxed/simple; bh=wNeCjQtUEjXraB2FnEYbRot7k1BW8WqNJpaCluw2O8M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UP+9eIbaz5mlAwXTVVCMz2E5f5ETH4gbwG9UnaM9K1HtrJMjsWtdxgzwQYJ/15D4WOMCKnRBpp8gRKXCW++KSMsM84g2dsz+IMh7+x1FJj1ppNMstz5Jq3xaxtU+LMgmxISd/3K4uJoZrp5ju0m6KWSSNd5CSZjRBAHw91yFIQE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=UvRWKZxG; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UvRWKZxG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328873; x=1739864873; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wNeCjQtUEjXraB2FnEYbRot7k1BW8WqNJpaCluw2O8M=; b=UvRWKZxG6kBbz9myC8so8/gBU4U8n2s39t4P6zUbdOccsdL/gRs0/Ru5 NoHrNryz/wisaGJtw90BXmFR5Baayz782lF1ZNG384kVU32xlfmAPgSn2 YZY49NW2Gwi9UgbqR6m0CSfuZBBNB8cc8PkfZWiLu87nGjREDJi0DptAn GkSf2vcmaMQf9EOEfWpcGFqueHeqOx86QhAQsRS+Pdxowcadth/30/hiy 21OJwtJPDg9h7zOcaXVt6vWjBXYruKwGrCv6lNtDp4mnTWA5rCADGMFBm lX2Z0p409D6JGHH0C7Pf7aDGLIxNc64eYzO+81H23dusnelndTBQDvI1J g==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535107" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535107" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966105" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966105" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 16/27] KVM: x86: Add fault checks for guest CR4.CET setting Date: Sun, 18 Feb 2024 23:47:22 -0800 Message-ID: <20240219074733.122080-17-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Check potential faults for CR4.CET setting per Intel SDM requirements. CET can be enabled if and only if CR0.WP == 1, i.e. setting CR4.CET == 1 faults if CR0.WP == 0 and setting CR0.WP == 0 fails if CR4.CET == 1. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Chao Gao Reviewed-by: Maxim Levitsky --- arch/x86/kvm/x86.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 906307757159..5f5df7e38d3d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1006,6 +1006,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE))) return 1; + if (!(cr0 & X86_CR0_WP) && kvm_is_cr4_bit_set(vcpu, X86_CR4_CET)) + return 1; + static_call(kvm_x86_set_cr0)(vcpu, cr0); kvm_post_set_cr0(vcpu, old_cr0, cr0); @@ -1217,6 +1220,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if ((cr4 & X86_CR4_CET) && !kvm_is_cr0_bit_set(vcpu, X86_CR0_WP)) + return 1; + static_call(kvm_x86_set_cr4)(vcpu, cr4); kvm_post_set_cr4(vcpu, old_cr4, cr4); From patchwork Mon Feb 19 07:47:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562308 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B02EF364BF; Mon, 19 Feb 2024 07:47:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328876; cv=none; b=RnZ6GH9yMEjBO5Bjmjv0cMBA+Fs0rArfCzIolfS6f+Ddpxg8U10guD+n9r4MeqPhPkQqrJvLpMGIKc/LR93XDEmMxfhp3UXRn2mFbCToq68HfmFfFiLab9XN2IRVLEvSS1ktIgiPnl8+Vr8dlrAoH2swacqv43hDJY4vzz8VsxE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328876; c=relaxed/simple; bh=2KNY8GVHtyU72UPtEQPzxT100cFv4ieC6pwN9/mckEI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VRYdjof+77B9R/SyEP9/OhoUMzL0OVJhkklhJoML8dQ911jF9eZ6U0WG/TGomvJTY+MHSI7k/4DDWtCMplmOWcmSeMtlcWGBbNx4pDZlAYNOCotCelhqvTpOdrvnvra8W9XjZobf1VuaGfk3268+hZ2jGhjhoGcbosfTPcQ3YNI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=k1UXkp0H; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="k1UXkp0H" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328874; x=1739864874; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2KNY8GVHtyU72UPtEQPzxT100cFv4ieC6pwN9/mckEI=; b=k1UXkp0HFkXQNZh+9ba2xhzUzwOIaruf4u+vdrwy4WdR4PLSFVtodRk0 RUyLAh5MczeOpRXpB6+asfs1YA47a4zfiXmCBiBnvGKUYK1cTjzsCYZxH ZZgXgE6X+wwX82MUUgR8SqAcrVrSe+yQfJiaaz7W575OYPckNN2gInZUT 0x7R4vni7FwbT/2q7KWWsjtHsykk+6z36GVg+357vJsf+rRTngneFyW7A N8brC8DQYhetKtvjk6bRzjL1VedA8FhC2ahUddGGm9ZK1AXm3SIEHOm18 4y85kisbR7ZZLcXOVIsM05jQ9rLGhdiMm8xbluBy0iUit1QnX5yyYv2Gr w==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535123" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535123" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966108" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966108" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:43 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 17/27] KVM: x86: Report KVM supported CET MSRs as to-be-saved Date: Sun, 18 Feb 2024 23:47:23 -0800 Message-ID: <20240219074733.122080-18-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add CET MSRs to the list of MSRs reported to userspace if the feature, i.e. IBT or SHSTK, associated with the MSRs is supported by KVM. SSP can only be read via RDSSP. Writing even requires destructive and potentially faulting operations such as SAVEPREVSSP/RSTORSSP or SETSSBSY/CLRSSBSY. Let the host use a pseudo-MSR that is just a wrapper for the GUEST_SSP field of the VMCS. Suggested-by: Chao Gao Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/vmx/vmx.c | 2 ++ arch/x86/kvm/x86.c | 18 ++++++++++++++++++ 3 files changed, 21 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 605899594ebb..9d08c0bec477 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -58,6 +58,7 @@ #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 #define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 #define MSR_KVM_MIGRATION_CONTROL 0x4b564d08 +#define MSR_KVM_SSP 0x4b564d09 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9239a89dea22..46042bc6e2fa 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7007,6 +7007,8 @@ static bool vmx_has_emulated_msr(struct kvm *kvm, u32 index) case MSR_AMD64_TSC_RATIO: /* This is AMD only. */ return false; + case MSR_KVM_SSP: + return kvm_cpu_cap_has(X86_FEATURE_SHSTK); default: return true; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5f5df7e38d3d..c0ed69353674 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1476,6 +1476,9 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, + MSR_IA32_U_CET, MSR_IA32_S_CET, + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, }; static const u32 msrs_to_save_pmu[] = { @@ -1579,6 +1582,7 @@ static const u32 emulated_msrs_all[] = { MSR_K7_HWCR, MSR_KVM_POLL_CONTROL, + MSR_KVM_SSP, }; static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)]; @@ -7441,6 +7445,20 @@ static void kvm_probe_msr_to_save(u32 msr_index) if (!kvm_caps.supported_xss) return; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + return; + break; + case MSR_IA32_INT_SSP_TAB: + if (!kvm_cpu_cap_has(X86_FEATURE_LM)) + return; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) + return; + break; default: break; } From patchwork Mon Feb 19 07:47:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562307 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EE5736126; Mon, 19 Feb 2024 07:47:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328876; cv=none; b=CGiDSjL3+hA2+LaRSECJC0kh5evMn9KmV92P4I3lFibZQ4lbQKp4/5ThL/dBDX2ZLFLhNR8EYOPakxe+wlaaa5yApubvuI4HVK5IIt+81WKTyYpsiA3eMZPMeH+rTWHXwi6WniSy+C0Liuu/jP4ChISI5TgLFTJnmrv3m5nKApc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328876; c=relaxed/simple; bh=1oiB8Ecsu9nSN1HnISHg9tKNmX0Jqyokz3Hwfr97W7A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fRAW77fq1giaSH5QuTWZslogH1dSBTFGvu9oy4qlVeAHwZsz556JbyYDkg109TdtPOjX+MqVyk6Z3ivVdtyzZawBf/9KW3Rk3mBcQxcgmKij7rnU3zRkcZgzbxxmGMR69OBhcGU3WrqAiuxlXV1Ry/deFcfhWwDgljBgvOdrvgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Z8KgaNXc; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Z8KgaNXc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328874; x=1739864874; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1oiB8Ecsu9nSN1HnISHg9tKNmX0Jqyokz3Hwfr97W7A=; b=Z8KgaNXc9SIEpE0MlzxQR3izSzS8rHix0VKU3bAF1TpG0rhJreWxrwyH o05GxXaNAAy1Yrxg2A6S0VSk40e5Yp+iGUMuk2idt887jaowu6lP63FDI iF84iPp9nkbcg4aSe1mnzlZA2g+He3rO+/vwowAjXQHZ684PAaId5Qf12 /RzSG0trdGcGUXOjT2iVWsXv8qT1eLp0br+l2I6uNhakAnfOUq4DVRE/E I60bpeEtDuiJw8j2Z//RIYkBdQK0bygm1KnQQDSWHKBJPn5Y4PWvRM1jn Eo0mzUiRirQ4EWPcqSX9G1IVKSItvYjHFkFBUtreMqOUcxMmUlAzsjMoD Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535128" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535128" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966111" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966111" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com, Zhang Yi Z Subject: [PATCH v10 18/27] KVM: VMX: Introduce CET VMCS fields and control bits Date: Sun, 18 Feb 2024 23:47:24 -0800 Message-ID: <20240219074733.122080-19-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Control-flow Enforcement Technology (CET) is a kind of CPU feature used to prevent Return/CALL/Jump-Oriented Programming (ROP/COP/JOP) attacks. It provides two sub-features(SHSTK,IBT) to defend against ROP/COP/JOP style control-flow subversion attacks. Shadow Stack (SHSTK): A shadow stack is a second stack used exclusively for control transfer operations. The shadow stack is separate from the data/normal stack and can be enabled individually in user and kernel mode. When shadow stack is enabled, CALL pushes the return address on both the data and shadow stack. RET pops the return address from both stacks and compares them. If the return addresses from the two stacks do not match, the processor generates a #CP. Indirect Branch Tracking (IBT): IBT introduces instruction(ENDBRANCH)to mark valid target addresses of indirect branches (CALL, JMP etc...). If an indirect branch is executed and the next instruction is _not_ an ENDBRANCH, the processor generates a #CP. These instruction behaves as a NOP on platforms that have no CET. Several new CET MSRs are defined to support CET: MSR_IA32_{U,S}_CET: CET settings for {user,supervisor} CET respectively. MSR_IA32_PL{0,1,2,3}_SSP: SHSTK pointer linear address for CPL{0,1,2,3}. MSR_IA32_INT_SSP_TAB: Linear address of SHSTK pointer table, whose entry is indexed by IST of interrupt gate desc. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring supervisor mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores current active SSP. {HOST,GUEST}_INTR_SSP_TABLE: Stores current active MSR_IA32_INT_SSP_TAB. On Intel platforms, two additional bits are defined in VM_EXIT and VM_ENTRY control fields: If VM_EXIT_LOAD_CET_STATE = 1, host CET states are loaded from following VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_CET_STATE = 1, guest CET states are loaded from following VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang Reviewed-by: Chao Gao Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 0e73616b82f3..451fd4f4fedc 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -104,6 +104,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -117,6 +118,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -345,6 +347,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822, GUEST_SYSENTER_ESP = 0x00006824, GUEST_SYSENTER_EIP = 0x00006826, + GUEST_S_CET = 0x00006828, + GUEST_SSP = 0x0000682a, + GUEST_INTR_SSP_TABLE = 0x0000682c, HOST_CR0 = 0x00006c00, HOST_CR3 = 0x00006c02, HOST_CR4 = 0x00006c04, @@ -357,6 +362,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP = 0x00006c12, HOST_RSP = 0x00006c14, HOST_RIP = 0x00006c16, + HOST_S_CET = 0x00006c18, + HOST_SSP = 0x00006c1a, + HOST_INTR_SSP_TABLE = 0x00006c1c }; /* From patchwork Mon Feb 19 07:47:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562309 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3F25374FA; Mon, 19 Feb 2024 07:47:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328877; cv=none; b=gvBifPKWJxOypmVNMmPJREfbo/eHqEgAS6cEsVYoMnr0kDTZb1GgGRjgEealveAwY2MlLWFcJ8Ph1ztzJQOcELk3EQRQAdmsUhNztgCUtXyVBkCRdnSa8aE53Vc5GmjwF8CBXirVipX5yfQR97nWo5qkVDIXd2vkFL29r2msTfw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328877; c=relaxed/simple; bh=7yQNwaaMLlyKLE8rZMsYP1fpLrQUPVbGPSALYvBYWdc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HGG4EC3sVDJ3t7ZzQMjtMKmSqFmP09YPhZQNtcmGi26FbWV2RCddKyZQOYHKS8ubzMqVherIkI35BPaXj/ZD0ZEmm8mIa9u5CDuPaZhNXDUVrehySLPjRgkpR3GX4Bjy60dd+v1ObyK6ja61+ZdCa880LJR/wb6J/wvdg0Rwpcg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=DYQIRXsn; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="DYQIRXsn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328875; x=1739864875; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7yQNwaaMLlyKLE8rZMsYP1fpLrQUPVbGPSALYvBYWdc=; b=DYQIRXsneTby4OmoAO1lGJkaWOvi3kFD16whM1LFRaxhjNCW8LdQNXGh I0M/mME0SdnVQUDcs4pv3/PDcVV6osqnpO3e7es13CNUnNPb5rJXGvg89 X7fAOoqrsfSCAAROfDpdjQabBRMkOdhiL+XS9gK56oYhBD0GQSk3usSxq iWgi9ZWZ6Lzujuu3yJtWevTi32w2DF2FU9tVt91+rcEkCy5QmeYzuP5g6 LLl7BdJfTj6Xti+En7XcOewEeLXpAmLZm+LCCOXzmEUk6UxgUErWItCAG gzVsVZyanBKMhnk8gLbr1b1GLIHcn+sytwiLTxyOWUXHp6hoUa9vdXdlX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535136" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535136" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966115" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966115" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 19/27] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Date: Sun, 18 Feb 2024 23:47:25 -0800 Message-ID: <20240219074733.122080-20-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Use the governed feature framework to track whether X86_FEATURE_SHSTK and X86_FEATURE_IBT features can be used by userspace and guest, i.e., the features can be used iff both KVM and guest CPUID can support them. TODO: remove this patch once Sean's refactor to "KVM-governed" framework is upstreamed. See the work here [*]. [*]: https://lore.kernel.org/all/20231110235528.1561679-1-seanjc@google.com/ Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/governed_features.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h index ad463b1ed4e4..daf0c0a3e29c 100644 --- a/arch/x86/kvm/governed_features.h +++ b/arch/x86/kvm/governed_features.h @@ -17,6 +17,8 @@ KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD) KVM_GOVERNED_X86_FEATURE(VGIF) KVM_GOVERNED_X86_FEATURE(VNMI) KVM_GOVERNED_X86_FEATURE(LAM) +KVM_GOVERNED_X86_FEATURE(SHSTK) +KVM_GOVERNED_X86_FEATURE(IBT) #undef KVM_GOVERNED_X86_FEATURE #undef KVM_GOVERNED_FEATURE diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 46042bc6e2fa..6cb94754c2a9 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7764,6 +7764,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VMX); kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_LAM); + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_SHSTK); + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IBT); vmx_setup_uret_msrs(vmx); From patchwork Mon Feb 19 07:47:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562310 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEC85374F2; Mon, 19 Feb 2024 07:47:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328877; cv=none; b=r3g9Hw1FxkA7xc3OaiouWk49x1YiNXa7469zh6Knzt85UJ9IxsT1fHNEAEoJrcAMoDejNUtwvp92Dh70GegzqZzMFwl1xJfViqmASxMXgo/TQEzGPXawxgI5q7KahhDCq3QQ4bEDw9bouKGc/FIVuEU0yGxcC4GCwD0ra9dHnBc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328877; c=relaxed/simple; bh=BTAmCrJM+TMcHQ8B+9ubqY9NJwhtodRsnqEaKj9jxbQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mbklRxyrOaOsbSL9dNUj+yOgsWShfebGy4M9x6aOru2iLYSeSYcnGorj51FS0XhrpvLygJoNKUTPbLfXXG71lCUR6/CE29BZBekauqwtiYCuMJ/DcJTBFw96Hgc/I8SkDvPbHxgElYd8S9kEKKX6Bo1zHoZOVOjFFA7OFImopDU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=l9dmXkCC; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="l9dmXkCC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328875; x=1739864875; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BTAmCrJM+TMcHQ8B+9ubqY9NJwhtodRsnqEaKj9jxbQ=; b=l9dmXkCC1AkFPVYNybVIzkDHkNBlWE8buOUZKbLoKBgFvYUohz7mAGJF SdfK+5vOWwru2GtsYUAGhzodYzDbWq+gMv8gi1ZHtETzforkpwNkrYQds Zsnb+wxKbj2oeAIognWTF+pqf9vGuQUQ+4KApKn9fsYu6SbE36uyjezAX qHAEwynqB7mStFZAsUfA04SO3Wg2gNitBSI/DPmGOX3OGM6aN69lJ1KQc DM7CWlM90sWV7xxgS5twW+On4p8vu9C8McuzDQtG2FODe4yIjhbrOdTl7 SllVNNwoIJDRfTjaM1z3PIIT30CkJDdUuTA2YjG/+fxplP5NhDsXPdNJC A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535142" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535142" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966118" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966118" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 20/27] KVM: VMX: Emulate read and write to CET MSRs Date: Sun, 18 Feb 2024 23:47:26 -0800 Message-ID: <20240219074733.122080-21-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add emulation interface for CET MSR access. The emulation code is split into common part and vendor specific part. The former does common checks for MSRs, e.g., accessibility, data validity etc., then passes operation to either XSAVE-managed MSRs via the helpers or CET VMCS fields. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 18 +++++++++ arch/x86/kvm/x86.c | 88 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 106 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6cb94754c2a9..ff2296fa7d39 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2106,6 +2106,15 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_IA32_S_CET: + msr_info->data = vmcs_readl(GUEST_S_CET); + break; + case MSR_KVM_SSP: + msr_info->data = vmcs_readl(GUEST_SSP); + break; + case MSR_IA32_INT_SSP_TAB: + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE); + break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); break; @@ -2415,6 +2424,15 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_IA32_S_CET: + vmcs_writel(GUEST_S_CET, data); + break; + case MSR_KVM_SSP: + vmcs_writel(GUEST_SSP, data); + break; + case MSR_IA32_INT_SSP_TAB: + vmcs_writel(GUEST_INTR_SSP_TABLE, data); + break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) return 1; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c0ed69353674..281c3fe728c5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1849,6 +1849,36 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type) } EXPORT_SYMBOL_GPL(kvm_msr_allowed); +#define CET_US_RESERVED_BITS GENMASK(9, 6) +#define CET_US_SHSTK_MASK_BITS GENMASK(1, 0) +#define CET_US_IBT_MASK_BITS (GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10)) +#define CET_US_LEGACY_BITMAP_BASE(data) ((data) >> 12) + +static bool is_set_cet_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u64 data, + bool host_initiated) +{ + bool msr_ctrl = index == MSR_IA32_S_CET || index == MSR_IA32_U_CET; + + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) + return true; + + if (msr_ctrl && guest_can_use(vcpu, X86_FEATURE_IBT)) + return true; + + /* + * If KVM supports the MSR, i.e. has enumerated the MSR existence to + * userspace, then userspace is allowed to write '0' irrespective of + * whether or not the MSR is exposed to the guest. + */ + if (!host_initiated || data) + return false; + + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) + return true; + + return msr_ctrl && kvm_cpu_cap_has(X86_FEATURE_IBT); +} + /* * Write @data into the MSR specified by @index. Select MSR specific fault * checks are bypassed if @host_initiated is %true. @@ -1908,6 +1938,42 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data, data = (u32)data; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!is_set_cet_msr_allowed(vcpu, index, data, host_initiated)) + return 1; + if (data & CET_US_RESERVED_BITS) + return 1; + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) && + (data & CET_US_SHSTK_MASK_BITS)) + return 1; + if (!guest_can_use(vcpu, X86_FEATURE_IBT) && + (data & CET_US_IBT_MASK_BITS)) + return 1; + if (!IS_ALIGNED(CET_US_LEGACY_BITMAP_BASE(data), 4)) + return 1; + /* IBT can be suppressed iff the TRACKER isn't WAIT_ENDBR. */ + if ((data & CET_SUPPRESS) && (data & CET_WAIT_ENDBR)) + return 1; + break; + case MSR_IA32_INT_SSP_TAB: + if (!is_set_cet_msr_allowed(vcpu, index, data, host_initiated)) + return 1; + if (is_noncanonical_address(data, vcpu)) + return 1; + break; + case MSR_KVM_SSP: + if (!host_initiated) + return 1; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!is_set_cet_msr_allowed(vcpu, index, data, host_initiated)) + return 1; + if (is_noncanonical_address(data, vcpu)) + return 1; + if (!IS_ALIGNED(data, 4)) + return 1; + break; } msr.data = data; @@ -1951,6 +2017,20 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, !guest_cpuid_has(vcpu, X86_FEATURE_RDPID)) return 1; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) && + !guest_can_use(vcpu, X86_FEATURE_IBT)) + return 1; + break; + case MSR_KVM_SSP: + if (!host_initiated) + return 1; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK)) + return 1; + break; } msr.index = index; @@ -4143,6 +4223,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vcpu->arch.guest_fpu.xfd_err = data; break; #endif + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + kvm_set_xstate_msr(vcpu, msr_info); + break; default: if (kvm_pmu_is_valid_msr(vcpu, msr)) return kvm_pmu_set_msr(vcpu, msr_info); @@ -4502,6 +4586,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vcpu->arch.guest_fpu.xfd_err; break; #endif + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + kvm_get_xstate_msr(vcpu, msr_info); + break; default: if (kvm_pmu_is_valid_msr(vcpu, msr_info->index)) return kvm_pmu_get_msr(vcpu, msr_info); From patchwork Mon Feb 19 07:47:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562312 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6DB53C09F; Mon, 19 Feb 2024 07:47:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328879; cv=none; b=pN/JzzxQ1wzMA8gGh4gPIxlwA/qXwoCvOoX6A3HRYHfjQ4R850bZJpioeYlsExuG8bXNKN0Gr8x+90GRseuZ9yuahL7d0lIZbAM+UexeciT6mszx1/ewZtQnAfCdGqtqj9nuXJtQNuYMirWupPe3TC652SdO966J+uBd6zG+9uQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328879; c=relaxed/simple; bh=j6k0zNCcu78ZjgKB056TtP0O+fKYCAbaHVnQy48CcZA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iP+jLhn7EofgdSG2g7qYOaj7GsT7w5jDmB4gVENJcSrwEIPiR+dFd4/o/lKf8bh+Vieo9vlvwKvZ2ORQpQaJhQcUGX6up+HHiBE+DLcHXpxqo2RZTMnp8DjlTuE6K04jB5wuJ4e+ghGolstEGWZf38upGJd9KlgbBx8f0CPXxwE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mNmu02P1; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mNmu02P1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328877; x=1739864877; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=j6k0zNCcu78ZjgKB056TtP0O+fKYCAbaHVnQy48CcZA=; b=mNmu02P1igqg5jSwFPmbDGHQkjxZyGnzWtkWhH3hQZBtHNSZkJZQIJ67 7r0l686OnPvI4DndNS8TqdED/0ZpKddCDJXoyW6CZwSqZVbNPy3DT0H+p oxSNFDLb90WP8yxzdaDJtzs5Q1UeNCD9KJXKXN5lXo/QT5lp9Ub/6keWW GyfX9SwGbiq2Iz4ZOdcWMxcgivH/hFYqxqMLpvR2r1pVrtdZ3FYavciU2 3OY0zldGciw7DmSq/33/GRC6uis8SYlCKJxXCM/X6OdMlNheXJvQpH4EY eSFQ1VTRuzz8BF2hTxGtsde1c36F9R2Uwu/pZiBfK1ybiR0jtbAdPsj9t A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535151" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535151" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966122" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966122" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 21/27] KVM: x86: Save and reload SSP to/from SMRAM Date: Sun, 18 Feb 2024 23:47:27 -0800 Message-ID: <20240219074733.122080-22-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Save CET SSP to SMRAM on SMI and reload it on RSM. KVM emulates HW arch behavior when guest enters/leaves SMM mode,i.e., save registers to SMRAM at the entry of SMM and reload them at the exit to SMM. Per SDM, SSP is one of such registers on 64-bit Arch, and add the support for SSP. Note, on 32-bit Arch, SSP is not defined in SMRAM, so fail 32-bit CET guest launch. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/cpuid.c | 11 +++++++++++ arch/x86/kvm/smm.c | 8 ++++++++ arch/x86/kvm/smm.h | 2 +- 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 2bb1931103ad..c0e13040e35b 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -149,6 +149,17 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu, if (vaddr_bits != 48 && vaddr_bits != 57 && vaddr_bits != 0) return -EINVAL; } + /* + * Prevent 32-bit guest launch if shadow stack is exposed as SSP + * state is not defined for 32-bit SMRAM. + */ + best = cpuid_entry2_find(entries, nent, 0x80000001, + KVM_CPUID_INDEX_NOT_SIGNIFICANT); + if (best && !(best->edx & F(LM))) { + best = cpuid_entry2_find(entries, nent, 0x7, 0); + if (best && (best->ecx & F(SHSTK))) + return -EINVAL; + } /* * Exposing dynamic xfeatures to the guest requires additional diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index 45c855389ea7..7aac9c54c353 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -275,6 +275,10 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, enter_smm_save_seg_64(vcpu, &smram->gs, VCPU_SREG_GS); smram->int_shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu); + + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) + KVM_BUG_ON(kvm_msr_read(vcpu, MSR_KVM_SSP, &smram->ssp), + vcpu->kvm); } #endif @@ -564,6 +568,10 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, static_call(kvm_x86_set_interrupt_shadow)(vcpu, 0); ctxt->interruptibility = (u8)smstate->int_shadow; + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) + KVM_BUG_ON(kvm_msr_write(vcpu, MSR_KVM_SSP, smstate->ssp), + vcpu->kvm); + return X86EMUL_CONTINUE; } #endif diff --git a/arch/x86/kvm/smm.h b/arch/x86/kvm/smm.h index a1cf2ac5bd78..1e2a3e18207f 100644 --- a/arch/x86/kvm/smm.h +++ b/arch/x86/kvm/smm.h @@ -116,8 +116,8 @@ struct kvm_smram_state_64 { u32 smbase; u32 reserved4[5]; - /* ssp and svm_* fields below are not implemented by KVM */ u64 ssp; + /* svm_* fields below are not implemented by KVM */ u64 svm_guest_pat; u64 svm_host_efer; u64 svm_host_cr4; From patchwork Mon Feb 19 07:47:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562311 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A901381C2; Mon, 19 Feb 2024 07:47:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328878; cv=none; b=enlj6DE1Xrp4J18hEVgnKTUD3+YwEzrc+Se/E+ii8i/N9LJyOMc8TljUSLVn+kSDnh4WwTAGdsRl6nUa3lAfrq13sZjdZSFGzzWLwRIJu1OuIoIMtNe9k7l5pRUloNPThR7sI9BbaO/5sCdLWliW3bWJ0ydg2pgARYgY2s2PMDw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328878; c=relaxed/simple; bh=lN6usg9zOqtCq4Znlmw51f35EKAS6w8CTKcm4FaLfzI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=d0IS06N/hcGps+s7Wh+Q+MYEu4JwP0l0fRLmsqmVKSiu11LyPAnC1woBsj6YsXY/6ojHihDnKw9aMG93G+2TmWcvcly3YXASCS3cZH8V0A88GmKkqzBs6ZIZqGzmgTX2pXXAjliqPCe1beDPoISQ577cn11hyxhZhxXUse55T0s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lj+ayEti; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lj+ayEti" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328876; x=1739864876; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lN6usg9zOqtCq4Znlmw51f35EKAS6w8CTKcm4FaLfzI=; b=lj+ayEtiYlI9IsViEhJEyWvC+jh2mIj3wGxPevG2YxtyXd6Dy4wVkyaw THxMhcslVDbaVT12z6aw7cpL8JEeMSdo2CLZ+sWiYa2I8oMp8a/niOBjX n/bfeYTFYsscwLfa5BLSirBCbv3tQKeqQ/YqpAJFz7F4xd8EXuK6I52f4 +5Gcuto3TQijPXJQuavhCJUgUJJff5F8FPACIn243+sGpf+sE1MewB16Z qcP9azhaYDLotZdPTuIZmfbCtp3UzHr7CG5/ZZp7NfQ4DE2Eltth9SMkt rtEFn5p0SY6OHTZqHHsL1vZWMvgfi0iw1zSviWURYPQOPG7tzlkrM3ZG+ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535161" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535161" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966124" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966124" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 22/27] KVM: VMX: Set up interception for CET MSRs Date: Sun, 18 Feb 2024 23:47:28 -0800 Message-ID: <20240219074733.122080-23-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Enable/disable CET MSRs interception per associated feature configuration. Shadow Stack feature requires all CET MSRs passed through to guest to make it supported in user and supervisor mode while IBT feature only depends on MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT. Note, this MSR design introduced an architectural limitation of SHSTK and IBT control for guest, i.e., when SHSTK is exposed, IBT is also available to guest from architectual perspective since IBT relies on subset of SHSTK relevant MSRs. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/kvm/vmx/vmx.c | 43 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ff2296fa7d39..24e921c4e7e3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -159,7 +159,7 @@ module_param(allow_smaller_maxphyaddr, bool, S_IRUGO); /* * List of MSRs that can be directly passed to the guest. - * In addition to these x2apic and PT MSRs are handled specially. + * In addition to these x2apic/PT/CET MSRs are handled specially. */ static u32 vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = { MSR_IA32_SPEC_CTRL, @@ -692,6 +692,10 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8: /* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */ return true; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -7767,6 +7771,41 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + bool incpt; + + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, + MSR_TYPE_RW, incpt); + if (!incpt) + return; + } + + if (kvm_cpu_cap_has(X86_FEATURE_IBT)) { + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, + MSR_TYPE_RW, incpt); + } +} + static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7845,6 +7884,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); + + vmx_update_intercept_for_cet_msr(vcpu); } static u64 vmx_get_perf_capabilities(void) From patchwork Mon Feb 19 07:47:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562315 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 773AE3D96C; Mon, 19 Feb 2024 07:47:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328880; cv=none; b=q2qp5ifZFyEEqhkWjB9kG4Xnhm19WcFF068wpyOrt8bEwZrMJnDzeHw0J7rgXZuqfAtv7Bt4HcH24pElSvmoFyG+jHj4hKZ3ezbXC+wPp1ZsM8q19Rw6xc8SzoggSsvmKtKe14TaATqt6GF0cEpaQwVpcxLpRzW9tsp9gYW6Xk8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328880; c=relaxed/simple; bh=Uv5NXvZAXmi5JcnTTPv3AnGIZ1+vXF1EKPUlOm+C42U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YoVzcgFVGWTyDpbvGGft0GlqGiB/Oc9I/5vIUBdJUH2A6AWL9raFHHlPDWk1i2OFTM2v8DBhlpc4lzvygSGuWPGtkwoar3B54wdtusYjBB7PNtnCGs84i9hvsfLNMDOxt9VdCY98tWRHDBG3McZIz+QP80uxnwGi6JvJrQmnO0Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=C0FTN5mm; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="C0FTN5mm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328878; x=1739864878; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Uv5NXvZAXmi5JcnTTPv3AnGIZ1+vXF1EKPUlOm+C42U=; b=C0FTN5mmmN697p/fBd8yakE4dkg3cM5USnk/1nIeZ8vS6jfc6fhScfe+ RVICkh9mEBBTIJx5T7iQpRxNw0+ZP7v0qaBrpCPhoZLaFV5kLUi0g+ZRV pWA5ghEiP3WnkmzvjPRAb1r8W04ZtZUqwXzhGWZW0tYGBfPFlFLZ6P21S n7VADNg7yI/OtLjoZYHRxJjJSqvIc5Pzf9qf4LC5GV1CmLEYprbySd0v4 wstOWmUDUXk26VZXkgeR51PGaXMMmdQyiSkcLK0/HgUSQmD/O2y/V2hVB Z2GRvHoeuPrKCUkQmWLtzJGVYtVbfSVwWP75bAW+3WRv5m7g6scON1sVu g==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535163" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535163" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966127" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966127" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 23/27] KVM: VMX: Set host constant supervisor states to VMCS fields Date: Sun, 18 Feb 2024 23:47:29 -0800 Message-ID: <20240219074733.122080-24-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Save constant values to HOST_{S_CET,SSP,INTR_SSP_TABLE} field explicitly. Kernel IBT is supported and the setting in MSR_IA32_S_CET is static after post-boot(The exception is BIOS call case but vCPU thread never across it) and KVM doesn't need to refresh HOST_S_CET field before every VM-Enter/ VM-Exit sequence. Host supervisor shadow stack is not enabled now and SSP is not accessible to kernel mode, thus it's safe to set host IA32_INT_SSP_TAB/SSP VMCS field to 0s. When shadow stack is enabled for CPL3, SSP is reloaded from PL3_SSP before it exits to userspace. Check SDM Vol 2A/B Chapter 3/4 for SYSCALL/ SYSRET/SYSENTER SYSEXIT/RDSSP/CALL etc. Prevent KVM module loading if host supervisor shadow stack SHSTK_EN is set in MSR_IA32_S_CET as KVM cannot co-exit with it correctly. Suggested-by: Sean Christopherson Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/kvm/vmx/capabilities.h | 4 ++++ arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++ arch/x86/kvm/x86.c | 14 ++++++++++++++ arch/x86/kvm/x86.h | 1 + 4 files changed, 34 insertions(+) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 41a4533f9989..ee8938818c8a 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -106,6 +106,10 @@ static inline bool cpu_has_load_perf_global_ctrl(void) return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; } +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 24e921c4e7e3..342b5b94c892 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4371,6 +4371,21 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx) if (cpu_has_load_ia32_efer()) vmcs_write64(HOST_IA32_EFER, host_efer); + + /* + * Supervisor shadow stack is not enabled on host side, i.e., + * host IA32_S_CET.SHSTK_EN bit is guaranteed to 0 now, per SDM + * description(RDSSP instruction), SSP is not readable in CPL0, + * so resetting the two registers to 0s at VM-Exit does no harm + * to kernel execution. When execution flow exits to userspace, + * SSP is reloaded from IA32_PL3_SSP. Check SDM Vol.2A/B Chapter + * 3 and 4 for details. + */ + if (cpu_has_load_cet_ctrl()) { + vmcs_writel(HOST_S_CET, host_s_cet); + vmcs_writel(HOST_SSP, 0); + vmcs_writel(HOST_INTR_SSP_TABLE, 0); + } } void set_cr4_guest_host_mask(struct vcpu_vmx *vmx) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 281c3fe728c5..73a55d388dd9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -114,6 +114,8 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE); #endif static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS; +u64 __read_mostly host_s_cet; +EXPORT_SYMBOL_GPL(host_s_cet); #define KVM_EXIT_HYPERCALL_VALID_MASK (1 << KVM_HC_MAP_GPA_RANGE) @@ -9862,6 +9864,18 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) return -EIO; } + if (boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_S_CET, host_s_cet); + /* + * Linux doesn't yet support supervisor shadow stacks (SSS), so + * KVM doesn't save/restore the associated MSRs, i.e. KVM may + * clobber the host values. Yell and refuse to load if SSS is + * unexpectedly enabled, e.g. to avoid crashing the host. + */ + if (WARN_ON_ONCE(host_s_cet & CET_SHSTK_EN)) + return -EIO; + } + x86_emulator_cache = kvm_alloc_emulator_cache(); if (!x86_emulator_cache) { pr_err("failed to allocate cache for x86 emulator\n"); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 9c19dfb5011d..656107e64c93 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -325,6 +325,7 @@ fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu); extern u64 host_xcr0; extern u64 host_xss; extern u64 host_arch_capabilities; +extern u64 host_s_cet; extern struct kvm_caps kvm_caps; From patchwork Mon Feb 19 07:47:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562314 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3E1F383AB; Mon, 19 Feb 2024 07:47:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328880; cv=none; b=SrfZNXPyO28tfbOUbLtgiIOdPNGbiwR9anqUpqL85LXLSl3/+iX3vUBN7gdNz5ylGx32t+N2Vno4DUn23N9DrxTeHS+XBu9E+eHahFTmhosB5RqcSPUYOda2SnUa6bEHogEto6PKt4d+ZG/Dkrh+ExOcf80yH8aF9y3+OYx/kxI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328880; c=relaxed/simple; bh=DCjPPc/YW+5nlsJpiVIipFWghIYTaoPxr0s0Kqx5JLU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u7xM/fKpJxZLMN0Q9FlnnS00e7rTnQ+YqQbedcgrXTAnDq/dGfbztqUyUn0OPhDL7u2n0OrrmRtjtp470Qmlveb6HtFnS1ahnt6k53/fmbb5J0TRMxI4rDbU1RH5X/yvJkeGI3jELcOd5tPyu6RW5ukYZeMs+8/TYq2Gvne6pxg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=WnP60YHF; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="WnP60YHF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328876; x=1739864876; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DCjPPc/YW+5nlsJpiVIipFWghIYTaoPxr0s0Kqx5JLU=; b=WnP60YHF/J7wMTr7Ui8Uca75iecRRrp/iMXkrLc86J7ICSeB1spGTQgT 7x1CH9j/+c3iOv//y7hS2oYZ0ISZFJUhWXw6iXm+xLd+e5sbw1OwbKthW xpiuO/Aa1Z84WzHp0iz4D7dGz/rUHVq3WSnAcjqtL2Y6qTjb75PpykieP OzTXdBZnztG/MAh/AXhdEgYgPllWygl6+hJTqaKSiAhmYDqksbGKyfnhm czAmzpgMYMn3fZQHPeBpWPSWBS17SmuO/M5u1HsHYxmbzLUJNUh+KS/xs oEb6NnieRuzLg0uxoGnrlqDWTOg038Gx1ue6evfGPZ+K+6xfyVW/kz+ND Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535165" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535165" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966131" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966131" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Date: Sun, 18 Feb 2024 23:47:30 -0800 Message-ID: <20240219074733.122080-25-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Expose CET features to guest if KVM/host can support them, clear CPUID feature bits if KVM/host cannot support. Set CPUID feature bits so that CET features are available in guest CPUID. Add CR4.CET bit support in order to allow guest set CET master control bit. Disable KVM CET feature if unrestricted_guest is unsupported/disabled as KVM does not support emulating CET. The CET load-bits in VM_ENTRY/VM_EXIT control fields should be set to make guest CET xstates isolated from host's. On platforms with VMX_BASIC[bit56] == 0, inject #CP at VMX entry with error code will fail, and if VMX_BASIC[bit56] == 1, #CP injection with or without error code is allowed. Disable CET feature bits if the MSR bit is cleared so that nested VMM can inject #CP if and only if VMX_BASIC[bit56] == 1. Don't expose CET feature if either of {U,S}_CET xstate bits is cleared in host XSS or if XSAVES isn't supported. CET MSR contents after reset, power-up and INIT are set to 0s, clears the guest fpstate fields so that the guest MSRs are reset to 0s after the events. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/cpuid.c | 25 ++++++++++++++++++++----- arch/x86/kvm/vmx/capabilities.h | 6 ++++++ arch/x86/kvm/vmx/vmx.c | 30 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 6 ++++-- arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++++-- arch/x86/kvm/x86.h | 3 +++ 8 files changed, 88 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 79f7c18c487b..3b263fa171a1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -134,7 +134,7 @@ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ - | X86_CR4_LAM_SUP)) + | X86_CR4_LAM_SUP | X86_CR4_CET)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index f1bd7b91b3c6..4aa9aaa295f0 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1110,6 +1110,7 @@ #define VMX_BASIC_MEM_TYPE_MASK 0x003c000000000000LLU #define VMX_BASIC_MEM_TYPE_WB 6LLU #define VMX_BASIC_INOUT 0x0040000000000000LLU +#define VMX_BASIC_NO_HW_ERROR_CODE_CC 0x0100000000000000LLU /* Resctrl MSRs: */ /* - Intel: */ diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index c0e13040e35b..d37f41472043 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -150,14 +150,14 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu, return -EINVAL; } /* - * Prevent 32-bit guest launch if shadow stack is exposed as SSP - * state is not defined for 32-bit SMRAM. + * CET is not supported for 32-bit guest, prevent guest launch if + * shadow stack or IBT is enabled for 32-bit guest. */ best = cpuid_entry2_find(entries, nent, 0x80000001, KVM_CPUID_INDEX_NOT_SIGNIFICANT); if (best && !(best->edx & F(LM))) { best = cpuid_entry2_find(entries, nent, 0x7, 0); - if (best && (best->ecx & F(SHSTK))) + if (best && ((best->ecx & F(SHSTK)) || (best->edx & F(IBT)))) return -EINVAL; } @@ -665,7 +665,7 @@ void kvm_set_cpu_caps(void) F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | - F(SGX_LC) | F(BUS_LOCK_DETECT) + F(SGX_LC) | F(BUS_LOCK_DETECT) | F(SHSTK) ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) @@ -683,7 +683,8 @@ void kvm_set_cpu_caps(void) F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | - F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) + F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) | + F(IBT) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ @@ -696,6 +697,20 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP); if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD); + /* + * Don't use boot_cpu_has() to check availability of IBT because the + * feature bit is cleared in boot_cpu_data when ibt=off is applied + * in host cmdline. + * + * As currently there's no HW bug which requires disabling IBT feature + * while CPU can enumerate it, host cmdline option ibt=off is most + * likely due to administrative reason on host side, so KVM refers to + * CPU CPUID enumeration to enable the feature. In future if there's + * actually some bug clobbered ibt=off option, then enforce additional + * check here to disable the support in KVM. + */ + if (cpuid_edx(7) & F(IBT)) + kvm_cpu_cap_set(X86_FEATURE_IBT); kvm_cpu_cap_mask(CPUID_7_1_EAX, F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index ee8938818c8a..e12bc233d88b 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -79,6 +79,12 @@ static inline bool cpu_has_vmx_basic_inout(void) return (((u64)vmcs_config.basic_cap << 32) & VMX_BASIC_INOUT); } +static inline bool cpu_has_vmx_basic_no_hw_errcode(void) +{ + return ((u64)vmcs_config.basic_cap << 32) & + VMX_BASIC_NO_HW_ERROR_CODE_CC; +} + static inline bool cpu_has_virtual_nmis(void) { return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS && diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 342b5b94c892..9df25c9e80f5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2609,6 +2609,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf, { VM_ENTRY_LOAD_IA32_EFER, VM_EXIT_LOAD_IA32_EFER }, { VM_ENTRY_LOAD_BNDCFGS, VM_EXIT_CLEAR_BNDCFGS }, { VM_ENTRY_LOAD_IA32_RTIT_CTL, VM_EXIT_CLEAR_IA32_RTIT_CTL }, + { VM_ENTRY_LOAD_CET_STATE, VM_EXIT_LOAD_CET_STATE }, }; memset(vmcs_conf, 0, sizeof(*vmcs_conf)); @@ -4934,6 +4935,14 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); /* 22.2.1 */ + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { + vmcs_writel(GUEST_SSP, 0); + vmcs_writel(GUEST_S_CET, 0); + vmcs_writel(GUEST_INTR_SSP_TABLE, 0); + } else if (kvm_cpu_cap_has(X86_FEATURE_IBT)) { + vmcs_writel(GUEST_S_CET, 0); + } + kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); vpid_sync_context(vmx->vpid); @@ -6353,6 +6362,10 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0) vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest); + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) + pr_err("S_CET = 0x%016lx, SSP = 0x%016lx, SSP TABLE = 0x%016lx\n", + vmcs_readl(GUEST_S_CET), vmcs_readl(GUEST_SSP), + vmcs_readl(GUEST_INTR_SSP_TABLE)); pr_err("*** Host State ***\n"); pr_err("RIP = 0x%016lx RSP = 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6383,6 +6396,10 @@ void dump_vmcs(struct kvm_vcpu *vcpu) vmcs_read64(HOST_IA32_PERF_GLOBAL_CTRL)); if (vmcs_read32(VM_EXIT_MSR_LOAD_COUNT) > 0) vmx_dump_msrs("host autoload", &vmx->msr_autoload.host); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) + pr_err("S_CET = 0x%016lx, SSP = 0x%016lx, SSP TABLE = 0x%016lx\n", + vmcs_readl(HOST_S_CET), vmcs_readl(HOST_SSP), + vmcs_readl(HOST_INTR_SSP_TABLE)); pr_err("*** Control State ***\n"); pr_err("CPUBased=0x%08x SecondaryExec=0x%08x TertiaryExec=0x%016llx\n", @@ -7965,7 +7982,6 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_UMIP); /* CPUID 0xD.1 */ - kvm_caps.supported_xss = 0; if (!cpu_has_vmx_xsaves()) kvm_cpu_cap_clear(X86_FEATURE_XSAVES); @@ -7977,6 +7993,18 @@ static __init void vmx_set_cpu_caps(void) if (cpu_has_vmx_waitpkg()) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); + + /* + * Disable CET if unrestricted_guest is unsupported as KVM doesn't + * enforce CET HW behaviors in emulator. On platforms with + * VMX_BASIC[bit56] == 0, inject #CP at VMX entry with error code + * fails, so disable CET in this case too. + */ + if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest || + !cpu_has_vmx_basic_no_hw_errcode()) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } } static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index e3b0985bb74a..d0cad2624564 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -484,7 +484,8 @@ static inline u8 vmx_get_rvi(void) VM_ENTRY_LOAD_IA32_EFER | \ VM_ENTRY_LOAD_BNDCFGS | \ VM_ENTRY_PT_CONCEAL_PIP | \ - VM_ENTRY_LOAD_IA32_RTIT_CTL) + VM_ENTRY_LOAD_IA32_RTIT_CTL | \ + VM_ENTRY_LOAD_CET_STATE) #define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS \ (VM_EXIT_SAVE_DEBUG_CONTROLS | \ @@ -506,7 +507,8 @@ static inline u8 vmx_get_rvi(void) VM_EXIT_LOAD_IA32_EFER | \ VM_EXIT_CLEAR_BNDCFGS | \ VM_EXIT_PT_CONCEAL_PIP | \ - VM_EXIT_CLEAR_IA32_RTIT_CTL) + VM_EXIT_CLEAR_IA32_RTIT_CTL | \ + VM_EXIT_LOAD_CET_STATE) #define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \ (PIN_BASED_EXT_INTR_MASK | \ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 73a55d388dd9..cd656099fbfd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -231,7 +231,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) -#define KVM_SUPPORTED_XSS 0 +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9943,6 +9944,20 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) kvm_caps.supported_xss = 0; + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &= ~(XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); + + if ((kvm_caps.supported_xss & (XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL)) != + (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~(XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); + } + #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_); #undef __kvm_cpu_cap_has @@ -12402,7 +12417,9 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) } #define XSTATE_NEED_RESET_MASK (XFEATURE_MASK_BNDREGS | \ - XFEATURE_MASK_BNDCSR) + XFEATURE_MASK_BNDCSR | \ + XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) static bool kvm_vcpu_has_xstate(unsigned long xfeature) { @@ -12410,6 +12427,11 @@ static bool kvm_vcpu_has_xstate(unsigned long xfeature) case XFEATURE_MASK_BNDREGS: case XFEATURE_MASK_BNDCSR: return kvm_cpu_cap_has(X86_FEATURE_MPX); + case XFEATURE_CET_USER: + return kvm_cpu_cap_has(X86_FEATURE_SHSTK) || + kvm_cpu_cap_has(X86_FEATURE_IBT); + case XFEATURE_CET_KERNEL: + return kvm_cpu_cap_has(X86_FEATURE_SHSTK); default: return false; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 656107e64c93..cc585051d24b 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -533,6 +533,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type); __reserved_bits |= X86_CR4_PCIDE; \ if (!__cpu_has(__c, X86_FEATURE_LAM)) \ __reserved_bits |= X86_CR4_LAM_SUP; \ + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \ + !__cpu_has(__c, X86_FEATURE_IBT)) \ + __reserved_bits |= X86_CR4_CET; \ __reserved_bits; \ }) From patchwork Mon Feb 19 07:47:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562313 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8F5C3CF5E; Mon, 19 Feb 2024 07:47:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328880; cv=none; b=Ea0iKDUDbRP+RRUbyAx6miu5hXZhjb8iWt/2We0HU7gquIwx/Vke0/qXTYch2kDH9KcQaxqZRitVFG4xxsmUYxvLuMiJoxunyQubVn37p6ld1M1207XJHW4L99W6Fvz+lHTPtJm3yn2FEyzozsp6FtCtW8NHGfIpuYbzPtu2nkA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328880; c=relaxed/simple; bh=amnQUCWuTb+cZ7qxz14jEfsX5/6hnGY9dMCWTjruETs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IxpqZqfFDhfZynzAe7+Ad0eUNt8MkrUgsKlSDF8P09FohHohviJygrWawUD4gM9accG1IObMGQp6ZkVFeDppnlMSjuk5O5sJaHVTsCRRc3fWQfsPSQvbRUE53Wh/sM+gwqRZDg4p/Qj6IHRjUkuJBulFaeuUxoSHe3CQKSu5fWk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=m6iNdcgH; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="m6iNdcgH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328878; x=1739864878; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=amnQUCWuTb+cZ7qxz14jEfsX5/6hnGY9dMCWTjruETs=; b=m6iNdcgHVATwAxX1s/spcOtpzdT71he+o9ESwsA5Zgvv408Ypm/dp/m5 Dt3S+W+qZ2o9mKKQte/xznf8Z6bi5ZIyAvRd/1uaQx+PMPWRUoVKCFYFN dV/NH9RI80phkN+COTsx1X3bTu6jDWH7IwMAoNvWEvL/j3Mr4I0CIMb/F fHvGA22cRn0XamMulCtXFLo+PqGv8VyehJjtflzsSBbYZYFCiptnjgqJL RVRWP09BfWslaWpHNh6P9kXC1kCSbctXRFzouifGlJCi8Kp7ppYjL6is1 QKJoIe4n3aTDSx/jMKzcBvoaYRsoqEYo08Wu22bdEKZugN3xRCx9GwtdL A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535166" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535166" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966134" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966134" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 25/27] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Date: Sun, 18 Feb 2024 23:47:31 -0800 Message-ID: <20240219074733.122080-26-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Per SDM description(Vol.3D, Appendix A.1): "If bit 56 is read as 1, software can use VM entry to deliver a hardware exception with or without an error code, regardless of vector" Modify has_error_code check before inject events to nested guest. Only enforce the check when guest is in real mode, the exception is not hard exception and the platform doesn't enumerate bit56 in VMX_BASIC, in all other case ignore the check to make the logic consistent with SDM. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao --- arch/x86/kvm/vmx/nested.c | 27 ++++++++++++++++++--------- arch/x86/kvm/vmx/nested.h | 5 +++++ 2 files changed, 23 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 4be0078ca713..0439208523b8 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1230,9 +1230,9 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data) { const u64 feature_and_reserved = /* feature (except bit 48; see below) */ - BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) | + BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) | BIT_ULL(56) | /* reserved */ - BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56); + BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 57); u64 vmx_basic = vmcs_config.nested.basic; if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved)) @@ -2865,7 +2865,6 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu, u8 vector = intr_info & INTR_INFO_VECTOR_MASK; u32 intr_type = intr_info & INTR_INFO_INTR_TYPE_MASK; bool has_error_code = intr_info & INTR_INFO_DELIVER_CODE_MASK; - bool should_have_error_code; bool urg = nested_cpu_has2(vmcs12, SECONDARY_EXEC_UNRESTRICTED_GUEST); bool prot_mode = !urg || vmcs12->guest_cr0 & X86_CR0_PE; @@ -2882,12 +2881,20 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu, CC(intr_type == INTR_TYPE_OTHER_EVENT && vector != 0)) return -EINVAL; - /* VM-entry interruption-info field: deliver error code */ - should_have_error_code = - intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode && - x86_exception_has_error_code(vector); - if (CC(has_error_code != should_have_error_code)) - return -EINVAL; + /* + * Cannot deliver error code in real mode or if the interrupt + * type is not hardware exception. For other cases, do the + * consistency check only if the vCPU doesn't enumerate + * VMX_BASIC_NO_HW_ERROR_CODE_CC. + */ + if (!prot_mode || intr_type != INTR_TYPE_HARD_EXCEPTION) { + if (CC(has_error_code)) + return -EINVAL; + } else if (!nested_cpu_has_no_hw_errcode_cc(vcpu)) { + if (CC(has_error_code != + x86_exception_has_error_code(vector))) + return -EINVAL; + } /* VM-entry exception error code */ if (CC(has_error_code && @@ -7011,6 +7018,8 @@ static void nested_vmx_setup_basic(struct nested_vmx_msrs *msrs) if (cpu_has_vmx_basic_inout()) msrs->basic |= VMX_BASIC_INOUT; + if (cpu_has_vmx_basic_no_hw_errcode()) + msrs->basic |= VMX_BASIC_NO_HW_ERROR_CODE_CC; } static void nested_vmx_setup_cr_fixed(struct nested_vmx_msrs *msrs) diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h index cce4e2aa30fb..747061c2aeb9 100644 --- a/arch/x86/kvm/vmx/nested.h +++ b/arch/x86/kvm/vmx/nested.h @@ -285,6 +285,11 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val) __kvm_is_valid_cr4(vcpu, val); } +static inline bool nested_cpu_has_no_hw_errcode_cc(struct kvm_vcpu *vcpu) +{ + return to_vmx(vcpu)->nested.msrs.basic & VMX_BASIC_NO_HW_ERROR_CODE_CC; +} + /* No difference in the restrictions on guest and host CR4 in VMX operation. */ #define nested_guest_cr4_valid nested_cr4_valid #define nested_host_cr4_valid nested_cr4_valid From patchwork Mon Feb 19 07:47:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562316 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A46EA40BF4; Mon, 19 Feb 2024 07:47:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328881; cv=none; b=Wf4pG66TN4Sl9AUScvBxP0tEmNkgjN0vpGQtfw5MrZlMpObTT6RzdftBp+nLW4TaYTC+hQ3WPCNdzA8rH1JLR+pmpsjnto0peOuoo6D8kQIVXuj0q5Xe0erMQfm70SDuOn2R8QVEvKrEKEFN+AWpwgbxT/cmDa2RPr2EKEPPejA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328881; c=relaxed/simple; bh=beytBB2aXs9EkybmEc3IsWJYZyRIF+fS294AYqn0udQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XW+T04SSgmb31B0Y3a4TPlc7Nn87RKdWe30cAzK0I+AtM1NXg8r3VKKDCW1dUPhzbvXVVd4rzUsJS8eo/CGYJduDPrlvIti03CtSCIHnP4gQDIAV3teyH2bxV9Q45TNg8FH91rzSFlI9vEeHSAT9aF7sUuwXnI+qLU7e6WvXSbI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Miz9tD+i; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Miz9tD+i" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328879; x=1739864879; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=beytBB2aXs9EkybmEc3IsWJYZyRIF+fS294AYqn0udQ=; b=Miz9tD+i6o8uuJTqLos0ELcMiqIGQ+BjLMiQShRyRjbz/VinrFhk7owD MaIaAC/AHuMiUG75VXY7UKGDzRg9M88JXGdwNGsA1cg6B+ryBQIg8mht1 ybfGFd29cjLIindKTzAPRH1M+0bQk8qJpkE6gwPCI2sp33kKgEThoUbw/ CBwy/5buJGVBLX6cTYmrdI94fcafbgCItCWjYUozTFSJuW2lNcdaOftPH pyLbe2nGHsmgTEhTFnfs+VV+CxehIHAf90aURwUtCF8U4jgqwb4iOVfO3 +WUJdVnq8CIdEU0kORQhAly3A2unm0XDT49QRamwoYS6LVLYim/v8PA3j A==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535167" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535167" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966136" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966136" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 26/27] KVM: nVMX: Enable CET support for nested guest Date: Sun, 18 Feb 2024 23:47:32 -0800 Message-ID: <20240219074733.122080-27-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Set up CET MSRs, related VM_ENTRY/EXIT control bits and fixed CR4 setting to enable CET for nested VM. vmcs12 and vmcs02 needs to be synced when L2 exits to L1 or when L1 wants to resume L2, that way correct CET states can be observed by one another. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 80 ++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmcs12.c | 6 +++ arch/x86/kvm/vmx/vmcs12.h | 14 ++++++- arch/x86/kvm/vmx/vmx.c | 2 + arch/x86/kvm/vmx/vmx.h | 3 ++ 5 files changed, 102 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 0439208523b8..d0311260270c 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -691,6 +691,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_FLUSH_CMD, MSR_TYPE_W); + /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */ + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_U_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_S_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL0_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL1_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL2_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL3_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false); vmx->nested.force_msr_bitmap_recalc = false; @@ -2438,6 +2460,30 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0 } } +static inline void cet_vmcs_fields_get(struct kvm_vcpu *vcpu, u64 *ssp, + u64 *s_cet, u64 *ssp_tbl) +{ + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) { + *ssp = vmcs_readl(GUEST_SSP); + *s_cet = vmcs_readl(GUEST_S_CET); + *ssp_tbl = vmcs_readl(GUEST_INTR_SSP_TABLE); + } else if (guest_can_use(vcpu, X86_FEATURE_IBT)) { + *s_cet = vmcs_readl(GUEST_S_CET); + } +} + +static inline void cet_vmcs_fields_put(struct kvm_vcpu *vcpu, u64 ssp, + u64 s_cet, u64 ssp_tbl) +{ + if (guest_can_use(vcpu, X86_FEATURE_SHSTK)) { + vmcs_writel(GUEST_SSP, ssp); + vmcs_writel(GUEST_S_CET, s_cet); + vmcs_writel(GUEST_INTR_SSP_TABLE, ssp_tbl); + } else if (guest_can_use(vcpu, X86_FEATURE_IBT)) { + vmcs_writel(GUEST_S_CET, s_cet); + } +} + static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) { struct hv_enlightened_vmcs *hv_evmcs = nested_vmx_evmcs(vmx); @@ -2553,6 +2599,11 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE) + cet_vmcs_fields_put(&vmx->vcpu, vmcs12->guest_ssp, + vmcs12->guest_s_cet, + vmcs12->guest_ssp_tbl); + set_cr4_guest_host_mask(vmx); } @@ -2591,6 +2642,13 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, kvm_set_dr(vcpu, 7, vcpu->arch.dr7); vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl); } + + if (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) + cet_vmcs_fields_put(vcpu, vmx->nested.pre_vmenter_ssp, + vmx->nested.pre_vmenter_s_cet, + vmx->nested.pre_vmenter_ssp_tbl); + if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending || !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmcs_write64(GUEST_BNDCFGS, vmx->nested.pre_vmenter_bndcfgs); @@ -3471,6 +3529,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmx->nested.pre_vmenter_bndcfgs = vmcs_read64(GUEST_BNDCFGS); + if (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) + cet_vmcs_fields_get(vcpu, &vmx->nested.pre_vmenter_ssp, + &vmx->nested.pre_vmenter_s_cet, + &vmx->nested.pre_vmenter_ssp_tbl); + /* * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and* * nested early checks are disabled. In the event of a "late" VM-Fail, @@ -4294,6 +4358,9 @@ static bool is_vmcs12_ext_field(unsigned long field) case GUEST_IDTR_BASE: case GUEST_PENDING_DBG_EXCEPTIONS: case GUEST_BNDCFGS: + case GUEST_SSP: + case GUEST_S_CET: + case GUEST_INTR_SSP_TABLE: return true; default: break; @@ -4344,6 +4411,10 @@ static void sync_vmcs02_to_vmcs12_rare(struct kvm_vcpu *vcpu, vmcs12->guest_pending_dbg_exceptions = vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS); + cet_vmcs_fields_get(&vmx->vcpu, &vmcs12->guest_ssp, + &vmcs12->guest_s_cet, + &vmcs12->guest_ssp_tbl); + vmx->nested.need_sync_vmcs02_to_vmcs12_rare = false; } @@ -4569,6 +4640,10 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, if (vmcs12->vm_exit_controls & VM_EXIT_CLEAR_BNDCFGS) vmcs_write64(GUEST_BNDCFGS, 0); + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_CET_STATE) + cet_vmcs_fields_put(vcpu, vmcs12->host_ssp, vmcs12->host_s_cet, + vmcs12->host_ssp_tbl); + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PAT) { vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat); vcpu->arch.pat = vmcs12->host_ia32_pat; @@ -6840,7 +6915,7 @@ static void nested_vmx_setup_exit_ctls(struct vmcs_config *vmcs_conf, VM_EXIT_HOST_ADDR_SPACE_SIZE | #endif VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT | - VM_EXIT_CLEAR_BNDCFGS; + VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_CET_STATE; msrs->exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER | @@ -6862,7 +6937,8 @@ static void nested_vmx_setup_entry_ctls(struct vmcs_config *vmcs_conf, #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | #endif - VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; + VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS | + VM_ENTRY_LOAD_CET_STATE; msrs->entry_ctls_high |= (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL); diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 106a72c923ca..4233b5ca9461 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELD(GUEST_S_CET, guest_s_cet), + FIELD(GUEST_SSP, guest_ssp), + FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), FIELD(HOST_CR0, host_cr0), FIELD(HOST_CR3, host_cr3), FIELD(HOST_CR4, host_cr4), @@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), + FIELD(HOST_S_CET, host_s_cet), + FIELD(HOST_SSP, host_ssp), + FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs12_field_offsets); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 01936013428b..3884489e7f7e 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -117,7 +117,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -292,6 +298,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9df25c9e80f5..210724d2151c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7727,6 +7727,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); + cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT)); entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1); cr4_fixed1_update(X86_CR4_LAM_SUP, eax, feature_bit(LAM)); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index d0cad2624564..3c1de37728fe 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -224,6 +224,9 @@ struct nested_vmx { */ u64 pre_vmenter_debugctl; u64 pre_vmenter_bndcfgs; + u64 pre_vmenter_ssp; + u64 pre_vmenter_s_cet; + u64 pre_vmenter_ssp_tbl; /* to migrate it to L1 if L2 writes to L1's CR8 directly */ int l1_tpr_threshold; From patchwork Mon Feb 19 07:47:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13562317 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 155FE41C7A; Mon, 19 Feb 2024 07:48:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328881; cv=none; b=D2Tkp+fk+yl3sAxQ+5Vb6atSorC+phX8yrdJsYrZoZ6ON4Kuuol0LqL192x4Dejk4MHDhn7ScruF5la/FWo69ezq0lb6q2spAqLkSeCqVb+fo+tO1ec+eAz7cQ3mZdoQj4S2W7h/rDYm+SdMZwwPj+yLJVQQ7u1uulw31lVd/w0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708328881; c=relaxed/simple; bh=7BaBpJwT/sD2IyfBYcZweX9LwIqZGDL72eD5AtTEaCU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bwyFHZ+tmLiilBMIVlEPuAy1uEIuAxTuSU11XzGAkc6tpeTQQXyqFMVJQYjSKULwnTfCnJjRcwV1cXfi+YI/AxDmKjYc5hvhqfUiJceBU4/aj6VrEs/zkz/CaO2fwKViL2xD8B75P7ziS2lLKO0v3RIoya3MDFHoirSg4jw8PkY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eBjXNDgF; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eBjXNDgF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708328880; x=1739864880; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7BaBpJwT/sD2IyfBYcZweX9LwIqZGDL72eD5AtTEaCU=; b=eBjXNDgFsbPS5dCKlTusIAFT45XVymJ4Jh2RqdmdJZGAVigQTWpVkFtv jOg1JEm2WWaXQQsewNnBRT19VtwW3Cdic1JuGw7JwAkhDFpqK8prj7kU3 538Zcc6cZeUbOmZZJdQWVpozPVgB3CNVsiCSLBOeunsTPNZOZrWhfV3uj mF1Dx9IEoVKPKFkFNUa/X/H+IP5b4KV3MqNrrQEVfHLIsEwHzBWzNQjOt VsgUdsMQIpuH6QTua/znJy9n5PrtxnJUZjU9dpJsiOt7hBXZDlt6oN1kL RraQsyURJQ3aIsgt6HVOCpyYKTk2/FdnBijtfnahn3NUSeHUnb1LOFCNA w==; X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="2535168" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="2535168" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10988"; a="826966139" X-IronPort-AV: E=Sophos;i="6.06,170,1705392000"; d="scan'208";a="826966139" Received: from jf.jf.intel.com (HELO jf.intel.com) ([10.165.9.183]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Feb 2024 23:47:44 -0800 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, mlevitsk@redhat.com, john.allen@amd.com, weijiang.yang@intel.com Subject: [PATCH v10 27/27] KVM: x86: Don't emulate instructions guarded by CET Date: Sun, 18 Feb 2024 23:47:33 -0800 Message-ID: <20240219074733.122080-28-weijiang.yang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240219074733.122080-1-weijiang.yang@intel.com> References: <20240219074733.122080-1-weijiang.yang@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Don't emulate the branch instructions, e.g., CALL/RET/JMP etc., when CET is active in guest, return KVM_INTERNAL_ERROR_EMULATION to userspace to handle it. KVM doesn't emulate CPU behaviors to check CET protected stuffs while emulating guest instructions, instead it stops emulation on detecting the instructions in process are CET protected. By doing so, it can avoid generating bogus #CP in guest and preventing CET protected execution flow subversion from guest side. Suggested-by: Chao Gao Signed-off-by: Yang Weijiang --- arch/x86/kvm/emulate.c | 46 ++++++++++++++++++++++++++++++++---------- 1 file changed, 35 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 8ccc17eb78ca..c18616d24ac9 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -178,6 +178,8 @@ #define IncSP ((u64)1 << 54) /* SP is incremented before ModRM calc */ #define TwoMemOp ((u64)1 << 55) /* Instruction has two memory operand */ #define IsBranch ((u64)1 << 56) /* Instruction is considered a branch. */ +#define ShadowStack ((u64)1 << 57) /* Instruction protected by Shadow Stack. */ +#define IndirBrnTrk ((u64)1 << 58) /* Instruction protected by IBT. */ #define DstXacc (DstAccLo | SrcAccHi | SrcWrite) @@ -4100,9 +4102,11 @@ static const struct opcode group4[] = { static const struct opcode group5[] = { F(DstMem | SrcNone | Lock, em_inc), F(DstMem | SrcNone | Lock, em_dec), - I(SrcMem | NearBranch | IsBranch, em_call_near_abs), - I(SrcMemFAddr | ImplicitOps | IsBranch, em_call_far), - I(SrcMem | NearBranch | IsBranch, em_jmp_abs), + I(SrcMem | NearBranch | IsBranch | ShadowStack | IndirBrnTrk, + em_call_near_abs), + I(SrcMemFAddr | ImplicitOps | IsBranch | ShadowStack | IndirBrnTrk, + em_call_far), + I(SrcMem | NearBranch | IsBranch | IndirBrnTrk, em_jmp_abs), I(SrcMemFAddr | ImplicitOps | IsBranch, em_jmp_far), I(SrcMem | Stack | TwoMemOp, em_push), D(Undefined), }; @@ -4364,11 +4368,11 @@ static const struct opcode opcode_table[256] = { /* 0xC8 - 0xCF */ I(Stack | SrcImmU16 | Src2ImmByte | IsBranch, em_enter), I(Stack | IsBranch, em_leave), - I(ImplicitOps | SrcImmU16 | IsBranch, em_ret_far_imm), - I(ImplicitOps | IsBranch, em_ret_far), - D(ImplicitOps | IsBranch), DI(SrcImmByte | IsBranch, intn), + I(ImplicitOps | SrcImmU16 | IsBranch | ShadowStack, em_ret_far_imm), + I(ImplicitOps | IsBranch | ShadowStack, em_ret_far), + D(ImplicitOps | IsBranch), DI(SrcImmByte | IsBranch | ShadowStack, intn), D(ImplicitOps | No64 | IsBranch), - II(ImplicitOps | IsBranch, em_iret, iret), + II(ImplicitOps | IsBranch | ShadowStack, em_iret, iret), /* 0xD0 - 0xD7 */ G(Src2One | ByteOp, group2), G(Src2One, group2), G(Src2CL | ByteOp, group2), G(Src2CL, group2), @@ -4384,7 +4388,7 @@ static const struct opcode opcode_table[256] = { I2bvIP(SrcImmUByte | DstAcc, em_in, in, check_perm_in), I2bvIP(SrcAcc | DstImmUByte, em_out, out, check_perm_out), /* 0xE8 - 0xEF */ - I(SrcImm | NearBranch | IsBranch, em_call), + I(SrcImm | NearBranch | IsBranch | ShadowStack, em_call), D(SrcImm | ImplicitOps | NearBranch | IsBranch), I(SrcImmFAddr | No64 | IsBranch, em_jmp_far), D(SrcImmByte | ImplicitOps | NearBranch | IsBranch), @@ -4403,7 +4407,8 @@ static const struct opcode opcode_table[256] = { static const struct opcode twobyte_table[256] = { /* 0x00 - 0x0F */ G(0, group6), GD(0, &group7), N, N, - N, I(ImplicitOps | EmulateOnUD | IsBranch, em_syscall), + N, I(ImplicitOps | EmulateOnUD | IsBranch | ShadowStack | IndirBrnTrk, + em_syscall), II(ImplicitOps | Priv, em_clts, clts), N, DI(ImplicitOps | Priv, invd), DI(ImplicitOps | Priv, wbinvd), N, N, N, D(ImplicitOps | ModRM | SrcMem | NoAccess), N, N, @@ -4434,8 +4439,9 @@ static const struct opcode twobyte_table[256] = { IIP(ImplicitOps, em_rdtsc, rdtsc, check_rdtsc), II(ImplicitOps | Priv, em_rdmsr, rdmsr), IIP(ImplicitOps, em_rdpmc, rdpmc, check_rdpmc), - I(ImplicitOps | EmulateOnUD | IsBranch, em_sysenter), - I(ImplicitOps | Priv | EmulateOnUD | IsBranch, em_sysexit), + I(ImplicitOps | EmulateOnUD | IsBranch | ShadowStack | IndirBrnTrk, + em_sysenter), + I(ImplicitOps | Priv | EmulateOnUD | IsBranch | ShadowStack, em_sysexit), N, N, N, N, N, N, N, N, N, N, /* 0x40 - 0x4F */ @@ -4973,6 +4979,24 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int if (ctxt->d == 0) return EMULATION_FAILED; + if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_CET) { + u64 u_cet, s_cet; + bool stop_em; + + if (ctxt->ops->get_msr(ctxt, MSR_IA32_U_CET, &u_cet) || + ctxt->ops->get_msr(ctxt, MSR_IA32_S_CET, &s_cet)) + return EMULATION_FAILED; + + stop_em = ((u_cet & CET_SHSTK_EN) || (s_cet & CET_SHSTK_EN)) && + (opcode.flags & ShadowStack); + + stop_em |= ((u_cet & CET_ENDBR_EN) || (s_cet & CET_ENDBR_EN)) && + (opcode.flags & IndirBrnTrk); + + if (stop_em) + return EMULATION_FAILED; + } + ctxt->execute = opcode.u.execute; if (unlikely(emulation_type & EMULTYPE_TRAP_UD) &&