From patchwork Thu Apr 10 07:24:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046011 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15526208989; Thu, 10 Apr 2025 07:22:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269745; cv=none; b=ZGQcEsRv7CHbELojfv3ycUoyIWP+FbqyH2IxiZo9/UsDG654nI7zf1glii1rd8g/ND+faN1DAGLw58ehtKwR8E+ia7S72wIlE0G+HOaqkob9hALLzp0BfzUrC6s9MNE+jpZ0rwcNbPUQdRNOAI3YA12Q6Pitx05bOUmO0qJSLh0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269745; c=relaxed/simple; bh=x75cG+zq9kKplLQMO26hjWO7/9o8dSy+vB9he+XJNfQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RV1rP2CVfDaj2vLnQDLZgxEhkOuMKByhVlLWxTERGyHxiUBITRB6gOz9COdbYlI3GZJPW2jJu0eSLxNsWJERKkQuC+J2R0WJAlCdcvObmQIx5J84KhuCrznZBS/rxUMPmucA3oiRVQD0RqLCRt7vFWrTVw87HMxiMTYpHZ2zuSg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=MUv+aXEj; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="MUv+aXEj" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269741; x=1775805741; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=x75cG+zq9kKplLQMO26hjWO7/9o8dSy+vB9he+XJNfQ=; b=MUv+aXEjpEW6JOM1Zfu2QuT3c+Y8drzTDPJsqJlZvuohidrzMSm1N+Xx IcQHKMEeodh/gR4v3wQevl408aFGNjeG5qzFc8FJ+PHrkm4ChhUipT5A2 eJS3SC+E6DlbiLF3wKX9iweBVhMQmNFWy/BPqnI/dVr+U0n6nd5A2RquG rllnCzGTLw1iLCEFVuiFijWDCnhRJKVfJgWQZu4yYSGYlozwv8CJ3oJJz LI7mwxknG9dod01kmraHcNon2rGdTidLxJajWPB4NrXK5QjidhbIFa64n 2vBKfq/A8JkF3Y56rsbPS2A6Yg2p5H/FG+hu5w1ZEfss+Zpzp5/ev2UeE Q==; X-CSE-ConnectionGUID: FcEaZsXWRKCYSUQ79EvcFw== X-CSE-MsgGUID: OB2i6Qb+S62DqPOLJNaSgQ== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439280" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439280" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:21 -0700 X-CSE-ConnectionGUID: FtdSiW5ISnut5H2OZgeVzg== X-CSE-MsgGUID: kS6gRNAwR22t/AODvzyTaA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778096" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:15 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Maxim Levitsky , Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Samuel Holland , Mitchell Levy , Vignesh Balasubramanian , Aruna Ramakrishna Subject: [PATCH v5 1/7] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Date: Thu, 10 Apr 2025 15:24:41 +0800 Message-ID: <20250410072605.2358393-2-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Sean Christopherson When granting userspace or a KVM guest access to an xfeature, preserve the entity's existing supervisor and software-defined permissions as tracked by __state_perm, i.e. use __state_perm to track *all* permissions even though all supported supervisor xfeatures are granted to all FPUs and FPU_GUEST_PERM_LOCKED disallows changing permissions. Effectively clobbering supervisor permissions results in inconsistent behavior, as xstate_get_group_perm() will report supervisor features for process that do NOT request access to dynamic user xfeatures, whereas any and all supervisor features will be absent from the set of permissions for any process that is granted access to one or more dynamic xfeatures (which right now means AMX). The inconsistency isn't problematic because fpu_xstate_prctl() already strips out everything except user xfeatures: case ARCH_GET_XCOMP_PERM: /* * Lockless snapshot as it can also change right after the * dropping the lock. */ permitted = xstate_get_host_group_perm(); permitted &= XFEATURE_MASK_USER_SUPPORTED; return put_user(permitted, uptr); case ARCH_GET_XCOMP_GUEST_PERM: permitted = xstate_get_guest_group_perm(); permitted &= XFEATURE_MASK_USER_SUPPORTED; return put_user(permitted, uptr); and similarly KVM doesn't apply the __state_perm to supervisor states (kvm_get_filtered_xcr0() incorporates xstate_get_guest_group_perm()): case 0xd: { u64 permitted_xcr0 = kvm_get_filtered_xcr0(); u64 permitted_xss = kvm_caps.supported_xss; But if KVM in particular were to ever change, dropping supervisor permissions would result in subtle bugs in KVM's reporting of supported CPUID settings. And the above behavior also means that having supervisor xfeatures in __state_perm is correctly handled by all users. Dropping supervisor permissions also creates another landmine for KVM. If more dynamic user xfeatures are ever added, requesting access to multiple xfeatures in separate ARCH_REQ_XCOMP_GUEST_PERM calls will result in the second invocation of __xstate_request_perm() computing the wrong ksize, as as the mask passed to xstate_calculate_size() would not contain *any* supervisor features. Commit 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor states in XSTATE permissions") fudged around the size issue for userspace FPUs, but for reasons unknown skipped guest FPUs. Lack of a fix for KVM "works" only because KVM doesn't yet support virtualizing features that have supervisor xfeatures, i.e. as of today, KVM guest FPUs will never need the relevant xfeatures. Simply extending the hack-a-fix for guests would temporarily solve the ksize issue, but wouldn't address the inconsistency issue and would leave another lurking pitfall for KVM. KVM support for virtualizing CET will likely add CET_KERNEL as a guest-only xfeature, i.e. CET_KERNEL will not be set in xfeatures_mask_supervisor() and would again be dropped when granting access to dynamic xfeatures. Note, the existing clobbering behavior is rather subtle. The @permitted parameter to __xstate_request_perm() comes from: permitted = xstate_get_group_perm(guest); which is either fpu->guest_perm.__state_perm or fpu->perm.__state_perm, where __state_perm is initialized to: fpu->perm.__state_perm = fpu_kernel_cfg.default_features; and copied to the guest side of things: /* Same defaults for guests */ fpu->guest_perm = fpu->perm; fpu_kernel_cfg.default_features contains everything except the dynamic xfeatures, i.e. everything except XFEATURE_MASK_XTILE_DATA: fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features; fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; When __xstate_request_perm() restricts the local "mask" variable to compute the user state size: mask &= XFEATURE_MASK_USER_SUPPORTED; usize = xstate_calculate_size(mask, false); it subtly overwrites the target __state_perm with "mask" containing only user xfeatures: perm = guest ? &fpu->guest_perm : &fpu->perm; /* Pairs with the READ_ONCE() in xstate_get_group_perm() */ WRITE_ONCE(perm->__state_perm, mask); Cc: Maxim Levitsky Cc: Weijiang Yang Cc: Dave Hansen Cc: Paolo Bonzini Cc: Peter Zijlstra Cc: Chao Gao Cc: Rick Edgecombe Cc: John Allen Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/all/ZTqgzZl-reO1m01I@google.com Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Signed-off-by: Chao Gao Reviewed-by: Maxim Levitsky Reviewed-by: Rick Edgecombe Acked-by: Dave Hansen --- arch/x86/include/asm/fpu/types.h | 8 +++++--- arch/x86/kernel/fpu/xstate.c | 18 +++++++++++------- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index de16862bf230..46cc263f9f4f 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -407,9 +407,11 @@ struct fpu_state_perm { /* * @__state_perm: * - * This bitmap indicates the permission for state components, which - * are available to a thread group. The permission prctl() sets the - * enabled state bits in thread_group_leader()->thread.fpu. + * This bitmap indicates the permission for state components + * available to a thread group, including both user and supervisor + * components and software-defined bits like FPU_GUEST_PERM_LOCKED. + * The permission prctl() sets the enabled state bits in + * thread_group_leader()->thread.fpu. * * All run time operations use the per thread information in the * currently active fpu.fpstate which contains the xfeature masks diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 46c45e2f2a5a..6f10f5490022 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1642,16 +1642,20 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest) if ((permitted & requested) == requested) return 0; - /* Calculate the resulting kernel state size */ + /* + * Calculate the resulting kernel state size. Note, @permitted also + * contains supervisor xfeatures even though supervisor are always + * permitted for kernel and guest FPUs, and never permitted for user + * FPUs. + */ mask = permitted | requested; - /* Take supervisor states into account on the host */ - if (!guest) - mask |= xfeatures_mask_supervisor(); ksize = xstate_calculate_size(mask, compacted); - /* Calculate the resulting user state size */ - mask &= XFEATURE_MASK_USER_SUPPORTED; - usize = xstate_calculate_size(mask, false); + /* + * Calculate the resulting user state size. Take care not to clobber + * the supervisor xfeatures in the new mask! + */ + usize = xstate_calculate_size(mask & XFEATURE_MASK_USER_SUPPORTED, false); if (!guest) { ret = validate_sigaltstack(usize); From patchwork Thu Apr 10 07:24:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046012 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1DD71212FB3; Thu, 10 Apr 2025 07:22:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269750; cv=none; b=QR8FlAxvfzOi2p5i49e+nPg7lNp7eHim/7knu20LeT/6HED18853NEn5u2edNlb1DzMhHABsC3sDxj3Imvlco92OJWoyaHGpuDYZ/fOlU8Q8wy/D5JSF1qW8L4F6Jx6rNWF7mLwOjhm3mbMKf9Kbf3yvP3YUr4LzbgZdY2RYLGI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269750; c=relaxed/simple; bh=bkwRITAvHYW1PQMKc66vh0RX7BrF/SUjCdF5VO+meFY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qmeOjs0JKjseyz9dokfTIsvwjMOK4gpXxw53bxzKLM2Vc20cG61d3IuGzUnbgeX6vdcSVrS58rTgdJTtgJNCUks31yGYs1jZDKTfE5+qSTLAO2V9wnaDOuQ3J8MVkS3NVpqs5kdlRLN8U7gizxtd3eEIzUqjfFeklCUM+B0/YiI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jOk1s93j; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jOk1s93j" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269749; x=1775805749; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bkwRITAvHYW1PQMKc66vh0RX7BrF/SUjCdF5VO+meFY=; b=jOk1s93jDr+3Hfz+76D+7/8d7sdNNITdxYotDNjzHIGgRKDj91YRyfuu zUZrd4bayKMJ+9ZvMfsJxTwvyFq7QeQaYcXM/N1vVOryXs1+IJkI2TRHv BzPl3l9JGEyqDJpf9r1Y9qtP0lQ/UgKVtnKifGu2dT6wu9tWy3AsSenqt tHEHp7nLPhbuTiAPv4wattnaji86v6SMoUYH7tjGf5wNLo4Gk9xg9Gcld TCpPDc1ujSTx9/SpW/IKjeIRF8W0S26yRXqzMoswNsI/axCQhu6yK88GG AysYFGtiM38moYq8qSqvpfjrVDelfRhyBiPjAnCjOkbqbA6fGuHTKajC0 g==; X-CSE-ConnectionGUID: GsDjSd5oSb2384MOmrXbbA== X-CSE-MsgGUID: YBT4UTTvQSqQitMDiGRJAA== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439305" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439305" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:28 -0700 X-CSE-ConnectionGUID: o42DGBJGSBqE1nIA8f+LRQ== X-CSE-MsgGUID: zA8eglnlTsGBXzdoZICEmw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778122" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:23 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Maxim Levitsky , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Samuel Holland , Mitchell Levy , Stanislav Spassov , Eric Biggers Subject: [PATCH v5 2/7] x86/fpu: Drop @perm from guest pseudo FPU container Date: Thu, 10 Apr 2025 15:24:42 +0800 Message-ID: <20250410072605.2358393-3-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Remove @perm from the guest pseudo FPU container. The field is initialized during allocation and never used later. Rename fpu_init_guest_permissions() to show that its sole purpose is to lock down guest permissions. Suggested-by: Maxim Levitsky Signed-off-by: Chao Gao --- v5: drop the useless fpu_guest argument (Chang) --- arch/x86/include/asm/fpu/types.h | 7 ------- arch/x86/kernel/fpu/core.c | 7 ++----- 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 46cc263f9f4f..9f9ed406b179 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -526,13 +526,6 @@ struct fpu_guest { */ u64 xfeatures; - /* - * @perm: xfeature bitmap of features which are - * permitted to be enabled for the guest - * vCPU. - */ - u64 perm; - /* * @xfd_err: Save the guest value. */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 1b734a9ff088..28ad7ec56eaa 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -202,7 +202,7 @@ void fpu_reset_from_exception_fixup(void) #if IS_ENABLED(CONFIG_KVM) static void __fpstate_reset(struct fpstate *fpstate, u64 xfd); -static void fpu_init_guest_permissions(struct fpu_guest *gfpu) +static void fpu_lock_guest_permissions(void) { struct fpu_state_perm *fpuperm; u64 perm; @@ -218,8 +218,6 @@ static void fpu_init_guest_permissions(struct fpu_guest *gfpu) WRITE_ONCE(fpuperm->__state_perm, perm | FPU_GUEST_PERM_LOCKED); spin_unlock_irq(¤t->sighand->siglock); - - gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED; } bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) @@ -240,7 +238,6 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) gfpu->fpstate = fpstate; gfpu->xfeatures = fpu_kernel_cfg.default_features; - gfpu->perm = fpu_kernel_cfg.default_features; /* * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state @@ -255,7 +252,7 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) if (WARN_ON_ONCE(fpu_user_cfg.default_size > gfpu->uabi_size)) gfpu->uabi_size = fpu_user_cfg.default_size; - fpu_init_guest_permissions(gfpu); + fpu_lock_guest_permissions(); return true; } From patchwork Thu Apr 10 07:24:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046013 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED2F8204F65; Thu, 10 Apr 2025 07:22:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269763; cv=none; b=umLEvjiVLOhj+15isisbdyb5OACqB50dAMVv8DgUcLGe3P+RAG164gFcviwZwJe76Hj6K1FH1pJIV/lowgnAFEW8X4NJsJX+etwzJ4nRQJy1rWfqJojspHzaiPwPZmk7FV8mQzxbTl1WaXduBkobMYyMSqACYl8e23gyALzYNGo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269763; c=relaxed/simple; bh=qtvKykEg1wiWubS0qpVNWzid0JRvlUFA2Q4b21MMp98=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cVpQ5GE+f0hedNJ3hT1E0mrEURgFRr7IrNECMwM64552m7SHiWAp/lLYyxNerd1vpj6/4ouCoDphDh5ZXzVMInXgDmp+IAN1wgbBnpPjwvYdqhPgIgdcnYyhgjFFkWKaQTXtizJhN9MD7HOBJ/3Tf9vTvP3WcLhhSzUh/0HS55Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Psknjdy6; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Psknjdy6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269762; x=1775805762; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qtvKykEg1wiWubS0qpVNWzid0JRvlUFA2Q4b21MMp98=; b=Psknjdy6Vq3jQCpwPsb11hrFdbJ0ZJcqR97U81LxoPXNqYRfYHPfAMZJ ndptIh4DtD6NS7BUmKM9+XKfnQGLUZTT4Kwwb0Z0KE31uaIQI2WyIbV5M 4vxFEyHITMz8++tjiEMFWBacAPobmWoJpv0T7bgcjihLaZK9BAe09oUvQ 1qv37VCUV2z4eDAdQam5jqRpwMMhEpc418Fo5vRk0l0wBuOI3W6F1GfMt KF6uKhWj0/K2EJNteGV+gxRW6Cy+cO3CExKFFzg+fR1Yzo3foh7gItC6u XHSp/+YzdqSeMZLJ/oB4h4lsR9GrV9cG5tXNCDFeWZaSU2Feb5EDBdDS9 A==; X-CSE-ConnectionGUID: 9glWmQAVQ7qgqYy7lCMkSg== X-CSE-MsgGUID: SZpL+cn7SNiGcaGxtIkizg== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439335" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439335" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:39 -0700 X-CSE-ConnectionGUID: Gs/t1q4AQhaTR6f1acfaqg== X-CSE-MsgGUID: PgBan7YPTZ6GfXbml3GDJw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778143" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:33 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Maxim Levitsky , Samuel Holland , Mitchell Levy , Eric Biggers , Stanislav Spassov , Vignesh Balasubramanian , Aruna Ramakrishna Subject: [PATCH v5 3/7] x86/fpu/xstate: Differentiate default features for host and guest FPUs Date: Thu, 10 Apr 2025 15:24:43 +0800 Message-ID: <20250410072605.2358393-4-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently, guest and host FPUs share the same default features. However, the CET supervisor xstate is the first feature that needs to be enabled exclusively for guest FPUs. Enabling it for host FPUs leads to a waste of 24 bytes in the XSAVE buffer. To support "guest-only" features, add a new structure to hold the default features and sizes for guest FPUs to clearly differentiate them from those for host FPUs. An alternative approach is adding a guest_only_xfeatures member to fpu_kernel_cfg and adding two helper functions to calculate the guest default xfeatures and size. However, calculating these defaults at runtime would introduce unnecessary overhead. Note that, for now, the default features for guest and host FPUs remain the same. This will change in a follow-up patch once guest permissions, default xfeatures, and fpstate size are all converted to use the guest defaults. Suggested-by: Chang S. Bae Signed-off-by: Chao Gao --- v5: Add a new vcpu_fpu_config instead of adding new members to fpu_state_config (Chang) Extract a helper to set default values (Chang) --- arch/x86/include/asm/fpu/types.h | 43 ++++++++++++++++++++++++++++++++ arch/x86/kernel/fpu/core.c | 1 + arch/x86/kernel/fpu/xstate.c | 29 ++++++++++++++++----- 3 files changed, 67 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 9f9ed406b179..769155a0401a 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -542,6 +542,48 @@ struct fpu_guest { struct fpstate *fpstate; }; +/* + * FPU state configuration data for fpu_guest. + * Initialized at boot time. Read only after init. + */ +struct vcpu_fpu_config { + /* + * @size: + * + * The default size of the register state buffer in guest FPUs. + * Includes all supported features except independent managed + * features and features which have to be requested by user space + * before usage. + */ + unsigned int size; + + /* + * @user_size: + * + * The default UABI size of the register state buffer in guest + * FPUs. Includes all supported user features except independent + * managed features and features which have to be requested by + * user space before usage. + */ + unsigned int user_size; + + /* + * @features: + * + * The default supported features bitmap in guest FPUs. Does not + * include independent managed features and features which have to + * be requested by user space before usage. + */ + u64 features; + + /* + * @user_features: + * + * Same as @features except only user xfeatures are included. + */ + u64 user_features; +}; + /* * FPU state configuration data. Initialized at boot time. Read only after init. */ @@ -597,5 +639,6 @@ struct fpu_state_config { /* FPU state configuration information */ extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg; +extern struct vcpu_fpu_config guest_default_cfg; #endif /* _ASM_X86_FPU_TYPES_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 28ad7ec56eaa..25f13cc8ad92 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -36,6 +36,7 @@ DEFINE_PER_CPU(u64, xfd_state); /* The FPU state configuration data for kernel and user space */ struct fpu_state_config fpu_kernel_cfg __ro_after_init; struct fpu_state_config fpu_user_cfg __ro_after_init; +struct vcpu_fpu_config guest_default_cfg __ro_after_init; /* * Represents the initial FPU state. It's mostly (but not completely) zeroes, diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 6f10f5490022..cdd1e51fb93e 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -738,6 +738,11 @@ static int __init init_xstate_size(void) fpu_user_cfg.default_size = xstate_calculate_size(fpu_user_cfg.default_features, false); + guest_default_cfg.size = + xstate_calculate_size(guest_default_cfg.features, compacted); + guest_default_cfg.user_size = + xstate_calculate_size(guest_default_cfg.user_features, false); + return 0; } @@ -766,6 +771,22 @@ static void __init fpu__init_disable_system_xstate(unsigned int legacy_size) fpstate_reset(¤t->thread.fpu); } +static void __init init_default_features(u64 kernel_max_features, u64 user_max_features) +{ + u64 kfeatures = kernel_max_features; + u64 ufeatures = user_max_features; + + /* Default feature sets should not include dynamic xfeatures. */ + kfeatures &= ~XFEATURE_MASK_USER_DYNAMIC; + ufeatures &= ~XFEATURE_MASK_USER_DYNAMIC; + + fpu_kernel_cfg.default_features = kfeatures; + fpu_user_cfg.default_features = ufeatures; + + guest_default_cfg.features = kfeatures; + guest_default_cfg.user_features = ufeatures; +} + /* * Enable and initialize the xsave feature. * Called once per system bootup. @@ -837,12 +858,8 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) fpu_user_cfg.max_features = fpu_kernel_cfg.max_features; fpu_user_cfg.max_features &= XFEATURE_MASK_USER_SUPPORTED; - /* Clean out dynamic features from default */ - fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features; - fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; - - fpu_user_cfg.default_features = fpu_user_cfg.max_features; - fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; + /* Now, given maximum feature set, determine default values */ + init_default_features(fpu_kernel_cfg.max_features, fpu_user_cfg.max_features); /* Store it for paranoia check at the end */ xfeatures = fpu_kernel_cfg.max_features; From patchwork Thu Apr 10 07:24:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046014 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77B4720B207; Thu, 10 Apr 2025 07:22:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269767; cv=none; b=lUItbo86QR78rqkJ6zC474znNlbhcz+sMYURQX0xAxBXcwQyqU3S9gDowUfrFjVtzJgX/EaO5sQtMyLjlnIKEctc4Sj3avLlZeIObXTx1NyvdyZFFe+pqsYE3bC+XeZPozshnbUeg1+4S6cw9inZ2GX/Hb7/lqnuqSgXoXf/EaY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269767; c=relaxed/simple; bh=WzJewkkVkSDXS36MObVi8a3bojWpuwsnQhRppAK9x8I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XjEWdM2EmqvxZrwYRxLxZgKH2xNweZQScY8io0JzBgLrZcU2Fh4/tQ60dpnJEbBus8+ln2d1eNX8G3xhfLuVFIMaAnQbTxePAxTUKa2sWpwg6iv9A8cZlme4toEtpQ0xX4OxkK9KwQyBIZiCKpw6uhk/ctpzUNMPZNEWTU4rIl4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=d1uQPFkI; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="d1uQPFkI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269765; x=1775805765; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WzJewkkVkSDXS36MObVi8a3bojWpuwsnQhRppAK9x8I=; b=d1uQPFkI7M/adcVUNOdxxOSukXmlGYDQvecYrbHCvpwEK/F1aO44AwJ3 tpp5G1phesqsjhxb2JUwofonm89ttbZ/TAgYlMUyfDQvEC64f8wPYSCSl HuuM4ezPO+L+AdEXP8who/oqfEMJVyQ2NGOEHG9TK1/kXm5NGWw9feVSH p/F0OyO9D5mCQmmqoB5fg5D8C2omdVjH7O0tY8vUM1+fxsUusu2UCO/Gv /02i/siYYycGYRGbanI0IZhiIbe88JtaCb8oq0svrQmqvqvJnSnVx0RqG +sgGtk4Lzt33xRXIDM6Odt/HB+Lv+sW/6z5Mxd3MjscuxJ7ZYJBNxXQYR g==; X-CSE-ConnectionGUID: DC1AzXmISUWcthi0THR/sw== X-CSE-MsgGUID: 7VTELozkTdSKcNHzcOHy+A== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439352" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439352" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:45 -0700 X-CSE-ConnectionGUID: sd9XQcdYRYmcgfRavt9KiA== X-CSE-MsgGUID: rM0qPWpNRHiFmF9WPdxDkg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778162" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:40 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Eric Biggers , Stanislav Spassov Subject: [PATCH v5 4/7] x86/fpu: Initialize guest FPU permissions from guest defaults Date: Thu, 10 Apr 2025 15:24:44 +0800 Message-ID: <20250410072605.2358393-5-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently, fpu->guest_perm is copied from fpu->perm, which is derived from fpu_kernel_cfg.default_features. Guest defaults were introduced to differentiate the features and sizes of host and guest FPUs. Copying guest FPU permissions from the host will lead to inconsistencies between the guest default features and permissions. Initialize guest FPU permissions from guest defaults instead of host defaults. This ensures that any changes to guest default features are automatically reflected in guest permissions, which in turn guarantees that fpstate_realloc() allocates a correctly sized XSAVE buffer for guest FPUs. Suggested-by: Chang S. Bae Signed-off-by: Chao Gao --- arch/x86/kernel/fpu/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 25f13cc8ad92..e23e435b85c4 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -543,8 +543,10 @@ void fpstate_reset(struct fpu *fpu) fpu->perm.__state_perm = fpu_kernel_cfg.default_features; fpu->perm.__state_size = fpu_kernel_cfg.default_size; fpu->perm.__user_state_size = fpu_user_cfg.default_size; - /* Same defaults for guests */ - fpu->guest_perm = fpu->perm; + + fpu->guest_perm.__state_perm = guest_default_cfg.features; + fpu->guest_perm.__state_size = guest_default_cfg.size; + fpu->guest_perm.__user_state_size = guest_default_cfg.user_size; } static inline void fpu_inherit_perms(struct fpu *dst_fpu) From patchwork Thu Apr 10 07:24:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046015 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E042520B819; Thu, 10 Apr 2025 07:22:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269773; cv=none; b=G4OycQkm9LEu6FSWD4Xfape2zLjURvrBjcw7U/Modxj6SIuSroAxZ386WST4vKVjLfxbEQ2G2zZo4Z4C8bN1+2ViwWN0ES6xRDJZ/GG57pc4DMFsNIHB3j7GoQM0QOCKu91J6l8L3osWPeADpLzTs1lfoNvOrKBhCWrDQJfJga4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269773; c=relaxed/simple; bh=p034v+1t9zJ9E2nJhCRWSagSV/SOx27HYDYAI/C9kig=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZdCh9DWZ3Fagd25Jt2ATxSxV1dCO/kZ4bufnRKUE6WGWOl25/aOE6htUsPwETKmCdtDT3NDzStqqhnFCl3QcKmvAncSoyR79WvyAb2W06G1t7vFx58H+NPa+QjL7NUE7yuwJd3MgdC9vtMeQOddq0Lh0KWHBWjb1lQv8SJ4mcH0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KY5VuvDM; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KY5VuvDM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269772; x=1775805772; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p034v+1t9zJ9E2nJhCRWSagSV/SOx27HYDYAI/C9kig=; b=KY5VuvDM9v+MSlYfNVAIpvRvV+0t1RgnddsbEBHK051KHYmAudpaZtz0 N5VPvS53+SrhQV01iQjYVXB8QSVKf70sMVH5KfNY/97M/CO0ttP3VzFX7 WfTM0dP2G8mf9YXYskX3AhhzI/SCDL1S7lh9FuzDplmobfrYThLzx4YKt HCZ7qGQgoANcT6NYqSQremEpOzckC+kSKuvOsVQkyffD4Nx6NPJHwugrr DZhYSg2Qw51Sq46Mxoby4vEkcqQrEKvRMVBJyXtXsIOS3ElPG65ZqnPbn cQilBHnsfP1+wlM0ofsQNeW3X2QcrvS4U6VWLNUjihO6qiAAsg9juVjj4 w==; X-CSE-ConnectionGUID: 68rVIDbuSyGye04BLlXz2A== X-CSE-MsgGUID: e5mpBPSNSOmIVzsaQ891Qw== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439375" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439375" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:51 -0700 X-CSE-ConnectionGUID: 6HaV4cmbSLqskdeR6/Bshw== X-CSE-MsgGUID: bzhfP9EGSIaZkl/QetYqvQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778193" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:47 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Eric Biggers , Stanislav Spassov Subject: [PATCH v5 5/7] x86/fpu: Initialize guest fpstate and FPU pseudo container from guest defaults Date: Thu, 10 Apr 2025 15:24:45 +0800 Message-ID: <20250410072605.2358393-6-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 fpu_alloc_guest_fpstate() currently uses host defaults to initialize guest fpstate and pseudo containers. Guest defaults were introduced to differentiate the features and sizes of host and guest FPUs. Switch to using guest defaults instead. Additionally, incorporate the initialization of indicators (is_valloc and is_guest) into the newly added guest-specific reset function to centralize the resetting of guest fpstate. Suggested-by: Chang S. Bae Signed-off-by: Chao Gao --- v5: init is_valloc/is_guest in the guest-specific reset function (Chang) --- arch/x86/kernel/fpu/core.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index e23e435b85c4..f5593f6009a4 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -201,7 +201,20 @@ void fpu_reset_from_exception_fixup(void) } #if IS_ENABLED(CONFIG_KVM) -static void __fpstate_reset(struct fpstate *fpstate, u64 xfd); +static void __guest_fpstate_reset(struct fpstate *fpstate, u64 xfd) +{ + /* Initialize sizes and feature masks */ + fpstate->size = guest_default_cfg.size; + fpstate->user_size = guest_default_cfg.user_size; + fpstate->xfeatures = guest_default_cfg.features; + fpstate->user_xfeatures = guest_default_cfg.user_features; + fpstate->xfd = xfd; + + /* Initialize indicators to reflect properties of the fpstate */ + fpstate->is_valloc = true; + fpstate->is_guest = true; +} + static void fpu_lock_guest_permissions(void) { @@ -226,19 +239,18 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) struct fpstate *fpstate; unsigned int size; - size = fpu_kernel_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64); + size = guest_default_cfg.size + ALIGN(offsetof(struct fpstate, regs), 64); + fpstate = vzalloc(size); if (!fpstate) return false; /* Leave xfd to 0 (the reset value defined by spec) */ - __fpstate_reset(fpstate, 0); + __guest_fpstate_reset(fpstate, 0); fpstate_init_user(fpstate); - fpstate->is_valloc = true; - fpstate->is_guest = true; gfpu->fpstate = fpstate; - gfpu->xfeatures = fpu_kernel_cfg.default_features; + gfpu->xfeatures = guest_default_cfg.features; /* * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state @@ -250,8 +262,8 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) * all features that can expand the uABI size must be opt-in. */ gfpu->uabi_size = sizeof(struct kvm_xsave); - if (WARN_ON_ONCE(fpu_user_cfg.default_size > gfpu->uabi_size)) - gfpu->uabi_size = fpu_user_cfg.default_size; + if (WARN_ON_ONCE(guest_default_cfg.user_size > gfpu->uabi_size)) + gfpu->uabi_size = guest_default_cfg.user_size; fpu_lock_guest_permissions(); From patchwork Thu Apr 10 07:24:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046016 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A843720B819; Thu, 10 Apr 2025 07:23:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269787; cv=none; b=CGBL4U+CKA9qWl0vATp9el1H9HTQ2iGAQs9uWTmJu8SbtaNZ9nhpTBuPl06bdbFN16OfNZcYulLBOU25J5uDgA1nXcs2fdevXd2UpBihzHYWSCucT+AHmmSDykZg8hrABs7CJRd815T19TNr4IptJWjJ2Ghh/Pp0KGcEsXQ1fdE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269787; c=relaxed/simple; bh=bkjS1ShHcprcZLSzOo1xbnZ1tFJ/S/ey47LDxHM6MQ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sQQzWkTgu9vzwdPJF55G6DGm6rmSfBsOxTsXJPUC7m37BGJFhETTn2DD9eQIMMfWFE7WpX1ZJQu/YHw2Qs1bnX3ntNlob0FZkUqK8+G6SQF1uE8PB3IJ40XdDqvNm6l7R86gpRrEhh1CghiCb14AbW1dD31Lef1/+fstLbfhv2k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=T3Yn+lBT; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="T3Yn+lBT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269786; x=1775805786; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bkjS1ShHcprcZLSzOo1xbnZ1tFJ/S/ey47LDxHM6MQ4=; b=T3Yn+lBT13X6o2VXNTzpMSVNQCIroXinMyddaWiF5sq9ZnkglbN7bjdU 33L+SxEq9n3RIUqOvLiVqPajqBvu+rVidVmhh8aywgyzalIjAeB0iBmre 7p9ojCkzxR0xq6JJJ0sfy18jRk7Mf6cBA6+Ih3r3oizSYwyqcvxF0Y+a3 lF/phmEEfgQxlb29ETNblQ36PDLVzcI+/Vx20b2jzz3+am0Fh8cWvGc00 aIs5mogoELqCV+UzBHbA+XNSajRkTtm8z3jh20QIx17UwOhgphhGT83UL uL0HP1fFMLEzBiXPA+U7PTYWe+SbsWXmvJNS489z6mpGyuLaNOpFdsbNL g==; X-CSE-ConnectionGUID: BNahLccPQ8CaVoMvCaLLvA== X-CSE-MsgGUID: MWRPLV+3SYuIgd5/uyFIhQ== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439415" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439415" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:23:03 -0700 X-CSE-ConnectionGUID: fXgORDmPTamGsfX0wIiwSg== X-CSE-MsgGUID: e9YRWaSQTeWdgMcSMygV8g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778246" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:22:58 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Maxim Levitsky , Mitchell Levy , Samuel Holland , Zhao Liu , Vignesh Balasubramanian , Aruna Ramakrishna , Uros Bizjak Subject: [PATCH v5 6/7] x86/fpu/xstate: Introduce "guest-only" supervisor xfeature set Date: Thu, 10 Apr 2025 15:24:46 +0800 Message-ID: <20250410072605.2358393-7-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Yang Weijiang In preparation for upcoming CET virtualization support, the CET supervisor state will be added as a "guest-only" feature, since it is required only by KVM (i.e., guest FPUs). Establish the infrastructure for "guest-only" features. Define a new XFEATURE_MASK_GUEST_SUPERVISOR mask to specify features that are enabled by default in guest FPUs but not in host FPUs. Specifically, for any bit in this set, permission is granted and XSAVE space is allocated during vCPU creation. Non-guest FPUs cannot enable guest-only features, even dynamically, and no XSAVE space will be allocated for them. The mask is currently empty, but this will be changed by a subsequent patch. Note that there is no plan to add "guest-only" user xfeatures, so the user default features remain unchanged. Co-developed-by: Chao Gao Signed-off-by: Chao Gao Signed-off-by: Yang Weijiang --- v5: Explain in detail the reasoning behind the mask name choice below the "---" separator line. In previous versions, the mask was named "XFEATURE_MASK_SUPERVISOR_DYNAMIC" Dave suggested this name [1], but he also noted, "I don't feel strongly about it and I've said my piece. I won't NAK it one way or the other." The term "dynamic" was initially preferred because it reflects the impact on XSAVE buffers—some buffers accommodate dynamic features while others do not. This naming allows for the introduction of dynamic features that are not strictly "guest-only", offering flexibility beyond KVM. However, using "dynamic" has led to confusion [2]. Chang pointed out that permission granting and buffer allocation are actually static at VCPU allocation, diverging from the model for user dynamic features. He also questioned the rationale for introducing a kernel dynamic feature mask while using it as a guest-only feature mask [3]. Moreover, Thomas remarked that "the dynamic naming is really bad" [4]. Although his specific concerns are unclear, we should be cautious about reinstating the "kernel dynamic feature" naming. Therefore, in v4, I renamed the mask to "XFEATURE_MASK_SUPERVISOR_GUEST" and further refined it to "XFEATURE_MASK_GUEST_SUPERVISOR" in this v5. [1]: https://lore.kernel.org/all/893ac578-baaf-4f4f-96ee-e012dfc073a8@intel.com/#t [2]: https://lore.kernel.org/kvm/e15d1074-d5ec-431d-86e5-a58bc6297df8@intel.com/ [3]: https://lore.kernel.org/kvm/7bee70fd-b2b9-4466-a694-4bf3486b19c7@intel.com/ [4]: https://lore.kernel.org/all/87sg1owmth.ffs@nanos.tec.linutronix.de/ --- arch/x86/include/asm/fpu/types.h | 9 +++++---- arch/x86/include/asm/fpu/xstate.h | 6 +++++- arch/x86/kernel/fpu/xstate.c | 14 +++++++++++--- arch/x86/kernel/fpu/xstate.h | 5 +++++ 4 files changed, 26 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 769155a0401a..7494d732b296 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -600,8 +600,9 @@ struct fpu_state_config { * @default_size: * * The default size of the register state buffer. Includes all - * supported features except independent managed features and - * features which have to be requested by user space before usage. + * supported features except independent managed features, + * guest-only features and features which have to be requested by + * user space before usage. */ unsigned int default_size; @@ -617,8 +618,8 @@ struct fpu_state_config { * @default_features: * * The default supported features bitmap. Does not include - * independent managed features and features which have to - * be requested by user space before usage. + * independent managed features, guest-only features and features + * which have to be requested by user space before usage. */ u64 default_features; /* diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 7f39fe7980c5..62768d2131ec 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -45,9 +45,13 @@ /* Features which are dynamically enabled for a process on request */ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA +/* Supervisor features which are enabled only in guest FPUs */ +#define XFEATURE_MASK_GUEST_SUPERVISOR 0 + /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ - XFEATURE_MASK_CET_USER) + XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_GUEST_SUPERVISOR) /* * A supervisor state component may not always contain valuable information, diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index cdd1e51fb93e..c7db9f1407f5 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -776,14 +776,22 @@ static void __init init_default_features(u64 kernel_max_features, u64 user_max_f u64 kfeatures = kernel_max_features; u64 ufeatures = user_max_features; - /* Default feature sets should not include dynamic xfeatures. */ - kfeatures &= ~XFEATURE_MASK_USER_DYNAMIC; + /* + * Default feature sets should not include dynamic and guest-only + * xfeatures at all. + */ + kfeatures &= ~(XFEATURE_MASK_USER_DYNAMIC | XFEATURE_MASK_GUEST_SUPERVISOR); ufeatures &= ~XFEATURE_MASK_USER_DYNAMIC; fpu_kernel_cfg.default_features = kfeatures; fpu_user_cfg.default_features = ufeatures; - guest_default_cfg.features = kfeatures; + /* + * Ensure VCPU FPU container only reserves a space for guest-only + * xfeatures. This distinction can save kernel memory by + * maintaining a necessary amount of XSAVE buffer. + */ + guest_default_cfg.features = kfeatures | xfeatures_mask_guest_supervisor(); guest_default_cfg.user_features = ufeatures; } diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 0fd34f53f025..8be3df4aa28b 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -61,6 +61,11 @@ static inline u64 xfeatures_mask_supervisor(void) return fpu_kernel_cfg.max_features & XFEATURE_MASK_SUPERVISOR_SUPPORTED; } +static inline u64 xfeatures_mask_guest_supervisor(void) +{ + return fpu_kernel_cfg.max_features & XFEATURE_MASK_GUEST_SUPERVISOR; +} + static inline u64 xfeatures_mask_independent(void) { if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) From patchwork Thu Apr 10 07:24:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 14046017 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66F4C20B819; Thu, 10 Apr 2025 07:23:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269796; cv=none; b=IGAkYPn9I+mw2Gwn9m70hZN0PO+SjnEFH7q8h6E/gSzMStYr4B3UbcHfj2Fj0xps1NfO0ZFn2JDF5wmtDVVHQ1El8N2qM5kyrhB1XDUAfuxXtDY4XJ03yU8QA1acwcGCOmggDUlPLjdXVb6Y/9+y3QmVtXW0NKP9ufuOBGUgigk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744269796; c=relaxed/simple; bh=9ljqzi7I9zgacMhPDRGMkgu7C926bGxUjg2vbBFijHY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fvwcJTKj1pRDNVj5xL6lYm2D2qdUnzJjpb3pwBrgq0A/XMc8rmiNUBP+Qto7n+iFcfaFOkLrDcPGnegZbPfkUwlKuIlIFgexKfMYnSsNdq3z/ZV45T8T5kO2WoA9WsHYyfbKQn3lNuX/jDA96w9Zcps6T9LFqXDfNfNTJI8J8N4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=g0PaMxCx; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="g0PaMxCx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1744269794; x=1775805794; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9ljqzi7I9zgacMhPDRGMkgu7C926bGxUjg2vbBFijHY=; b=g0PaMxCxuqCwpDpuUKr4lOrnk/ZBB9vBQP9NwuvKQFpc9UGp4jRyc2fR BaU9LQsBuMH3u4L584iezaZcgwEU3/9BxsbAORUD8PFp3LLHNLlZ6hju5 fRWjzX+NdtclXCKHf7UHMniyPeQT9bEfs/gdcF/44hlELN4NojibUX82I CdjSlUO9FyYv9ZuUFpwCnTC1Bk/2XYQEaVO/WDVowGE1gCt7ysHdxcXOR hFUDdBjIYIKT5M+oFvPWOLJgIEZzPPYhXDWbaUbeOFnxGU6A3+O2HzOJe tJLs1RyatXCsPcgABNdBEjzBUpP0GaxMlHPo7GUqbLgJbXtNpFBqQqJ0r Q==; X-CSE-ConnectionGUID: rTeLCjY4Syi+/2s/pM1U0g== X-CSE-MsgGUID: GWh7GThSQQSU8ojcT4O63g== X-IronPort-AV: E=McAfee;i="6700,10204,11399"; a="56439466" X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="56439466" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:23:13 -0700 X-CSE-ConnectionGUID: bTeBltl5T2SfA8l+CBpK0Q== X-CSE-MsgGUID: NypEWP73STODDQNJvhgqcg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,202,1739865600"; d="scan'208";a="128778283" Received: from spr.sh.intel.com ([10.239.53.19]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Apr 2025 00:23:08 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Maxim Levitsky , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Samuel Holland , Mitchell Levy , Zhao Liu , Aruna Ramakrishna , Vignesh Balasubramanian Subject: [PATCH v5 7/7] x86/fpu/xstate: Add CET supervisor xfeature support as a guest-only feature Date: Thu, 10 Apr 2025 15:24:47 +0800 Message-ID: <20250410072605.2358393-8-chao.gao@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250410072605.2358393-1-chao.gao@intel.com> References: <20250410072605.2358393-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Yang Weijiang == Background == CET defines two register states: CET user, which includes user-mode control registers, and CET supervisor, which consists of shadow-stack pointers for privilege levels 0-2. Current kernels disable shadow stacks in kernel mode, making the CET supervisor state unused and eliminating the need for context switching. == Problem == To virtualize CET for guests, KVM must accurately emulate hardware behavior. A key challenge arises because there is no CPUID flag to indicate that shadow stack is supported only in user mode. Therefore, KVM cannot assume guests will not enable shadow stacks in kernel mode and must preserve the CET supervisor state of vCPUs. == Solution == An initial proposal to manually save and restore CET supervisor states using raw RDMSR/WRMSR in KVM was rejected due to performance concerns and its impact on KVM's ABI. Instead, leveraging the kernel's FPU infrastructure for context switching was favored [1]. The main question then became whether to enable the CET supervisor state globally for all processes or restrict it to vCPU processes. This decision involves a trade-off between a 24-byte XSTATE buffer waste for all non-vCPU processes and approximately 100 lines of code complexity in the kernel [2]. The agreed approach is to first try this optimal solution [3], i.e., restricting the CET supervisor state to guest FPUs only and eliminating unnecessary space waste. The guest-only xfeature infrastructure has already been added. Now, introduce CET supervisor xstate support as the first guest-only feature to prepare for the upcoming CET virtualization in KVM. Signed-off-by: Yang Weijiang Signed-off-by: Chao Gao Reviewed-by: Rick Edgecombe Reviewed-by: Maxim Levitsky Link: https://lore.kernel.org/kvm/ZM1jV3UPL0AMpVDI@google.com/ [1] Link: https://lore.kernel.org/kvm/1c2fd06e-2e97-4724-80ab-8695aa4334e7@intel.com/ [2] Link: https://lore.kernel.org/kvm/2597a87b-1248-b8ce-ce60-94074bc67ea4@intel.com/ [3] --- v5: Introduce CET supervisor xfeature directly as a guest-only feature, rather than first introducing it in one patch and then converting it to guest-only in a subsequent patch. (Chang) Add new features after cleanups/bug fixes (Chang, Dave, Ingo) Improve the commit message to follow the suggested background-problem-solution pattern. --- arch/x86/include/asm/fpu/types.h | 14 ++++++++++++-- arch/x86/include/asm/fpu/xstate.h | 5 ++--- arch/x86/kernel/fpu/xstate.c | 5 ++++- 3 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 7494d732b296..c9b83beb6d74 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -118,7 +118,7 @@ enum xfeature { XFEATURE_PKRU, XFEATURE_PASID, XFEATURE_CET_USER, - XFEATURE_CET_KERNEL_UNUSED, + XFEATURE_CET_KERNEL, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -141,7 +141,7 @@ enum xfeature { #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) #define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) -#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) #define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) @@ -266,6 +266,16 @@ struct cet_user_state { u64 user_ssp; }; +/* + * State component 12 is Control-flow Enforcement supervisor states. + * This state includes SSP pointers for privilege levels 0 through 2. + */ +struct cet_supervisor_state { + u64 pl0_ssp; + u64 pl1_ssp; + u64 pl2_ssp; +} __packed; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 62768d2131ec..86070ac1c708 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -46,7 +46,7 @@ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA /* Supervisor features which are enabled only in guest FPUs */ -#define XFEATURE_MASK_GUEST_SUPERVISOR 0 +#define XFEATURE_MASK_GUEST_SUPERVISOR XFEATURE_MASK_CET_KERNEL /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ @@ -78,8 +78,7 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ - XFEATURE_MASK_CET_KERNEL) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index c7db9f1407f5..e12df668291c 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -56,7 +56,7 @@ static const char *xfeature_names[] = "Protection Keys User registers", "PASID state", "Control-flow User registers", - "Control-flow Kernel registers (unused)", + "Control-flow Kernel registers (KVM only)", "unknown xstate feature", "unknown xstate feature", "unknown xstate feature", @@ -79,6 +79,7 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PKRU] = X86_FEATURE_OSPKE, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, [XFEATURE_CET_USER] = X86_FEATURE_SHSTK, + [XFEATURE_CET_KERNEL] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -369,6 +370,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate) XFEATURE_MASK_BNDCSR | \ XFEATURE_MASK_PASID | \ XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL | \ XFEATURE_MASK_XTILE) /* @@ -569,6 +571,7 @@ static bool __init check_xstate_against_struct(int nr) case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state); case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg); case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state); + case XFEATURE_CET_KERNEL: return XCHECK_SZ(sz, nr, struct cet_supervisor_state); case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true; default: XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr);