diff mbox series

[v3,09/10] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set

Message ID 20250307164123.1613414-10-chao.gao@intel.com (mailing list archive)
State New
Headers show
Series Introduce CET supervisor state support | expand

Commit Message

Chao Gao March 7, 2025, 4:41 p.m. UTC
From: Yang Weijiang <weijiang.yang@intel.com>

Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features
that can be optionally enabled by kernel components. This is similar to
XFEATURE_MASK_USER_DYNAMIC in that it contains optional xfeatures that
can allows the FPU buffer to be dynamically sized. The difference is that
the KERNEL variant contains supervisor features and will be enabled by
kernel components that need them, and not directly by the user. Currently
it's used by KVM to configure guest dedicated fpstate for calculating
the xfeature and fpstate storage size etc.

Kernel dynamic features are enabled for the guest FPU and disabled for
the kernel FPU, effectively making them guest-only features.

Set XFEATURE_CET_KERNEL as the first kernel dynamic feature, as it is
required only by the guest FPU for the upcoming CET virtualization
support in KVM.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
I am tempted to rename XFEATURE_MASK_KERNEL_DYNAMIC to
XFEATURE_MASK_GUEST_ONLY. But I am not sure if this was discussed
and rejected.
---
 arch/x86/include/asm/fpu/xstate.h | 5 ++++-
 arch/x86/kernel/fpu/xstate.c      | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Chang S. Bae March 9, 2025, 10:06 p.m. UTC | #1
On 3/7/2025 8:41 AM, Chao Gao wrote:
> 
> Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features
> that can be optionally enabled by kernel components. This is similar to
> XFEATURE_MASK_USER_DYNAMIC in that it contains optional xfeatures that
> can allows the FPU buffer to be dynamically sized. The difference is that
> the KERNEL variant contains supervisor features and will be enabled by
> kernel components that need them, and not directly by the user. Currently
> it's used by KVM to configure guest dedicated fpstate for calculating
> the xfeature and fpstate storage size etc.
> 
> Kernel dynamic features are enabled for the guest FPU and disabled for
> the kernel FPU, effectively making them guest-only features.
> 
> Set XFEATURE_CET_KERNEL as the first kernel dynamic feature, as it is
> required only by the guest FPU for the upcoming CET virtualization
> support in KVM.

When introducing user dynamic features, AMX required a large state, so 
buffer reallocation for expansion was deferred until it was actually 
used. This introduction was associated with introducing a permission 
mechanism, which was expected to be requested by userspace.

For VCPU tasks, the userspace component (QEMU) requests permission [1], 
and buffer expansion then follows based on the exposed CPUID 
determination [2].

Now, regarding the new kernel dynamic features, I’m unsure whether this 
changelog or anything else sufficiently describes its semantics 
distintively. It appears that both permission grant and buffer 
allocation for the kernel dynamic feature occur at VCPU allocation time. 
However, this model differs from the deferred buffer expansion model for 
user dynamic features.

If the kernel dynamic feature model were to follow the same deferred 
reallocation approach as user dynamic features, buffer reallocation 
would be expected. In that case, I'd also question whether fpu_guest_cfg 
is truly necessary.

VCPU allocation could still rely on fpu_kernel_cfg, and fpu->guest_perm 
could be extrapolated from fpu->perm or fpu_kernel_cfg. Then, 
reallocation could proceed as usual based on the permission, extending 
fpu_enable_guest_xfd_features(), possibly renaming it to 
fpu_enable_dynamic_features().

That said, this is a relatively small state. Even if the intent was to 
introduce a new semantic model distinct from user dynamic features, it 
should be clearly documented to avoid confusion.

On the other hand, if the goal is rather to establish a new approach for 
handling a previously nonexistent set of guest-exclusive features, then 
the current approach remains somewhat convoluted without clear 
descriptions. Perhaps, I'm missing something.

Thanks,
Chang

[1] https://github.com/qemu/qemu/blob/master/target/i386/kvm/kvm.c#L6395
[2] 
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/cpuid.c#n195
Chao Gao March 10, 2025, 3:49 a.m. UTC | #2
>When introducing user dynamic features, AMX required a large state, so buffer
>reallocation for expansion was deferred until it was actually used. This
>introduction was associated with introducing a permission mechanism, which
>was expected to be requested by userspace.
>
>For VCPU tasks, the userspace component (QEMU) requests permission [1], and
>buffer expansion then follows based on the exposed CPUID determination [2].
>
>Now, regarding the new kernel dynamic features, I’m unsure whether this
>changelog or anything else sufficiently describes its semantics distintively.
>It appears that both permission grant and buffer allocation for the kernel
>dynamic feature occur at VCPU allocation time. However, this model differs
>from the deferred buffer expansion model for user dynamic features.
>
>If the kernel dynamic feature model were to follow the same deferred
>reallocation approach as user dynamic features, buffer reallocation would be
>expected. In that case, I'd also question whether fpu_guest_cfg is truly
>necessary.
>
>VCPU allocation could still rely on fpu_kernel_cfg, and fpu->guest_perm could
>be extrapolated from fpu->perm or fpu_kernel_cfg. Then, reallocation could
>proceed as usual based on the permission, extending
>fpu_enable_guest_xfd_features(), possibly renaming it to
>fpu_enable_dynamic_features().
>
>That said, this is a relatively small state.

Yes, there's no need to make the guest FPU dynamically sized for the CET
supervisor state, as it is only 24 bytes.

XFEATURE_MASK_KERNEL_DYNAMIC is a misnomer. It is misleading readers into
thinking it involves permission requests and dynamic sizing, similar to
XFEATURE_MASK_USER_DYNAMIC

>Even if the intent was to
>introduce a new semantic model distinct from user dynamic features, it should
>be clearly documented to avoid confusion.

The goal isn't to add a new semantic model for dynamic features.

>
>On the other hand, if the goal is rather to establish a new approach for
>handling a previously nonexistent set of guest-exclusive features, then the

Yes. This is the goal of this patch.

>current approach remains somewhat convoluted without clear descriptions.
>Perhaps, I'm missing something.

Do you mean this patch is "somewhat convoluted"? or the whole series?

I am assuming you meant this series as this patch itself is quite small.

Here is how this series is organized:

Patches 1–4 : Cleanups and preparatory fixes.
Patches 5–7 : Introduce fpu_guest_cfg to formalize guest FPU configuration.
Patch 8 (Primary Goal): Add CET supervisor state support.
Patches 9–10 : make CET supserviosr state a guest-only feature to save XSAVE buffer
	       space for non-guest FPUs (placed at the end for easier review/drop).

I believe the "somewhat convoluted" impression comes from the introduction of
fpu_guest_cfg. But as I alluded to in patch 5's changelog, fpu_guest_cfg
actually simplifies the architecture rather than adding complexity, with
minimal overhead, i.e., a single global config. It was suggested by Sean [1].
In my view, it offers three benefits:

 - Readability: Removes ambiguity in fpu_alloc_guest_fpstate() by initializing
		the guest FPU with its own config.

 - Extensibility: Supports clean addition of guest-only features (e.g., CET
		  supervisor state) or potentially kernel-only features (e.g.,
		  PASID, which is not used by guest FPUs)

 - Robustness: Prevent issues like those addressed by patches 3/4.


It is possible to make some features guest-only without fpu_guest_cfg, but
doing so would make fpu_alloc_guest_fpstate() a bit difficult to understand.
See [2].

[1]: https://lore.kernel.org/kvm/ZTf5wPKXuHBQk0AN@google.com/
[2]: https://lore.kernel.org/kvm/20230914063325.85503-8-weijiang.yang@intel.com/

>
>Thanks,
>Chang
>
>[1] https://github.com/qemu/qemu/blob/master/target/i386/kvm/kvm.c#L6395
>[2] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/cpuid.c#n195

Thanks for these references.
Chang S. Bae March 10, 2025, 5:20 a.m. UTC | #3
On 3/9/2025 8:49 PM, Chao Gao wrote:
> 
> It was suggested by Sean [1].
...
 > [1]: https://lore.kernel.org/kvm/ZTf5wPKXuHBQk0AN@google.com/

But, you're defining a kernel "dynamic" feature while introducing a
"guest-only" xfeature concept. Both seem to be mixed together with this
patch. Why not call it as a guest-only feature? That's what Sean was
suggesting, no?

"I would much prefer to avoid the whole "dynamic" thing and instead make 
CET explicitly guest-only.  E.g. fpu_kernel_guest_only_xfeatures?  Or 
even better if it doesn't cause weirdness elsewhere, a dedicated 
fpu_guest_cfg.  For me at least, a fpu_guest_cfg would make it easier to 
understand what all is going on."

Thanks,
Chang
Chao Gao March 10, 2025, 5:53 a.m. UTC | #4
On Sun, Mar 09, 2025 at 10:20:47PM -0700, Chang S. Bae wrote:
>On 3/9/2025 8:49 PM, Chao Gao wrote:
>> 
>> It was suggested by Sean [1].
>...
>> [1]: https://lore.kernel.org/kvm/ZTf5wPKXuHBQk0AN@google.com/
>
>But, you're defining a kernel "dynamic" feature while introducing a
>"guest-only" xfeature concept. Both seem to be mixed together with this
>patch. Why not call it as a guest-only feature? That's what Sean was
>suggesting, no?

Yes. I agree that we should call it as a guest-only feature. That's also why I
included a note in this patch below the "---" to seek feedback on the naming:

	I am tempted to rename XFEATURE_MASK_KERNEL_DYNAMIC to
	XFEATURE_MASK_GUEST_ONLY. But I am not sure if this was discussed
	and rejected.

Thanks for confirming that the renaming is necessary.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 8990cf381bef..f342715d204b 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -42,9 +42,12 @@ 
 #define XFEATURE_MASK_USER_RESTORE	\
 	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
 
-/* Features which are dynamically enabled for a process on request */
+/* Features which are dynamically enabled per userspace request */
 #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
 
+/* Features which are dynamically enabled per kernel side request */
+#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
 					    XFEATURE_MASK_CET_USER | \
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 12613ebdbb5d..e5284e67dfec 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -826,6 +826,7 @@  void __init fpu__init_system_xstate(unsigned int legacy_size)
 	/* Clean out dynamic features from default */
 	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
 	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
+	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
 
 	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
 	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;