Message ID | 20250307164123.1613414-10-chao.gao@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Introduce CET supervisor state support | expand |
On 3/7/2025 8:41 AM, Chao Gao wrote: > > Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features > that can be optionally enabled by kernel components. This is similar to > XFEATURE_MASK_USER_DYNAMIC in that it contains optional xfeatures that > can allows the FPU buffer to be dynamically sized. The difference is that > the KERNEL variant contains supervisor features and will be enabled by > kernel components that need them, and not directly by the user. Currently > it's used by KVM to configure guest dedicated fpstate for calculating > the xfeature and fpstate storage size etc. > > Kernel dynamic features are enabled for the guest FPU and disabled for > the kernel FPU, effectively making them guest-only features. > > Set XFEATURE_CET_KERNEL as the first kernel dynamic feature, as it is > required only by the guest FPU for the upcoming CET virtualization > support in KVM. When introducing user dynamic features, AMX required a large state, so buffer reallocation for expansion was deferred until it was actually used. This introduction was associated with introducing a permission mechanism, which was expected to be requested by userspace. For VCPU tasks, the userspace component (QEMU) requests permission [1], and buffer expansion then follows based on the exposed CPUID determination [2]. Now, regarding the new kernel dynamic features, I’m unsure whether this changelog or anything else sufficiently describes its semantics distintively. It appears that both permission grant and buffer allocation for the kernel dynamic feature occur at VCPU allocation time. However, this model differs from the deferred buffer expansion model for user dynamic features. If the kernel dynamic feature model were to follow the same deferred reallocation approach as user dynamic features, buffer reallocation would be expected. In that case, I'd also question whether fpu_guest_cfg is truly necessary. VCPU allocation could still rely on fpu_kernel_cfg, and fpu->guest_perm could be extrapolated from fpu->perm or fpu_kernel_cfg. Then, reallocation could proceed as usual based on the permission, extending fpu_enable_guest_xfd_features(), possibly renaming it to fpu_enable_dynamic_features(). That said, this is a relatively small state. Even if the intent was to introduce a new semantic model distinct from user dynamic features, it should be clearly documented to avoid confusion. On the other hand, if the goal is rather to establish a new approach for handling a previously nonexistent set of guest-exclusive features, then the current approach remains somewhat convoluted without clear descriptions. Perhaps, I'm missing something. Thanks, Chang [1] https://github.com/qemu/qemu/blob/master/target/i386/kvm/kvm.c#L6395 [2] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/cpuid.c#n195
>When introducing user dynamic features, AMX required a large state, so buffer >reallocation for expansion was deferred until it was actually used. This >introduction was associated with introducing a permission mechanism, which >was expected to be requested by userspace. > >For VCPU tasks, the userspace component (QEMU) requests permission [1], and >buffer expansion then follows based on the exposed CPUID determination [2]. > >Now, regarding the new kernel dynamic features, I’m unsure whether this >changelog or anything else sufficiently describes its semantics distintively. >It appears that both permission grant and buffer allocation for the kernel >dynamic feature occur at VCPU allocation time. However, this model differs >from the deferred buffer expansion model for user dynamic features. > >If the kernel dynamic feature model were to follow the same deferred >reallocation approach as user dynamic features, buffer reallocation would be >expected. In that case, I'd also question whether fpu_guest_cfg is truly >necessary. > >VCPU allocation could still rely on fpu_kernel_cfg, and fpu->guest_perm could >be extrapolated from fpu->perm or fpu_kernel_cfg. Then, reallocation could >proceed as usual based on the permission, extending >fpu_enable_guest_xfd_features(), possibly renaming it to >fpu_enable_dynamic_features(). > >That said, this is a relatively small state. Yes, there's no need to make the guest FPU dynamically sized for the CET supervisor state, as it is only 24 bytes. XFEATURE_MASK_KERNEL_DYNAMIC is a misnomer. It is misleading readers into thinking it involves permission requests and dynamic sizing, similar to XFEATURE_MASK_USER_DYNAMIC >Even if the intent was to >introduce a new semantic model distinct from user dynamic features, it should >be clearly documented to avoid confusion. The goal isn't to add a new semantic model for dynamic features. > >On the other hand, if the goal is rather to establish a new approach for >handling a previously nonexistent set of guest-exclusive features, then the Yes. This is the goal of this patch. >current approach remains somewhat convoluted without clear descriptions. >Perhaps, I'm missing something. Do you mean this patch is "somewhat convoluted"? or the whole series? I am assuming you meant this series as this patch itself is quite small. Here is how this series is organized: Patches 1–4 : Cleanups and preparatory fixes. Patches 5–7 : Introduce fpu_guest_cfg to formalize guest FPU configuration. Patch 8 (Primary Goal): Add CET supervisor state support. Patches 9–10 : make CET supserviosr state a guest-only feature to save XSAVE buffer space for non-guest FPUs (placed at the end for easier review/drop). I believe the "somewhat convoluted" impression comes from the introduction of fpu_guest_cfg. But as I alluded to in patch 5's changelog, fpu_guest_cfg actually simplifies the architecture rather than adding complexity, with minimal overhead, i.e., a single global config. It was suggested by Sean [1]. In my view, it offers three benefits: - Readability: Removes ambiguity in fpu_alloc_guest_fpstate() by initializing the guest FPU with its own config. - Extensibility: Supports clean addition of guest-only features (e.g., CET supervisor state) or potentially kernel-only features (e.g., PASID, which is not used by guest FPUs) - Robustness: Prevent issues like those addressed by patches 3/4. It is possible to make some features guest-only without fpu_guest_cfg, but doing so would make fpu_alloc_guest_fpstate() a bit difficult to understand. See [2]. [1]: https://lore.kernel.org/kvm/ZTf5wPKXuHBQk0AN@google.com/ [2]: https://lore.kernel.org/kvm/20230914063325.85503-8-weijiang.yang@intel.com/ > >Thanks, >Chang > >[1] https://github.com/qemu/qemu/blob/master/target/i386/kvm/kvm.c#L6395 >[2] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/cpuid.c#n195 Thanks for these references.
On 3/9/2025 8:49 PM, Chao Gao wrote: > > It was suggested by Sean [1]. ... > [1]: https://lore.kernel.org/kvm/ZTf5wPKXuHBQk0AN@google.com/ But, you're defining a kernel "dynamic" feature while introducing a "guest-only" xfeature concept. Both seem to be mixed together with this patch. Why not call it as a guest-only feature? That's what Sean was suggesting, no? "I would much prefer to avoid the whole "dynamic" thing and instead make CET explicitly guest-only. E.g. fpu_kernel_guest_only_xfeatures? Or even better if it doesn't cause weirdness elsewhere, a dedicated fpu_guest_cfg. For me at least, a fpu_guest_cfg would make it easier to understand what all is going on." Thanks, Chang
On Sun, Mar 09, 2025 at 10:20:47PM -0700, Chang S. Bae wrote: >On 3/9/2025 8:49 PM, Chao Gao wrote: >> >> It was suggested by Sean [1]. >... >> [1]: https://lore.kernel.org/kvm/ZTf5wPKXuHBQk0AN@google.com/ > >But, you're defining a kernel "dynamic" feature while introducing a >"guest-only" xfeature concept. Both seem to be mixed together with this >patch. Why not call it as a guest-only feature? That's what Sean was >suggesting, no? Yes. I agree that we should call it as a guest-only feature. That's also why I included a note in this patch below the "---" to seek feedback on the naming: I am tempted to rename XFEATURE_MASK_KERNEL_DYNAMIC to XFEATURE_MASK_GUEST_ONLY. But I am not sure if this was discussed and rejected. Thanks for confirming that the renaming is necessary.
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 8990cf381bef..f342715d204b 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -42,9 +42,12 @@ #define XFEATURE_MASK_USER_RESTORE \ (XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU) -/* Features which are dynamically enabled for a process on request */ +/* Features which are dynamically enabled per userspace request */ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA +/* Features which are dynamically enabled per kernel side request */ +#define XFEATURE_MASK_KERNEL_DYNAMIC XFEATURE_MASK_CET_KERNEL + /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ XFEATURE_MASK_CET_USER | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 12613ebdbb5d..e5284e67dfec 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -826,6 +826,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size) /* Clean out dynamic features from default */ fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features; fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC; + fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC; fpu_user_cfg.default_features = fpu_user_cfg.max_features; fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;