diff mbox series

[v7,04/26] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set

Message ID 20231124055330.138870-5-weijiang.yang@intel.com (mailing list archive)
State New, archived
Headers show
Series Enable CET Virtualization | expand

Commit Message

Yang, Weijiang Nov. 24, 2023, 5:53 a.m. UTC
Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
optionally enabled by kernel components, i.e., the features are required by
specific kernel components. Currently it's used by KVM to configure guest
dedicated fpstate for calculating the xfeature and fpstate storage size etc.

The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
supported by host as they're enabled in xsaves/xrstors operating xfeature set
(XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
not enabled in host kernel so it can be omitted for normal fpstate by default.

Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
optimized by HW for normal fpstate.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/fpu/xstate.h | 5 ++++-
 arch/x86/kernel/fpu/xstate.c      | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Edgecombe, Rick P Nov. 28, 2023, 1:46 a.m. UTC | #1
On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features
> can be
> optionally enabled by kernel components, i.e., the features are
> required by
> specific kernel components.

The above is a bit tough to parse. Does any of this seem clearer?

Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features
that can be optionally enabled by kernel components. This is similar to
XFEATURE_MASK_KERNEL_DYNAMIC in that it contains optional xfeatures
that can allows the FPU buffer to be dynamically sized. The difference
is that the KERNEL variant contains supervisor features and will be
enabled by kernel components that need them, and not directly by the
user.

>  Currently it's used by KVM to configure guest
> dedicated fpstate for calculating the xfeature and fpstate storage
> size etc.
> 
> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL,
> which is
> supported by host as they're enabled in xsaves/xrstors operating
> xfeature set
> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow
> stack, is
> not enabled in host kernel so it can be omitted for normal fpstate by
> default.
> 
> Remove the kernel dynamic feature from
> fpu_kernel_cfg.default_features so that
> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can
> be
> optimized by HW for normal fpstate.

Thanks for breaking these into small patches.
Yang, Weijiang Nov. 28, 2023, 8 a.m. UTC | #2
On 11/28/2023 9:46 AM, Edgecombe, Rick P wrote:
> On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
>> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features
>> can be
>> optionally enabled by kernel components, i.e., the features are
>> required by
>> specific kernel components.
> The above is a bit tough to parse. Does any of this seem clearer?
>
> Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features
> that can be optionally enabled by kernel components. This is similar to
> XFEATURE_MASK_KERNEL_DYNAMIC in that it contains optional xfeatures
> that can allows the FPU buffer to be dynamically sized. The difference
> is that the KERNEL variant contains supervisor features and will be
> enabled by kernel components that need them, and not directly by the
> user.

Definitely the wording is much better,  I'll apply it to next version, thanks a lot!

>>   Currently it's used by KVM to configure guest
>> dedicated fpstate for calculating the xfeature and fpstate storage
>> size etc.
>>
>> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL,
>> which is
>> supported by host as they're enabled in xsaves/xrstors operating
>> xfeature set
>> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow
>> stack, is
>> not enabled in host kernel so it can be omitted for normal fpstate by
>> default.
>>
>> Remove the kernel dynamic feature from
>> fpu_kernel_cfg.default_features so that
>> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can
>> be
>> optimized by HW for normal fpstate.
> Thanks for breaking these into small patches.
Maxim Levitsky Nov. 30, 2023, 5:33 p.m. UTC | #3
On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be

I am not sure though that this name is correct, but I don't know if I can
suggest a better name.

> optionally enabled by kernel components, i.e., the features are required by
> specific kernel components. Currently it's used by KVM to configure guest
> dedicated fpstate for calculating the xfeature and fpstate storage size etc.
> 
> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
> supported by host as they're enabled in xsaves/xrstors operating xfeature set
> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
> not enabled in host kernel so it can be omitted for normal fpstate by default.
> 
> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
> optimized by HW for normal fpstate.
> 
> Suggested-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/fpu/xstate.h | 5 ++++-
>  arch/x86/kernel/fpu/xstate.c      | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> index 3b4a038d3c57..a212d3851429 100644
> --- a/arch/x86/include/asm/fpu/xstate.h
> +++ b/arch/x86/include/asm/fpu/xstate.h
> @@ -46,9 +46,12 @@
>  #define XFEATURE_MASK_USER_RESTORE	\
>  	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
>  
> -/* Features which are dynamically enabled for a process on request */
> +/* Features which are dynamically enabled per userspace request */
>  #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
>  
> +/* Features which are dynamically enabled per kernel side request */

I suggest to explain this a bit better. How about something like that:

"Kernel features that are not enabled by default for all processes, but can
be still used by some processes, for example to support guest virtualization"

But feel free to keep it as is or propose something else. IMHO this will
be confusing this way or another.


Another question: kernel already has a notion of 'independent features'
which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'

Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
from independent buffer (in case of LBRs, perf code cares for this).

Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC, and maybe rename the
'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
(terrible name, but you might think of a better name)


> +#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
> +
>  /* All currently supported supervisor features */
>  #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
>  					    XFEATURE_MASK_CET_USER | \
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index b57d909facca..ba4172172afd 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>  	/* Clean out dynamic features from default */
>  	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
>  	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
> +	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
>  
>  	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>  	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;



Best regards,
	Maxim Levitsky
Yang, Weijiang Dec. 1, 2023, 7:49 a.m. UTC | #4
On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
> On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
>> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
> I am not sure though that this name is correct, but I don't know if I can
> suggest a better name.

It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
>> optionally enabled by kernel components, i.e., the features are required by
>> specific kernel components. Currently it's used by KVM to configure guest
>> dedicated fpstate for calculating the xfeature and fpstate storage size etc.
>>
>> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
>> supported by host as they're enabled in xsaves/xrstors operating xfeature set
>> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
>> not enabled in host kernel so it can be omitted for normal fpstate by default.
>>
>> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
>> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
>> optimized by HW for normal fpstate.
>>
>> Suggested-by: Dave Hansen <dave.hansen@intel.com>
>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>> ---
>>   arch/x86/include/asm/fpu/xstate.h | 5 ++++-
>>   arch/x86/kernel/fpu/xstate.c      | 1 +
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
>> index 3b4a038d3c57..a212d3851429 100644
>> --- a/arch/x86/include/asm/fpu/xstate.h
>> +++ b/arch/x86/include/asm/fpu/xstate.h
>> @@ -46,9 +46,12 @@
>>   #define XFEATURE_MASK_USER_RESTORE	\
>>   	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
>>   
>> -/* Features which are dynamically enabled for a process on request */
>> +/* Features which are dynamically enabled per userspace request */
>>   #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
>>   
>> +/* Features which are dynamically enabled per kernel side request */
> I suggest to explain this a bit better. How about something like that:
>
> "Kernel features that are not enabled by default for all processes, but can
> be still used by some processes, for example to support guest virtualization"

It looks good to me, will apply it in next version, thanks!

> But feel free to keep it as is or propose something else. IMHO this will
> be confusing this way or another.
>
>
> Another question: kernel already has a notion of 'independent features'
> which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
>
> Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
> from independent buffer (in case of LBRs, perf code cares for this).
>
> Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,

CET_S here refers to PL{0,1,2}_SSP, right?

IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.

But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
Enabling it in common FPU framework would make the switch/swap much easier without additional
support code.

>   and maybe rename the
> 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
> (terrible name, but you might think of a better name)
>
>
>> +#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
>> +
>>   /* All currently supported supervisor features */
>>   #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
>>   					    XFEATURE_MASK_CET_USER | \
>> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>> index b57d909facca..ba4172172afd 100644
>> --- a/arch/x86/kernel/fpu/xstate.c
>> +++ b/arch/x86/kernel/fpu/xstate.c
>> @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>>   	/* Clean out dynamic features from default */
>>   	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
>>   	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>> +	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
>>   
>>   	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>>   	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>
>
> Best regards,
> 	Maxim Levitsky
>
>
>
>
Maxim Levitsky Dec. 5, 2023, 9:55 a.m. UTC | #5
On Fri, 2023-12-01 at 15:49 +0800, Yang, Weijiang wrote:
> On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
> > On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
> > > Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
> > I am not sure though that this name is correct, but I don't know if I can
> > suggest a better name.
> 
> It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
> > > optionally enabled by kernel components, i.e., the features are required by
> > > specific kernel components. Currently it's used by KVM to configure guest
> > > dedicated fpstate for calculating the xfeature and fpstate storage size etc.
> > > 
> > > The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
> > > supported by host as they're enabled in xsaves/xrstors operating xfeature set
> > > (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
> > > not enabled in host kernel so it can be omitted for normal fpstate by default.
> > > 
> > > Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
> > > the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
> > > optimized by HW for normal fpstate.
> > > 
> > > Suggested-by: Dave Hansen <dave.hansen@intel.com>
> > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > > ---
> > >   arch/x86/include/asm/fpu/xstate.h | 5 ++++-
> > >   arch/x86/kernel/fpu/xstate.c      | 1 +
> > >   2 files changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> > > index 3b4a038d3c57..a212d3851429 100644
> > > --- a/arch/x86/include/asm/fpu/xstate.h
> > > +++ b/arch/x86/include/asm/fpu/xstate.h
> > > @@ -46,9 +46,12 @@
> > >   #define XFEATURE_MASK_USER_RESTORE	\
> > >   	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
> > >   
> > > -/* Features which are dynamically enabled for a process on request */
> > > +/* Features which are dynamically enabled per userspace request */
> > >   #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
> > >   
> > > +/* Features which are dynamically enabled per kernel side request */
> > I suggest to explain this a bit better. How about something like that:
> > 
> > "Kernel features that are not enabled by default for all processes, but can
> > be still used by some processes, for example to support guest virtualization"
> 
> It looks good to me, will apply it in next version, thanks!
> 
> > But feel free to keep it as is or propose something else. IMHO this will
> > be confusing this way or another.
> > 
> > 
> > Another question: kernel already has a notion of 'independent features'
> > which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
> > 
> > Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
> > from independent buffer (in case of LBRs, perf code cares for this).
> > 
> > Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,
> 
> CET_S here refers to PL{0,1,2}_SSP, right?
> 
> IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
> owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.

This is true, but the question that begs to be asked, is what is the true purpose of the 'independent features' is
from the POV of the kernel FPU framework. IMHO these are features that the framework is not aware of, except
that it enables it in IA32_XSS (and in XCR0 in the future).

For the guest only features, like CET_S, it is also kind of the same thing (xsave but to guest state area only).
I don't insist that we add CET_S to independent features, but I just gave an idea that maybe that is better
from complexity point of view to add CET there. It's up to you to decide.

Sean what do you think?

Best regards,
	Maxim Levitsky


> 
> But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
> Enabling it in common FPU framework would make the switch/swap much easier without additional
> support code.
> 
> >   and maybe rename the
> > 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
> > (terrible name, but you might think of a better name)
> > 
> > 
> > > +#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
> > > +
> > >   /* All currently supported supervisor features */
> > >   #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
> > >   					    XFEATURE_MASK_CET_USER | \
> > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> > > index b57d909facca..ba4172172afd 100644
> > > --- a/arch/x86/kernel/fpu/xstate.c
> > > +++ b/arch/x86/kernel/fpu/xstate.c
> > > @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
> > >   	/* Clean out dynamic features from default */
> > >   	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
> > >   	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
> > > +	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
> > >   
> > >   	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
> > >   	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
> > 
> > Best regards,
> > 	Maxim Levitsky
> > 
> > 
> > 
> >
Yang, Weijiang Dec. 6, 2023, 3 a.m. UTC | #6
On 12/5/2023 5:55 PM, Maxim Levitsky wrote:
> On Fri, 2023-12-01 at 15:49 +0800, Yang, Weijiang wrote:
>> On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
>>> On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
>>>> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
>>> I am not sure though that this name is correct, but I don't know if I can
>>> suggest a better name.
>> It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
>>>> optionally enabled by kernel components, i.e., the features are required by
>>>> specific kernel components. Currently it's used by KVM to configure guest
>>>> dedicated fpstate for calculating the xfeature and fpstate storage size etc.
>>>>
>>>> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
>>>> supported by host as they're enabled in xsaves/xrstors operating xfeature set
>>>> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
>>>> not enabled in host kernel so it can be omitted for normal fpstate by default.
>>>>
>>>> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
>>>> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
>>>> optimized by HW for normal fpstate.
>>>>
>>>> Suggested-by: Dave Hansen <dave.hansen@intel.com>
>>>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>>>> ---
>>>>    arch/x86/include/asm/fpu/xstate.h | 5 ++++-
>>>>    arch/x86/kernel/fpu/xstate.c      | 1 +
>>>>    2 files changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
>>>> index 3b4a038d3c57..a212d3851429 100644
>>>> --- a/arch/x86/include/asm/fpu/xstate.h
>>>> +++ b/arch/x86/include/asm/fpu/xstate.h
>>>> @@ -46,9 +46,12 @@
>>>>    #define XFEATURE_MASK_USER_RESTORE	\
>>>>    	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
>>>>    
>>>> -/* Features which are dynamically enabled for a process on request */
>>>> +/* Features which are dynamically enabled per userspace request */
>>>>    #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
>>>>    
>>>> +/* Features which are dynamically enabled per kernel side request */
>>> I suggest to explain this a bit better. How about something like that:
>>>
>>> "Kernel features that are not enabled by default for all processes, but can
>>> be still used by some processes, for example to support guest virtualization"
>> It looks good to me, will apply it in next version, thanks!
>>
>>> But feel free to keep it as is or propose something else. IMHO this will
>>> be confusing this way or another.
>>>
>>>
>>> Another question: kernel already has a notion of 'independent features'
>>> which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
>>>
>>> Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
>>> from independent buffer (in case of LBRs, perf code cares for this).
>>>
>>> Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,
>> CET_S here refers to PL{0,1,2}_SSP, right?
>>
>> IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
>> owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.
> This is true, but the question that begs to be asked, is what is the true purpose of the 'independent features' is
> from the POV of the kernel FPU framework. IMHO these are features that the framework is not aware of, except
> that it enables it in IA32_XSS (and in XCR0 in the future).

This is the origin intention for introducing independent features(firstly called dynamic feature, renamed later), from the
changelog the major concern is overhead:

commit f0dccc9da4c0fda049e99326f85db8c242fd781f
Author: Kan Liang <kan.liang@linux.intel.com>
Date:   Fri Jul 3 05:49:26 2020 -0700

     x86/fpu/xstate: Support dynamic supervisor feature for LBR

"However, the kernel should not save/restore the LBR state component at
each context switch, like other state components, because of the
following unique features of LBR:
- The LBR state component only contains valuable information when LBR
   is enabled in the perf subsystem, but for most of the time, LBR is
   disabled.
- The size of the LBR state component is huge. For the current
   platform, it's 808 bytes.
If the kernel saves/restores the LBR state at each context switch, for
most of the time, it is just a waste of space and cycles."

>
> For the guest only features, like CET_S, it is also kind of the same thing (xsave but to guest state area only).
> I don't insist that we add CET_S to independent features, but I just gave an idea that maybe that is better
> from complexity point of view to add CET there. It's up to you to decide.
>
> Sean what do you think?
>
> Best regards,
> 	Maxim Levitsky
>
>
>> But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
>> Enabling it in common FPU framework would make the switch/swap much easier without additional
>> support code.
>>
>>>    and maybe rename the
>>> 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
>>> (terrible name, but you might think of a better name)
>>>
>>>
>>>> +#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
>>>> +
>>>>    /* All currently supported supervisor features */
>>>>    #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
>>>>    					    XFEATURE_MASK_CET_USER | \
>>>> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>>>> index b57d909facca..ba4172172afd 100644
>>>> --- a/arch/x86/kernel/fpu/xstate.c
>>>> +++ b/arch/x86/kernel/fpu/xstate.c
>>>> @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>>>>    	/* Clean out dynamic features from default */
>>>>    	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
>>>>    	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>> +	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
>>>>    
>>>>    	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>>>>    	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>> Best regards,
>>> 	Maxim Levitsky
>>>
>>>
>>>
>>>
>
>
>
Maxim Levitsky Dec. 6, 2023, 4:11 p.m. UTC | #7
On Wed, 2023-12-06 at 11:00 +0800, Yang, Weijiang wrote:
> On 12/5/2023 5:55 PM, Maxim Levitsky wrote:
> > On Fri, 2023-12-01 at 15:49 +0800, Yang, Weijiang wrote:
> > > On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
> > > > On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
> > > > > Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
> > > > I am not sure though that this name is correct, but I don't know if I can
> > > > suggest a better name.
> > > It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
> > > > > optionally enabled by kernel components, i.e., the features are required by
> > > > > specific kernel components. Currently it's used by KVM to configure guest
> > > > > dedicated fpstate for calculating the xfeature and fpstate storage size etc.
> > > > > 
> > > > > The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
> > > > > supported by host as they're enabled in xsaves/xrstors operating xfeature set
> > > > > (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
> > > > > not enabled in host kernel so it can be omitted for normal fpstate by default.
> > > > > 
> > > > > Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
> > > > > the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
> > > > > optimized by HW for normal fpstate.
> > > > > 
> > > > > Suggested-by: Dave Hansen <dave.hansen@intel.com>
> > > > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > > > > ---
> > > > >    arch/x86/include/asm/fpu/xstate.h | 5 ++++-
> > > > >    arch/x86/kernel/fpu/xstate.c      | 1 +
> > > > >    2 files changed, 5 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> > > > > index 3b4a038d3c57..a212d3851429 100644
> > > > > --- a/arch/x86/include/asm/fpu/xstate.h
> > > > > +++ b/arch/x86/include/asm/fpu/xstate.h
> > > > > @@ -46,9 +46,12 @@
> > > > >    #define XFEATURE_MASK_USER_RESTORE	\
> > > > >    	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
> > > > >    
> > > > > -/* Features which are dynamically enabled for a process on request */
> > > > > +/* Features which are dynamically enabled per userspace request */
> > > > >    #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
> > > > >    
> > > > > +/* Features which are dynamically enabled per kernel side request */
> > > > I suggest to explain this a bit better. How about something like that:
> > > > 
> > > > "Kernel features that are not enabled by default for all processes, but can
> > > > be still used by some processes, for example to support guest virtualization"
> > > It looks good to me, will apply it in next version, thanks!
> > > 
> > > > But feel free to keep it as is or propose something else. IMHO this will
> > > > be confusing this way or another.
> > > > 
> > > > 
> > > > Another question: kernel already has a notion of 'independent features'
> > > > which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
> > > > 
> > > > Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
> > > > from independent buffer (in case of LBRs, perf code cares for this).
> > > > 
> > > > Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,
> > > CET_S here refers to PL{0,1,2}_SSP, right?
> > > 
> > > IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
> > > owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.
> > This is true, but the question that begs to be asked, is what is the true purpose of the 'independent features' is
> > from the POV of the kernel FPU framework. IMHO these are features that the framework is not aware of, except
> > that it enables it in IA32_XSS (and in XCR0 in the future).
> 
> This is the origin intention for introducing independent features(firstly called dynamic feature, renamed later), from the
> changelog the major concern is overhead:

Yes, and to some extent the reason why we want to have CET supervisor state not saved on normal thread's FPU state is also overhead,
because in theory if the kernel did save it, the MSRs will be in INIT state and thus XSAVES shouldn't have any functional impact,
even if it saves/restores them for nothing.

In other words, as I said, independent features = features that FPU state doesn't manage, and are just optionally enabled,
so that a custom code can do a custom xsave(s)/xrstor(s), likely from/to a custom area to save/load these features.

It might make sense to rename independent features again to something like 'unmanaged features' or 'manual features' or something
like that.


Another interesting question that arises here, is once KVM supports arch LBRs, it will likely need to expose the XFEATURE_LBR
to the guest and will need to context switch it similar to CET_S state, which strengthens the argument that CET_S should
be in the same group as the 'independent features'.

Depending on the performance impact, XFEATURE_LBR might even need to be dynamically allocated.


For the reference this is the patch series that introduced the arch LBRs to KVM:
https://www.spinics.net/lists/kvm/msg296507.html


Best regards,
	Maxim Levitsky

> 
> commit f0dccc9da4c0fda049e99326f85db8c242fd781f
> Author: Kan Liang <kan.liang@linux.intel.com>
> Date:   Fri Jul 3 05:49:26 2020 -0700
> 
>      x86/fpu/xstate: Support dynamic supervisor feature for LBR
> 
> "However, the kernel should not save/restore the LBR state component at
> each context switch, like other state components, because of the
> following unique features of LBR:
> - The LBR state component only contains valuable information when LBR
>    is enabled in the perf subsystem, but for most of the time, LBR is
>    disabled.
> - The size of the LBR state component is huge. For the current
>    platform, it's 808 bytes.
> If the kernel saves/restores the LBR state at each context switch, for
> most of the time, it is just a waste of space and cycles."
> 
> > For the guest only features, like CET_S, it is also kind of the same thing (xsave but to guest state area only).
> > I don't insist that we add CET_S to independent features, but I just gave an idea that maybe that is better
> > from complexity point of view to add CET there. It's up to you to decide.
> > 
> > Sean what do you think?
> > 
> > Best regards,
> > 	Maxim Levitsky
> > 
> > 
> > > But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
> > > Enabling it in common FPU framework would make the switch/swap much easier without additional
> > > support code.
> > > 
> > > >    and maybe rename the
> > > > 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
> > > > (terrible name, but you might think of a better name)
> > > > 
> > > > 
> > > > > +#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
> > > > > +
> > > > >    /* All currently supported supervisor features */
> > > > >    #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
> > > > >    					    XFEATURE_MASK_CET_USER | \
> > > > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> > > > > index b57d909facca..ba4172172afd 100644
> > > > > --- a/arch/x86/kernel/fpu/xstate.c
> > > > > +++ b/arch/x86/kernel/fpu/xstate.c
> > > > > @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
> > > > >    	/* Clean out dynamic features from default */
> > > > >    	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
> > > > >    	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
> > > > > +	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
> > > > >    
> > > > >    	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
> > > > >    	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
> > > > Best regards,
> > > > 	Maxim Levitsky
> > > > 
> > > > 
> > > > 
> > > > 
> > 
> >
Yang, Weijiang Dec. 8, 2023, 3:57 p.m. UTC | #8
On 12/7/2023 12:11 AM, Maxim Levitsky wrote:
> On Wed, 2023-12-06 at 11:00 +0800, Yang, Weijiang wrote:
>> On 12/5/2023 5:55 PM, Maxim Levitsky wrote:
>>> On Fri, 2023-12-01 at 15:49 +0800, Yang, Weijiang wrote:
>>>> On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
>>>>> On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
>>>>>> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
>>>>> I am not sure though that this name is correct, but I don't know if I can
>>>>> suggest a better name.
>>>> It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
>>>>>> optionally enabled by kernel components, i.e., the features are required by
>>>>>> specific kernel components. Currently it's used by KVM to configure guest
>>>>>> dedicated fpstate for calculating the xfeature and fpstate storage size etc.
>>>>>>
>>>>>> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
>>>>>> supported by host as they're enabled in xsaves/xrstors operating xfeature set
>>>>>> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
>>>>>> not enabled in host kernel so it can be omitted for normal fpstate by default.
>>>>>>
>>>>>> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
>>>>>> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
>>>>>> optimized by HW for normal fpstate.
>>>>>>
>>>>>> Suggested-by: Dave Hansen <dave.hansen@intel.com>
>>>>>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>>>>>> ---
>>>>>>     arch/x86/include/asm/fpu/xstate.h | 5 ++++-
>>>>>>     arch/x86/kernel/fpu/xstate.c      | 1 +
>>>>>>     2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
>>>>>> index 3b4a038d3c57..a212d3851429 100644
>>>>>> --- a/arch/x86/include/asm/fpu/xstate.h
>>>>>> +++ b/arch/x86/include/asm/fpu/xstate.h
>>>>>> @@ -46,9 +46,12 @@
>>>>>>     #define XFEATURE_MASK_USER_RESTORE	\
>>>>>>     	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
>>>>>>     
>>>>>> -/* Features which are dynamically enabled for a process on request */
>>>>>> +/* Features which are dynamically enabled per userspace request */
>>>>>>     #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
>>>>>>     
>>>>>> +/* Features which are dynamically enabled per kernel side request */
>>>>> I suggest to explain this a bit better. How about something like that:
>>>>>
>>>>> "Kernel features that are not enabled by default for all processes, but can
>>>>> be still used by some processes, for example to support guest virtualization"
>>>> It looks good to me, will apply it in next version, thanks!
>>>>
>>>>> But feel free to keep it as is or propose something else. IMHO this will
>>>>> be confusing this way or another.
>>>>>
>>>>>
>>>>> Another question: kernel already has a notion of 'independent features'
>>>>> which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
>>>>>
>>>>> Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
>>>>> from independent buffer (in case of LBRs, perf code cares for this).
>>>>>
>>>>> Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,
>>>> CET_S here refers to PL{0,1,2}_SSP, right?
>>>>
>>>> IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
>>>> owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.
>>> This is true, but the question that begs to be asked, is what is the true purpose of the 'independent features' is
>>> from the POV of the kernel FPU framework. IMHO these are features that the framework is not aware of, except
>>> that it enables it in IA32_XSS (and in XCR0 in the future).
>> This is the origin intention for introducing independent features(firstly called dynamic feature, renamed later), from the
>> changelog the major concern is overhead:
> Yes, and to some extent the reason why we want to have CET supervisor state not saved on normal thread's FPU state is also overhead,
> because in theory if the kernel did save it, the MSRs will be in INIT state and thus XSAVES shouldn't have any functional impact,
> even if it saves/restores them for nothing.

CET supervisor state in normal thread's FPU state won't always be in INIT state. Per SDM, it's INIT state is defined only if 3 MSRs are 0,
but if guest is using supervisor CET,  then with vCPU migration between pCPUs, more and more MSRs would hold non-zero contents.
This doesn't impact host kernel behavior because host CET_S is still disabled, but it does impact host XSAVES/XRSTORS behavior.

> In other words, as I said, independent features = features that FPU state doesn't manage, and are just optionally enabled,
> so that a custom code can do a custom xsave(s)/xrstor(s), likely from/to a custom area to save/load these features.
>
> It might make sense to rename independent features again to something like 'unmanaged features' or 'manual features' or something
> like that.
>
>
> Another interesting question that arises here, is once KVM supports arch LBRs, it will likely need to expose the XFEATURE_LBR
> to the guest and will need to context switch it similar to CET_S state, which strengthens the argument that CET_S should
> be in the same group as the 'independent features'.
>
> Depending on the performance impact, XFEATURE_LBR might even need to be dynamically allocated.

This is most likely true for fpu_guest_cfg instead of fpu_kernel_cfg, let me think it over, thanks for bring up this brilliant idea :-)

> For the reference this is the patch series that introduced the arch LBRs to KVM:
> https://www.spinics.net/lists/kvm/msg296507.html
>
>
> Best regards,
> 	Maxim Levitsky
>
>> commit f0dccc9da4c0fda049e99326f85db8c242fd781f
>> Author: Kan Liang <kan.liang@linux.intel.com>
>> Date:   Fri Jul 3 05:49:26 2020 -0700
>>
>>       x86/fpu/xstate: Support dynamic supervisor feature for LBR
>>
>> "However, the kernel should not save/restore the LBR state component at
>> each context switch, like other state components, because of the
>> following unique features of LBR:
>> - The LBR state component only contains valuable information when LBR
>>     is enabled in the perf subsystem, but for most of the time, LBR is
>>     disabled.
>> - The size of the LBR state component is huge. For the current
>>     platform, it's 808 bytes.
>> If the kernel saves/restores the LBR state at each context switch, for
>> most of the time, it is just a waste of space and cycles."
>>
>>> For the guest only features, like CET_S, it is also kind of the same thing (xsave but to guest state area only).
>>> I don't insist that we add CET_S to independent features, but I just gave an idea that maybe that is better
>>> from complexity point of view to add CET there. It's up to you to decide.
>>>
>>> Sean what do you think?
>>>
>>> Best regards,
>>> 	Maxim Levitsky
>>>
>>>
>>>> But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
>>>> Enabling it in common FPU framework would make the switch/swap much easier without additional
>>>> support code.
>>>>
>>>>>     and maybe rename the
>>>>> 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
>>>>> (terrible name, but you might think of a better name)
>>>>>
>>>>>
>>>>>> +#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
>>>>>> +
>>>>>>     /* All currently supported supervisor features */
>>>>>>     #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
>>>>>>     					    XFEATURE_MASK_CET_USER | \
>>>>>> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>>>>>> index b57d909facca..ba4172172afd 100644
>>>>>> --- a/arch/x86/kernel/fpu/xstate.c
>>>>>> +++ b/arch/x86/kernel/fpu/xstate.c
>>>>>> @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>>>>>>     	/* Clean out dynamic features from default */
>>>>>>     	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
>>>>>>     	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>>>> +	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
>>>>>>     
>>>>>>     	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>>>>>>     	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>>> Best regards,
>>>>> 	Maxim Levitsky
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>
diff mbox series

Patch

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 3b4a038d3c57..a212d3851429 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -46,9 +46,12 @@ 
 #define XFEATURE_MASK_USER_RESTORE	\
 	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
 
-/* Features which are dynamically enabled for a process on request */
+/* Features which are dynamically enabled per userspace request */
 #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
 
+/* Features which are dynamically enabled per kernel side request */
+#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
 					    XFEATURE_MASK_CET_USER | \
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index b57d909facca..ba4172172afd 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -824,6 +824,7 @@  void __init fpu__init_system_xstate(unsigned int legacy_size)
 	/* Clean out dynamic features from default */
 	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
 	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
+	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
 
 	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
 	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;