Message ID | 20230914063325.85503-15-weijiang.yang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enable CET Virtualization | expand |
On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > From: Sean Christopherson <seanjc@google.com> > > Load the guest's FPU state if userspace is accessing MSRs whose values > are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(), > to facilitate access to such kind of MSRs. > > If MSRs supported in kvm_caps.supported_xss are passed through to guest, > the guest MSRs are swapped with host's before vCPU exits to userspace and > after it re-enters kernel before next VM-entry. > > Because the modified code is also used for the KVM_GET_MSRS device ioctl(), > explicitly check @vcpu is non-null before attempting to load guest state. > The XSS supporting MSRs cannot be retrieved via the device ioctl() without > loading guest FPU state (which doesn't exist). > > Note that guest_cpuid_has() is not queried as host userspace is allowed to > access MSRs that have not been exposed to the guest, e.g. it might do > KVM_SET_MSRS prior to KVM_SET_CPUID2. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > Co-developed-by: Yang Weijiang <weijiang.yang@intel.com> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> > --- > arch/x86/kvm/x86.c | 30 +++++++++++++++++++++++++++++- > arch/x86/kvm/x86.h | 24 ++++++++++++++++++++++++ > 2 files changed, 53 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 66edbed25db8..a091764bf1d2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > static DEFINE_MUTEX(vendor_module_lock); > +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); > +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); > + > struct kvm_x86_ops kvm_x86_ops __read_mostly; > > #define KVM_X86_OP(func) \ > @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > } > EXPORT_SYMBOL_GPL(kvm_get_msr_common); > > +static const u32 xstate_msrs[] = { > + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, > + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, > +}; > + > +static bool is_xstate_msr(u32 index) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { > + if (index == xstate_msrs[i]) > + return true; > + } > + return false; > +} The name 'xstate_msr' IMHO is not clear. How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: "These msrs are context switched together with the rest of the guest FPU state, on exit/entry to/from userspace There is also an assumption that loading guest values while the host kernel runs, doesn't cause harm to the host kernel" But if you prefer something else, its fine with me, but I do appreciate to have some comment attached to 'xstate_msr' at least. > + > /* > * Read or write a bunch of msrs. All parameters are kernel addresses. > * > @@ -4382,11 +4401,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, > int (*do_msr)(struct kvm_vcpu *vcpu, > unsigned index, u64 *data)) > { > + bool fpu_loaded = false; > int i; > > - for (i = 0; i < msrs->nmsrs; ++i) > + for (i = 0; i < msrs->nmsrs; ++i) { > + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && > + is_xstate_msr(entries[i].index)) { A comment here about why this is done, will also be appreciated: "Userspace requested us to read a MSR which value resides in the guest FPU state. Load this state temporarily to CPU to read/update it." > + kvm_load_guest_fpu(vcpu); > + fpu_loaded = true; > + } > if (do_msr(vcpu, entries[i].index, &entries[i].data)) > break; > + } And maybe here too: "If KVM loaded the guest FPU state, unload to it to restore the original userspace FPU state and to update the guest FPU state in case it was modified." > + if (fpu_loaded) > + kvm_put_guest_fpu(vcpu); > > return i; > } > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 1e7be1f6ab29..9a8e3a84eaf4 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -540,4 +540,28 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, > unsigned int port, void *data, unsigned int count, > int in); > > +/* > + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated > + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest, > + * guest FPU should have been loaded already. > + */ > + > +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, > + struct msr_data *msr_info) > +{ > + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); > + kvm_fpu_get(); > + rdmsrl(msr_info->index, msr_info->data); > + kvm_fpu_put(); > +} > + > +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, > + struct msr_data *msr_info) > +{ > + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); > + kvm_fpu_get(); > + wrmsrl(msr_info->index, msr_info->data); > + kvm_fpu_put(); > +} These functions are not used in the patch. I think they should be added later when used. Best regards, Maxim Levitsky > + > #endif
On Tue, Oct 31, 2023, Maxim Levitsky wrote: > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 66edbed25db8..a091764bf1d2 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > > > static DEFINE_MUTEX(vendor_module_lock); > > +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); > > +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); > > + > > struct kvm_x86_ops kvm_x86_ops __read_mostly; > > > > #define KVM_X86_OP(func) \ > > @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > } > > EXPORT_SYMBOL_GPL(kvm_get_msr_common); > > > > +static const u32 xstate_msrs[] = { > > + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, > > + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, > > +}; > > + > > +static bool is_xstate_msr(u32 index) > > +{ > > + int i; > > + > > + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { > > + if (index == xstate_msrs[i]) > > + return true; > > + } > > + return false; > > +} > > The name 'xstate_msr' IMHO is not clear. > > How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: Maybe xstate_managed_msrs? I'd prefer not to include "guest" because the behavior is more a property of the architecture and/or the host kernel. I understand where you're coming from, but it's the MSR *values* are part of guest state, whereas the check is a query on how KVM manages the MSR value, if that makes sense. And I really don't like "FPU". I get why the the kernel uses the "FPU" terminology, but for this check in particular I want to tie the behavior back to the architecture, i.e. provide the hint that the reason why these MSRs are special is because Intel defined them to be context switched via XSTATE. Actually, this is unnecesary bikeshedding to some extent, using an array is silly. It's easier and likely far more performant (not that that matters in this case) to use a switch statement. Is this better? /* * Returns true if the MSR in question is managed via XSTATE, i.e. is context * switched with the rest of guest FPU state. */ static bool is_xstate_managed_msr(u32 index) { switch (index) { case MSR_IA32_U_CET: case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: return true; default: return false; } } /* * Read or write a bunch of msrs. All parameters are kernel addresses. * * @return number of msrs set successfully. */ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, struct kvm_msr_entry *entries, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { bool fpu_loaded = false; int i; for (i = 0; i < msrs->nmsrs; ++i) { /* * If userspace is accessing one or more XSTATE-managed MSRs, * temporarily load the guest's FPU state so that the guest's * MSR value(s) is resident in hardware, i.e. so that KVM can * get/set the MSR via RDMSR/WRMSR. */ if (vcpu && !fpu_loaded && kvm_caps.supported_xss && is_xstate_managed_msr(entries[i].index)) { kvm_load_guest_fpu(vcpu); fpu_loaded = true; } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; } if (fpu_loaded) kvm_put_guest_fpu(vcpu); return i; }
On Wed, 2023-11-01 at 11:05 -0700, Sean Christopherson wrote: > On Tue, Oct 31, 2023, Maxim Levitsky wrote: > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 66edbed25db8..a091764bf1d2 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > > static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > > > > > static DEFINE_MUTEX(vendor_module_lock); > > > +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); > > > +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); > > > + > > > struct kvm_x86_ops kvm_x86_ops __read_mostly; > > > > > > #define KVM_X86_OP(func) \ > > > @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > } > > > EXPORT_SYMBOL_GPL(kvm_get_msr_common); > > > > > > +static const u32 xstate_msrs[] = { > > > + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, > > > + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, > > > +}; > > > + > > > +static bool is_xstate_msr(u32 index) > > > +{ > > > + int i; > > > + > > > + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { > > > + if (index == xstate_msrs[i]) > > > + return true; > > > + } > > > + return false; > > > +} > > > > The name 'xstate_msr' IMHO is not clear. > > > > How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: > > Maybe xstate_managed_msrs? I'd prefer not to include "guest" because the behavior > is more a property of the architecture and/or the host kernel. I understand where > you're coming from, but it's the MSR *values* are part of guest state, whereas the > check is a query on how KVM manages the MSR value, if that makes sense. Makes sense. > > And I really don't like "FPU". I get why the the kernel uses the "FPU" terminology, > but for this check in particular I want to tie the behavior back to the architecture, > i.e. provide the hint that the reason why these MSRs are special is because Intel > defined them to be context switched via XSTATE. > > Actually, this is unnecesary bikeshedding to some extent, using an array is silly. > It's easier and likely far more performant (not that that matters in this case) > to use a switch statement. > > Is this better? > > /* > * Returns true if the MSR in question is managed via XSTATE, i.e. is context > * switched with the rest of guest FPU state. > */ > static bool is_xstate_managed_msr(u32 index) > { > switch (index) { > case MSR_IA32_U_CET: > case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: > return true; > default: > return false; > } > } Reasonable. > > /* > * Read or write a bunch of msrs. All parameters are kernel addresses. > * > * @return number of msrs set successfully. > */ > static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, > struct kvm_msr_entry *entries, > int (*do_msr)(struct kvm_vcpu *vcpu, > unsigned index, u64 *data)) > { > bool fpu_loaded = false; > int i; > > for (i = 0; i < msrs->nmsrs; ++i) { > /* > * If userspace is accessing one or more XSTATE-managed MSRs, > * temporarily load the guest's FPU state so that the guest's > * MSR value(s) is resident in hardware, i.e. so that KVM can > * get/set the MSR via RDMSR/WRMSR. > */ Reasonable as well. > if (vcpu && !fpu_loaded && kvm_caps.supported_xss && > is_xstate_managed_msr(entries[i].index)) { > kvm_load_guest_fpu(vcpu); > fpu_loaded = true; > } > if (do_msr(vcpu, entries[i].index, &entries[i].data)) > break; > } > if (fpu_loaded) > kvm_put_guest_fpu(vcpu); > > return i; > } > Best regards, Maxim Levitsky
On 11/2/2023 2:05 AM, Sean Christopherson wrote: > On Tue, Oct 31, 2023, Maxim Levitsky wrote: >> On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 66edbed25db8..a091764bf1d2 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); >>> static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); >>> >>> static DEFINE_MUTEX(vendor_module_lock); >>> +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); >>> +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); >>> + >>> struct kvm_x86_ops kvm_x86_ops __read_mostly; >>> >>> #define KVM_X86_OP(func) \ >>> @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >>> } >>> EXPORT_SYMBOL_GPL(kvm_get_msr_common); >>> >>> +static const u32 xstate_msrs[] = { >>> + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, >>> + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, >>> +}; >>> + >>> +static bool is_xstate_msr(u32 index) >>> +{ >>> + int i; >>> + >>> + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { >>> + if (index == xstate_msrs[i]) >>> + return true; >>> + } >>> + return false; >>> +} >> The name 'xstate_msr' IMHO is not clear. >> >> How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: > Maybe xstate_managed_msrs? I'd prefer not to include "guest" because the behavior > is more a property of the architecture and/or the host kernel. I understand where > you're coming from, but it's the MSR *values* are part of guest state, whereas the > check is a query on how KVM manages the MSR value, if that makes sense. > > And I really don't like "FPU". I get why the the kernel uses the "FPU" terminology, > but for this check in particular I want to tie the behavior back to the architecture, > i.e. provide the hint that the reason why these MSRs are special is because Intel > defined them to be context switched via XSTATE. > > Actually, this is unnecesary bikeshedding to some extent, using an array is silly. > It's easier and likely far more performant (not that that matters in this case) > to use a switch statement. > > Is this better? The change looks good to me! Thanks! > /* > * Returns true if the MSR in question is managed via XSTATE, i.e. is context > * switched with the rest of guest FPU state. > */ > static bool is_xstate_managed_msr(u32 index) How about is_xfeature_msr()? xfeature is XSAVE-Supported-Feature, just to align with SDM convention. > { > switch (index) { > case MSR_IA32_U_CET: > case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: > return true; > default: > return false; > } > } > > /* > * Read or write a bunch of msrs. All parameters are kernel addresses. > * > * @return number of msrs set successfully. > */ > static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, > struct kvm_msr_entry *entries, > int (*do_msr)(struct kvm_vcpu *vcpu, > unsigned index, u64 *data)) > { > bool fpu_loaded = false; > int i; > > for (i = 0; i < msrs->nmsrs; ++i) { > /* > * If userspace is accessing one or more XSTATE-managed MSRs, > * temporarily load the guest's FPU state so that the guest's > * MSR value(s) is resident in hardware, i.e. so that KVM can > * get/set the MSR via RDMSR/WRMSR. > */ > if (vcpu && !fpu_loaded && kvm_caps.supported_xss && > is_xstate_managed_msr(entries[i].index)) { > kvm_load_guest_fpu(vcpu); > fpu_loaded = true; > } > if (do_msr(vcpu, entries[i].index, &entries[i].data)) > break; > } > if (fpu_loaded) > kvm_put_guest_fpu(vcpu); > > return i; > }
On Fri, Nov 03, 2023, Weijiang Yang wrote: > On 11/2/2023 2:05 AM, Sean Christopherson wrote: > > /* > > * Returns true if the MSR in question is managed via XSTATE, i.e. is context > > * switched with the rest of guest FPU state. > > */ > > static bool is_xstate_managed_msr(u32 index) > > How about is_xfeature_msr()? xfeature is XSAVE-Supported-Feature, just to align with SDM > convention. My vote remains for is_xstate_managed_msr(). is_xfeature_msr() could also refer to MSRs that control XSTATE features, e.g. XSS.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 66edbed25db8..a091764bf1d2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; #define KVM_X86_OP(func) \ @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static const u32 xstate_msrs[] = { + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, +}; + +static bool is_xstate_msr(u32 index) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { + if (index == xstate_msrs[i]) + return true; + } + return false; +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4382,11 +4401,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xstate_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 1e7be1f6ab29..9a8e3a84eaf4 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -540,4 +540,28 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +/* + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest, + * guest FPU should have been loaded already. + */ + +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + rdmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + wrmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + #endif