Message ID | 20230505175705.18098-4-alejandro.vallejo@cloud.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Add CpuidUserDis support | expand |
On 05.05.2023 19:57, Alejandro Vallejo wrote: > This is in order to aid guests of AMD hardware that we have exposed > CPUID faulting to. If they try to modify the Intel MSR that enables > the feature, trigger levelling so AMD's version of it (CpuidUserDis) > is used instead. > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > --- > xen/arch/x86/msr.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) Don't you also need to update cpu-policy.c:calculate_host_policy() for the guest to actually know it can use the functionality? Which in turn would appear to require some form of adjustment to lib/x86/policy.c:x86_cpu_policies_are_compatible(). > --- a/xen/arch/x86/msr.c > +++ b/xen/arch/x86/msr.c > @@ -431,6 +431,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val) > { > bool old_cpuid_faulting = msrs->misc_features_enables.cpuid_faulting; > > + /* > + * The boot CPU must support Intel's CPUID faulting _or_ > + * AMD's CpuidUserDis. > + */ > + bool can_fault_cpuid = cpu_has_cpuid_faulting || > + boot_cpu_has(X86_FEATURE_CPUID_USER_DIS); I'd like to suggest that in such a comment it not be emphasized that it's the boot CPU (alone) we check. In fact I'm not convinced any comment is needed here at all. I'm further inclined to suggest to omit this (single-use) variable altogether. Jan
On 08/05/2023 2:18 pm, Jan Beulich wrote: > On 05.05.2023 19:57, Alejandro Vallejo wrote: >> This is in order to aid guests of AMD hardware that we have exposed >> CPUID faulting to. If they try to modify the Intel MSR that enables >> the feature, trigger levelling so AMD's version of it (CpuidUserDis) >> is used instead. >> >> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> >> --- >> xen/arch/x86/msr.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) > Don't you also need to update cpu-policy.c:calculate_host_policy() > for the guest to actually know it can use the functionality? Which > in turn would appear to require some form of adjustment to > lib/x86/policy.c:x86_cpu_policies_are_compatible(). I asked Alejandro to do it like this. Advertising this to guests requires plumbing another MSR into the infrastructure which isn't quite set up properly let, and is in flux from my work. For now, this just lets Xen enforce the policy over PV guests, which is an improvement in and of itself. ~Andrew
On 09.05.2023 12:05, Andrew Cooper wrote: > On 08/05/2023 2:18 pm, Jan Beulich wrote: >> On 05.05.2023 19:57, Alejandro Vallejo wrote: >>> This is in order to aid guests of AMD hardware that we have exposed >>> CPUID faulting to. If they try to modify the Intel MSR that enables >>> the feature, trigger levelling so AMD's version of it (CpuidUserDis) >>> is used instead. >>> >>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> >>> --- >>> xen/arch/x86/msr.c | 9 ++++++++- >>> 1 file changed, 8 insertions(+), 1 deletion(-) >> Don't you also need to update cpu-policy.c:calculate_host_policy() >> for the guest to actually know it can use the functionality? Which >> in turn would appear to require some form of adjustment to >> lib/x86/policy.c:x86_cpu_policies_are_compatible(). > > I asked Alejandro to do it like this. > > Advertising this to guests requires plumbing another MSR into the > infrastructure which isn't quite set up properly let, and is in flux > from my work. > > For now, this just lets Xen enforce the policy over PV guests, which is > an improvement in and of itself. But as per the title this patch is about HVM guests (aiui the PV aspect is taken care of already without the patch here). In any event - if the omissions are intentional (for the time being), then I think that wants mentioning in the description. Jan
On Tue, May 09, 2023 at 04:41:49PM +0200, Jan Beulich wrote: > > I asked Alejandro to do it like this. > > > > Advertising this to guests requires plumbing another MSR into the > > infrastructure which isn't quite set up properly let, and is in flux > > from my work. > > > > For now, this just lets Xen enforce the policy over PV guests, which is > > an improvement in and of itself. > > But as per the title this patch is about HVM guests (aiui the PV aspect > is taken care of already without the patch here). In any event - if the > omissions are intentional (for the time being), then I think that wants > mentioning in the description. > > Jan HVM guests are always exposed the Intel interface (emulated if not natively available). The HVM max policy forces it on, and I don't see anything in the default policy overriding it. My attempt here was to let AMD guests use the emulated Intel MSR and trigger levellling that would itself rely on AMD's CpuidUserDis without guest intervention. That said, several cans of worms exist in mantaining this internal routing. I'll get rid of that last patch and leave HVM guests alone for the time being. They are functionally correct (albeit their CPUID take 2 faults, whereas 1 would suffice). Alejandro
On 09.05.2023 12:05, Andrew Cooper wrote: > On 08/05/2023 2:18 pm, Jan Beulich wrote: >> On 05.05.2023 19:57, Alejandro Vallejo wrote: >>> This is in order to aid guests of AMD hardware that we have exposed >>> CPUID faulting to. If they try to modify the Intel MSR that enables >>> the feature, trigger levelling so AMD's version of it (CpuidUserDis) >>> is used instead. >>> >>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> >>> --- >>> xen/arch/x86/msr.c | 9 ++++++++- >>> 1 file changed, 8 insertions(+), 1 deletion(-) >> Don't you also need to update cpu-policy.c:calculate_host_policy() >> for the guest to actually know it can use the functionality? Which >> in turn would appear to require some form of adjustment to >> lib/x86/policy.c:x86_cpu_policies_are_compatible(). > > I asked Alejandro to do it like this. > > Advertising this to guests requires plumbing another MSR into the > infrastructure which isn't quite set up properly let, and is in flux > from my work. Maybe there was some misunderstanding here, as I realize only now. I wasn't asking to expose the AMD feature, but instead I was after /* 0x000000ce MSR_INTEL_PLATFORM_INFO */ /* probe_cpuid_faulting() sanity checks presence of MISC_FEATURES_ENABLES */ p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting; wanting to be extended by "|| boot_cpu_has(X86_FEATURE_CPUID_USER_DIS)". That, afaict, has no connection to plumbing yet another MSR. Jan
On Wed, May 10, 2023 at 10:15:31AM +0200, Jan Beulich wrote: > On 09.05.2023 12:05, Andrew Cooper wrote: > > On 08/05/2023 2:18 pm, Jan Beulich wrote: > >> On 05.05.2023 19:57, Alejandro Vallejo wrote: > >>> This is in order to aid guests of AMD hardware that we have exposed > >>> CPUID faulting to. If they try to modify the Intel MSR that enables > >>> the feature, trigger levelling so AMD's version of it (CpuidUserDis) > >>> is used instead. > >>> > >>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > >>> --- > >>> xen/arch/x86/msr.c | 9 ++++++++- > >>> 1 file changed, 8 insertions(+), 1 deletion(-) > >> Don't you also need to update cpu-policy.c:calculate_host_policy() > >> for the guest to actually know it can use the functionality? Which > >> in turn would appear to require some form of adjustment to > >> lib/x86/policy.c:x86_cpu_policies_are_compatible(). > > > > I asked Alejandro to do it like this. > > > > Advertising this to guests requires plumbing another MSR into the > > infrastructure which isn't quite set up properly let, and is in flux > > from my work. > > Maybe there was some misunderstanding here, as I realize only now. I > wasn't asking to expose the AMD feature, but instead I was after > > /* 0x000000ce MSR_INTEL_PLATFORM_INFO */ > /* probe_cpuid_faulting() sanity checks presence of MISC_FEATURES_ENABLES */ > p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting; > > wanting to be extended by "|| boot_cpu_has(X86_FEATURE_CPUID_USER_DIS)". > That, afaict, has no connection to plumbing yet another MSR. > > Jan Let me backtrack a little. There's 2 different (but related) matters under discussion: 1. Trapping PV guest attempts to run the cpuid instruction 2. Providing a virtualized CPUID faulting interface so a guest kernel can intercept the CPUID instructions of code running under it. This series was meant to provide fix (1) on AMD hardware. I did go a bit beyond in v1, trying to provide support for a unified faulting mechanism on AMD, but providing a virtualized CPUID faulting interface needs to be done a bit more carefully than that. Doing it partially now just adds tech debt to be paid when CpuidUserDis is exposed to the domains. Changing the policy to expose the Intel interface when AMD is the backend falls on (2), which is probably better off done separately in a unified way. v2 removes the changes to x86/msr.c so only (1) is addressed. Does this make sense? Alejandro
On 10.05.2023 12:52, Alejandro Vallejo wrote: > On Wed, May 10, 2023 at 10:15:31AM +0200, Jan Beulich wrote: >> On 09.05.2023 12:05, Andrew Cooper wrote: >>> On 08/05/2023 2:18 pm, Jan Beulich wrote: >>>> On 05.05.2023 19:57, Alejandro Vallejo wrote: >>>>> This is in order to aid guests of AMD hardware that we have exposed >>>>> CPUID faulting to. If they try to modify the Intel MSR that enables >>>>> the feature, trigger levelling so AMD's version of it (CpuidUserDis) >>>>> is used instead. >>>>> >>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> >>>>> --- >>>>> xen/arch/x86/msr.c | 9 ++++++++- >>>>> 1 file changed, 8 insertions(+), 1 deletion(-) >>>> Don't you also need to update cpu-policy.c:calculate_host_policy() >>>> for the guest to actually know it can use the functionality? Which >>>> in turn would appear to require some form of adjustment to >>>> lib/x86/policy.c:x86_cpu_policies_are_compatible(). >>> >>> I asked Alejandro to do it like this. >>> >>> Advertising this to guests requires plumbing another MSR into the >>> infrastructure which isn't quite set up properly let, and is in flux >>> from my work. >> >> Maybe there was some misunderstanding here, as I realize only now. I >> wasn't asking to expose the AMD feature, but instead I was after >> >> /* 0x000000ce MSR_INTEL_PLATFORM_INFO */ >> /* probe_cpuid_faulting() sanity checks presence of MISC_FEATURES_ENABLES */ >> p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting; >> >> wanting to be extended by "|| boot_cpu_has(X86_FEATURE_CPUID_USER_DIS)". >> That, afaict, has no connection to plumbing yet another MSR. > > Let me backtrack a little. There's 2 different (but related) matters under > discussion: > > 1. Trapping PV guest attempts to run the cpuid instruction > 2. Providing a virtualized CPUID faulting interface so a guest kernel > can intercept the CPUID instructions of code running under it. > > This series was meant to provide fix (1) on AMD hardware. I did go a bit > beyond in v1, trying to provide support for a unified faulting mechanism > on AMD, but providing a virtualized CPUID faulting interface needs to be > done a bit more carefully than that. Doing it partially now just adds tech > debt to be paid when CpuidUserDis is exposed to the domains. > > Changing the policy to expose the Intel interface when AMD is the backend > falls on (2), which is probably better off done separately in a unified way. > > v2 removes the changes to x86/msr.c so only (1) is addressed. > > Does this make sense? Certainly. Nevertheless I would like to understand what Andrew meant, even if only for staying better in sync with all the work he has been doing (and is still planning to do) in the area of CPU policies and leveling. Jan
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index ecf126566d..984aedf180 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -431,6 +431,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val) { bool old_cpuid_faulting = msrs->misc_features_enables.cpuid_faulting; + /* + * The boot CPU must support Intel's CPUID faulting _or_ + * AMD's CpuidUserDis. + */ + bool can_fault_cpuid = cpu_has_cpuid_faulting || + boot_cpu_has(X86_FEATURE_CPUID_USER_DIS); + rsvd = ~0ull; if ( cp->platform_info.cpuid_faulting ) rsvd &= ~MSR_MISC_FEATURES_CPUID_FAULTING; @@ -440,7 +447,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val) msrs->misc_features_enables.raw = val; - if ( v == curr && is_hvm_domain(d) && cpu_has_cpuid_faulting && + if ( v == curr && is_hvm_domain(d) && can_fault_cpuid && (old_cpuid_faulting ^ msrs->misc_features_enables.cpuid_faulting) ) ctxt_switch_levelling(v); break;
This is in order to aid guests of AMD hardware that we have exposed CPUID faulting to. If they try to modify the Intel MSR that enables the feature, trigger levelling so AMD's version of it (CpuidUserDis) is used instead. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/msr.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)