Message ID | 20230706194022.2485195-2-john.allen@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix MCE handling on AMD hosts | expand |
Hi John, Thanks for the patches. Few comments below. On 7/6/23 14:40, John Allen wrote: > Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to > be exposed to guests to allow them to handle machine check exceptions on AMD > hosts. > > Reported-by: William Roche <william.roche@oracle.com> > Signed-off-by: John Allen <john.allen@amd.com> > --- > target/i386/cpu.c | 2 +- > target/i386/cpu.h | 4 ++++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 06009b80e8..09fae9337a 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -5874,7 +5874,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > break; > case 0x80000007: > *eax = 0; > - *ebx = 0; > + *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR; This is adding this feature unconditionally which does not seem right. Couple of things. 1. Add the feature word for SUCCOR. Users can add this feature using the feature word "+succor". 2. Also define CPUID_8000_0007_EBX_SUCCOR : In this case, we can add this feature as part of the EPYC Model update. Thanks Babu
+x86 qemu folks On 06/07/2023 21:22, Moger, Babu wrote: > Hi John, > Thanks for the patches. Few comments below. > > On 7/6/23 14:40, John Allen wrote: >> Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to >> be exposed to guests to allow them to handle machine check exceptions on AMD >> hosts. >> >> Reported-by: William Roche <william.roche@oracle.com> >> Signed-off-by: John Allen <john.allen@amd.com> >> --- >> target/i386/cpu.c | 2 +- >> target/i386/cpu.h | 4 ++++ >> 2 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >> index 06009b80e8..09fae9337a 100644 >> --- a/target/i386/cpu.c >> +++ b/target/i386/cpu.c >> @@ -5874,7 +5874,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, >> break; >> case 0x80000007: >> *eax = 0; >> - *ebx = 0; >> + *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR; > > This is adding this feature unconditionally which does not seem right. > Couple of things. > 1. Add the feature word for SUCCOR. Users can add this feature using the > feature word "+succor". > The feature requires no specific SVM/KVM support, it's really just a bit enumerated to the guest that's used to tell whether AMD MCE handling is supported, instead of blantly crashing outright. So I think the thinking here is that GET_SUPPORTED_CPUID won't return SUCCOR and the feature gets filtered out if it's declared as a regular named feature like you do in the CPU models. Making it relevant for any hypervisor kernel. > 2. Also define CPUID_8000_0007_EBX_SUCCOR : In this case, we can add this > feature as part of the EPYC Model update. > how would you set CPUID regardless of the hypervisor enumerates should it not require active emulation by the host? I am sort of interested on what is the right way to do this, as it would be better to do this regardless of the gets advertised by ioctls alike. > Thanks > Babu >
On 7/6/23 21:40, John Allen wrote: > case 0x80000007: > *eax = 0; > - *ebx = 0; > + *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR; > *ecx = 0; > *edx = env->features[FEAT_8000_0007_EDX]; > break; I agree that it needs no hypervisor support, but Babu is right that you cannot add it unconditionally (especially not on Intel processors). You can special case CPUID_8000_0007_EBX_SUCCOR in kvm_arch_get_supported_cpuid() so that it is added even on old kernels. There are already several such cases. Adding it to KVM is nice to have anyway, so please send a patch for that. Also, the patch does not compile (probably you missed a prerequisite) as it lacks all the rigamarole that is needed to add FEAT_8000_0007_EBX. Paolo
On Fri, Jul 07, 2023 at 04:25:22PM +0200, Paolo Bonzini wrote: > On 7/6/23 21:40, John Allen wrote: > > case 0x80000007: > > *eax = 0; > > - *ebx = 0; > > + *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR; > > *ecx = 0; > > *edx = env->features[FEAT_8000_0007_EDX]; > > break; > > I agree that it needs no hypervisor support, but Babu is right that you > cannot add it unconditionally (especially not on Intel processors). > > You can special case CPUID_8000_0007_EBX_SUCCOR in > kvm_arch_get_supported_cpuid() so that it is added even on old kernels. > There are already several such cases. Adding it to KVM is nice to have > anyway, so please send a patch for that. By adding it to KVM do you mean adding a patch to the kernel to expose the cpuid bit? Or do you mean just adding the special case to kvm_arch_get_supported_cpuid? For the kvm_arch_get_supported_cpuid case, I don't understand how this would be different from unconditionally exposing the bit as done above. Can you help me understand what you have in mind for this? I might add a case like below: ... } else if (function == 0x80000007 && reg == R_EBX) { ret |= CPUID_8000_0007_EBX_SUCCOR; ... If we wanted to only expose the bit for AMD cpus, we would then need to call IS_AMD_CPU with the CPUX86State as a paramter which would mean that kvm_arch_get_supported_cpuid and all of its callers would need to take the CPUX86State as a parameter. Is there another way to differentiate between AMD and Intel cpus in this case? > > Also, the patch does not compile (probably you missed a prerequisite) as it > lacks all the rigamarole that is needed to add FEAT_8000_0007_EBX. I'm not encountering any compilation issues. What are the errors that you are seeing? Thanks, John
On 12/07/2023 20:11, John Allen wrote: > On Fri, Jul 07, 2023 at 04:25:22PM +0200, Paolo Bonzini wrote: >> On 7/6/23 21:40, John Allen wrote: >>> case 0x80000007: >>> *eax = 0; >>> - *ebx = 0; >>> + *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR; >>> *ecx = 0; >>> *edx = env->features[FEAT_8000_0007_EDX]; >>> break; >> >> I agree that it needs no hypervisor support, but Babu is right that you >> cannot add it unconditionally (especially not on Intel processors). >> >> You can special case CPUID_8000_0007_EBX_SUCCOR in >> kvm_arch_get_supported_cpuid() so that it is added even on old kernels. >> There are already several such cases. Adding it to KVM is nice to have >> anyway, so please send a patch for that. > > By adding it to KVM do you mean adding a patch to the kernel to expose > the cpuid bit? Or do you mean just adding the special case to > kvm_arch_get_supported_cpuid? > > For the kvm_arch_get_supported_cpuid case, I don't understand how this > would be different from unconditionally exposing the bit as done above. > Can you help me understand what you have in mind for this? > > I might add a case like below: > ... > } else if (function == 0x80000007 && reg == R_EBX) { > ret |= CPUID_8000_0007_EBX_SUCCOR; > ... > IIUC the thinking here is to cover a hypervisor that doesn't advertise SUCCOR via supported cpuid from the kernel, by adding that elif statement above. So when a CPU model is configured, the filtering of cpuid leafs logic takes place, it doesn't "dissappear" when the values from the kernel isn't set. Otherwise, if it's advertised by the kernel, everything goes as normal in this filtering logic and everything works. This presumes that you set "succor" feature string in the CPUID model definition instead of unilaterally defining it like this patch. Such that only AMD CPU models advertise the feature instead of everybody. > If we wanted to only expose the bit for AMD cpus, we would then need to > call IS_AMD_CPU with the CPUX86State as a paramter which would mean that > kvm_arch_get_supported_cpuid and all of its callers would need to take > the CPUX86State as a parameter. Is there another way to differentiate > between AMD and Intel cpus in this case? > There might be an implicit convetion here where the caller is the one selecting all the bits -- so kvm_arch_get_supported_cpuid won't care to validate which vendor supports a given bit. Usually steered down via CPU model names (-cpu Genoa), or named CPU features (-cpu Genoa,-feature-name) e.g. Nothing stops you from trying to use a Intel CPUID on a AMD host. So the IS_AMD_CPU inside kvm_arch_get_supported_cpuid doesn't seem to matter Joao
diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 06009b80e8..09fae9337a 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5874,7 +5874,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, break; case 0x80000007: *eax = 0; - *ebx = 0; + *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR; *ecx = 0; *edx = env->features[FEAT_8000_0007_EDX]; break; diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 3edaad7688..fccb9b5a97 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -555,6 +555,7 @@ typedef enum FeatureWord { FEAT_7_1_EAX, /* CPUID[EAX=7,ECX=1].EAX */ FEAT_8000_0001_EDX, /* CPUID[8000_0001].EDX */ FEAT_8000_0001_ECX, /* CPUID[8000_0001].ECX */ + FEAT_8000_0007_EBX, /* CPUID[8000_0007].EBX */ FEAT_8000_0007_EDX, /* CPUID[8000_0007].EDX */ FEAT_8000_0008_EBX, /* CPUID[8000_0008].EBX */ FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */ @@ -856,6 +857,9 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS]; /* Packets which contain IP payload have LIP values */ #define CPUID_14_0_ECX_LIP (1U << 31) +/* RAS Features */ +#define CPUID_8000_0007_EBX_SUCCOR (1U << 1) + /* CLZERO instruction */ #define CPUID_8000_0008_EBX_CLZERO (1U << 0) /* Always save/restore FP error pointers */
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to be exposed to guests to allow them to handle machine check exceptions on AMD hosts. Reported-by: William Roche <william.roche@oracle.com> Signed-off-by: John Allen <john.allen@amd.com> --- target/i386/cpu.c | 2 +- target/i386/cpu.h | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-)