Message ID | 20230726204157.3604531-2-john.allen@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix MCE handling on AMD hosts | expand |
On 26/07/2023 21:41, John Allen wrote: > Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to > be exposed to guests to allow them to handle machine check exceptions on AMD > hosts. > > Reported-by: William Roche <william.roche@oracle.com> > Signed-off-by: John Allen <john.allen@amd.com> I think this is matching the last discussion: Reviewed-by: Joao Martins <joao.m.martins@oracle.com> The patch ordering doesn't look correct though. Perhaps we should expose succor only after MCE is fixed so this patch would be the second, not the first? Also, this should in generally be OK for -cpu host, but might be missing a third patch that adds "succor" to the AMD models e.g. diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 97ad229d8ba3..d132cb3bbbbe 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -4731,6 +4731,16 @@ static const X86CPUDefinition builtin_x86_defs[] = { { /* end of list */ } }, }, + { + .version = 5, + .props = (PropValue[]) { + /* Erratum 1386 */ + { "model-id", + "AMD EPYC-Rome-v5 Processor" }, + { "succor", "on" }, + { /* end of list */ } + }, + }, { /* end of list */ } } }, @@ -4806,6 +4816,16 @@ static const X86CPUDefinition builtin_x86_defs[] = { }, .cache_info = &epyc_milan_v2_cache_info }, + { + .version = 3, + .props = (PropValue[]) { + /* Erratum 1386 */ + { "model-id", + "AMD EPYC-Milan-v3 Processor" }, + { "succor", "on" }, + { /* end of list */ } + }, + }, { /* end of list */ } } }, @@ -4880,6 +4900,20 @@ static const X86CPUDefinition builtin_x86_defs[] = { .xlevel = 0x80000022, .model_id = "AMD EPYC-Genoa Processor", .cache_info = &epyc_genoa_cache_info, + .versions = (X86CPUVersionDefinition[]) { + { .version = 1 }, + { + .version = 2, + .props = (PropValue[]) { + /* Erratum 1386 */ + { "model-id", + "AMD EPYC-Genoa-v2 Processor" }, + { "succor", "on" }, + { /* end of list */ } + }, + }, + { /* end of list */ } + } },
On Fri, Sep 01, 2023 at 11:30:53AM +0100, Joao Martins wrote: > On 26/07/2023 21:41, John Allen wrote: > > Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to > > be exposed to guests to allow them to handle machine check exceptions on AMD > > hosts. > > > > Reported-by: William Roche <william.roche@oracle.com> > > Signed-off-by: John Allen <john.allen@amd.com> > > I think this is matching the last discussion: > > Reviewed-by: Joao Martins <joao.m.martins@oracle.com> > > The patch ordering doesn't look correct though. Perhaps we should expose succor > only after MCE is fixed so this patch would be the second, not the first? Yes, that makes sense. I will address this and send another version of the series with the correct ordering. > > Also, this should in generally be OK for -cpu host, but might be missing a third > patch that adds "succor" to the AMD models e.g. Babu, I think we previously discussed adding this to the models later in a separate series. Is this your preferred course of action or can we add it with this series? Thanks, John > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 97ad229d8ba3..d132cb3bbbbe 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -4731,6 +4731,16 @@ static const X86CPUDefinition builtin_x86_defs[] = { > { /* end of list */ } > }, > }, > + { > + .version = 5, > + .props = (PropValue[]) { > + /* Erratum 1386 */ > + { "model-id", > + "AMD EPYC-Rome-v5 Processor" }, > + { "succor", "on" }, > + { /* end of list */ } > + }, > + }, > { /* end of list */ } > } > }, > @@ -4806,6 +4816,16 @@ static const X86CPUDefinition builtin_x86_defs[] = { > }, > .cache_info = &epyc_milan_v2_cache_info > }, > + { > + .version = 3, > + .props = (PropValue[]) { > + /* Erratum 1386 */ > + { "model-id", > + "AMD EPYC-Milan-v3 Processor" }, > + { "succor", "on" }, > + { /* end of list */ } > + }, > + }, > { /* end of list */ } > } > }, > @@ -4880,6 +4900,20 @@ static const X86CPUDefinition builtin_x86_defs[] = { > .xlevel = 0x80000022, > .model_id = "AMD EPYC-Genoa Processor", > .cache_info = &epyc_genoa_cache_info, > + .versions = (X86CPUVersionDefinition[]) { > + { .version = 1 }, > + { > + .version = 2, > + .props = (PropValue[]) { > + /* Erratum 1386 */ > + { "model-id", > + "AMD EPYC-Genoa-v2 Processor" }, > + { "succor", "on" }, > + { /* end of list */ } > + }, > + }, > + { /* end of list */ } > + } > },
Hi John, On 9/5/2023 10:01 AM, John Allen wrote: > On Fri, Sep 01, 2023 at 11:30:53AM +0100, Joao Martins wrote: >> On 26/07/2023 21:41, John Allen wrote: >>> Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to >>> be exposed to guests to allow them to handle machine check exceptions on AMD >>> hosts. >>> >>> Reported-by: William Roche<william.roche@oracle.com> >>> Signed-off-by: John Allen<john.allen@amd.com> >> I think this is matching the last discussion: >> >> Reviewed-by: Joao Martins<joao.m.martins@oracle.com> >> >> The patch ordering doesn't look correct though. Perhaps we should expose succor >> only after MCE is fixed so this patch would be the second, not the first? > Yes, that makes sense. I will address this and send another version of > the series with the correct ordering. > >> Also, this should in generally be OK for -cpu host, but might be missing a third >> patch that adds "succor" to the AMD models e.g. > Babu, > > I think we previously discussed adding this to the models later in a > separate series. Is this your preferred course of action or can we add > it with this series? Yes. We can add it later as a separate series. We just added EPYC-Genoa. We don't want to add EPYC-Genoa-v2 at this point. We have few more features pending as well. Thanks Babu
diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 06009b80e8..9708785255 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -872,6 +872,22 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = { }, .tcg_features = TCG_7_1_EAX_FEATURES, }, + [FEAT_8000_0007_EBX] = { + .type = CPUID_FEATURE_WORD, + .feat_names = { + NULL, "succor", NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + NULL, NULL, NULL, NULL, + }, + .cpuid = { .eax = 0x80000007, .reg = R_EBX, }, + .tcg_features = 0, + .unmigratable_flags = 0, + }, [FEAT_8000_0007_EDX] = { .type = CPUID_FEATURE_WORD, .feat_names = { @@ -5874,7 +5890,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, break; case 0x80000007: *eax = 0; - *ebx = 0; + *ebx = env->features[FEAT_8000_0007_EBX]; *ecx = 0; *edx = env->features[FEAT_8000_0007_EDX]; break; diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 3edaad7688..fccb9b5a97 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -555,6 +555,7 @@ typedef enum FeatureWord { FEAT_7_1_EAX, /* CPUID[EAX=7,ECX=1].EAX */ FEAT_8000_0001_EDX, /* CPUID[8000_0001].EDX */ FEAT_8000_0001_ECX, /* CPUID[8000_0001].ECX */ + FEAT_8000_0007_EBX, /* CPUID[8000_0007].EBX */ FEAT_8000_0007_EDX, /* CPUID[8000_0007].EDX */ FEAT_8000_0008_EBX, /* CPUID[8000_0008].EBX */ FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */ @@ -856,6 +857,9 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS]; /* Packets which contain IP payload have LIP values */ #define CPUID_14_0_ECX_LIP (1U << 31) +/* RAS Features */ +#define CPUID_8000_0007_EBX_SUCCOR (1U << 1) + /* CLZERO instruction */ #define CPUID_8000_0008_EBX_CLZERO (1U << 0) /* Always save/restore FP error pointers */ diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index f25837f63f..4b62138459 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -417,6 +417,8 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, */ cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); ret |= cpuid_1_edx & CPUID_EXT2_AMD_ALIASES; + } else if (function == 0x80000007 && reg == R_EBX) { + ret |= CPUID_8000_0007_EBX_SUCCOR; } else if (function == KVM_CPUID_FEATURES && reg == R_EAX) { /* kvm_pv_unhalt is reported by GET_SUPPORTED_CPUID, but it can't * be enabled without the in-kernel irqchip
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to be exposed to guests to allow them to handle machine check exceptions on AMD hosts. Reported-by: William Roche <william.roche@oracle.com> Signed-off-by: John Allen <john.allen@amd.com> --- v2: - Add "succor" feature word. - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature. --- target/i386/cpu.c | 18 +++++++++++++++++- target/i386/cpu.h | 4 ++++ target/i386/kvm/kvm.c | 2 ++ 3 files changed, 23 insertions(+), 1 deletion(-)