diff mbox series

[1/2] i386: Add support for SUCCOR feature

Message ID 20230706194022.2485195-2-john.allen@amd.com (mailing list archive)
State New, archived
Headers show
Series Fix MCE handling on AMD hosts | expand

Commit Message

John Allen July 6, 2023, 7:40 p.m. UTC
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.

Reported-by: William Roche <william.roche@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
---
 target/i386/cpu.c | 2 +-
 target/i386/cpu.h | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Moger, Babu July 6, 2023, 8:22 p.m. UTC | #1
Hi John,
Thanks for the patches. Few comments below.

On 7/6/23 14:40, John Allen wrote:
> Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
> be exposed to guests to allow them to handle machine check exceptions on AMD
> hosts.
> 
> Reported-by: William Roche <william.roche@oracle.com>
> Signed-off-by: John Allen <john.allen@amd.com>
> ---
>  target/i386/cpu.c | 2 +-
>  target/i386/cpu.h | 4 ++++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 06009b80e8..09fae9337a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5874,7 +5874,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          break;
>      case 0x80000007:
>          *eax = 0;
> -        *ebx = 0;
> +        *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR;

This is adding this feature unconditionally which does not seem right.
Couple of things.
1. Add the feature word for SUCCOR. Users can add this feature using the
feature word "+succor".

2. Also define  CPUID_8000_0007_EBX_SUCCOR : In this case, we can add this
feature as part of the EPYC Model update.

Thanks
Babu
Joao Martins July 6, 2023, 9:07 p.m. UTC | #2
+x86 qemu folks

On 06/07/2023 21:22, Moger, Babu wrote:
> Hi John,
> Thanks for the patches. Few comments below.
> 
> On 7/6/23 14:40, John Allen wrote:
>> Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
>> be exposed to guests to allow them to handle machine check exceptions on AMD
>> hosts.
>>
>> Reported-by: William Roche <william.roche@oracle.com>
>> Signed-off-by: John Allen <john.allen@amd.com>
>> ---
>>  target/i386/cpu.c | 2 +-
>>  target/i386/cpu.h | 4 ++++
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 06009b80e8..09fae9337a 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -5874,7 +5874,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>          break;
>>      case 0x80000007:
>>          *eax = 0;
>> -        *ebx = 0;
>> +        *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR;
> 
> This is adding this feature unconditionally which does not seem right.
> Couple of things.
> 1. Add the feature word for SUCCOR. Users can add this feature using the
> feature word "+succor".
> 

The feature requires no specific SVM/KVM support, it's really just a bit
enumerated to the guest that's used to tell whether AMD MCE handling is
supported, instead of blantly crashing outright. So I think the thinking here is
that GET_SUPPORTED_CPUID won't return SUCCOR and the feature gets filtered out
if it's declared as a regular named feature like you do in the CPU models.
Making it relevant for any hypervisor kernel.

> 2. Also define  CPUID_8000_0007_EBX_SUCCOR : In this case, we can add this
> feature as part of the EPYC Model update.
> 

how would you set CPUID regardless of the hypervisor enumerates should it not
require active emulation by the host? I am sort of interested on what is the
right way to do this, as it would be better to do this regardless of the gets
advertised by ioctls alike.

> Thanks
> Babu
>
Paolo Bonzini July 7, 2023, 2:25 p.m. UTC | #3
On 7/6/23 21:40, John Allen wrote:
>       case 0x80000007:
>           *eax = 0;
> -        *ebx = 0;
> +        *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR;
>           *ecx = 0;
>           *edx = env->features[FEAT_8000_0007_EDX];
>           break;

I agree that it needs no hypervisor support, but Babu is right that you 
cannot add it unconditionally (especially not on Intel processors).

You can special case CPUID_8000_0007_EBX_SUCCOR in 
kvm_arch_get_supported_cpuid() so that it is added even on old kernels. 
There are already several such cases.  Adding it to KVM is nice to have 
anyway, so please send a patch for that.

Also, the patch does not compile (probably you missed a prerequisite) as 
it lacks all the rigamarole that is needed to add FEAT_8000_0007_EBX.

Paolo
John Allen July 12, 2023, 7:11 p.m. UTC | #4
On Fri, Jul 07, 2023 at 04:25:22PM +0200, Paolo Bonzini wrote:
> On 7/6/23 21:40, John Allen wrote:
> >       case 0x80000007:
> >           *eax = 0;
> > -        *ebx = 0;
> > +        *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR;
> >           *ecx = 0;
> >           *edx = env->features[FEAT_8000_0007_EDX];
> >           break;
> 
> I agree that it needs no hypervisor support, but Babu is right that you
> cannot add it unconditionally (especially not on Intel processors).
> 
> You can special case CPUID_8000_0007_EBX_SUCCOR in
> kvm_arch_get_supported_cpuid() so that it is added even on old kernels.
> There are already several such cases.  Adding it to KVM is nice to have
> anyway, so please send a patch for that.

By adding it to KVM do you mean adding a patch to the kernel to expose
the cpuid bit? Or do you mean just adding the special case to
kvm_arch_get_supported_cpuid?

For the kvm_arch_get_supported_cpuid case, I don't understand how this
would be different from unconditionally exposing the bit as done above.
Can you help me understand what you have in mind for this?

I might add a case like below:
...
    } else if (function == 0x80000007 && reg == R_EBX) {
        ret |= CPUID_8000_0007_EBX_SUCCOR;
...

If we wanted to only expose the bit for AMD cpus, we would then need to
call IS_AMD_CPU with the CPUX86State as a paramter which would mean that
kvm_arch_get_supported_cpuid and all of its callers would need to take
the CPUX86State as a parameter. Is there another way to differentiate
between AMD and Intel cpus in this case?

> 
> Also, the patch does not compile (probably you missed a prerequisite) as it
> lacks all the rigamarole that is needed to add FEAT_8000_0007_EBX.

I'm not encountering any compilation issues. What are the errors that
you are seeing?

Thanks,
John
Joao Martins July 20, 2023, 1:29 p.m. UTC | #5
On 12/07/2023 20:11, John Allen wrote:
> On Fri, Jul 07, 2023 at 04:25:22PM +0200, Paolo Bonzini wrote:
>> On 7/6/23 21:40, John Allen wrote:
>>>       case 0x80000007:
>>>           *eax = 0;
>>> -        *ebx = 0;
>>> +        *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR;
>>>           *ecx = 0;
>>>           *edx = env->features[FEAT_8000_0007_EDX];
>>>           break;
>>
>> I agree that it needs no hypervisor support, but Babu is right that you
>> cannot add it unconditionally (especially not on Intel processors).
>>
>> You can special case CPUID_8000_0007_EBX_SUCCOR in
>> kvm_arch_get_supported_cpuid() so that it is added even on old kernels.
>> There are already several such cases.  Adding it to KVM is nice to have
>> anyway, so please send a patch for that.
> 
> By adding it to KVM do you mean adding a patch to the kernel to expose
> the cpuid bit? Or do you mean just adding the special case to
> kvm_arch_get_supported_cpuid?
> 
> For the kvm_arch_get_supported_cpuid case, I don't understand how this
> would be different from unconditionally exposing the bit as done above.
> Can you help me understand what you have in mind for this?
> 
> I might add a case like below:
> ...
>     } else if (function == 0x80000007 && reg == R_EBX) {
>         ret |= CPUID_8000_0007_EBX_SUCCOR;
> ...
> 

IIUC the thinking here is to cover a hypervisor that doesn't advertise SUCCOR
via supported cpuid from the kernel, by adding that elif statement above.
So when a CPU model is configured, the filtering of cpuid leafs logic
takes place, it doesn't "dissappear" when the values from the kernel isn't set.

Otherwise, if it's advertised by the kernel, everything goes as normal in this
filtering logic and everything works.

This presumes that you set "succor" feature string in the CPUID model definition
instead of unilaterally defining it like this patch. Such that only AMD CPU
models advertise the feature instead of everybody.

> If we wanted to only expose the bit for AMD cpus, we would then need to
> call IS_AMD_CPU with the CPUX86State as a paramter which would mean that
> kvm_arch_get_supported_cpuid and all of its callers would need to take
> the CPUX86State as a parameter. Is there another way to differentiate
> between AMD and Intel cpus in this case?
> 
There might be an implicit convetion here where the caller is the one selecting
all the bits -- so kvm_arch_get_supported_cpuid won't care to validate which
vendor supports a given bit. Usually steered down via CPU model names (-cpu
Genoa), or named CPU features (-cpu Genoa,-feature-name) e.g. Nothing stops you
from trying to use a Intel CPUID on a AMD host. So the IS_AMD_CPU inside
kvm_arch_get_supported_cpuid doesn't seem to matter

	Joao
diff mbox series

Patch

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 06009b80e8..09fae9337a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5874,7 +5874,7 @@  void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         break;
     case 0x80000007:
         *eax = 0;
-        *ebx = 0;
+        *ebx = env->features[FEAT_8000_0007_EBX] | CPUID_8000_0007_EBX_SUCCOR;
         *ecx = 0;
         *edx = env->features[FEAT_8000_0007_EDX];
         break;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 3edaad7688..fccb9b5a97 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -555,6 +555,7 @@  typedef enum FeatureWord {
     FEAT_7_1_EAX,       /* CPUID[EAX=7,ECX=1].EAX */
     FEAT_8000_0001_EDX, /* CPUID[8000_0001].EDX */
     FEAT_8000_0001_ECX, /* CPUID[8000_0001].ECX */
+    FEAT_8000_0007_EBX, /* CPUID[8000_0007].EBX */
     FEAT_8000_0007_EDX, /* CPUID[8000_0007].EDX */
     FEAT_8000_0008_EBX, /* CPUID[8000_0008].EBX */
     FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */
@@ -856,6 +857,9 @@  typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 /* Packets which contain IP payload have LIP values */
 #define CPUID_14_0_ECX_LIP              (1U << 31)
 
+/* RAS Features */
+#define CPUID_8000_0007_EBX_SUCCOR      (1U << 1)
+
 /* CLZERO instruction */
 #define CPUID_8000_0008_EBX_CLZERO      (1U << 0)
 /* Always save/restore FP error pointers */