diff mbox series

KVM: x86: work around QEMU issue with synthetic CPUID leaves

Message ID 20220429192553.932611-1-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: work around QEMU issue with synthetic CPUID leaves | expand

Commit Message

Paolo Bonzini April 29, 2022, 7:25 p.m. UTC
Synthesizing AMD leaves up to 0x80000021 caused problems with QEMU,
which assumes the *host* CPUID[0x80000000].EAX is higher or equal
to what KVM_GET_SUPPORTED_CPUID reports.

This causes QEMU to issue bogus host CPUIDs when preparing the input
to KVM_SET_CPUID2.  It can even get into an infinite loop, which is
only terminated by an abort():

   cpuid_data is full, no space for cpuid(eax:0x8000001d,ecx:0x3e)

To work around this, only synthesize those leaves if 0x8000001d exists
on the host.  The synthetic 0x80000021 leaf is mostly useful on Zen2,
which satisfies the condition.

Fixes: f144c49e8c39 ("KVM: x86: synthesize CPUID leaf 0x80000021h if useful")
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/cpuid.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Maxim Levitsky May 1, 2022, 11:16 a.m. UTC | #1
On Fri, 2022-04-29 at 15:25 -0400, Paolo Bonzini wrote:
> Synthesizing AMD leaves up to 0x80000021 caused problems with QEMU,
> which assumes the *host* CPUID[0x80000000].EAX is higher or equal
> to what KVM_GET_SUPPORTED_CPUID reports.
> 
> This causes QEMU to issue bogus host CPUIDs when preparing the input
> to KVM_SET_CPUID2.  It can even get into an infinite loop, which is
> only terminated by an abort():
> 
>    cpuid_data is full, no space for cpuid(eax:0x8000001d,ecx:0x3e)
> 
> To work around this, only synthesize those leaves if 0x8000001d exists
> on the host.  The synthetic 0x80000021 leaf is mostly useful on Zen2,
> which satisfies the condition.
> 
> Fixes: f144c49e8c39 ("KVM: x86: synthesize CPUID leaf 0x80000021h if useful")
> Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/cpuid.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b24ca7f4ed7c..598334ed5fbc 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1085,12 +1085,21 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>  	case 0x80000000:
>  		entry->eax = min(entry->eax, 0x80000021);
>  		/*
> -		 * Serializing LFENCE is reported in a multitude of ways,
> -		 * and NullSegClearsBase is not reported in CPUID on Zen2;
> -		 * help userspace by providing the CPUID leaf ourselves.
> +		 * Serializing LFENCE is reported in a multitude of ways, and
> +		 * NullSegClearsBase is not reported in CPUID on Zen2; help
> +		 * userspace by providing the CPUID leaf ourselves.
> +		 *
> +		 * However, only do it if the host has CPUID leaf 0x8000001d.
> +		 * QEMU thinks that it can query the host blindly for that
> +		 * CPUID leaf if KVM reports that it supports 0x8000001d or
> +		 * above.  The processor merrily returns values from the
> +		 * highest Intel leaf which QEMU tries to use as the guest's
> +		 * 0x8000001d.  Even worse, this can result in an infinite
> +		 * loop if said highest leaf has no subleaves indexed by ECX.

Very small nitpick: It might be useful to add a note that qemu does this only for the
leaf 0x8000001d.

>  		 */
> -		if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
> -		    || !static_cpu_has_bug(X86_BUG_NULL_SEG))
> +		if (entry->eax >= 0x8000001d &&
> +		    (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
> +		     || !static_cpu_has_bug(X86_BUG_NULL_SEG)))
>  			entry->eax = max(entry->eax, 0x80000021);
>  		break;
>  	case 0x80000001:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky
Paolo Bonzini May 1, 2022, 5:37 p.m. UTC | #2
On 5/1/22 13:16, Maxim Levitsky wrote:
>> +		 * However, only do it if the host has CPUID leaf 0x8000001d.
>> +		 * QEMU thinks that it can query the host blindly for that
>> +		 * CPUID leaf if KVM reports that it supports 0x8000001d or
>> +		 * above.  The processor merrily returns values from the
>> +		 * highest Intel leaf which QEMU tries to use as the guest's
>> +		 * 0x8000001d.  Even worse, this can result in an infinite
>> +		 * loop if said highest leaf has no subleaves indexed by ECX.
>
> Very small nitpick: It might be useful to add a note that qemu does this only for the
> leaf 0x8000001d.

Yes, it's there: "QEMU thinks that it can query the host blindly for 
that CPUID leaf", "that" is 0x8000001d in the previous sentence.

Paolo
Maxim Levitsky May 2, 2022, 6:25 a.m. UTC | #3
On Sun, 2022-05-01 at 19:37 +0200, Paolo Bonzini wrote:
> On 5/1/22 13:16, Maxim Levitsky wrote:
> > > +		 * However, only do it if the host has CPUID leaf 0x8000001d.
> > > +		 * QEMU thinks that it can query the host blindly for that
> > > +		 * CPUID leaf if KVM reports that it supports 0x8000001d or
> > > +		 * above.  The processor merrily returns values from the
> > > +		 * highest Intel leaf which QEMU tries to use as the guest's
> > > +		 * 0x8000001d.  Even worse, this can result in an infinite
> > > +		 * loop if said highest leaf has no subleaves indexed by ECX.
> > 
> > Very small nitpick: It might be useful to add a note that qemu does this only for the
> > leaf 0x8000001d.
> 
> Yes, it's there: "QEMU thinks that it can query the host blindly for 
> that CPUID leaf", "that" is 0x8000001d in the previous sentence.

Yes I see it, but it doesn't state that qemu doesn't do this to other leaves in the affected range.

I had to check the qemu source to verify this to be sure that checking for 0x8000001d
is enough.

Just a tiny minor nitpick though.

Best regards,
	Maxim Levitsky

> 
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b24ca7f4ed7c..598334ed5fbc 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1085,12 +1085,21 @@  static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 	case 0x80000000:
 		entry->eax = min(entry->eax, 0x80000021);
 		/*
-		 * Serializing LFENCE is reported in a multitude of ways,
-		 * and NullSegClearsBase is not reported in CPUID on Zen2;
-		 * help userspace by providing the CPUID leaf ourselves.
+		 * Serializing LFENCE is reported in a multitude of ways, and
+		 * NullSegClearsBase is not reported in CPUID on Zen2; help
+		 * userspace by providing the CPUID leaf ourselves.
+		 *
+		 * However, only do it if the host has CPUID leaf 0x8000001d.
+		 * QEMU thinks that it can query the host blindly for that
+		 * CPUID leaf if KVM reports that it supports 0x8000001d or
+		 * above.  The processor merrily returns values from the
+		 * highest Intel leaf which QEMU tries to use as the guest's
+		 * 0x8000001d.  Even worse, this can result in an infinite
+		 * loop if said highest leaf has no subleaves indexed by ECX.
 		 */
-		if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
-		    || !static_cpu_has_bug(X86_BUG_NULL_SEG))
+		if (entry->eax >= 0x8000001d &&
+		    (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
+		     || !static_cpu_has_bug(X86_BUG_NULL_SEG)))
 			entry->eax = max(entry->eax, 0x80000021);
 		break;
 	case 0x80000001: