diff mbox series

[v2,2/4] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN

Message ID 20220117150542.2176196-3-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug | expand

Commit Message

Vitaly Kuznetsov Jan. 17, 2022, 3:05 p.m. UTC
Commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
forbade changing CPUID altogether but unfortunately this is not fully
compatible with existing VMMs. In particular, QEMU reuses vCPU fds for
CPU hotplug after unplug and it calls KVM_SET_CPUID2. Instead of full ban,
check whether the supplied CPUID data is equal to what was previously set.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Fixes: feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/cpuid.c | 33 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c   | 19 -------------------
 2 files changed, 33 insertions(+), 19 deletions(-)

Comments

Paolo Bonzini Jan. 17, 2022, 5:30 p.m. UTC | #1
On 1/17/22 16:05, Vitaly Kuznetsov wrote:
>   
> +/* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
> +static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
> +				 int nent)
> +{
> +	struct kvm_cpuid_entry2 *best;
> +	int i;
> +
> +	for (i = 0; i < nent; i++) {
> +		best = kvm_find_cpuid_entry(vcpu, e2[i].function, e2[i].index);
> +		if (!best)
> +			return -EINVAL;
> +
> +		if (e2[i].eax != best->eax || e2[i].ebx != best->ebx ||
> +		    e2[i].ecx != best->ecx || e2[i].edx != best->edx)
> +			return -EINVAL;
> +	}
> +
> +	return 0;
> +}

What about this alternative implementation:

/* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
                                  int nent)
{
         struct kvm_cpuid_entry2 *orig;
         int i;

         if (nent != vcpu->arch.cpuid_nent)
                 return -EINVAL;

         for (i = 0; i < nent; i++) {
                 orig = &vcpu->arch.cpuid_entries[i];
                 if (e2[i].function != orig->function ||
                     e2[i].index != orig->index ||
                     e2[i].eax != orig->eax || e2[i].ebx != orig->ebx ||
                     e2[i].ecx != orig->ecx || e2[i].edx != orig->edx)
                         return -EINVAL;
         }

         return 0;
}

avoiding the repeated calls to kvm_find_cpuid_entry?

Paolo
Vitaly Kuznetsov Jan. 18, 2022, 8:40 a.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 1/17/22 16:05, Vitaly Kuznetsov wrote:
>>   
>> +/* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
>> +static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
>> +				 int nent)
>> +{
>> +	struct kvm_cpuid_entry2 *best;
>> +	int i;
>> +
>> +	for (i = 0; i < nent; i++) {
>> +		best = kvm_find_cpuid_entry(vcpu, e2[i].function, e2[i].index);
>> +		if (!best)
>> +			return -EINVAL;
>> +
>> +		if (e2[i].eax != best->eax || e2[i].ebx != best->ebx ||
>> +		    e2[i].ecx != best->ecx || e2[i].edx != best->edx)
>> +			return -EINVAL;
>> +	}
>> +
>> +	return 0;
>> +}
>
> What about this alternative implementation:
>
> /* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
> static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
>                                   int nent)
> {
>          struct kvm_cpuid_entry2 *orig;
>          int i;
>
>          if (nent != vcpu->arch.cpuid_nent)
>                  return -EINVAL;
>
>          for (i = 0; i < nent; i++) {
>                  orig = &vcpu->arch.cpuid_entries[i];
>                  if (e2[i].function != orig->function ||
>                      e2[i].index != orig->index ||
>                      e2[i].eax != orig->eax || e2[i].ebx != orig->ebx ||
>                      e2[i].ecx != orig->ecx || e2[i].edx != orig->edx)
>                          return -EINVAL;
>          }
>
>          return 0;
> }
>
> avoiding the repeated calls to kvm_find_cpuid_entry?
>

My version is a bit more permissive as it allows supplying CPUID entries
in any order, not necessarily matching the original. I *guess* this
doesn't matter much for the QEMU problem we're trying to workaround,
I'll have to check.
Vitaly Kuznetsov Jan. 18, 2022, 2:37 p.m. UTC | #3
Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> Paolo Bonzini <pbonzini@redhat.com> writes:
>
>> On 1/17/22 16:05, Vitaly Kuznetsov wrote:
>>>   
>>> +/* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
>>> +static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
>>> +				 int nent)
>>> +{
>>> +	struct kvm_cpuid_entry2 *best;
>>> +	int i;
>>> +
>>> +	for (i = 0; i < nent; i++) {
>>> +		best = kvm_find_cpuid_entry(vcpu, e2[i].function, e2[i].index);
>>> +		if (!best)
>>> +			return -EINVAL;
>>> +
>>> +		if (e2[i].eax != best->eax || e2[i].ebx != best->ebx ||
>>> +		    e2[i].ecx != best->ecx || e2[i].edx != best->edx)
>>> +			return -EINVAL;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>
>> What about this alternative implementation:
>>
...
>>
>> avoiding the repeated calls to kvm_find_cpuid_entry?
>>
>
> My version is a bit more permissive as it allows supplying CPUID entries
> in any order, not necessarily matching the original. I *guess* this
> doesn't matter much for the QEMU problem we're trying to workaround,
> I'll have to check.

I tried this with QEMU and nothing blew up, during CPU hotplug entries
come in the same order as the original. v3 which I've just sent
implements this suggestion.
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 812190a707f6..c2e9c860fc3f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -119,6 +119,25 @@  static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
 	return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, xfeatures);
 }
 
+/* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */
+static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
+				 int nent)
+{
+	struct kvm_cpuid_entry2 *best;
+	int i;
+
+	for (i = 0; i < nent; i++) {
+		best = kvm_find_cpuid_entry(vcpu, e2[i].function, e2[i].index);
+		if (!best)
+			return -EINVAL;
+
+		if (e2[i].eax != best->eax || e2[i].ebx != best->ebx ||
+		    e2[i].ecx != best->ecx || e2[i].edx != best->edx)
+			return -EINVAL;
+	}
+
+	return 0;
+}
 static void kvm_update_kvm_cpuid_base(struct kvm_vcpu *vcpu)
 {
 	u32 function;
@@ -313,6 +332,20 @@  static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 
 	__kvm_update_cpuid_runtime(vcpu, e2, nent);
 
+	/*
+	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
+	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
+	 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
+	 * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
+	 * the core vCPU model on the fly. It would've been better to forbid any
+	 * KVM_SET_CPUID{,2} calls after KVM_RUN altogether but unfortunately
+	 * some VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do
+	 * KVM_SET_CPUID{,2} again. To support this legacy behavior, check
+	 * whether the supplied CPUID data is equal to what's already set.
+	 */
+	if (vcpu->arch.last_vmentry_cpu != -1)
+		return kvm_cpuid_check_equal(vcpu, e2, nent);
+
 	r = kvm_check_cpuid(vcpu, e2, nent);
 	if (r)
 		return r;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76b4803dd3bd..ff1416010728 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5230,17 +5230,6 @@  long kvm_arch_vcpu_ioctl(struct file *filp,
 		struct kvm_cpuid __user *cpuid_arg = argp;
 		struct kvm_cpuid cpuid;
 
-		/*
-		 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
-		 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
-		 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
-		 * faults due to reusing SPs/SPTEs.  In practice no sane VMM mucks with
-		 * the core vCPU model on the fly, so fail.
-		 */
-		r = -EINVAL;
-		if (vcpu->arch.last_vmentry_cpu != -1)
-			goto out;
-
 		r = -EFAULT;
 		if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
 			goto out;
@@ -5251,14 +5240,6 @@  long kvm_arch_vcpu_ioctl(struct file *filp,
 		struct kvm_cpuid2 __user *cpuid_arg = argp;
 		struct kvm_cpuid2 cpuid;
 
-		/*
-		 * KVM_SET_CPUID{,2} after KVM_RUN is forbidded, see the comment in
-		 * KVM_SET_CPUID case above.
-		 */
-		r = -EINVAL;
-		if (vcpu->arch.last_vmentry_cpu != -1)
-			goto out;
-
 		r = -EFAULT;
 		if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
 			goto out;