diff mbox series

[v2,01/66] KVM: x86: Return -E2BIG when KVM_GET_SUPPORTED_CPUID hits max entries

Message ID 20200302235709.27467-2-sean.j.christopherson@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Introduce KVM cpu caps | expand

Commit Message

Sean Christopherson March 2, 2020, 11:56 p.m. UTC
Fix a long-standing bug that causes KVM to return 0 instead of -E2BIG
when userspace's array is insufficiently sized.

This technically breaks backwards compatibility, e.g. a userspace with a
hardcoded cpuid->nent could theoretically be broken as it would see an
error instead of success if cpuid->nent is less than the number of
entries required to fully enumerate the host CPU.  But, the lowest known
cpuid->nent hardcoded by a VMM is 100 (lkvm and selftests), and the
largest realistic limit on Intel and AMD is well under a 100.  E.g.
Intel's Icelake server with all the bells and whistles tops out at ~60
entries (variable due to SGX sub-leafs), and AMD's CPUID documentation
allows for less than 50 (KVM hard caps CPUID 0xD at a single sub-leaf).

Note, while the Fixes: tag is accurate with respect to the immediate
bug, it's likely that similar bugs in KVM_GET_SUPPORTED_CPUID existed
prior to the refactoring, e.g. Qemu contains a workaround for the broken
KVM_GET_SUPPORTED_CPUID behavior that predates the buggy commit by over
two years.  The Qemu workaround is also likely the main reason the bug
has gone unreported for so long.

Qemu hack:
  commit 76ae317f7c16aec6b469604b1764094870a75470
  Author: Mark McLoughlin <markmc@redhat.com>
  Date:   Tue May 19 18:55:21 2009 +0100

    kvm: work around supported cpuid ioctl() brokenness

    KVM_GET_SUPPORTED_CPUID has been known to fail to return -E2BIG
    when it runs out of entries. Detect this by always trying again
    with a bigger table if the ioctl() fills the table.

Fixes: 831bf664e9c1f ("KVM: Refactor and simplify kvm_dev_ioctl_get_supported_cpuid")
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/cpuid.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini March 3, 2020, 2:16 p.m. UTC | #1
On 03/03/20 00:56, Sean Christopherson wrote:
> (KVM hard caps CPUID 0xD at a single sub-leaf).

Hmm... no it doesn't?

                for (idx = 1, i = 1; idx < 64; ++idx) {
                        u64 mask = ((u64)1 << idx);
                        if (*nent >= maxnent)
                                goto out;

                        do_host_cpuid(&entry[i], function, idx);
                        if (idx == 1) {
                                entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
                                cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
                                entry[i].ebx = 0;
                                if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
                                        entry[i].ebx =
                                                xstate_required_size(supported,
                                                                     true);
                        } else {
                                if (entry[i].eax == 0 || !(supported & mask))
                                        continue;
                                if (WARN_ON_ONCE(entry[i].ecx & 1))
                                        continue;
                        }
                        entry[i].ecx = 0;
                        entry[i].edx = 0;
                        ++*nent;
                        ++i;
                }

I still think the patch is correct, what matters is that no KVM in
existence supports enough processor features to reach 100 or so subleaves.

Paolo
Sean Christopherson March 3, 2020, 3:17 p.m. UTC | #2
On Tue, Mar 03, 2020 at 03:16:16PM +0100, Paolo Bonzini wrote:
> On 03/03/20 00:56, Sean Christopherson wrote:
> > (KVM hard caps CPUID 0xD at a single sub-leaf).
> 
> Hmm... no it doesn't?
> 
>                 for (idx = 1, i = 1; idx < 64; ++idx) {
>                         u64 mask = ((u64)1 << idx);
>                         if (*nent >= maxnent)
>                                 goto out;
> 
>                         do_host_cpuid(&entry[i], function, idx);
>                         if (idx == 1) {
>                                 entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
>                                 cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
>                                 entry[i].ebx = 0;
>                                 if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
>                                         entry[i].ebx =
>                                                 xstate_required_size(supported,
>                                                                      true);
>                         } else {
>                                 if (entry[i].eax == 0 || !(supported & mask))
>                                         continue;
>                                 if (WARN_ON_ONCE(entry[i].ecx & 1))
>                                         continue;
>                         }
>                         entry[i].ecx = 0;
>                         entry[i].edx = 0;
>                         ++*nent;
>                         ++i;
>                 }

Ah rats, I was thinking of CPUID 0x7 when I wrote that.  Maybe just reword
it to "(KVM hard caps the number of CPUID 0xD sub-leafs)."?

> I still think the patch is correct, what matters is that no KVM in
> existence supports enough processor features to reach 100 or so subleaves.
Jim Mattson March 3, 2020, 7:47 p.m. UTC | #3
On Mon, Mar 2, 2020 at 3:57 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Fix a long-standing bug that causes KVM to return 0 instead of -E2BIG
> when userspace's array is insufficiently sized.
>
> This technically breaks backwards compatibility, e.g. a userspace with a
> hardcoded cpuid->nent could theoretically be broken as it would see an
> error instead of success if cpuid->nent is less than the number of
> entries required to fully enumerate the host CPU.  But, the lowest known
> cpuid->nent hardcoded by a VMM is 100 (lkvm and selftests), and the

I have an existence proof for 98. :-)

> largest realistic limit on Intel and AMD is well under a 100.  E.g.
> Intel's Icelake server with all the bells and whistles tops out at ~60
> entries (variable due to SGX sub-leafs), and AMD's CPUID documentation
> allows for less than 50 (KVM hard caps CPUID 0xD at a single sub-leaf).
>
> Note, while the Fixes: tag is accurate with respect to the immediate
> bug, it's likely that similar bugs in KVM_GET_SUPPORTED_CPUID existed
> prior to the refactoring, e.g. Qemu contains a workaround for the broken
> KVM_GET_SUPPORTED_CPUID behavior that predates the buggy commit by over
> two years.  The Qemu workaround is also likely the main reason the bug
> has gone unreported for so long.
>
> Qemu hack:
>   commit 76ae317f7c16aec6b469604b1764094870a75470
>   Author: Mark McLoughlin <markmc@redhat.com>
>   Date:   Tue May 19 18:55:21 2009 +0100
>
>     kvm: work around supported cpuid ioctl() brokenness
>
>     KVM_GET_SUPPORTED_CPUID has been known to fail to return -E2BIG
>     when it runs out of entries. Detect this by always trying again
>     with a bigger table if the ioctl() fills the table.
>
> Fixes: 831bf664e9c1f ("KVM: Refactor and simplify kvm_dev_ioctl_get_supported_cpuid")
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b1c469446b07..47ce04762c20 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -908,9 +908,14 @@  int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
 			goto out_free;
 
 		limit = cpuid_entries[nent - 1].eax;
-		for (func = ent->func + 1; func <= limit && nent < cpuid->nent && r == 0; ++func)
+		for (func = ent->func + 1; func <= limit && r == 0; ++func) {
+			if (nent >= cpuid->nent) {
+				r = -E2BIG;
+				goto out_free;
+			}
 			r = do_cpuid_func(&cpuid_entries[nent], func,
 				          &nent, cpuid->nent, type);
+		}
 
 		if (r)
 			goto out_free;