From patchwork Mon Aug 15 17:50:59 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 9281731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1FAD160467 for ; Mon, 15 Aug 2016 17:51:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12BA628E15 for ; Mon, 15 Aug 2016 17:51:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 05A9628E19; Mon, 15 Aug 2016 17:51:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.9 required=2.0 tests=BAYES_00,HK_RANDOM_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E4FD28E15 for ; Mon, 15 Aug 2016 17:51:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752930AbcHORvF (ORCPT ); Mon, 15 Aug 2016 13:51:05 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46244 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752801AbcHORvD (ORCPT ); Mon, 15 Aug 2016 13:51:03 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1DA2361E5E for ; Mon, 15 Aug 2016 17:51:03 +0000 (UTC) Received: from potion (dhcp-1-206.brq.redhat.com [10.34.1.206]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id u7FHp0r6000407; Mon, 15 Aug 2016 13:51:00 -0400 Received: by potion (sSMTP sendmail emulation); Mon, 15 Aug 2016 19:50:59 +0200 Date: Mon, 15 Aug 2016 19:50:59 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Eduardo Habkost Cc: Marcelo Tosatti , Paolo Bonzini , peterx@redhat.com, Andrew Jones , kvm@vger.kernel.org Subject: Re: GET_SUPPORTED_CPUID and irqchip (was Re: kvm_pv_unhalt and kernel_irqchip=off) Message-ID: <20160815175059.GA12532@potion> References: <20160810182704.GE5627@thinpad.lan.raisama.net> <20160810190412.GA8001@potion> <20160812183742.GM5627@thinpad.lan.raisama.net> <20160813124303.GA20960@potion> <20160815130346.GP5627@thinpad.lan.raisama.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160815130346.GP5627@thinpad.lan.raisama.net> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 15 Aug 2016 17:51:03 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2016-08-15 10:03-0300, Eduardo Habkost: > On Sat, Aug 13, 2016 at 02:43:03PM +0200, Radim Krčmář wrote: >> I would like to return -EINVAL from KVM_SET_CPUID2 if userspace >> requested a new CPUID feature that cannot work in given situation. >> >> Another way would be to disable buggy features in KVM_SET_CPUID2, which >> would require userspace to call KVM_GET_CPUID2 afterwards to learn what >> the guest is actually using. > > I'd prefer to get -EINVAL. QEMU needs to be 100% aware of the > resulting CPUID bits, so if we end up trying to enable something > that will never work, it's already too late to silently clear > CPUID bits. > > This shouldn't make any difference for existing QEMU code, > because it already filters out all CPUID feature flags based > on GET_SUPPORTED_CPUID (+ the extra fixups in > QEMU's kvm_arch_get_supported_cpuid()). > > But: note that returning -EINVAL might break very old QEMU > versions. Maybe we really want them to break (because they have > broken CPUID logic), but I am not sure. Returning -EINVAL would break even current QEMU users, because they might have '-cpu host' with a guest that doesn't use PV_UNHALT. -EINVAL without any interface changes could be done only for future features. :/ >> I have patches that implement the latter for X2APIC and PV_UNHALT, but >> I'm not sure if it's better than leaving the bug unfixed, because QEMU >> doesn't use KVM_GET_CPUID2 and migration to older KVM would change >> CPUID, which is a very subtle bug. >> What do you think? > > Implementing those checks basically mean moving the existing > kvm_arch_get_supported_cpuid() logic inside KVM. If we do that, > can we get an interface that will allow QEMU to just query KVM > instead of duplicating that logic? The QEMU code for old kernels likely has to stay so only future features wouldn't be duplicated. An interface to check what KVM disabled already exists, KVM_GET_CPUID2. We'd be hijacking it for "SET -> GET -> compare" where QEMU would exit() if features got cleared (and/or added?). It an extension of existing practice -- new features would be notified with a capability and SET+GET would check if they really can be enabled. We'd need a toggle (a way for QEMU to tell KVM that it will GET and compare) to allow KVM safely clear already released features. e.g. migration wouldn't enable this feature. The toggle for SET+GET would be very simple in the kernel, but might turn out to be complicated in QEMU ... See the rough idea at the bottom. > Making KVM_SET_CPUID2 clear those bits and/or return -EINVAL > would probably be enough as an interface, as long as QEMU has a > way to know in advance if it needs to clear CPUID bits based on > the legacy kvm_arch_get_supported_cpuid() code (for older > kernels), or if it can just give everything to KVM_SET_CPUID2, > and check for unsupported/cleared flags later (for newer > kernels). A new IOCTL that returns valid-but-not-in-supported and invalid-from-supported CPUIDs and forces KVM to -EINVAL if userspace doesn't listen is in the game too. It is more complicated, though. ---8<--- Rough (doesn't even define everything) idea of the clearing approach. --- -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 739db9ab16b2..8d085823c4af 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -545,6 +545,9 @@ struct kvm_cpuid { struct kvm_cpuid_entry entries[0]; }; +If KVM_CAP_CLEAR_INVALID_CPUID is enabled, then the ioctls can modify CPUID +bits to prevent invalid configurations. + 4.21 KVM_SET_SIGNAL_MASK @@ -3900,6 +3903,18 @@ to take care of that. This capability can be enabled dynamically even if VCPUs were already created and are running. +7.9 KVM_CAP_CLEAR_INVALID_CPUID + +Architectures: x86 +Parameters: none + +After setting this capability, KVM_SET_CPUID2 can modify CPUID bits that are +invalid in the given configuration, e.g. disable KVM_FEATURE_PV_UNHALT when the +LAPIC is in the userspace. + +The userspace is expected to use KVM_GET_CPUID2 to retrieve the final CPUID. + + 8. Other capabilities. ---------------------- diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb596cd..afa05bb31cc3 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -160,6 +160,21 @@ static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu) } } +static void kvm_clear_invalid_cpuid(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + struct kvm_cpuid_entry2 *best; + + if (!kvm->clear_invalid_cpuid) + return; + + best = kvm_find_cpuid_entry(vcpu, 1, 0); + if (best && !lapic_in_kernel(vcpu)) + best->ecx &= ~F(X2APIC); + + /* TODO: to disable KVM_FEATURE_PV_UNHALT */ +} + int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -234,6 +249,8 @@ int kvm_vcpu_ioctl_set_cpuid2(struct kvm_vcpu *vcpu, cpuid->nent * sizeof(struct kvm_cpuid_entry2))) goto out; vcpu->arch.cpuid_nent = cpuid->nent; + + kvm_clear_invalid_cpuid(vcpu); kvm_apic_set_version(vcpu); kvm_x86_ops->cpuid_update(vcpu); r = kvm_update_cpuid(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 19f9f9e05c2a..99028df615e7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3812,6 +3812,13 @@ split_irqchip_unlock: r = 0; break; + case KVM_CAP_CLEAR_INVALID_CPUID: + r = -EEXIST; + if (kvm->created_vcpus) + break; + kvm->clear_invalid_cpuid = true; + r = 0; + break; default: r = -EINVAL; break;