From patchwork Wed Oct 31 13:26:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Orr X-Patchwork-Id: 10662673 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38EDB14DE for ; Wed, 31 Oct 2018 13:26:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 276F2284BD for ; Wed, 31 Oct 2018 13:26:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1B60528AA4; Wed, 31 Oct 2018 13:26:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8960284BD for ; Wed, 31 Oct 2018 13:26:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729088AbeJaWYo (ORCPT ); Wed, 31 Oct 2018 18:24:44 -0400 Received: from mail-yb1-f202.google.com ([209.85.219.202]:37200 "EHLO mail-yb1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728858AbeJaWYo (ORCPT ); Wed, 31 Oct 2018 18:24:44 -0400 Received: by mail-yb1-f202.google.com with SMTP id t60-v6so1229476ybi.4 for ; Wed, 31 Oct 2018 06:26:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=sag5MWmUMVeJlNMpxbgpOCQmQo4Ok3FbocF8rLJTZCg=; b=klEZe1+YLFreakUC1NNWYeyOnlaW0ntB3zzqwzbnKOaNr+HjvwFIoPO+mZH+KAwDlZ j+lseImIqnz9dnhMd0IopzDS5+E5z7GC8b1ifKqMksISoeug5GrN82adGqE2ttXDcJHr 9h9IpYHmtTy6cIvoqClSy2yErvilzoRCGC936R0e+woy9EklHZAlZ2zXeyvO449TV7DP FnECVaZ8rWoA5SGl0kzb/l2Kjgtf4oPjArv7bo8AZ5qFCZymZpQPtMzWm+vq867jTIHJ WYP6G9Dua4VPzkl0oURZB8xSwkhWwSCYXNmVeYkpAakTx5Z/wiJ2oezakszw8tHPOlsU Rxjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=sag5MWmUMVeJlNMpxbgpOCQmQo4Ok3FbocF8rLJTZCg=; b=bUv0wcigkKxOZeC4B5W3vDzfiTYQmDO1TwtsiTuYAnukVkAJlFJ+6AXCpYWxTarlII B7vvRlki0DR7BFeEW+mJxc2G3ByKz6HkIAen/xGS+KvfZ8R7OZ1H0u/lwfFP85CtGTFz KLT0fpqAUtw3OcIGYGsMwZVE7/zHKdgEAeQmdMyoU57Htm91zjN/b5/SThhqmmfQqttv +AtUcEjiFAhYZq4VmvYIeLOyV0+llYaSQw2WYNanmCCnwVuxI6Vl4Cix+D4mkBQZPZ3F /X1two+e9ksvCV1y5oZ4d2pW6Qyk+4d32S1dy+shm/Aa7jRyo7s7CKfHwzE9tEXBhHNL 8I3g== X-Gm-Message-State: AGRZ1gLtSa7U27I+N0VuAqy813CRTzW3Q05OVIY52A7mdNuXZv+J1hZN pRIoN4fjZ432AX7fDqB//t6uTR0zdzAElJHNeTrPImRBNUuXsPVuti7LsyP8h5e2DDl8XNGVAO9 wz6POZz0Wl0FPGYekc+tDo1bq4Tmtwm5YxMzmTbbmhb3UVJDGj5Tx8I9zNZdN X-Google-Smtp-Source: AJdET5c7s7Wrujes1yN28z27Ablc0GtM2SWbbocQ15S6tHRBU9kdq6AxSSMsZhc+s2/n1eNSXZgn2HWdP18m X-Received: by 2002:a25:cb56:: with SMTP id b83-v6mr1485845ybg.72.1540992402222; Wed, 31 Oct 2018 06:26:42 -0700 (PDT) Date: Wed, 31 Oct 2018 06:26:31 -0700 In-Reply-To: <20181031132634.50440-1-marcorr@google.com> Message-Id: <20181031132634.50440-2-marcorr@google.com> Mime-Version: 1.0 References: <20181031132634.50440-1-marcorr@google.com> X-Mailer: git-send-email 2.19.1.568.g152ad8e336-goog Subject: [kvm PATCH v5 1/4] kvm: x86: Use task structs fpu field for user From: Marc Orr To: kvm@vger.kernel.org, jmattson@google.com, rientjes@google.com, konrad.wilk@oracle.com, linux-mm@kvack.org, akpm@linux-foundation.org, pbonzini@redhat.com, rkrcmar@redhat.com, willy@infradead.org, sean.j.christopherson@intel.com, dave.hansen@linux.intel.com, kernellwp@gmail.com Cc: Marc Orr Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Previously, x86's instantiation of 'struct kvm_vcpu_arch' added an fpu field to save/restore fpu-related architectural state, which will differ from kvm's fpu state. However, this is redundant to the 'struct fpu' field, called fpu, embedded in the task struct, via the thread field. Thus, this patch removes the user_fpu field from the kvm_vcpu_arch struct and replaces it with the task struct's fpu field. This change is significant because the fpu struct is actually quite large. For example, on the system used to develop this patch, this change reduces the size of the vcpu_vmx struct from 23680 bytes down to 19520 bytes, when building the kernel with kvmconfig. This reduction in the size of the vcpu_vmx struct moves us closer to being able to allocate the struct at order 2, rather than order 3. Suggested-by: Dave Hansen Signed-off-by: Marc Orr --- arch/x86/include/asm/kvm_host.h | 7 +++---- arch/x86/kvm/x86.c | 4 ++-- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 55e51ff7e421..ebb1d7a755d4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -601,16 +601,15 @@ struct kvm_vcpu_arch { /* * QEMU userspace and the guest each have their own FPU state. - * In vcpu_run, we switch between the user and guest FPU contexts. - * While running a VCPU, the VCPU thread will have the guest FPU - * context. + * In vcpu_run, we switch between the user, maintained in the + * task_struct struct, and guest FPU contexts. While running a VCPU, + * the VCPU thread will have the guest FPU context. * * Note that while the PKRU state lives inside the fpu registers, * it is switched out separately at VMENTER and VMEXIT time. The * "guest_fpu" state here contains the guest FPU context, with the * host PRKU bits. */ - struct fpu user_fpu; struct fpu guest_fpu; u64 xcr0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bdcb5babfb68..ff77514f7367 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7999,7 +7999,7 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu) static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu) { preempt_disable(); - copy_fpregs_to_fpstate(&vcpu->arch.user_fpu); + copy_fpregs_to_fpstate(¤t->thread.fpu); /* PKRU is separately restored in kvm_x86_ops->run. */ __copy_kernel_to_fpregs(&vcpu->arch.guest_fpu.state, ~XFEATURE_MASK_PKRU); @@ -8012,7 +8012,7 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu) { preempt_disable(); copy_fpregs_to_fpstate(&vcpu->arch.guest_fpu); - copy_kernel_to_fpregs(&vcpu->arch.user_fpu.state); + copy_kernel_to_fpregs(¤t->thread.fpu.state); preempt_enable(); ++vcpu->stat.fpu_reload; trace_kvm_fpu(0); From patchwork Wed Oct 31 13:26:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Orr X-Patchwork-Id: 10662677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE47013BF for ; Wed, 31 Oct 2018 13:26:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DC034284BD for ; Wed, 31 Oct 2018 13:26:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CFC4928AA4; Wed, 31 Oct 2018 13:26:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0966C284BD for ; Wed, 31 Oct 2018 13:26:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729204AbeJaWYr (ORCPT ); Wed, 31 Oct 2018 18:24:47 -0400 Received: from mail-qk1-f202.google.com ([209.85.222.202]:43831 "EHLO mail-qk1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729100AbeJaWYq (ORCPT ); Wed, 31 Oct 2018 18:24:46 -0400 Received: by mail-qk1-f202.google.com with SMTP id z126so242934qka.10 for ; Wed, 31 Oct 2018 06:26:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=uuP0VVlUnqXp2dMHDoSAgZr4dF/qtgdpqgl8mRbBnkk=; b=F+NB1RxcQr2bTutSgJwdifRc4zUjLOl3TxTiJHutBqkRq9urjO1g/Ejfoj2m02VRRi mXa/kOSHDf/1S4KttEeQI3hW+yIRaWyYJPtEj/dd2aj7aAHrB26s2t7pkMRg48EnSci0 yqIfk90njcXNVQY1uiKJ68GfDVT5xiWF7pkMfgPTBRoaVTKiWJAhUyl1WlU2wL4toIIn JFjJ/iE4dnPv3Bx0Mt9D7bq70PkLz+bIg3jOY7BsGE/V5dK0Mv+LNlqAJSNn9tKRCUBF HaD5EDUQolCbdGjcBOeO52uO4hdoud3797xftSYdWIOQeDEFhmro/vIcS3OEMiXdi2eP 8F5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=uuP0VVlUnqXp2dMHDoSAgZr4dF/qtgdpqgl8mRbBnkk=; b=neTh9XbToWkEGegiT304uZOCHvcC5DphManHef6R+H8aqCWahTsUScxQCfDrmyj6XW rkSwIPQGAPDZnWGV4GZNAHhU2vQpHQDbXrZHDOhDzJGNe9uqMI3ThTLBnA7HZm2E+Ged Gj3TYP6iRVRLXbGyIPl5I8BS8ekrQ6rzDN5zmF/iJtb8P7s8ReTUECxLapZAc9RomyHQ fu3VQipUsWCwa9Zmr4kKsQLi1A9Ywtng70Yx/BJ4XPRZ+S9mUPix2ZRyNi6/V80aAaOI NUmlrbjo8YZhSr7DW+D7t536sGENNMkTco/9ZTQvLvIPXvhXr1a3cvu1hzzJZjUX+0bz ugQg== X-Gm-Message-State: AGRZ1gKWvvyjWWKdQaxIDkb6Jy40iYUC0HT8yaaTIs65Dq1NE0n1LORf czq59eF59SsfIgbwUHs8maiGhv18TTVlcTME86Y2XY2dv6TlAIgUumT+0WgyGb46V3pXMyU4srs rznQEYiRFV8BZccf7Z7EtHrlus/pPbHAya2UHgC9W8Lvoo5EhDjlISJMmnNmI X-Google-Smtp-Source: AJdET5ctl87/dqXf6bdBRw29fQWrPDgKICEXJLDQF68l2QLKEa9Behs/YHIqguLELUiMZXKxyWi7agOifVkj X-Received: by 2002:a37:9a90:: with SMTP id c138mr2150620qke.36.1540992404300; Wed, 31 Oct 2018 06:26:44 -0700 (PDT) Date: Wed, 31 Oct 2018 06:26:32 -0700 In-Reply-To: <20181031132634.50440-1-marcorr@google.com> Message-Id: <20181031132634.50440-3-marcorr@google.com> Mime-Version: 1.0 References: <20181031132634.50440-1-marcorr@google.com> X-Mailer: git-send-email 2.19.1.568.g152ad8e336-goog Subject: [kvm PATCH v5 2/4] kvm: x86: Dynamically allocate guest_fpu From: Marc Orr To: kvm@vger.kernel.org, jmattson@google.com, rientjes@google.com, konrad.wilk@oracle.com, linux-mm@kvack.org, akpm@linux-foundation.org, pbonzini@redhat.com, rkrcmar@redhat.com, willy@infradead.org, sean.j.christopherson@intel.com, dave.hansen@linux.intel.com, kernellwp@gmail.com Cc: Marc Orr Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Previously, the guest_fpu field was embedded in the kvm_vcpu_arch struct. Unfortunately, the field is quite large, (e.g., 4352 bytes on my current setup). This bloats the kvm_vcpu_arch struct for x86 into an order 3 memory allocation, which can become a problem on overcommitted machines. Thus, this patch moves the fpu state outside of the kvm_vcpu_arch struct. With this patch applied, the kvm_vcpu_arch struct is reduced to 15168 bytes for vmx on my setup when building the kernel with kvmconfig. Suggested-by: Dave Hansen Signed-off-by: Marc Orr --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/svm.c | 10 ++++++++ arch/x86/kvm/vmx.c | 10 ++++++++ arch/x86/kvm/x86.c | 45 +++++++++++++++++++++++---------- 4 files changed, 54 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ebb1d7a755d4..c8a2a263f91f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -610,7 +610,7 @@ struct kvm_vcpu_arch { * "guest_fpu" state here contains the guest FPU context, with the * host PRKU bits. */ - struct fpu guest_fpu; + struct fpu *guest_fpu; u64 xcr0; u64 guest_supported_xcr0; @@ -1194,6 +1194,7 @@ struct kvm_arch_async_pf { }; extern struct kvm_x86_ops *kvm_x86_ops; +extern struct kmem_cache *x86_fpu_cache; #define __KVM_HAVE_ARCH_VM_ALLOC static inline struct kvm *kvm_arch_alloc_vm(void) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index f416f5c7f2ae..ac0c52ca22c6 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2121,6 +2121,13 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) goto out; } + svm->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache, GFP_KERNEL); + if (!svm->vcpu.arch.guest_fpu) { + printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n"); + err = -ENOMEM; + goto free_partial_svm; + } + err = kvm_vcpu_init(&svm->vcpu, kvm, id); if (err) goto free_svm; @@ -2180,6 +2187,8 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) uninit: kvm_vcpu_uninit(&svm->vcpu); free_svm: + kmem_cache_free(x86_fpu_cache, svm->vcpu.arch.guest_fpu); +free_partial_svm: kmem_cache_free(kvm_vcpu_cache, svm); out: return ERR_PTR(err); @@ -2194,6 +2203,7 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu) __free_page(virt_to_page(svm->nested.hsave)); __free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER); kvm_vcpu_uninit(vcpu); + kmem_cache_free(x86_fpu_cache, svm->vcpu.arch.guest_fpu); kmem_cache_free(kvm_vcpu_cache, svm); /* * The vmcb page can be recycled, causing a false negative in diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index abeeb45d1c33..4078cf15a4b0 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11476,6 +11476,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) free_loaded_vmcs(vmx->loaded_vmcs); kfree(vmx->guest_msrs); kvm_vcpu_uninit(vcpu); + kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu); kmem_cache_free(kvm_vcpu_cache, vmx); } @@ -11489,6 +11490,13 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) if (!vmx) return ERR_PTR(-ENOMEM); + vmx->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache, GFP_KERNEL); + if (!vmx->vcpu.arch.guest_fpu) { + printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n"); + err = -ENOMEM; + goto free_partial_vcpu; + } + vmx->vpid = allocate_vpid(); err = kvm_vcpu_init(&vmx->vcpu, kvm, id); @@ -11576,6 +11584,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) kvm_vcpu_uninit(&vmx->vcpu); free_vcpu: free_vpid(vmx->vpid); + kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu); +free_partial_vcpu: kmem_cache_free(kvm_vcpu_cache, vmx); return ERR_PTR(err); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ff77514f7367..420516f0749a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -213,6 +213,9 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { u64 __read_mostly host_xcr0; +struct kmem_cache *x86_fpu_cache; +EXPORT_SYMBOL_GPL(x86_fpu_cache); + static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt); static inline void kvm_async_pf_hash_reset(struct kvm_vcpu *vcpu) @@ -3635,7 +3638,7 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) { - struct xregs_state *xsave = &vcpu->arch.guest_fpu.state.xsave; + struct xregs_state *xsave = &vcpu->arch.guest_fpu->state.xsave; u64 xstate_bv = xsave->header.xfeatures; u64 valid; @@ -3677,7 +3680,7 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) static void load_xsave(struct kvm_vcpu *vcpu, u8 *src) { - struct xregs_state *xsave = &vcpu->arch.guest_fpu.state.xsave; + struct xregs_state *xsave = &vcpu->arch.guest_fpu->state.xsave; u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET); u64 valid; @@ -3725,7 +3728,7 @@ static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, fill_xsave((u8 *) guest_xsave->region, vcpu); } else { memcpy(guest_xsave->region, - &vcpu->arch.guest_fpu.state.fxsave, + &vcpu->arch.guest_fpu->state.fxsave, sizeof(struct fxregs_state)); *(u64 *)&guest_xsave->region[XSAVE_HDR_OFFSET / sizeof(u32)] = XFEATURE_MASK_FPSSE; @@ -3755,7 +3758,7 @@ static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, if (xstate_bv & ~XFEATURE_MASK_FPSSE || mxcsr & ~mxcsr_feature_mask) return -EINVAL; - memcpy(&vcpu->arch.guest_fpu.state.fxsave, + memcpy(&vcpu->arch.guest_fpu->state.fxsave, guest_xsave->region, sizeof(struct fxregs_state)); } return 0; @@ -6819,10 +6822,23 @@ int kvm_arch_init(void *opaque) } r = -ENOMEM; + x86_fpu_cache = kmem_cache_create_usercopy( + "x86_fpu", + sizeof(struct fpu), + __alignof__(struct fpu), + SLAB_ACCOUNT, + offsetof(struct fpu, state), + sizeof_field(struct fpu, state), + NULL); + if (!x86_fpu_cache) { + printk(KERN_ERR "kvm: failed to allocate cache for x86 fpu\n"); + goto out; + } + shared_msrs = alloc_percpu(struct kvm_shared_msrs); if (!shared_msrs) { printk(KERN_ERR "kvm: failed to allocate percpu kvm_shared_msrs\n"); - goto out; + goto out_free_x86_fpu_cache; } r = kvm_mmu_module_init(); @@ -6855,6 +6871,8 @@ int kvm_arch_init(void *opaque) out_free_percpu: free_percpu(shared_msrs); +out_free_x86_fpu_cache: + kmem_cache_destroy(x86_fpu_cache); out: return r; } @@ -6878,6 +6896,7 @@ void kvm_arch_exit(void) kvm_x86_ops = NULL; kvm_mmu_module_exit(); free_percpu(shared_msrs); + kmem_cache_destroy(x86_fpu_cache); } int kvm_vcpu_halt(struct kvm_vcpu *vcpu) @@ -8001,7 +8020,7 @@ static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu) preempt_disable(); copy_fpregs_to_fpstate(¤t->thread.fpu); /* PKRU is separately restored in kvm_x86_ops->run. */ - __copy_kernel_to_fpregs(&vcpu->arch.guest_fpu.state, + __copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state, ~XFEATURE_MASK_PKRU); preempt_enable(); trace_kvm_fpu(1); @@ -8011,7 +8030,7 @@ static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu) static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu) { preempt_disable(); - copy_fpregs_to_fpstate(&vcpu->arch.guest_fpu); + copy_fpregs_to_fpstate(vcpu->arch.guest_fpu); copy_kernel_to_fpregs(¤t->thread.fpu.state); preempt_enable(); ++vcpu->stat.fpu_reload; @@ -8506,7 +8525,7 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) vcpu_load(vcpu); - fxsave = &vcpu->arch.guest_fpu.state.fxsave; + fxsave = &vcpu->arch.guest_fpu->state.fxsave; memcpy(fpu->fpr, fxsave->st_space, 128); fpu->fcw = fxsave->cwd; fpu->fsw = fxsave->swd; @@ -8526,7 +8545,7 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) vcpu_load(vcpu); - fxsave = &vcpu->arch.guest_fpu.state.fxsave; + fxsave = &vcpu->arch.guest_fpu->state.fxsave; memcpy(fxsave->st_space, fpu->fpr, 128); fxsave->cwd = fpu->fcw; @@ -8582,9 +8601,9 @@ static int sync_regs(struct kvm_vcpu *vcpu) static void fx_init(struct kvm_vcpu *vcpu) { - fpstate_init(&vcpu->arch.guest_fpu.state); + fpstate_init(&vcpu->arch.guest_fpu->state); if (boot_cpu_has(X86_FEATURE_XSAVES)) - vcpu->arch.guest_fpu.state.xsave.header.xcomp_bv = + vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv = host_xcr0 | XSTATE_COMPACTION_ENABLED; /* @@ -8708,11 +8727,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) */ if (init_event) kvm_put_guest_fpu(vcpu); - mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu.state.xsave, + mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu->state.xsave, XFEATURE_MASK_BNDREGS); if (mpx_state_buffer) memset(mpx_state_buffer, 0, sizeof(struct mpx_bndreg_state)); - mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu.state.xsave, + mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu->state.xsave, XFEATURE_MASK_BNDCSR); if (mpx_state_buffer) memset(mpx_state_buffer, 0, sizeof(struct mpx_bndcsr)); From patchwork Wed Oct 31 13:26:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Orr X-Patchwork-Id: 10662679 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA61E14DE for ; Wed, 31 Oct 2018 13:26:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1CEA28AA4 for ; Wed, 31 Oct 2018 13:26:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 90684284BD; Wed, 31 Oct 2018 13:26:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02FFD284BD for ; Wed, 31 Oct 2018 13:26:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729147AbeJaWYt (ORCPT ); Wed, 31 Oct 2018 18:24:49 -0400 Received: from mail-oi1-f201.google.com ([209.85.167.201]:44263 "EHLO mail-oi1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729100AbeJaWYs (ORCPT ); Wed, 31 Oct 2018 18:24:48 -0400 Received: by mail-oi1-f201.google.com with SMTP id j192-v6so7601183oih.11 for ; Wed, 31 Oct 2018 06:26:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=+kKh85pxhdnY1kL+7UzqrV5xVyad1ppPxeuCWtpIWyw=; b=Ct2XtX9b23GZ26vl6VP5NA3NZg7RWq519mYx7oJ8CrLrgjowUDaiZmDRTdQDGls9mT adpzHqPePiiIeDFZMbj92pvz6N2BDauDdNyROzkGus8ImDbwPxtAbVvqr/ycqpMvQJHv S3XKoeOaUs74PlB+MOu/CD58m/hbdCNGs1SmDNmsyMF4afG/1S2Lq1rp4hF0sEa7yUIi 6GBq8IONXg300L89jOOR1PBFinf4RCMuDc1lycFEnQi2oB70ZeZoUOH2J3dIpzYyR3Nw pxXtwnSaNrHPkltm+9T5rfgPTEqid56qv/OqGBClscQUvX8F0cdH55m9IZGHlRVhYn48 MABg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=+kKh85pxhdnY1kL+7UzqrV5xVyad1ppPxeuCWtpIWyw=; b=EpIyc0SsvwUJ/SET5jaZax7wudEcMbc/CU9P5Vhu+kg9dN3hcS4PsrkJqISpRHeojL 4uT53GoC7jZ4uIf78Aa60EAq5tmIdBYULkWIAR6cht5lNKptbblaVmg6UXa4cf6NNvlf ag12pL/12veJTNZxw3lKlOpe7ojfVMfaBzjdG3OXNOJ2OOYYOTqXxWFS066gLFqLN0KB pZ7eEa7k4IBzRomYLCvaP7/fYWvQZiIde6HqOGFBCmnSZ/kPu6PVJJRrA++g+BI+YHyB kBkp3axfk1MM8UAQHy1oaIKUAyqqmU/SXX+shmpUxiPaVTqPQHxxANCJPdJtU4VUWkU7 1W0Q== X-Gm-Message-State: AGRZ1gJS5jNE8+CPZQEUPWIvC53yshBJVaZ5p9/xj5uKy5K7XM7zvsWs eG3Cho7ZO9CcUXrFcFlPNfPWyVB5xPsvzAwFBdA9qB9y3JB5aPX5DbZBZ9Q9k4lWjuAekAudEnb 58fCKqjaEDn68UvzU7FtbXo2TGk/jk1xq0DHYXcFZH5csTo42Yipe+B+72is/ X-Google-Smtp-Source: AJdET5enSEQeBJOcQVgXZpGLBcQtZd0h76K8S3C874gGlFjLhe01hHCqTSH9nWeq2d0OTbNmqW0DBLEbWib0 X-Received: by 2002:a9d:2944:: with SMTP id d62mr1890189otb.9.1540992406394; Wed, 31 Oct 2018 06:26:46 -0700 (PDT) Date: Wed, 31 Oct 2018 06:26:33 -0700 In-Reply-To: <20181031132634.50440-1-marcorr@google.com> Message-Id: <20181031132634.50440-4-marcorr@google.com> Mime-Version: 1.0 References: <20181031132634.50440-1-marcorr@google.com> X-Mailer: git-send-email 2.19.1.568.g152ad8e336-goog Subject: [kvm PATCH v5 3/4] kvm: vmx: refactor vmx_msrs struct for vmalloc From: Marc Orr To: kvm@vger.kernel.org, jmattson@google.com, rientjes@google.com, konrad.wilk@oracle.com, linux-mm@kvack.org, akpm@linux-foundation.org, pbonzini@redhat.com, rkrcmar@redhat.com, willy@infradead.org, sean.j.christopherson@intel.com, dave.hansen@linux.intel.com, kernellwp@gmail.com Cc: Marc Orr Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Previously, the vmx_msrs struct relied being aligned within a struct that is backed by the direct map (e.g., memory allocated with kalloc()). Specifically, this enabled the virtual addresses associated with the struct to be translated to physical addresses. However, we'd like to refactor the host struct, vcpu_vmx, to be allocated with vmalloc(), so that allocation will succeed when contiguous physical memory is scarce. Thus, this patch refactors how vmx_msrs is declared and allocated, to ensure that it can be mapped to the physical address space, even when vmx_msrs resides within in a vmalloc()'d struct. Signed-off-by: Marc Orr --- arch/x86/kvm/vmx.c | 57 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 4078cf15a4b0..315cf4b5f262 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -970,8 +970,25 @@ static inline int pi_test_sn(struct pi_desc *pi_desc) struct vmx_msrs { unsigned int nr; - struct vmx_msr_entry val[NR_AUTOLOAD_MSRS]; + struct vmx_msr_entry *val; }; +struct kmem_cache *vmx_msr_entry_cache; + +/* + * To prevent vmx_msr_entry array from crossing a page boundary, require: + * sizeof(*vmx_msrs.vmx_msr_entry.val) to be a power of two. This is guaranteed + * through compile-time asserts that: + * - NR_AUTOLOAD_MSRS * sizeof(struct vmx_msr_entry) is a power of two + * - NR_AUTOLOAD_MSRS * sizeof(struct vmx_msr_entry) <= PAGE_SIZE + * - The allocation of vmx_msrs.vmx_msr_entry.val is aligned to its size. + */ +#define CHECK_POWER_OF_TWO(val) \ + BUILD_BUG_ON_MSG(!((val) && !((val) & ((val) - 1))), \ + #val " is not a power of two.") +#define CHECK_INTRA_PAGE(val) do { \ + CHECK_POWER_OF_TWO(val); \ + BUILD_BUG_ON(!(val <= PAGE_SIZE)); \ + } while (0) struct vcpu_vmx { struct kvm_vcpu vcpu; @@ -11497,6 +11514,19 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) goto free_partial_vcpu; } + vmx->msr_autoload.guest.val = + kmem_cache_zalloc(vmx_msr_entry_cache, GFP_KERNEL); + if (!vmx->msr_autoload.guest.val) { + err = -ENOMEM; + goto free_fpu; + } + vmx->msr_autoload.host.val = + kmem_cache_zalloc(vmx_msr_entry_cache, GFP_KERNEL); + if (!vmx->msr_autoload.host.val) { + err = -ENOMEM; + goto free_msr_autoload_guest; + } + vmx->vpid = allocate_vpid(); err = kvm_vcpu_init(&vmx->vcpu, kvm, id); @@ -11584,6 +11614,10 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) kvm_vcpu_uninit(&vmx->vcpu); free_vcpu: free_vpid(vmx->vpid); + kmem_cache_free(vmx_msr_entry_cache, vmx->msr_autoload.host.val); +free_msr_autoload_guest: + kmem_cache_free(vmx_msr_entry_cache, vmx->msr_autoload.guest.val); +free_fpu: kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu); free_partial_vcpu: kmem_cache_free(kvm_vcpu_cache, vmx); @@ -15163,6 +15197,10 @@ module_exit(vmx_exit); static int __init vmx_init(void) { int r; + size_t vmx_msr_entry_size = + sizeof(struct vmx_msr_entry) * NR_AUTOLOAD_MSRS; + + CHECK_INTRA_PAGE(vmx_msr_entry_size); #if IS_ENABLED(CONFIG_HYPERV) /* @@ -15194,9 +15232,21 @@ static int __init vmx_init(void) #endif r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx), - __alignof__(struct vcpu_vmx), THIS_MODULE); + __alignof__(struct vcpu_vmx), THIS_MODULE); if (r) return r; + /* + * A vmx_msr_entry array resides exclusively within the kernel. Thus, + * use kmem_cache_create_usercopy(), with the usersize argument set to + * ZERO, to blacklist copying vmx_msr_entry to/from user space. + */ + vmx_msr_entry_cache = + kmem_cache_create_usercopy("vmx_msr_entry", vmx_msr_entry_size, + vmx_msr_entry_size, SLAB_ACCOUNT, 0, 0, NULL); + if (!vmx_msr_entry_cache) { + r = -ENOMEM; + goto out; + } /* * Must be called after kvm_init() so enable_ept is properly set @@ -15220,5 +15270,8 @@ static int __init vmx_init(void) vmx_check_vmcs12_offsets(); return 0; +out: + kvm_exit(); + return r; } module_init(vmx_init); From patchwork Wed Oct 31 13:26:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Orr X-Patchwork-Id: 10662683 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33EC113BF for ; Wed, 31 Oct 2018 13:26:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 24298284BD for ; Wed, 31 Oct 2018 13:26:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 17FAF28AA4; Wed, 31 Oct 2018 13:26:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 923CB284BD for ; Wed, 31 Oct 2018 13:26:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729266AbeJaWYu (ORCPT ); Wed, 31 Oct 2018 18:24:50 -0400 Received: from mail-pg1-f201.google.com ([209.85.215.201]:46974 "EHLO mail-pg1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729118AbeJaWYu (ORCPT ); Wed, 31 Oct 2018 18:24:50 -0400 Received: by mail-pg1-f201.google.com with SMTP id 75-v6so11467663pgc.13 for ; Wed, 31 Oct 2018 06:26:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=unPH4bSgTFjVNwKk6oCcUeWwQ3wTeQ6A3j9CRmR6rxE=; b=cTbNiZHLLqiCsYSAZzQUyUDFYhvrkAWgtf3cj0Miwz59hKR9ktOnzwuowCNk10ND2T XeC/gRxIHywBmmJIRBm202aMLKcmxqyqY5Jtg5SqkFm4X2H6ffysMCn7ybtrWYT+PaaG M0udFkpE2MJmeBW3yYirc9TQFPR+lwIX/UJkCkjvvv5cDEEMWsZV5vM7mQLnxMTRsD3j ifkQ6gYckIfuYPQVcq+Uen7rBOmcmYiU5wMG6yroNThexULCmjpdl7EG7m7JYQpb5ct4 65Rcfut8b2u83RbIRPgKl12fBTrwroL2+60iS17EuEts6hEnbbtCZ/zelxNB/5rTu84f W3AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=unPH4bSgTFjVNwKk6oCcUeWwQ3wTeQ6A3j9CRmR6rxE=; b=PQknEiU/89mcytkCBw/FhIEuyMEZBSikZSPLt365hmC6OHXEQ4EtYPtFeJ60o3mS++ PStyaLkPgCKw09b8/nQh6O0O7oPJQKb3JCGkWdLvIl2fE5/5ntC5go+Xq3NDh64GNtY9 xTpDlH4sVPZtJ54adhnyn3KltHYSTnzGjFTXZRtVBzRP3PyGxxtbS36IS5lI1kKdMU+X +gFsed308T/Fwv9mVNHLYHna2p+KuDCfbmq4lMXctSR41TgmDh/OBAwAbLsxI/WbikIr 4zonGIM5u0tcZ8is8ZbVffYt1wkhFwoMAuf8QQU7MZT9n3QE5wdu8NvuPlaudp9NPzwp JWEQ== X-Gm-Message-State: AGRZ1gL1YTollcTVKJAG0EgbyUytcTmmJVloihFADdVoWiPbzTgsrLPw awA/938iCNtket692nVfBTQOJjsvobNWb5iXfrH5nO1u5CTkv2wXZUX8Sn1mjGpZMycD/3ulTvX E5S57xWeyDwaf1d98U7Tpl8ENmK5WRI4gJWBBPnLS3CEmL49LQwDUdfDm4AiA X-Google-Smtp-Source: AJdET5fwqKY/nofJYGiflINsQ/c34vPpLl2kZUVixIhMBX5Oty0tx4FrTX1q4lr7MlcFLh3fR7e1oRtlW1FC X-Received: by 2002:a17:902:20e8:: with SMTP id v37-v6mr1039308plg.42.1540992408341; Wed, 31 Oct 2018 06:26:48 -0700 (PDT) Date: Wed, 31 Oct 2018 06:26:34 -0700 In-Reply-To: <20181031132634.50440-1-marcorr@google.com> Message-Id: <20181031132634.50440-5-marcorr@google.com> Mime-Version: 1.0 References: <20181031132634.50440-1-marcorr@google.com> X-Mailer: git-send-email 2.19.1.568.g152ad8e336-goog Subject: [kvm PATCH v5 4/4] kvm: vmx: use vmalloc() to allocate vcpus From: Marc Orr To: kvm@vger.kernel.org, jmattson@google.com, rientjes@google.com, konrad.wilk@oracle.com, linux-mm@kvack.org, akpm@linux-foundation.org, pbonzini@redhat.com, rkrcmar@redhat.com, willy@infradead.org, sean.j.christopherson@intel.com, dave.hansen@linux.intel.com, kernellwp@gmail.com Cc: Marc Orr Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Previously, vcpus were allocated through the kmem_cache_zalloc() API, which requires the underlying physical memory to be contiguous. Because the x86 vcpu struct, struct vcpu_vmx, is relatively large (e.g., currently 47680 bytes on my setup), it can become hard to find contiguous memory. At the same time, the comments in the code indicate that the primary reason for using the kmem_cache_zalloc() API is to align the memory rather than to provide physical contiguity. Thus, this patch updates the vcpu allocation logic for vmx to use the vmalloc() API. Signed-off-by: Marc Orr --- arch/x86/kvm/vmx.c | 37 ++++++++++++++++++++++++++++++------- virt/kvm/kvm_main.c | 28 ++++++++++++++++------------ 2 files changed, 46 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 315cf4b5f262..af651540ee45 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -898,7 +898,14 @@ struct nested_vmx { #define POSTED_INTR_ON 0 #define POSTED_INTR_SN 1 -/* Posted-Interrupt Descriptor */ +/* + * Posted-Interrupt Descriptor + * + * Note, the physical address of this structure is used by VMX. Furthermore, the + * translation code assumes that the entire pi_desc struct resides within a + * single page, which will be true because the struct is 64 bytes and 64-byte + * aligned. + */ struct pi_desc { u32 pir[8]; /* Posted interrupt requested */ union { @@ -6633,6 +6640,14 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx) } if (kvm_vcpu_apicv_active(&vmx->vcpu)) { + /* + * Note, pi_desc is contained within a single + * page because the struct is 64 bytes and 64-byte aligned. + */ + phys_addr_t pi_desc_phys = + page_to_phys(vmalloc_to_page(&vmx->pi_desc)) + + (u64)&vmx->pi_desc % PAGE_SIZE; + vmcs_write64(EOI_EXIT_BITMAP0, 0); vmcs_write64(EOI_EXIT_BITMAP1, 0); vmcs_write64(EOI_EXIT_BITMAP2, 0); @@ -6641,7 +6656,7 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx) vmcs_write16(GUEST_INTR_STATUS, 0); vmcs_write16(POSTED_INTR_NV, POSTED_INTR_VECTOR); - vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc))); + vmcs_write64(POSTED_INTR_DESC_ADDR, pi_desc_phys); } if (!kvm_pause_in_guest(vmx->vcpu.kvm)) { @@ -11494,13 +11509,18 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) kfree(vmx->guest_msrs); kvm_vcpu_uninit(vcpu); kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu); - kmem_cache_free(kvm_vcpu_cache, vmx); + kmem_cache_free(vmx_msr_entry_cache, vmx->msr_autoload.guest.val); + kmem_cache_free(vmx_msr_entry_cache, vmx->msr_autoload.host.val); + vfree(vmx); } static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) { int err; - struct vcpu_vmx *vmx = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); + struct vcpu_vmx *vmx = + __vmalloc(sizeof(struct vcpu_vmx), + GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT, + PAGE_KERNEL); unsigned long *msr_bitmap; int cpu; @@ -11620,7 +11640,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) free_fpu: kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu); free_partial_vcpu: - kmem_cache_free(kvm_vcpu_cache, vmx); + vfree(vmx); return ERR_PTR(err); } @@ -15231,8 +15251,11 @@ static int __init vmx_init(void) } #endif - r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx), - __alignof__(struct vcpu_vmx), THIS_MODULE); + /* + * Disable kmem cache; vmalloc will be used instead + * to avoid OOM'ing when memory is available but not contiguous. + */ + r = kvm_init(&vmx_x86_ops, 0, 0, THIS_MODULE); if (r) return r; /* diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 786ade1843a2..8b979e7c3ecd 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4038,18 +4038,22 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, goto out_free_2; register_reboot_notifier(&kvm_reboot_notifier); - /* A kmem cache lets us meet the alignment requirements of fx_save. */ - if (!vcpu_align) - vcpu_align = __alignof__(struct kvm_vcpu); - kvm_vcpu_cache = - kmem_cache_create_usercopy("kvm_vcpu", vcpu_size, vcpu_align, - SLAB_ACCOUNT, - offsetof(struct kvm_vcpu, arch), - sizeof_field(struct kvm_vcpu, arch), - NULL); - if (!kvm_vcpu_cache) { - r = -ENOMEM; - goto out_free_3; + /* + * When vcpu_size is zero, + * architecture-specific code manages its own vcpu allocation. + */ + kvm_vcpu_cache = NULL; + if (vcpu_size) { + if (!vcpu_align) + vcpu_align = __alignof__(struct kvm_vcpu); + kvm_vcpu_cache = kmem_cache_create_usercopy( + "kvm_vcpu", vcpu_size, vcpu_align, SLAB_ACCOUNT, + offsetof(struct kvm_vcpu, arch), + sizeof_field(struct kvm_vcpu, arch), NULL); + if (!kvm_vcpu_cache) { + r = -ENOMEM; + goto out_free_3; + } } r = kvm_async_pf_init();