From patchwork Mon Feb 11 19:02:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 10806707 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DEBE2746 for ; Mon, 11 Feb 2019 19:03:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C698B293A8 for ; Mon, 11 Feb 2019 19:03:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B531C2AE23; Mon, 11 Feb 2019 19:03:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD0B3293A8 for ; Mon, 11 Feb 2019 19:03:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732504AbfBKTC7 (ORCPT ); Mon, 11 Feb 2019 14:02:59 -0500 Received: from mail-ua1-f74.google.com ([209.85.222.74]:44220 "EHLO mail-ua1-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727021AbfBKTC7 (ORCPT ); Mon, 11 Feb 2019 14:02:59 -0500 Received: by mail-ua1-f74.google.com with SMTP id x23so1384150uan.11 for ; Mon, 11 Feb 2019 11:02:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=Ylk99skZT2uedteDHf46o+fKhoEqpovdBQJ8f22EgQc=; b=FXPIUDubATNRuHJBC6a+PQpG+sMIAqMyPbsyYO8w46vowHx8DoR+UYbfwZUxfKeQI0 +Cku80BX56bhRu8wwlydh5ubo8EgtqcXDFI9A/iIeixUQAeQFnEDcY7sj62twRmbEa4V gc8oC62yQiYqi8KaIGvsC9TEGjfCBIoJTbMOXfMVqzGSRMplXyZVXZSZkraPMAjFIl/0 AoOfWJZQ8aJMtrvEqJr7BL/pjMN5sUf7x24xW8I1SCUeph2vSPMVQ/3/tpqv6icsawkY ZPQTeJ1mgc/8kXHCI/c7AkWyRHSexDi9rfjWdKuJPcDFDF7V7A3dZryXK+bjT0apHYAL nFJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=Ylk99skZT2uedteDHf46o+fKhoEqpovdBQJ8f22EgQc=; b=ULBOF3g8Szazm/qJxX/4DaAjf6XSP8GMMieoYo4Q1SGNR+2MiFA8yKz6jHyTcPsYh4 +akVqiC5Jpgxd/+u7eJgIloPSBk5WHMFP+cHtkJPZczivkb0ZmA30gBZhDlHAgHOu25P ncLQKnXOZHJZSU76/Odv2PwzblrlVFUivKIQmdcxR6a/DYiIk3gtHSKzRLm5lMahNnUW N9V0n3nZK3KvK6XaeICaBK7oMWRukIv6sOUb/2AKiSj9XfZDBrN5RM5uaf/GoyZrWCqg mxa6KW82vFvQEnZZT9l7nSanQqonoZRj04fpbQ6W1jb+nDoF35uTKvaRrj8h93zmUOex nm/g== X-Gm-Message-State: AHQUAuZtvKUvqS05AntF59a9sCU9Gh4iJmXcnXiWWTIa4kobmbc1xYhy ALZLCIKwL4MejL5aUH3A1SLFu/QDB//H X-Google-Smtp-Source: AHgI3IbdVStA1+lLQOv7YVOTMe+cp5IA1W6qKeX0DkLskDHtxFhsr9THEqOMhPKQyKgB6cCifeVyQmsiqbIA X-Received: by 2002:ab0:7698:: with SMTP id v24mr12573580uaq.11.1549911777460; Mon, 11 Feb 2019 11:02:57 -0800 (PST) Date: Mon, 11 Feb 2019 11:02:49 -0800 Message-Id: <20190211190252.198101-1-bgardon@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.20.1.791.gb4d0f1c61a-goog Subject: [PATCH v3 1/4] kvm: Add memcg accounting to KVM allocations From: Ben Gardon To: Paolo Bonzini , Sean Christopherson Cc: " =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= " , Shakeel Butt , kvm@vger.kernel.org, Ben Gardon Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP There are many KVM kernel memory allocations which are tied to the life of the VM process and should be charged to the VM process's cgroup. If the allocations aren't tied to the process, the OOM killer will not know that killing the process will free the associated kernel memory. Add __GFP_ACCOUNT flags to many of the allocations which are not yet being charged to the VM process's cgroup. Tested: Ran all kvm-unit-tests on a 64 bit Haswell machine, the patch introduced no new failures. Ran a kernel memory accounting test which creates a VM to touch memory and then checks that the kernel memory allocated for the process is within certain bounds. With this patch we account for much more of the vmalloc and slab memory allocated for the VM. There remain a few allocations which should be charged to the VM's cgroup but are not. In they include: vcpu->run kvm->coalesced_mmio_ring There allocations are unaccounted in this patch because they are mapped to userspace, and accounting them to a cgroup causes problems. This should be addressed in a future patch. Signed-off-by: Ben Gardon Reviewed-by: Shakeel Butt --- virt/kvm/coalesced_mmio.c | 3 ++- virt/kvm/eventfd.c | 7 ++++--- virt/kvm/irqchip.c | 4 ++-- virt/kvm/kvm_main.c | 29 +++++++++++++++-------------- virt/kvm/vfio.c | 4 ++-- 5 files changed, 25 insertions(+), 22 deletions(-) diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index 6855cce3e5287..5294abb3f1788 100644 --- a/virt/kvm/coalesced_mmio.c +++ b/virt/kvm/coalesced_mmio.c @@ -144,7 +144,8 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm, if (zone->pio != 1 && zone->pio != 0) return -EINVAL; - dev = kzalloc(sizeof(struct kvm_coalesced_mmio_dev), GFP_KERNEL); + dev = kzalloc(sizeof(struct kvm_coalesced_mmio_dev), + GFP_KERNEL_ACCOUNT); if (!dev) return -ENOMEM; diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index b20b751286fc6..4325250afd728 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -297,7 +297,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) if (!kvm_arch_intc_initialized(kvm)) return -EAGAIN; - irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL); + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL_ACCOUNT); if (!irqfd) return -ENOMEM; @@ -345,7 +345,8 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) } if (!irqfd->resampler) { - resampler = kzalloc(sizeof(*resampler), GFP_KERNEL); + resampler = kzalloc(sizeof(*resampler), + GFP_KERNEL_ACCOUNT); if (!resampler) { ret = -ENOMEM; mutex_unlock(&kvm->irqfds.resampler_lock); @@ -797,7 +798,7 @@ static int kvm_assign_ioeventfd_idx(struct kvm *kvm, if (IS_ERR(eventfd)) return PTR_ERR(eventfd); - p = kzalloc(sizeof(*p), GFP_KERNEL); + p = kzalloc(sizeof(*p), GFP_KERNEL_ACCOUNT); if (!p) { ret = -ENOMEM; goto fail; diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c index b1286c4e07122..3547b0d8c91ea 100644 --- a/virt/kvm/irqchip.c +++ b/virt/kvm/irqchip.c @@ -196,7 +196,7 @@ int kvm_set_irq_routing(struct kvm *kvm, nr_rt_entries += 1; new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head)), - GFP_KERNEL); + GFP_KERNEL_ACCOUNT); if (!new) return -ENOMEM; @@ -208,7 +208,7 @@ int kvm_set_irq_routing(struct kvm *kvm, for (i = 0; i < nr; ++i) { r = -ENOMEM; - e = kzalloc(sizeof(*e), GFP_KERNEL); + e = kzalloc(sizeof(*e), GFP_KERNEL_ACCOUNT); if (!e) goto out; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1f888a103f788..0127dd5ff2eb3 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -525,7 +525,7 @@ static struct kvm_memslots *kvm_alloc_memslots(void) int i; struct kvm_memslots *slots; - slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); + slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL_ACCOUNT); if (!slots) return NULL; @@ -601,12 +601,12 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, int fd) kvm->debugfs_stat_data = kcalloc(kvm_debugfs_num_entries, sizeof(*kvm->debugfs_stat_data), - GFP_KERNEL); + GFP_KERNEL_ACCOUNT); if (!kvm->debugfs_stat_data) return -ENOMEM; for (p = debugfs_entries; p->name; p++) { - stat_data = kzalloc(sizeof(*stat_data), GFP_KERNEL); + stat_data = kzalloc(sizeof(*stat_data), GFP_KERNEL_ACCOUNT); if (!stat_data) return -ENOMEM; @@ -671,7 +671,7 @@ static struct kvm *kvm_create_vm(unsigned long type) goto out_err_no_irq_srcu; for (i = 0; i < KVM_NR_BUSES; i++) { rcu_assign_pointer(kvm->buses[i], - kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL)); + kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT)); if (!kvm->buses[i]) goto out_err; } @@ -789,7 +789,7 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot) { unsigned long dirty_bytes = 2 * kvm_dirty_bitmap_bytes(memslot); - memslot->dirty_bitmap = kvzalloc(dirty_bytes, GFP_KERNEL); + memslot->dirty_bitmap = kvzalloc(dirty_bytes, GFP_KERNEL_ACCOUNT); if (!memslot->dirty_bitmap) return -ENOMEM; @@ -1018,7 +1018,7 @@ int __kvm_set_memory_region(struct kvm *kvm, goto out_free; } - slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); + slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL_ACCOUNT); if (!slots) goto out_free; memcpy(slots, __kvm_memslots(kvm, as_id), sizeof(struct kvm_memslots)); @@ -2678,7 +2678,7 @@ static long kvm_vcpu_ioctl(struct file *filp, struct kvm_regs *kvm_regs; r = -ENOMEM; - kvm_regs = kzalloc(sizeof(struct kvm_regs), GFP_KERNEL); + kvm_regs = kzalloc(sizeof(struct kvm_regs), GFP_KERNEL_ACCOUNT); if (!kvm_regs) goto out; r = kvm_arch_vcpu_ioctl_get_regs(vcpu, kvm_regs); @@ -2706,7 +2706,8 @@ static long kvm_vcpu_ioctl(struct file *filp, break; } case KVM_GET_SREGS: { - kvm_sregs = kzalloc(sizeof(struct kvm_sregs), GFP_KERNEL); + kvm_sregs = kzalloc(sizeof(struct kvm_sregs), + GFP_KERNEL_ACCOUNT); r = -ENOMEM; if (!kvm_sregs) goto out; @@ -2798,7 +2799,7 @@ static long kvm_vcpu_ioctl(struct file *filp, break; } case KVM_GET_FPU: { - fpu = kzalloc(sizeof(struct kvm_fpu), GFP_KERNEL); + fpu = kzalloc(sizeof(struct kvm_fpu), GFP_KERNEL_ACCOUNT); r = -ENOMEM; if (!fpu) goto out; @@ -2975,7 +2976,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm, if (test) return 0; - dev = kzalloc(sizeof(*dev), GFP_KERNEL); + dev = kzalloc(sizeof(*dev), GFP_KERNEL_ACCOUNT); if (!dev) return -ENOMEM; @@ -3709,7 +3710,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, return -ENOSPC; new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count + 1) * - sizeof(struct kvm_io_range)), GFP_KERNEL); + sizeof(struct kvm_io_range)), GFP_KERNEL_ACCOUNT); if (!new_bus) return -ENOMEM; @@ -3755,7 +3756,7 @@ void kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx, return; new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count - 1) * - sizeof(struct kvm_io_range)), GFP_KERNEL); + sizeof(struct kvm_io_range)), GFP_KERNEL_ACCOUNT); if (!new_bus) { pr_err("kvm: failed to shrink bus, removing it completely\n"); goto broken; @@ -4023,7 +4024,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm) active = kvm_active_vms; spin_unlock(&kvm_lock); - env = kzalloc(sizeof(*env), GFP_KERNEL); + env = kzalloc(sizeof(*env), GFP_KERNEL_ACCOUNT); if (!env) return; @@ -4039,7 +4040,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm) add_uevent_var(env, "PID=%d", kvm->userspace_pid); if (kvm->debugfs_dentry) { - char *tmp, *p = kmalloc(PATH_MAX, GFP_KERNEL); + char *tmp, *p = kmalloc(PATH_MAX, GFP_KERNEL_ACCOUNT); if (p) { tmp = dentry_path_raw(kvm->debugfs_dentry, p, PATH_MAX); diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index d99850c462a18..524cbd20379fb 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -219,7 +219,7 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg) } } - kvg = kzalloc(sizeof(*kvg), GFP_KERNEL); + kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT); if (!kvg) { mutex_unlock(&kv->lock); kvm_vfio_group_put_external_user(vfio_group); @@ -405,7 +405,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type) if (tmp->ops == &kvm_vfio_ops) return -EBUSY; - kv = kzalloc(sizeof(*kv), GFP_KERNEL); + kv = kzalloc(sizeof(*kv), GFP_KERNEL_ACCOUNT); if (!kv) return -ENOMEM;