From patchwork Mon Jul 1 06:22:04 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rusty Russell X-Patchwork-Id: 2806511 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id DEA7F9F9FF for ; Mon, 1 Jul 2013 08:54:27 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id F3B95201A4 for ; Mon, 1 Jul 2013 08:54:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 151582018C for ; Mon, 1 Jul 2013 08:54:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753347Ab3GAIxF (ORCPT ); Mon, 1 Jul 2013 04:53:05 -0400 Received: from ozlabs.org ([203.10.76.45]:34342 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753268Ab3GAIxB (ORCPT ); Mon, 1 Jul 2013 04:53:01 -0400 Received: by ozlabs.org (Postfix, from userid 1011) id 22CBB2C01FC; Mon, 1 Jul 2013 18:53:00 +1000 (EST) From: Rusty Russell To: Chegu Vinod , prarit@redhat.com, LKML , Gleb Natapov , Paolo Bonzini Cc: KVM Cc: Prarit Bhargava Subject: Re: kvm_intel: Could not allocate 42 bytes percpu data In-Reply-To: <51C897A7.50302@hp.com> References: <51C897A7.50302@hp.com> User-Agent: Notmuch/0.15.2+81~gd2c8818 (http://notmuchmail.org) Emacs/23.4.1 (i686-pc-linux-gnu) Date: Mon, 01 Jul 2013 15:52:04 +0930 Message-ID: <87ehbisstv.fsf@rustcorp.com.au> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Chegu Vinod writes: > Hello, > > Lots (~700+) of the following messages are showing up in the dmesg of a > 3.10-rc1 based kernel (Host OS is running on a large socket count box > with HT-on). > > [ 82.270682] PERCPU: allocation failed, size=42 align=16, alloc from > reserved chunk failed > [ 82.272633] kvm_intel: Could not allocate 42 bytes percpu data Woah, weird.... Oh. Shit. Um, this is embarrassing. Thanks, Rusty. Tested-by: Jim Hull --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html === module: do percpu allocation after uniqueness check. No, really! v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting percpu memory on large machines: Now we have a new state MODULE_STATE_UNFORMED, we can insert the module into the list (and thus guarantee its uniqueness) before we allocate the per-cpu region. In my defence, it didn't actually say the patch did this. Just that we "can". This patch actually *does* it. Signed-off-by: Rusty Russell Tested-by: Noone it seems. diff --git a/kernel/module.c b/kernel/module.c index cab4bce..fa53db8 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2927,7 +2927,6 @@ static struct module *layout_and_allocate(struct load_info *info, int flags) { /* Module within temporary copy. */ struct module *mod; - Elf_Shdr *pcpusec; int err; mod = setup_load_info(info, flags); @@ -2942,17 +2941,10 @@ static struct module *layout_and_allocate(struct load_info *info, int flags) err = module_frob_arch_sections(info->hdr, info->sechdrs, info->secstrings, mod); if (err < 0) - goto out; + return ERR_PTR(err); - pcpusec = &info->sechdrs[info->index.pcpu]; - if (pcpusec->sh_size) { - /* We have a special allocation for this section. */ - err = percpu_modalloc(mod, - pcpusec->sh_size, pcpusec->sh_addralign); - if (err) - goto out; - pcpusec->sh_flags &= ~(unsigned long)SHF_ALLOC; - } + /* We will do a special allocation for per-cpu sections later. */ + info->sechdrs[info->index.pcpu].sh_flags &= ~(unsigned long)SHF_ALLOC; /* Determine total sizes, and put offsets in sh_entsize. For now this is done generically; there doesn't appear to be any @@ -2963,17 +2955,22 @@ static struct module *layout_and_allocate(struct load_info *info, int flags) /* Allocate and move to the final place */ err = move_module(mod, info); if (err) - goto free_percpu; + return ERR_PTR(err); /* Module has been copied to its final place now: return it. */ mod = (void *)info->sechdrs[info->index.mod].sh_addr; kmemleak_load_module(mod, info); return mod; +} -free_percpu: - percpu_modfree(mod); -out: - return ERR_PTR(err); +static int alloc_module_percpu(struct module *mod, struct load_info *info) +{ + Elf_Shdr *pcpusec = &info->sechdrs[info->index.pcpu]; + if (!pcpusec->sh_size) + return 0; + + /* We have a special allocation for this section. */ + return percpu_modalloc(mod, pcpusec->sh_size, pcpusec->sh_addralign); } /* mod is no longer valid after this! */ @@ -3237,6 +3234,11 @@ static int load_module(struct load_info *info, const char __user *uargs, } #endif + /* To avoid stressing percpu allocator, do this once we're unique. */ + err = alloc_module_percpu(mod, info); + if (err) + goto unlink_mod; + /* Now module is in final location, initialize linked lists, etc. */ err = module_unload_init(mod); if (err)