From patchwork Mon Jul  1 06:22:04 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Rusty Russell <rusty@rustcorp.com.au>
X-Patchwork-Id: 2806511
Return-Path: <kvm-owner@kernel.org>
X-Original-To: patchwork-kvm@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.19.201])
	by patchwork1.web.kernel.org (Postfix) with ESMTP id DEA7F9F9FF
	for <patchwork-kvm@patchwork.kernel.org>;
	Mon,  1 Jul 2013 08:54:27 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id F3B95201A4
	for <patchwork-kvm@patchwork.kernel.org>;
	Mon,  1 Jul 2013 08:54:26 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id 151582018C
	for <patchwork-kvm@patchwork.kernel.org>;
	Mon,  1 Jul 2013 08:54:26 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753347Ab3GAIxF (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Mon, 1 Jul 2013 04:53:05 -0400
Received: from ozlabs.org ([203.10.76.45]:34342 "EHLO ozlabs.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753268Ab3GAIxB (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 1 Jul 2013 04:53:01 -0400
Received: by ozlabs.org (Postfix, from userid 1011)
	id 22CBB2C01FC; Mon,  1 Jul 2013 18:53:00 +1000 (EST)
From: Rusty Russell <rusty@rustcorp.com.au>
To: Chegu Vinod <chegu_vinod@hp.com>, prarit@redhat.com,
	LKML <linux-kernel@vger.kernel.org>, Gleb Natapov <gleb@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: KVM <kvm@vger.kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Subject: Re: kvm_intel: Could not allocate 42 bytes percpu data
In-Reply-To: <51C897A7.50302@hp.com>
References: <51C897A7.50302@hp.com>
User-Agent: Notmuch/0.15.2+81~gd2c8818 (http://notmuchmail.org) Emacs/23.4.1
	(i686-pc-linux-gnu)
Date: Mon, 01 Jul 2013 15:52:04 +0930
Message-ID: <87ehbisstv.fsf@rustcorp.com.au>
MIME-Version: 1.0
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI,
	RP_MATCHES_RCVD,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Chegu Vinod <chegu_vinod@hp.com> writes:
> Hello,
>
> Lots (~700+) of the following messages are showing up in the dmesg of a 
> 3.10-rc1 based kernel (Host OS is running on a large socket count box 
> with HT-on).
>
> [   82.270682] PERCPU: allocation failed, size=42 align=16, alloc from 
> reserved chunk failed
> [   82.272633] kvm_intel: Could not allocate 42 bytes percpu data

Woah, weird....

Oh.  Shit.  Um, this is embarrassing.

Thanks,
Rusty.
Tested-by: Jim Hull <jim.hull@hp.com>
---
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

===
module: do percpu allocation after uniqueness check.  No, really!

v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting
percpu memory on large machines:

    Now we have a new state MODULE_STATE_UNFORMED, we can insert the
    module into the list (and thus guarantee its uniqueness) before we
    allocate the per-cpu region.

In my defence, it didn't actually say the patch did this.  Just that
we "can".

This patch actually *does* it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Noone it seems.

diff --git a/kernel/module.c b/kernel/module.c
index cab4bce..fa53db8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2927,7 +2927,6 @@ static struct module *layout_and_allocate(struct load_info *info, int flags)
 {
 	/* Module within temporary copy. */
 	struct module *mod;
-	Elf_Shdr *pcpusec;
 	int err;
 
 	mod = setup_load_info(info, flags);
@@ -2942,17 +2941,10 @@ static struct module *layout_and_allocate(struct load_info *info, int flags)
 	err = module_frob_arch_sections(info->hdr, info->sechdrs,
 					info->secstrings, mod);
 	if (err < 0)
-		goto out;
+		return ERR_PTR(err);
 
-	pcpusec = &info->sechdrs[info->index.pcpu];
-	if (pcpusec->sh_size) {
-		/* We have a special allocation for this section. */
-		err = percpu_modalloc(mod,
-				      pcpusec->sh_size, pcpusec->sh_addralign);
-		if (err)
-			goto out;
-		pcpusec->sh_flags &= ~(unsigned long)SHF_ALLOC;
-	}
+	/* We will do a special allocation for per-cpu sections later. */
+	info->sechdrs[info->index.pcpu].sh_flags &= ~(unsigned long)SHF_ALLOC;
 
 	/* Determine total sizes, and put offsets in sh_entsize.  For now
 	   this is done generically; there doesn't appear to be any
@@ -2963,17 +2955,22 @@ static struct module *layout_and_allocate(struct load_info *info, int flags)
 	/* Allocate and move to the final place */
 	err = move_module(mod, info);
 	if (err)
-		goto free_percpu;
+		return ERR_PTR(err);
 
 	/* Module has been copied to its final place now: return it. */
 	mod = (void *)info->sechdrs[info->index.mod].sh_addr;
 	kmemleak_load_module(mod, info);
 	return mod;
+}
 
-free_percpu:
-	percpu_modfree(mod);
-out:
-	return ERR_PTR(err);
+static int alloc_module_percpu(struct module *mod, struct load_info *info)
+{
+	Elf_Shdr *pcpusec = &info->sechdrs[info->index.pcpu];
+	if (!pcpusec->sh_size)
+		return 0;
+
+	/* We have a special allocation for this section. */
+	return percpu_modalloc(mod, pcpusec->sh_size, pcpusec->sh_addralign);
 }
 
 /* mod is no longer valid after this! */
@@ -3237,6 +3234,11 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	}
 #endif
 
+	/* To avoid stressing percpu allocator, do this once we're unique. */
+	err = alloc_module_percpu(mod, info);
+	if (err)
+		goto unlink_mod;
+
 	/* Now module is in final location, initialize linked lists, etc. */
 	err = module_unload_init(mod);
 	if (err)