From patchwork Thu Nov 29 14:52:44 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Herring X-Patchwork-Id: 1821531 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id 1C8A7DF23A for ; Thu, 29 Nov 2012 14:56:26 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1Te5UB-0007mq-2Q; Thu, 29 Nov 2012 14:53:07 +0000 Received: from mail-ob0-f177.google.com ([209.85.214.177]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1Te5U8-0007mX-9V for linux-arm-kernel@lists.infradead.org; Thu, 29 Nov 2012 14:53:05 +0000 Received: by mail-ob0-f177.google.com with SMTP id ef5so2436249obb.36 for ; Thu, 29 Nov 2012 06:53:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer; bh=RMIoIGKmBsuOCgV5trRaCTTaPCv9HVOi3l6svZjffys=; b=lEEJnv+Q3+mzze0fnrUKflNM/QiVGrAdnUigCy2J7Rx6XElbPN/vEH9uOAJ63gSt3V 33yvtQwCE5McVuAEw1ok6WzmA2lZsptdJC11xb+KwoDC4YxFBirD3HUNOdRqeb4hsNhq lBbkftUKp6sHKORtxsPUuZdsWmtFCMoVeUGWD4lbUNXgAI2ftr6Zuw26ROPFQ6FaN2wq a42jaNpNecw1ftHmoUQMonW4ymR/7U/JaRniKpnabMumiYqnFExDTPI0bdRyRH/vrRkD IRdpm5drS73FL24vKfGWx0ZnSbVuZdu43vmFksV1ydhp8XNrO+KtwGrPKydbI4bfXOWg xgmw== Received: by 10.60.27.97 with SMTP id s1mr1755074oeg.6.1354200782157; Thu, 29 Nov 2012 06:53:02 -0800 (PST) Received: from rob-laptop.grandenetworks.net (65-36-73-129.dyn.grandenetworks.net. [65.36.73.129]) by mx.google.com with ESMTPS id m3sm1796438obm.21.2012.11.29.06.53.00 (version=SSLv3 cipher=OTHER); Thu, 29 Nov 2012 06:53:01 -0800 (PST) From: Rob Herring To: linux-arm-kernel@lists.infradead.org Subject: [PATCH v2] ARM: implement optimized percpu variable access Date: Thu, 29 Nov 2012 08:52:44 -0600 Message-Id: <1354200764-23751-1-git-send-email-robherring2@gmail.com> X-Mailer: git-send-email 1.7.10.4 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20121129_095304_427212_DFC9B991 X-CRM114-Status: GOOD ( 21.66 ) X-Spam-Score: -2.5 (--) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-2.5 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.214.177 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (robherring2[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (robherring2[at]gmail.com) -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature Cc: Russell King , Nicolas Pitre , Tony Lindgren , Will Deacon , Jamie Lokier , Rob Herring X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Rob Herring Use the previously unused TPIDRPRW register to store percpu offsets. TPIDRPRW is only accessible in PL1, so it can only be used in the kernel. This replaces 2 loads with a mrc instruction for each percpu variable access. With hackbench, the performance improvement is 1.4% on Cortex-A9 (highbank). Taking an average of 30 runs of "hackbench -l 1000" yields: Before: 6.2191 After: 6.1348 Will Deacon reported similar delta on v6 with 11MPCore. The asm "memory" constraints are needed here to ensure the percpu offset gets reloaded. Testing by Will found that this would not happen in __schedule() which is a bit of a special case as preemption is disabled but the execution can move cores. Signed-off-by: Rob Herring Acked-by: Will Deacon Acked-by: Nicolas Pitre Acked-by: Tony Lindgren --- Changes in v2: - Add asm "memory" constraint - Only enable on v6K and v7 and avoid enabling for v6 SMP_ON_UP - Fix missing initialization of TPIDRPRW for resume path - Move cpu_init to beginning of secondary_start_kernel to ensure percpu variables can be accessed as early as possible. arch/arm/include/asm/Kbuild | 1 - arch/arm/include/asm/percpu.h | 45 +++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/setup.c | 6 ++++++ arch/arm/kernel/smp.c | 4 +++- 4 files changed, 54 insertions(+), 2 deletions(-) create mode 100644 arch/arm/include/asm/percpu.h diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild index f70ae17..2ffdaac 100644 --- a/arch/arm/include/asm/Kbuild +++ b/arch/arm/include/asm/Kbuild @@ -16,7 +16,6 @@ generic-y += local64.h generic-y += msgbuf.h generic-y += param.h generic-y += parport.h -generic-y += percpu.h generic-y += poll.h generic-y += resource.h generic-y += sections.h diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h new file mode 100644 index 0000000..968c0a1 --- /dev/null +++ b/arch/arm/include/asm/percpu.h @@ -0,0 +1,45 @@ +/* + * Copyright 2012 Calxeda, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ +#ifndef _ASM_ARM_PERCPU_H_ +#define _ASM_ARM_PERCPU_H_ + +/* + * Same as asm-generic/percpu.h, except that we store the per cpu offset + * in the TPIDRPRW. TPIDRPRW only exists on V6K and V7 + */ +#if defined(CONFIG_SMP) && !defined(CONFIG_CPU_V6) +static inline void set_my_cpu_offset(unsigned long off) +{ + /* Set TPIDRPRW */ + asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (off) : "memory"); +} + +static inline unsigned long __my_cpu_offset(void) +{ + unsigned long off; + /* Read TPIDRPRW */ + asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : : "memory"); + return off; +} +#define __my_cpu_offset __my_cpu_offset() +#else +#define set_my_cpu_offset(x) do {} while(0) + +#endif /* CONFIG_SMP */ + +#include + +#endif /* _ASM_ARM_PERCPU_H_ */ diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index da1d1aa..b2909ce 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -383,6 +383,12 @@ void cpu_init(void) BUG(); } + /* + * This only works on resume and secondary cores. For booting on the + * boot cpu, smp_prepare_boot_cpu is called after percpu area setup. + */ + set_my_cpu_offset(per_cpu_offset(cpu)); + cpu_proc_init(); /* diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index fbc8b26..aadcca7 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -296,6 +296,8 @@ asmlinkage void __cpuinit secondary_start_kernel(void) struct mm_struct *mm = &init_mm; unsigned int cpu; + cpu_init(); + /* * The identity mapping is uncached (strongly ordered), so * switch away from it before attempting any exclusive accesses. @@ -315,7 +317,6 @@ asmlinkage void __cpuinit secondary_start_kernel(void) printk("CPU%u: Booted secondary processor\n", cpu); - cpu_init(); preempt_disable(); trace_hardirqs_off(); @@ -371,6 +372,7 @@ void __init smp_cpus_done(unsigned int max_cpus) void __init smp_prepare_boot_cpu(void) { + set_my_cpu_offset(per_cpu_offset(smp_processor_id())); } void __init smp_prepare_cpus(unsigned int max_cpus)