From patchwork Mon Jun 10 03:53:12 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Pitre X-Patchwork-Id: 2695631 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) by patchwork2.kernel.org (Postfix) with ESMTP id 04AA9DF264 for ; Mon, 10 Jun 2013 03:53:49 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1UltAs-0004df-9y; Mon, 10 Jun 2013 03:53:42 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1UltAp-0008Qi-Cq; Mon, 10 Jun 2013 03:53:39 +0000 Received: from mail-qe0-f53.google.com ([209.85.128.53]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1UltAl-0008QB-Im for linux-arm-kernel@lists.infradead.org; Mon, 10 Jun 2013 03:53:36 +0000 Received: by mail-qe0-f53.google.com with SMTP id 1so3825495qee.12 for ; Sun, 09 Jun 2013 20:53:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version:content-type:x-gm-message-state; bh=ottNwB+nvGK6vL78BHqehV53VxF5lpFyEpkKFuyD0dM=; b=gavyWl1fycRDtFpsUBcWEksjeke0oC37E7lj02/AImZwvIflfcERX0MyghC7M63jZl 0U/gn7yiCloRtn35cImhlmV8XDgzqPfQH/+uuS8HhD7PJCazJ5SwoJYPN8ascE6+ERL6 hasljQp61lGcS5ow7y3ZUZj9OqrMtfpMdekbg8k776zPcN8XZP7+WYNkSbPyI6+QLBeG N+f8jJHRsv25PCfTS2Uh4Q3W03WL5uke5t7eLdIQhGAKi/EI85aKz52aB09WgvU2AF5w f4bTbkz0b0HyPh7P6+M6T+2yfORc9HeJ+Wbv9p+N4k7Ghb4E8glFCTFkCPth/oWz8eCT t2HA== X-Received: by 10.229.109.68 with SMTP id i4mr3224362qcp.147.1370836394190; Sun, 09 Jun 2013 20:53:14 -0700 (PDT) Received: from xanadu.home (modemcable044.209-83-70.mc.videotron.ca. [70.83.209.44]) by mx.google.com with ESMTPSA id gk8sm13209448qab.12.2013.06.09.20.53.12 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sun, 09 Jun 2013 20:53:13 -0700 (PDT) Date: Sun, 9 Jun 2013 23:53:12 -0400 (EDT) From: Nicolas Pitre To: Lorenzo Pieralisi Subject: Re: [PATCH 1/2] ARM: vexpress/TC2: basic PM support In-Reply-To: <20130607142645.GD3111@e102568-lin.cambridge.arm.com> Message-ID: References: <1370587152-4630-1-git-send-email-nicolas.pitre@linaro.org> <1370587152-4630-2-git-send-email-nicolas.pitre@linaro.org> <20130607142645.GD3111@e102568-lin.cambridge.arm.com> User-Agent: Alpine 2.03 (LFD 1266 2009-07-14) MIME-Version: 1.0 X-Gm-Message-State: ALoCoQngqkuLeOdzVAK01xr72szhkW2D86ULdfDK7s0DU1CacEKiW9cr1ycjp6Dv9QP+fFrY4wYy X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130609_235335_710391_D4CF2AF7 X-CRM114-Status: GOOD ( 44.07 ) X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.128.53 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Pawel Moll , Dave P Martin , "linux-arm-kernel@lists.infradead.org" , "patches@linaro.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org On Fri, 7 Jun 2013, Lorenzo Pieralisi wrote: > On Fri, Jun 07, 2013 at 07:39:11AM +0100, Nicolas Pitre wrote: > > +#include > > Is the include above needed ? Apparently not. > > +static int tc2_pm_power_up(unsigned int cpu, unsigned int cluster) > > +{ > > + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); > > + if (cluster >= 2 || cpu >= vexpress_spc_get_nb_cpus(cluster)) > > + return -EINVAL; > > We could stash (vexpress_spc_get_nb_cpus()), it never changes. Well... sure. However, given cpu_up is not a very hot path and because the cpu argument is externally provided in this case, I'm inclined to leave this instance as is. I'll hardcode a constant in the other cases where the cpu number is obtained from the MPIDR and validated with BUG_ON() where this is done only to prevent overflowing the tc2_pm_use_count array if something very wrong happens. Oh and I'll use defines for those constants as well. > > + if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) { > > + arch_spin_unlock(&tc2_pm_lock); > > + > > + set_cr(get_cr() & ~CR_C); > > We must disable L2 prefetching on A15 before cleaning L2. OK. > > + flush_cache_all(); > > + asm volatile ("clrex"); > > + set_auxcr(get_auxcr() & ~(1 << 6)); > > I think we should add comments here to avoid copy'n'paste mayhem. The > code above is safe on cpus like A15/A7 (I know this back-end can just > be run on those processors) that hit in the cache with C-bit in SCTLR > cleared, it would explode on processors (eg A9) that do not hit in the > cache with C-bit cleared. I am wondering if it is better to write inline > asm and jump to v7 cache functions that do not need stack push/pop > straight away. Well... yeah. This is unfortunate because we moved this part from assembly to C over time to clean it up. But better use something safer in case it gets copied as you say. > > > + cci_disable_port_by_cpu(mpidr); > > + > > + /* > > + * Ensure that both C & I bits are disabled in the SCTLR > > + * before disabling ACE snoops. This ensures that no > > + * coherency traffic will originate from this cpu after > > + * ACE snoops are turned off. > > + */ > > + cpu_proc_fin(); > > Mmm, C bit is already cleared, why clear the I bit (and the A bit) ? > I do not think cpu_proc_fin() is needed and I am really keen on getting > the power down procedure right to avoid copy'n'paste induced error from > the start. I trusted the above and its accompanying comment from Achin's initial cluster shutdown code. But I suppose icache lookups don't create snoop requests? > > + __mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN); > > + } else { > > + /* > > + * If last man then undo any setup done previously. > > + */ > > + if (last_man) { > > + vexpress_spc_powerdown_enable(cluster, 0); > > + vexpress_spc_set_global_wakeup_intr(0); > > + } > > + > > + arch_spin_unlock(&tc2_pm_lock); > > + > > + set_cr(get_cr() & ~CR_C); > > + flush_cache_louis(); > > + asm volatile ("clrex"); > > + set_auxcr(get_auxcr() & ~(1 << 6)); > > + } > > + > > + __mcpm_cpu_down(cpu, cluster); > > + > > + /* Now we are prepared for power-down, do it: */ > > + if (!skip_wfi) > > + wfi(); > > + > > + /* Not dead at this point? Let our caller cope. */ > > This function should disable the GIC CPU IF, but I guess you will add > the code when CPUidle is merged. The GIC code does not provide a hook for doing that at the moment. That needs to be sorted out there before that can be added here. [...] OK, here's another version: From: Nicolas Pitre Date: Fri, 19 Oct 2012 20:48:50 -0400 Subject: [PATCH] ARM: vexpress/TC2: basic PM support This is the MCPM backend for the Virtual Express A15x2 A7x3 CoreTile aka TC2. This provides cluster management for SMP secondary boot and CPU hotplug. Signed-off-by: Nicolas Pitre Reviewed-by: Lorenzo Pieralisi diff --git a/arch/arm/mach-vexpress/Kconfig b/arch/arm/mach-vexpress/Kconfig index b8bbabec63..e7a825d7df 100644 --- a/arch/arm/mach-vexpress/Kconfig +++ b/arch/arm/mach-vexpress/Kconfig @@ -66,4 +66,13 @@ config ARCH_VEXPRESS_DCSCB This is needed to provide CPU and cluster power management on RTSM implementing big.LITTLE. +config ARCH_VEXPRESS_TC2 + bool "Versatile Express TC2 power management" + depends on MCPM + select VEXPRESS_SPC + select ARM_CCI + help + Support for CPU and cluster power management on Versatile Express + with a TC2 (A15x2 A7x3) big.LITTLE core tile. + endmenu diff --git a/arch/arm/mach-vexpress/Makefile b/arch/arm/mach-vexpress/Makefile index 48ba89a814..b1cf227fa5 100644 --- a/arch/arm/mach-vexpress/Makefile +++ b/arch/arm/mach-vexpress/Makefile @@ -7,5 +7,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \ obj-y := v2m.o obj-$(CONFIG_ARCH_VEXPRESS_CA9X4) += ct-ca9x4.o obj-$(CONFIG_ARCH_VEXPRESS_DCSCB) += dcscb.o dcscb_setup.o +obj-$(CONFIG_ARCH_VEXPRESS_TC2) += tc2_pm.o obj-$(CONFIG_SMP) += platsmp.o obj-$(CONFIG_HOTPLUG_CPU) += hotplug.o diff --git a/arch/arm/mach-vexpress/tc2_pm.c b/arch/arm/mach-vexpress/tc2_pm.c new file mode 100644 index 0000000000..f0673b4814 --- /dev/null +++ b/arch/arm/mach-vexpress/tc2_pm.c @@ -0,0 +1,275 @@ +/* + * arch/arm/mach-vexpress/tc2_pm.c - TC2 power management support + * + * Created by: Nicolas Pitre, October 2012 + * Copyright: (C) 2012-2013 Linaro Limited + * + * Some portions of this file were originally written by Achin Gupta + * Copyright: (C) 2012 ARM Limited + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include +#include + +/* + * We can't use regular spinlocks. In the switcher case, it is possible + * for an outbound CPU to call power_down() after its inbound counterpart + * is already live using the same logical CPU number which trips lockdep + * debugging. + */ +static arch_spinlock_t tc2_pm_lock = __ARCH_SPIN_LOCK_UNLOCKED; + +#define TC2_CLUSTERS 2 +#define TC2_MAX_CPUS_PER_CLUSTER 3 +static int tc2_pm_use_count[TC2_MAX_CPUS_PER_CLUSTER][TC2_CLUSTERS]; + +static int tc2_pm_power_up(unsigned int cpu, unsigned int cluster) +{ + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); + if (cluster >= TC2_CLUSTERS || cpu >= vexpress_spc_get_nb_cpus(cluster)) + return -EINVAL; + + /* + * Since this is called with IRQs enabled, and no arch_spin_lock_irq + * variant exists, we need to disable IRQs manually here. + */ + local_irq_disable(); + arch_spin_lock(&tc2_pm_lock); + + if (!tc2_pm_use_count[0][cluster] && + !tc2_pm_use_count[1][cluster] && + !tc2_pm_use_count[2][cluster]) + vexpress_spc_powerdown_enable(cluster, 0); + + tc2_pm_use_count[cpu][cluster]++; + if (tc2_pm_use_count[cpu][cluster] == 1) { + vexpress_spc_write_resume_reg(cluster, cpu, + virt_to_phys(mcpm_entry_point)); + vexpress_spc_set_cpu_wakeup_irq(cpu, cluster, 1); + } else if (tc2_pm_use_count[cpu][cluster] != 2) { + /* + * The only possible values are: + * 0 = CPU down + * 1 = CPU (still) up + * 2 = CPU requested to be up before it had a chance + * to actually make itself down. + * Any other value is a bug. + */ + BUG(); + } + + arch_spin_unlock(&tc2_pm_lock); + local_irq_enable(); + + return 0; +} + +static void tc2_pm_power_down(void) +{ + unsigned int mpidr, cpu, cluster; + bool last_man = false, skip_wfi = false; + + mpidr = read_cpuid_mpidr(); + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); + + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); + BUG_ON(cluster >= TC2_CLUSTERS || cpu >= TC2_MAX_CPUS_PER_CLUSTER); + + __mcpm_cpu_going_down(cpu, cluster); + + arch_spin_lock(&tc2_pm_lock); + BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP); + tc2_pm_use_count[cpu][cluster]--; + if (tc2_pm_use_count[cpu][cluster] == 0) { + vexpress_spc_set_cpu_wakeup_irq(cpu, cluster, 1); + if (!tc2_pm_use_count[0][cluster] && + !tc2_pm_use_count[1][cluster] && + !tc2_pm_use_count[2][cluster]) { + vexpress_spc_powerdown_enable(cluster, 1); + vexpress_spc_set_global_wakeup_intr(1); + last_man = true; + } + } else if (tc2_pm_use_count[cpu][cluster] == 1) { + /* + * A power_up request went ahead of us. + * Even if we do not want to shut this CPU down, + * the caller expects a certain state as if the WFI + * was aborted. So let's continue with cache cleaning. + */ + skip_wfi = true; + } else + BUG(); + + if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) { + arch_spin_unlock(&tc2_pm_lock); + + if ((read_cpuid_id() & 0xf0) == 0xf0) { + /* + * On the Cortex-A15 we need to disable + * L2 prefetching before flushing the cache. + */ + asm volatile( + "mcr p15, 1, %0, c15, c0, 3 \n\t" + "isb \n\t" + "dsb " + : : "r" (0x400) ); + } + + /* + * We need to disable and flush the whole (L1 and L2) cache. + * Let's do it in the safest possible way i.e. with + * no memory access within the following sequence, + * including the stack. + */ + asm volatile( + "mrc p15, 0, r0, c1, c0, 0 @ get CR \n\t" + "bic r0, r0, #"__stringify(CR_C)" \n\t" + "mcr p15, 0, r0, c1, c0, 0 @ set CR \n\t" + "isb \n\t" + "bl v7_flush_dcache_all \n\t" + "clrex \n\t" + "mrc p15, 0, r0, c1, c0, 1 @ get AUXCR \n\t" + "bic r0, r0, #(1 << 6) @ disable local coherency \n\t" + "mcr p15, 0, r0, c1, c0, 1 @ set AUXCR \n\t" + "isb " + : : : "r0","r1","r2","r3","r4","r5","r6","r7", + "r9","r10","r11","lr","memory"); + + cci_disable_port_by_cpu(mpidr); + + __mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN); + } else { + /* + * If last man then undo any setup done previously. + */ + if (last_man) { + vexpress_spc_powerdown_enable(cluster, 0); + vexpress_spc_set_global_wakeup_intr(0); + } + + arch_spin_unlock(&tc2_pm_lock); + + /* + * We need to disable and flush only the L1 cache. + * Let's do it in the safest possible way as above. + */ + asm volatile( + "mrc p15, 0, r0, c1, c0, 0 @ get CR \n\t" + "bic r0, r0, #"__stringify(CR_C)" \n\t" + "mcr p15, 0, r0, c1, c0, 0 @ set CR \n\t" + "isb \n\t" + "bl v7_flush_dcache_louis \n\t" + "clrex \n\t" + "mrc p15, 0, r0, c1, c0, 1 @ get AUXCR \n\t" + "bic r0, r0, #(1 << 6) @ disable local coherency \n\t" + "mcr p15, 0, r0, c1, c0, 1 @ set AUXCR \n\t" + "isb " + : : : "r0","r1","r2","r3","r4","r5","r6","r7", + "r9","r10","r11","lr","memory"); + } + + __mcpm_cpu_down(cpu, cluster); + + /* Now we are prepared for power-down, do it: */ + if (!skip_wfi) + wfi(); + + /* Not dead at this point? Let our caller cope. */ +} + +static void tc2_pm_powered_up(void) +{ + unsigned int mpidr, cpu, cluster; + unsigned long flags; + + mpidr = read_cpuid_mpidr(); + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); + + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); + BUG_ON(cluster >= TC2_CLUSTERS || cpu >= TC2_MAX_CPUS_PER_CLUSTER); + + local_irq_save(flags); + arch_spin_lock(&tc2_pm_lock); + + if (!tc2_pm_use_count[0][cluster] && + !tc2_pm_use_count[1][cluster] && + !tc2_pm_use_count[2][cluster]) { + vexpress_spc_powerdown_enable(cluster, 0); + vexpress_spc_set_global_wakeup_intr(0); + } + + if (!tc2_pm_use_count[cpu][cluster]) + tc2_pm_use_count[cpu][cluster] = 1; + + vexpress_spc_set_cpu_wakeup_irq(cpu, cluster, 0); + vexpress_spc_write_resume_reg(cluster, cpu, 0); + + arch_spin_unlock(&tc2_pm_lock); + local_irq_restore(flags); +} + +static const struct mcpm_platform_ops tc2_pm_power_ops = { + .power_up = tc2_pm_power_up, + .power_down = tc2_pm_power_down, + .powered_up = tc2_pm_powered_up, +}; + +static void __init tc2_pm_usage_count_init(void) +{ + unsigned int mpidr, cpu, cluster; + + mpidr = read_cpuid_mpidr(); + cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); + cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); + + pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster); + BUG_ON(cluster >= TC2_CLUSTERS || cpu >= TC2_MAX_CPUS_PER_CLUSTER); + tc2_pm_use_count[cpu][cluster] = 1; +} + +/* + * Enable cluster-level coherency, in preparation for turning on the MMU. + */ +static void __naked tc2_pm_power_up_setup(unsigned int affinity_level) +{ + asm volatile (" \n" +" cmp r0, #1 \n" +" bxne lr \n" +" b cci_enable_port_for_self "); +} + +static int __init tc2_pm_init(void) +{ + int ret; + + if (!vexpress_spc_check_loaded()) + return -ENODEV; + + tc2_pm_usage_count_init(); + + ret = mcpm_platform_register(&tc2_pm_power_ops); + if (!ret) + ret = mcpm_sync_init(tc2_pm_power_up_setup); + if (!ret) + pr_info("TC2 power management initialized\n"); + return ret; +} + +early_initcall(tc2_pm_init);