From patchwork Mon Aug 15 13:55:10 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 1067492 Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by demeter1.kernel.org (8.14.4/8.14.4) with ESMTP id p7FDxBOB023921 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 15 Aug 2011 13:59:31 GMT Received: from canuck.infradead.org ([134.117.69.58]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1Qsxgf-0000rB-UJ; Mon, 15 Aug 2011 13:58:43 +0000 Received: from localhost ([127.0.0.1] helo=canuck.infradead.org) by canuck.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1Qsxgf-0001qJ-19; Mon, 15 Aug 2011 13:58:41 +0000 Received: from cam-admin0.cambridge.arm.com ([217.140.96.50]) by canuck.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1Qsxdd-00015r-I6 for linux-arm-kernel@lists.infradead.org; Mon, 15 Aug 2011 13:55:41 +0000 Received: from localhost.localdomain (e102822-lin.cambridge.arm.com [10.1.70.54]) by cam-admin0.cambridge.arm.com (8.12.6/8.12.6) with ESMTP id p7FDsX1A015179; Mon, 15 Aug 2011 14:54:35 +0100 (BST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH 09/15] ARM: perf: lock PMU registers per-CPU Date: Mon, 15 Aug 2011 14:55:10 +0100 Message-Id: <1313416516-8006-10-git-send-email-mark.rutland@arm.com> X-Mailer: git-send-email 1.7.0.4 In-Reply-To: <1313416516-8006-1-git-send-email-mark.rutland@arm.com> References: <1313416516-8006-1-git-send-email-mark.rutland@arm.com> X-CRM114-Version: 20090807-BlameThorstenAndJenny ( TRE 0.7.6 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20110815_095534_013303_D3557A0E X-CRM114-Status: GOOD ( 19.34 ) X-Spam-Score: -3.3 (---) X-Spam-Report: SpamAssassin version 3.3.1 on canuck.infradead.org summary: Content analysis details: (-3.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [217.140.96.50 listed in list.dnswl.org] -1.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain Cc: Mark Rutland , Jamie Iles , Will Deacon , Jean Pihet , Ashwin Chaugule X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Mon, 15 Aug 2011 13:59:34 +0000 (UTC) Currently, a single lock serialises access to CPU PMU registers. This global locking is unnecessary as PMU registers are local to the CPU they monitor. This patch replaces the global lock with a per-CPU lock. As the lock is in struct cpu_hw_events, PMUs providing a single cpu_hw_events instance can be locked globally. Signed-off-by: Mark Rutland Reviewed-by: Will Deacon --- arch/arm/kernel/perf_event.c | 17 +++++++++----- arch/arm/kernel/perf_event_v6.c | 25 +++++++++++++-------- arch/arm/kernel/perf_event_v7.c | 20 ++++++++++------- arch/arm/kernel/perf_event_xscale.c | 40 +++++++++++++++++++++-------------- 4 files changed, 62 insertions(+), 40 deletions(-) diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 5ce6c33..9331d57 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -27,12 +27,6 @@ #include /* - * Hardware lock to serialize accesses to PMU registers. Needed for the - * read/modify/write sequences. - */ -static DEFINE_RAW_SPINLOCK(pmu_lock); - -/* * ARMv6 supports a maximum of 3 events, starting from index 0. If we add * another platform that supports more, we need to increase this to be the * largest of all platforms. @@ -55,6 +49,12 @@ struct cpu_hw_events { * an event. A 0 means that the counter can be used. */ unsigned long used_mask[BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)]; + + /* + * Hardware lock to serialize accesses to PMU registers. Needed for the + * read/modify/write sequences. + */ + raw_spinlock_t pmu_lock; }; static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events); @@ -685,6 +685,11 @@ static struct cpu_hw_events *armpmu_get_cpu_events(void) static void __init cpu_pmu_init(struct arm_pmu *armpmu) { + int cpu; + for_each_possible_cpu(cpu) { + struct cpu_hw_events *events = &per_cpu(cpu_hw_events, cpu); + raw_spin_lock_init(&events->pmu_lock); + } armpmu->get_hw_events = armpmu_get_cpu_events; } diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c index 8390128..68cf704 100644 --- a/arch/arm/kernel/perf_event_v6.c +++ b/arch/arm/kernel/perf_event_v6.c @@ -433,6 +433,7 @@ armv6pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); if (ARMV6_CYCLE_COUNTER == idx) { mask = 0; @@ -454,12 +455,12 @@ armv6pmu_enable_event(struct hw_perf_event *hwc, * Mask out the current event and set the counter to count the event * that we're interested in. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~mask; val |= evt; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int counter_is_active(unsigned long pmcr, int idx) @@ -544,24 +545,26 @@ static void armv6pmu_start(void) { unsigned long flags, val; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val |= ARMV6_PMCR_ENABLE; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void armv6pmu_stop(void) { unsigned long flags, val; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~ARMV6_PMCR_ENABLE; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int @@ -595,6 +598,7 @@ armv6pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); if (ARMV6_CYCLE_COUNTER == idx) { mask = ARMV6_PMCR_CCOUNT_IEN; @@ -615,12 +619,12 @@ armv6pmu_disable_event(struct hw_perf_event *hwc, * of ETM bus signal assertion cycles. The external reporting should * be disabled and so this should never increment. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~mask; val |= evt; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void @@ -628,6 +632,7 @@ armv6mpcore_pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, flags, evt = 0; + struct cpu_hw_events *events = armpmu->get_hw_events(); if (ARMV6_CYCLE_COUNTER == idx) { mask = ARMV6_PMCR_CCOUNT_IEN; @@ -644,12 +649,12 @@ armv6mpcore_pmu_disable_event(struct hw_perf_event *hwc, * Unlike UP ARMv6, we don't have a way of stopping the counters. We * simply disable the interrupt reporting. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~mask; val |= evt; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static struct arm_pmu armv6pmu = { diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c index f4170fc..68ac522 100644 --- a/arch/arm/kernel/perf_event_v7.c +++ b/arch/arm/kernel/perf_event_v7.c @@ -936,12 +936,13 @@ static void armv7_pmnc_dump_regs(void) static void armv7pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); /* * Enable counter and interrupt, and set the counter to count * the event that we're interested in. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* * Disable counter @@ -966,17 +967,18 @@ static void armv7pmu_enable_event(struct hw_perf_event *hwc, int idx) */ armv7_pmnc_enable_counter(idx); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void armv7pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); /* * Disable counter and interrupt */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* * Disable counter @@ -988,7 +990,7 @@ static void armv7pmu_disable_event(struct hw_perf_event *hwc, int idx) */ armv7_pmnc_disable_intens(idx); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) @@ -1054,21 +1056,23 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) static void armv7pmu_start(void) { unsigned long flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* Enable all counters */ armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void armv7pmu_stop(void) { unsigned long flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* Disable all counters */ armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int armv7pmu_get_event_idx(struct cpu_hw_events *cpuc, diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c index ca89a06..18e4823 100644 --- a/arch/arm/kernel/perf_event_xscale.c +++ b/arch/arm/kernel/perf_event_xscale.c @@ -281,6 +281,7 @@ static void xscale1pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); switch (idx) { case XSCALE_CYCLE_COUNTER: @@ -302,18 +303,19 @@ xscale1pmu_enable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val &= ~mask; val |= evt; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale1pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct cpu_hw_events *events = armpmu->get_hw_events(); switch (idx) { case XSCALE_CYCLE_COUNTER: @@ -333,12 +335,12 @@ xscale1pmu_disable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val &= ~mask; val |= evt; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int @@ -365,24 +367,26 @@ static void xscale1pmu_start(void) { unsigned long flags, val; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val |= XSCALE_PMU_ENABLE; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale1pmu_stop(void) { unsigned long flags, val; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val &= ~XSCALE_PMU_ENABLE; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static inline u32 @@ -610,6 +614,7 @@ static void xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags, ien, evtsel; + struct cpu_hw_events *events = armpmu->get_hw_events(); ien = xscale2pmu_read_int_enable(); evtsel = xscale2pmu_read_event_select(); @@ -643,16 +648,17 @@ xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); xscale2pmu_write_event_select(evtsel); xscale2pmu_write_int_enable(ien); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags, ien, evtsel; + struct cpu_hw_events *events = armpmu->get_hw_events(); ien = xscale2pmu_read_int_enable(); evtsel = xscale2pmu_read_event_select(); @@ -686,10 +692,10 @@ xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); xscale2pmu_write_event_select(evtsel); xscale2pmu_write_int_enable(ien); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int @@ -712,24 +718,26 @@ static void xscale2pmu_start(void) { unsigned long flags, val; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64; val |= XSCALE_PMU_ENABLE; xscale2pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale2pmu_stop(void) { unsigned long flags, val; + struct cpu_hw_events *events = armpmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale2pmu_read_pmnc(); val &= ~XSCALE_PMU_ENABLE; xscale2pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static inline u32