From patchwork Fri Mar 10 10:46:14 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mark Rutland <mark.rutland@arm.com>
X-Patchwork-Id: 9615953
Return-Path: 
 <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	9D7C760415 for <patchwork-linux-arm@patchwork.kernel.org>;
	Fri, 10 Mar 2017 10:48:19 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 742B4285A6
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Fri, 10 Mar 2017 10:48:19 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 68C902870E; Fri, 10 Mar 2017 10:48:19 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID autolearn=ham version=3.3.1
Received: from bombadil.infradead.org (bombadil.infradead.org
	[65.50.211.133])
	(using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8E366285A6
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Fri, 10 Mar 2017 10:48:18 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=lists.infradead.org; s=bombadil.20170209; h=Sender:
	Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe:
	List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References:
	In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID:
	Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
	:Resent-Message-ID:List-Owner;
	bh=7lbVJWma9yubMx59XdacM1zmmXjRunzfygcbwhr6rmU=;
	b=jQDL2eJ8BPBlxsy2ETfmLVtoqY
	BHZtHImST1xSk1fffBvJ8VcI+KGfGCYChmnFKNcbjBlluZEe/ssBsIf6LZ5sDI+vtFNY0nsD3XSGj
	7VDYIYy2fuJKRX/OPr/vAPjqpbM98N7pnqFYnwqAl+ZtmhkowqBUoehRcSxOjqGbV1oaqv0BsfQCM
	ClQhWoGHz7AHliyAk1SrpvGLzg803cWBQn4RKDwBZQXt5juSSjSe5Cr9B83esl5TZH9t/1hfrIzJ/
	6xkyspS367pUupr2DZZv902giybdHnkPmiQsfUvTkYwe/h3YPKjjhqVgS1HAXtBBcnSqZYPOELOz1
	msKc/QJg==;
Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux))
	id 1cmI5p-0008Or-U2; Fri, 10 Mar 2017 10:48:17 +0000
Received: from foss.arm.com ([217.140.101.70])
	by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux))
	id 1cmI4R-0007Qo-Go for linux-arm-kernel@lists.infradead.org;
	Fri, 10 Mar 2017 10:46:56 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 71C9F1476;
	Fri, 10 Mar 2017 02:46:35 -0800 (PST)
Received: from leverpostej.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com
	[10.72.51.249])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id
	C77723F220; Fri, 10 Mar 2017 02:46:34 -0800 (PST)
From: Mark Rutland <mark.rutland@arm.com>
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCHv4 2/3] drivers/perf: arm_pmu: manage interrupts per-cpu
Date: Fri, 10 Mar 2017 10:46:14 +0000
Message-Id: <1489142775-11290-3-git-send-email-mark.rutland@arm.com>
X-Mailer: git-send-email 1.9.1
In-Reply-To: <1489142775-11290-1-git-send-email-mark.rutland@arm.com>
References: <1489142775-11290-1-git-send-email-mark.rutland@arm.com>
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20170310_024651_667823_02F064E1 
X-CRM114-Status: GOOD (  23.21  )
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: 
 <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: 
 <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Cc: mark.rutland@arm.com, will.deacon@arm.com
MIME-Version: 1.0
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
X-Virus-Scanned: ClamAV using ClamSMTP

When requesting or freeing interrupts, we use platform_get_irq() to find
relevant irqs, backing this up with additional information in an
optional irq_affinity table.

This means that our irq request and free paths are tied to a
platform_device, and our request path must jump through a number of
hoops in order to determine the required affinity of each interrupt.

Given that the affinity must be static, we can compute the affinity once
up-front at probe time, simplifying the irq request and free paths. By
recording interrupts in a per-cpu data structure, we simplify a few
paths, and permit a subsequent rework of the request and free paths.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c       | 314 ++++++++++++++++++++++---------------------
 include/linux/perf/arm_pmu.h |   3 +-
 2 files changed, 166 insertions(+), 151 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index dd0f3b2..68d9f08 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -617,94 +617,76 @@ static void cpu_pmu_disable_percpu_irq(void *data)
 
 static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)
 {
-	int i, irq, irqs;
-	struct platform_device *pmu_device = cpu_pmu->plat_device;
+	int cpu;
 	struct pmu_hw_events __percpu *hw_events = cpu_pmu->hw_events;
 
-	irqs = min(pmu_device->num_resources, num_possible_cpus());
-
-	irq = platform_get_irq(pmu_device, 0);
-	if (irq > 0 && irq_is_percpu(irq)) {
-		on_each_cpu_mask(&cpu_pmu->supported_cpus,
-				 cpu_pmu_disable_percpu_irq, &irq, 1);
-		free_percpu_irq(irq, &hw_events->percpu_pmu);
-	} else {
-		for (i = 0; i < irqs; ++i) {
-			int cpu = i;
-
-			if (cpu_pmu->irq_affinity)
-				cpu = cpu_pmu->irq_affinity[i];
-
-			if (!cpumask_test_and_clear_cpu(cpu, &cpu_pmu->active_irqs))
-				continue;
-			irq = platform_get_irq(pmu_device, i);
-			if (irq > 0)
-				free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, cpu));
+	for_each_cpu(cpu, &cpu_pmu->supported_cpus) {
+		int irq = per_cpu(hw_events->irq, cpu);
+		if (!irq)
+			continue;
+
+		if (irq_is_percpu(irq)) {
+			on_each_cpu_mask(&cpu_pmu->supported_cpus,
+					 cpu_pmu_disable_percpu_irq, &irq, 1);
+			free_percpu_irq(irq, &hw_events->percpu_pmu);
+
+			break;
 		}
+
+		if (!cpumask_test_and_clear_cpu(cpu, &cpu_pmu->active_irqs))
+			continue;
+
+		free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, cpu));
 	}
 }
 
 static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)
 {
-	int i, err, irq, irqs;
-	struct platform_device *pmu_device = cpu_pmu->plat_device;
+	int cpu, err;
 	struct pmu_hw_events __percpu *hw_events = cpu_pmu->hw_events;
 
-	if (!pmu_device)
-		return -ENODEV;
-
-	irqs = min(pmu_device->num_resources, num_possible_cpus());
-	if (irqs < 1) {
-		pr_warn_once("perf/ARM: No irqs for PMU defined, sampling events not supported\n");
-		return 0;
-	}
-
-	irq = platform_get_irq(pmu_device, 0);
-	if (irq > 0 && irq_is_percpu(irq)) {
-		err = request_percpu_irq(irq, handler, "arm-pmu",
-					 &hw_events->percpu_pmu);
-		if (err) {
-			pr_err("unable to request IRQ%d for ARM PMU counters\n",
-				irq);
-			return err;
-		}
-
-		on_each_cpu_mask(&cpu_pmu->supported_cpus,
-				 cpu_pmu_enable_percpu_irq, &irq, 1);
-	} else {
-		for (i = 0; i < irqs; ++i) {
-			int cpu = i;
-
-			err = 0;
-			irq = platform_get_irq(pmu_device, i);
-			if (irq < 0)
-				continue;
-
-			if (cpu_pmu->irq_affinity)
-				cpu = cpu_pmu->irq_affinity[i];
-
-			/*
-			 * If we have a single PMU interrupt that we can't shift,
-			 * assume that we're running on a uniprocessor machine and
-			 * continue. Otherwise, continue without this interrupt.
-			 */
-			if (irq_set_affinity(irq, cpumask_of(cpu)) && irqs > 1) {
-				pr_warn("unable to set irq affinity (irq=%d, cpu=%u)\n",
-					irq, cpu);
-				continue;
-			}
+	for_each_cpu(cpu, &cpu_pmu->supported_cpus) {
+		int irq = per_cpu(hw_events->irq, cpu);
+		if (!irq)
+			continue;
 
-			err = request_irq(irq, handler,
-					  IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu",
-					  per_cpu_ptr(&hw_events->percpu_pmu, cpu));
+		if (irq_is_percpu(irq)) {
+			err = request_percpu_irq(irq, handler, "arm-pmu",
+						 &hw_events->percpu_pmu);
 			if (err) {
 				pr_err("unable to request IRQ%d for ARM PMU counters\n",
 					irq);
 				return err;
 			}
 
-			cpumask_set_cpu(cpu, &cpu_pmu->active_irqs);
+			on_each_cpu_mask(&cpu_pmu->supported_cpus,
+					 cpu_pmu_enable_percpu_irq, &irq, 1);
+
+			break;
 		}
+
+		/*
+		 * If we have a single PMU interrupt that we can't shift,
+		 * assume that we're running on a uniprocessor machine and
+		 * continue. Otherwise, continue without this interrupt.
+		 */
+		if (irq_set_affinity(irq, cpumask_of(cpu)) &&
+		    num_possible_cpus() > 1) {
+			pr_warn("unable to set irq affinity (irq=%d, cpu=%u)\n",
+				irq, cpu);
+			continue;
+		}
+
+		err = request_irq(irq, handler,
+				  IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu",
+				  per_cpu_ptr(&hw_events->percpu_pmu, cpu));
+		if (err) {
+			pr_err("unable to request IRQ%d for ARM PMU counters\n",
+				irq);
+			return err;
+		}
+
+		cpumask_set_cpu(cpu, &cpu_pmu->active_irqs);
 	}
 
 	return 0;
@@ -846,10 +828,6 @@ static int cpu_pmu_init(struct arm_pmu *cpu_pmu)
 		on_each_cpu_mask(&cpu_pmu->supported_cpus, cpu_pmu->reset,
 			 cpu_pmu, 1);
 
-	/* If no interrupts available, set the corresponding capability flag */
-	if (!platform_get_irq(cpu_pmu->plat_device, 0))
-		cpu_pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
-
 	/*
 	 * This is a CPU PMU potentially in a heterogeneous configuration (e.g.
 	 * big.LITTLE). This is not an uncore PMU, and we have taken ctx
@@ -897,98 +875,133 @@ static int probe_current_pmu(struct arm_pmu *pmu,
 	return ret;
 }
 
-static int of_pmu_irq_cfg(struct arm_pmu *pmu)
+static int pmu_parse_percpu_irq(struct arm_pmu *pmu, int irq)
 {
-	int *irqs, i = 0;
-	bool using_spi = false;
-	struct platform_device *pdev = pmu->plat_device;
+	int cpu, ret;
+	struct pmu_hw_events __percpu *hw_events = pmu->hw_events;
 
-	irqs = kcalloc(pdev->num_resources, sizeof(*irqs), GFP_KERNEL);
-	if (!irqs)
-		return -ENOMEM;
+	ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus);
+	if (ret)
+		return ret;
 
-	do {
-		struct device_node *dn;
-		int cpu, irq;
+	for_each_cpu(cpu, &pmu->supported_cpus)
+		per_cpu(hw_events->irq, cpu) = irq;
 
-		/* See if we have an affinity entry */
-		dn = of_parse_phandle(pdev->dev.of_node, "interrupt-affinity", i);
-		if (!dn)
-			break;
+	return 0;
+}
 
-		/* Check the IRQ type and prohibit a mix of PPIs and SPIs */
-		irq = platform_get_irq(pdev, i);
-		if (irq > 0) {
-			bool spi = !irq_is_percpu(irq);
-
-			if (i > 0 && spi != using_spi) {
-				pr_err("PPI/SPI IRQ type mismatch for %s!\n",
-					dn->name);
-				of_node_put(dn);
-				kfree(irqs);
-				return -EINVAL;
-			}
+static bool pmu_has_irq_affinity(struct device_node *node)
+{
+	return !!of_find_property(node, "interrupt-affinity", NULL);
+}
 
-			using_spi = spi;
-		}
+static int pmu_parse_irq_affinity(struct device_node *node, int i)
+{
+	struct device_node *dn;
+	int cpu;
 
-		/* Now look up the logical CPU number */
-		for_each_possible_cpu(cpu) {
-			struct device_node *cpu_dn;
+	/*
+	 * If we don't have an interrupt-affinity property, we guess irq
+	 * affinity matches our logical CPU order, as we used to assume.
+	 * This is fragile, so we'll warn in pmu_parse_irqs().
+	 */
+	if (!pmu_has_irq_affinity(node))
+		return i;
 
-			cpu_dn = of_cpu_device_node_get(cpu);
-			of_node_put(cpu_dn);
+	dn = of_parse_phandle(node, "interrupt-affinity", i);
+	if (!dn) {
+		pr_warn("failed to parse interrupt-affinity[%d] for %s\n",
+			i, node->name);
+		return -EINVAL;
+	}
 
-			if (dn == cpu_dn)
-				break;
-		}
+	/* Now look up the logical CPU number */
+	for_each_possible_cpu(cpu) {
+		struct device_node *cpu_dn;
+
+		cpu_dn = of_cpu_device_node_get(cpu);
+		of_node_put(cpu_dn);
 
-		if (cpu >= nr_cpu_ids) {
-			pr_warn("Failed to find logical CPU for %s\n",
-				dn->name);
-			of_node_put(dn);
-			cpumask_setall(&pmu->supported_cpus);
+		if (dn == cpu_dn)
 			break;
-		}
-		of_node_put(dn);
+	}
 
-		/* For SPIs, we need to track the affinity per IRQ */
-		if (using_spi) {
-			if (i >= pdev->num_resources)
-				break;
+	if (cpu >= nr_cpu_ids) {
+		pr_warn("failed to find logical CPU for %s\n", dn->name);
+	}
 
-			irqs[i] = cpu;
-		}
+	of_node_put(dn);
 
-		/* Keep track of the CPUs containing this PMU type */
-		cpumask_set_cpu(cpu, &pmu->supported_cpus);
-		i++;
-	} while (1);
+	return cpu;
+}
+
+static int pmu_parse_irqs(struct arm_pmu *pmu)
+{
+	int i = 0, nr_irqs;
+	struct platform_device *pdev = pmu->plat_device;
+	struct pmu_hw_events __percpu *hw_events = pmu->hw_events;
+
+	nr_irqs = platform_irq_count(pdev);
+	if (nr_irqs < 0) {
+		pr_err("unable to count PMU IRQs\n");
+		return nr_irqs;
+	}
+
+	/*
+	 * In this case we have no idea which CPUs are covered by the PMU.
+	 * To match our prior behaviour, we assume all CPUs in this case.
+	 */
+	if (nr_irqs == 0) {
+		pr_warn("no irqs for PMU, sampling events not supported\n");
+		pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
+		cpumask_setall(&pmu->supported_cpus);
+		return 0;
+	}
 
-	/* If we didn't manage to parse anything, try the interrupt affinity */
-	if (cpumask_weight(&pmu->supported_cpus) == 0) {
+	if (nr_irqs == 1) {
 		int irq = platform_get_irq(pdev, 0);
+		if (irq && irq_is_percpu(irq))
+			return pmu_parse_percpu_irq(pmu, irq);
+	}
 
-		if (irq > 0 && irq_is_percpu(irq)) {
-			/* If using PPIs, check the affinity of the partition */
-			int ret;
+	if (!pmu_has_irq_affinity(pdev->dev.of_node)) {
+		pr_warn("no interrupt-affinity property for %s, guessing.\n",
+			of_node_full_name(pdev->dev.of_node));
+	}
 
-			ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus);
-			if (ret) {
-				kfree(irqs);
-				return ret;
-			}
-		} else {
-			/* Otherwise default to all CPUs */
-			cpumask_setall(&pmu->supported_cpus);
+	/*
+	 * Some platforms have all PMU IRQs OR'd into a single IRQ, with a
+	 * special platdata function that attempts to demux them.
+	 */
+	if (dev_get_platdata(&pdev->dev))
+		cpumask_setall(&pmu->supported_cpus);
+
+	for (i = 0; i < nr_irqs; i++) {
+		int cpu, irq;
+
+		irq = platform_get_irq(pdev, i);
+		if (WARN_ON(irq <= 0))
+			continue;
+
+		if (irq_is_percpu(irq)) {
+			pr_warn("multiple PPIs or mismatched SPI/PPI detected\n");
+			return -EINVAL;
 		}
-	}
 
-	/* If we matched up the IRQ affinities, use them to route the SPIs */
-	if (using_spi && i == pdev->num_resources)
-		pmu->irq_affinity = irqs;
-	else
-		kfree(irqs);
+		cpu = pmu_parse_irq_affinity(pdev->dev.of_node, i);
+		if (cpu < 0)
+			return cpu;
+		if (cpu >= nr_cpu_ids)
+			continue;
+
+		if (per_cpu(hw_events->irq, cpu)) {
+			pr_warn("multiple PMU IRQs for the same CPU detected\n");
+			return -EINVAL;
+		}
+
+		per_cpu(hw_events->irq, cpu) = irq;
+		cpumask_set_cpu(cpu, &pmu->supported_cpus);
+	}
 
 	return 0;
 }
@@ -1050,6 +1063,10 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 
 	pmu->plat_device = pdev;
 
+	ret = pmu_parse_irqs(pmu);
+	if (ret)
+		goto out_free;
+
 	if (node && (of_id = of_match_node(of_table, pdev->dev.of_node))) {
 		init_fn = of_id->data;
 
@@ -1062,9 +1079,7 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 			pmu->secure_access = false;
 		}
 
-		ret = of_pmu_irq_cfg(pmu);
-		if (!ret)
-			ret = init_fn(pmu);
+		ret = init_fn(pmu);
 	} else if (probe_table) {
 		cpumask_setall(&pmu->supported_cpus);
 		ret = probe_current_pmu(pmu, probe_table);
@@ -1097,7 +1112,6 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 out_free:
 	pr_info("%s: failed to register PMU devices!\n",
 		of_node_full_name(node));
-	kfree(pmu->irq_affinity);
 	armpmu_free(pmu);
 	return ret;
 }
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 8462da2..05a3eb4 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -75,6 +75,8 @@ struct pmu_hw_events {
 	 * already have to allocate this struct per cpu.
 	 */
 	struct arm_pmu		*percpu_pmu;
+
+	int irq;
 };
 
 enum armpmu_attr_groups {
@@ -88,7 +90,6 @@ struct arm_pmu {
 	struct pmu	pmu;
 	cpumask_t	active_irqs;
 	cpumask_t	supported_cpus;
-	int		*irq_affinity;
 	char		*name;
 	irqreturn_t	(*handle_irq)(int irq_num, void *dev);
 	void		(*enable)(struct perf_event *event);