From patchwork Fri Mar 10 10:46:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 9615953 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9D7C760415 for ; Fri, 10 Mar 2017 10:48:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 742B4285A6 for ; Fri, 10 Mar 2017 10:48:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 68C902870E; Fri, 10 Mar 2017 10:48:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8E366285A6 for ; Fri, 10 Mar 2017 10:48:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=7lbVJWma9yubMx59XdacM1zmmXjRunzfygcbwhr6rmU=; b=jQDL2eJ8BPBlxsy2ETfmLVtoqY BHZtHImST1xSk1fffBvJ8VcI+KGfGCYChmnFKNcbjBlluZEe/ssBsIf6LZ5sDI+vtFNY0nsD3XSGj 7VDYIYy2fuJKRX/OPr/vAPjqpbM98N7pnqFYnwqAl+ZtmhkowqBUoehRcSxOjqGbV1oaqv0BsfQCM ClQhWoGHz7AHliyAk1SrpvGLzg803cWBQn4RKDwBZQXt5juSSjSe5Cr9B83esl5TZH9t/1hfrIzJ/ 6xkyspS367pUupr2DZZv902giybdHnkPmiQsfUvTkYwe/h3YPKjjhqVgS1HAXtBBcnSqZYPOELOz1 msKc/QJg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1cmI5p-0008Or-U2; Fri, 10 Mar 2017 10:48:17 +0000 Received: from foss.arm.com ([217.140.101.70]) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1cmI4R-0007Qo-Go for linux-arm-kernel@lists.infradead.org; Fri, 10 Mar 2017 10:46:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 71C9F1476; Fri, 10 Mar 2017 02:46:35 -0800 (PST) Received: from leverpostej.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C77723F220; Fri, 10 Mar 2017 02:46:34 -0800 (PST) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Subject: [PATCHv4 2/3] drivers/perf: arm_pmu: manage interrupts per-cpu Date: Fri, 10 Mar 2017 10:46:14 +0000 Message-Id: <1489142775-11290-3-git-send-email-mark.rutland@arm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1489142775-11290-1-git-send-email-mark.rutland@arm.com> References: <1489142775-11290-1-git-send-email-mark.rutland@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170310_024651_667823_02F064E1 X-CRM114-Status: GOOD ( 23.21 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, will.deacon@arm.com MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP When requesting or freeing interrupts, we use platform_get_irq() to find relevant irqs, backing this up with additional information in an optional irq_affinity table. This means that our irq request and free paths are tied to a platform_device, and our request path must jump through a number of hoops in order to determine the required affinity of each interrupt. Given that the affinity must be static, we can compute the affinity once up-front at probe time, simplifying the irq request and free paths. By recording interrupts in a per-cpu data structure, we simplify a few paths, and permit a subsequent rework of the request and free paths. Signed-off-by: Mark Rutland Cc: Will Deacon --- drivers/perf/arm_pmu.c | 314 ++++++++++++++++++++++--------------------- include/linux/perf/arm_pmu.h | 3 +- 2 files changed, 166 insertions(+), 151 deletions(-) diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index dd0f3b2..68d9f08 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -617,94 +617,76 @@ static void cpu_pmu_disable_percpu_irq(void *data) static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu) { - int i, irq, irqs; - struct platform_device *pmu_device = cpu_pmu->plat_device; + int cpu; struct pmu_hw_events __percpu *hw_events = cpu_pmu->hw_events; - irqs = min(pmu_device->num_resources, num_possible_cpus()); - - irq = platform_get_irq(pmu_device, 0); - if (irq > 0 && irq_is_percpu(irq)) { - on_each_cpu_mask(&cpu_pmu->supported_cpus, - cpu_pmu_disable_percpu_irq, &irq, 1); - free_percpu_irq(irq, &hw_events->percpu_pmu); - } else { - for (i = 0; i < irqs; ++i) { - int cpu = i; - - if (cpu_pmu->irq_affinity) - cpu = cpu_pmu->irq_affinity[i]; - - if (!cpumask_test_and_clear_cpu(cpu, &cpu_pmu->active_irqs)) - continue; - irq = platform_get_irq(pmu_device, i); - if (irq > 0) - free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, cpu)); + for_each_cpu(cpu, &cpu_pmu->supported_cpus) { + int irq = per_cpu(hw_events->irq, cpu); + if (!irq) + continue; + + if (irq_is_percpu(irq)) { + on_each_cpu_mask(&cpu_pmu->supported_cpus, + cpu_pmu_disable_percpu_irq, &irq, 1); + free_percpu_irq(irq, &hw_events->percpu_pmu); + + break; } + + if (!cpumask_test_and_clear_cpu(cpu, &cpu_pmu->active_irqs)) + continue; + + free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, cpu)); } } static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler) { - int i, err, irq, irqs; - struct platform_device *pmu_device = cpu_pmu->plat_device; + int cpu, err; struct pmu_hw_events __percpu *hw_events = cpu_pmu->hw_events; - if (!pmu_device) - return -ENODEV; - - irqs = min(pmu_device->num_resources, num_possible_cpus()); - if (irqs < 1) { - pr_warn_once("perf/ARM: No irqs for PMU defined, sampling events not supported\n"); - return 0; - } - - irq = platform_get_irq(pmu_device, 0); - if (irq > 0 && irq_is_percpu(irq)) { - err = request_percpu_irq(irq, handler, "arm-pmu", - &hw_events->percpu_pmu); - if (err) { - pr_err("unable to request IRQ%d for ARM PMU counters\n", - irq); - return err; - } - - on_each_cpu_mask(&cpu_pmu->supported_cpus, - cpu_pmu_enable_percpu_irq, &irq, 1); - } else { - for (i = 0; i < irqs; ++i) { - int cpu = i; - - err = 0; - irq = platform_get_irq(pmu_device, i); - if (irq < 0) - continue; - - if (cpu_pmu->irq_affinity) - cpu = cpu_pmu->irq_affinity[i]; - - /* - * If we have a single PMU interrupt that we can't shift, - * assume that we're running on a uniprocessor machine and - * continue. Otherwise, continue without this interrupt. - */ - if (irq_set_affinity(irq, cpumask_of(cpu)) && irqs > 1) { - pr_warn("unable to set irq affinity (irq=%d, cpu=%u)\n", - irq, cpu); - continue; - } + for_each_cpu(cpu, &cpu_pmu->supported_cpus) { + int irq = per_cpu(hw_events->irq, cpu); + if (!irq) + continue; - err = request_irq(irq, handler, - IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu", - per_cpu_ptr(&hw_events->percpu_pmu, cpu)); + if (irq_is_percpu(irq)) { + err = request_percpu_irq(irq, handler, "arm-pmu", + &hw_events->percpu_pmu); if (err) { pr_err("unable to request IRQ%d for ARM PMU counters\n", irq); return err; } - cpumask_set_cpu(cpu, &cpu_pmu->active_irqs); + on_each_cpu_mask(&cpu_pmu->supported_cpus, + cpu_pmu_enable_percpu_irq, &irq, 1); + + break; } + + /* + * If we have a single PMU interrupt that we can't shift, + * assume that we're running on a uniprocessor machine and + * continue. Otherwise, continue without this interrupt. + */ + if (irq_set_affinity(irq, cpumask_of(cpu)) && + num_possible_cpus() > 1) { + pr_warn("unable to set irq affinity (irq=%d, cpu=%u)\n", + irq, cpu); + continue; + } + + err = request_irq(irq, handler, + IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu", + per_cpu_ptr(&hw_events->percpu_pmu, cpu)); + if (err) { + pr_err("unable to request IRQ%d for ARM PMU counters\n", + irq); + return err; + } + + cpumask_set_cpu(cpu, &cpu_pmu->active_irqs); } return 0; @@ -846,10 +828,6 @@ static int cpu_pmu_init(struct arm_pmu *cpu_pmu) on_each_cpu_mask(&cpu_pmu->supported_cpus, cpu_pmu->reset, cpu_pmu, 1); - /* If no interrupts available, set the corresponding capability flag */ - if (!platform_get_irq(cpu_pmu->plat_device, 0)) - cpu_pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT; - /* * This is a CPU PMU potentially in a heterogeneous configuration (e.g. * big.LITTLE). This is not an uncore PMU, and we have taken ctx @@ -897,98 +875,133 @@ static int probe_current_pmu(struct arm_pmu *pmu, return ret; } -static int of_pmu_irq_cfg(struct arm_pmu *pmu) +static int pmu_parse_percpu_irq(struct arm_pmu *pmu, int irq) { - int *irqs, i = 0; - bool using_spi = false; - struct platform_device *pdev = pmu->plat_device; + int cpu, ret; + struct pmu_hw_events __percpu *hw_events = pmu->hw_events; - irqs = kcalloc(pdev->num_resources, sizeof(*irqs), GFP_KERNEL); - if (!irqs) - return -ENOMEM; + ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus); + if (ret) + return ret; - do { - struct device_node *dn; - int cpu, irq; + for_each_cpu(cpu, &pmu->supported_cpus) + per_cpu(hw_events->irq, cpu) = irq; - /* See if we have an affinity entry */ - dn = of_parse_phandle(pdev->dev.of_node, "interrupt-affinity", i); - if (!dn) - break; + return 0; +} - /* Check the IRQ type and prohibit a mix of PPIs and SPIs */ - irq = platform_get_irq(pdev, i); - if (irq > 0) { - bool spi = !irq_is_percpu(irq); - - if (i > 0 && spi != using_spi) { - pr_err("PPI/SPI IRQ type mismatch for %s!\n", - dn->name); - of_node_put(dn); - kfree(irqs); - return -EINVAL; - } +static bool pmu_has_irq_affinity(struct device_node *node) +{ + return !!of_find_property(node, "interrupt-affinity", NULL); +} - using_spi = spi; - } +static int pmu_parse_irq_affinity(struct device_node *node, int i) +{ + struct device_node *dn; + int cpu; - /* Now look up the logical CPU number */ - for_each_possible_cpu(cpu) { - struct device_node *cpu_dn; + /* + * If we don't have an interrupt-affinity property, we guess irq + * affinity matches our logical CPU order, as we used to assume. + * This is fragile, so we'll warn in pmu_parse_irqs(). + */ + if (!pmu_has_irq_affinity(node)) + return i; - cpu_dn = of_cpu_device_node_get(cpu); - of_node_put(cpu_dn); + dn = of_parse_phandle(node, "interrupt-affinity", i); + if (!dn) { + pr_warn("failed to parse interrupt-affinity[%d] for %s\n", + i, node->name); + return -EINVAL; + } - if (dn == cpu_dn) - break; - } + /* Now look up the logical CPU number */ + for_each_possible_cpu(cpu) { + struct device_node *cpu_dn; + + cpu_dn = of_cpu_device_node_get(cpu); + of_node_put(cpu_dn); - if (cpu >= nr_cpu_ids) { - pr_warn("Failed to find logical CPU for %s\n", - dn->name); - of_node_put(dn); - cpumask_setall(&pmu->supported_cpus); + if (dn == cpu_dn) break; - } - of_node_put(dn); + } - /* For SPIs, we need to track the affinity per IRQ */ - if (using_spi) { - if (i >= pdev->num_resources) - break; + if (cpu >= nr_cpu_ids) { + pr_warn("failed to find logical CPU for %s\n", dn->name); + } - irqs[i] = cpu; - } + of_node_put(dn); - /* Keep track of the CPUs containing this PMU type */ - cpumask_set_cpu(cpu, &pmu->supported_cpus); - i++; - } while (1); + return cpu; +} + +static int pmu_parse_irqs(struct arm_pmu *pmu) +{ + int i = 0, nr_irqs; + struct platform_device *pdev = pmu->plat_device; + struct pmu_hw_events __percpu *hw_events = pmu->hw_events; + + nr_irqs = platform_irq_count(pdev); + if (nr_irqs < 0) { + pr_err("unable to count PMU IRQs\n"); + return nr_irqs; + } + + /* + * In this case we have no idea which CPUs are covered by the PMU. + * To match our prior behaviour, we assume all CPUs in this case. + */ + if (nr_irqs == 0) { + pr_warn("no irqs for PMU, sampling events not supported\n"); + pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT; + cpumask_setall(&pmu->supported_cpus); + return 0; + } - /* If we didn't manage to parse anything, try the interrupt affinity */ - if (cpumask_weight(&pmu->supported_cpus) == 0) { + if (nr_irqs == 1) { int irq = platform_get_irq(pdev, 0); + if (irq && irq_is_percpu(irq)) + return pmu_parse_percpu_irq(pmu, irq); + } - if (irq > 0 && irq_is_percpu(irq)) { - /* If using PPIs, check the affinity of the partition */ - int ret; + if (!pmu_has_irq_affinity(pdev->dev.of_node)) { + pr_warn("no interrupt-affinity property for %s, guessing.\n", + of_node_full_name(pdev->dev.of_node)); + } - ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus); - if (ret) { - kfree(irqs); - return ret; - } - } else { - /* Otherwise default to all CPUs */ - cpumask_setall(&pmu->supported_cpus); + /* + * Some platforms have all PMU IRQs OR'd into a single IRQ, with a + * special platdata function that attempts to demux them. + */ + if (dev_get_platdata(&pdev->dev)) + cpumask_setall(&pmu->supported_cpus); + + for (i = 0; i < nr_irqs; i++) { + int cpu, irq; + + irq = platform_get_irq(pdev, i); + if (WARN_ON(irq <= 0)) + continue; + + if (irq_is_percpu(irq)) { + pr_warn("multiple PPIs or mismatched SPI/PPI detected\n"); + return -EINVAL; } - } - /* If we matched up the IRQ affinities, use them to route the SPIs */ - if (using_spi && i == pdev->num_resources) - pmu->irq_affinity = irqs; - else - kfree(irqs); + cpu = pmu_parse_irq_affinity(pdev->dev.of_node, i); + if (cpu < 0) + return cpu; + if (cpu >= nr_cpu_ids) + continue; + + if (per_cpu(hw_events->irq, cpu)) { + pr_warn("multiple PMU IRQs for the same CPU detected\n"); + return -EINVAL; + } + + per_cpu(hw_events->irq, cpu) = irq; + cpumask_set_cpu(cpu, &pmu->supported_cpus); + } return 0; } @@ -1050,6 +1063,10 @@ int arm_pmu_device_probe(struct platform_device *pdev, pmu->plat_device = pdev; + ret = pmu_parse_irqs(pmu); + if (ret) + goto out_free; + if (node && (of_id = of_match_node(of_table, pdev->dev.of_node))) { init_fn = of_id->data; @@ -1062,9 +1079,7 @@ int arm_pmu_device_probe(struct platform_device *pdev, pmu->secure_access = false; } - ret = of_pmu_irq_cfg(pmu); - if (!ret) - ret = init_fn(pmu); + ret = init_fn(pmu); } else if (probe_table) { cpumask_setall(&pmu->supported_cpus); ret = probe_current_pmu(pmu, probe_table); @@ -1097,7 +1112,6 @@ int arm_pmu_device_probe(struct platform_device *pdev, out_free: pr_info("%s: failed to register PMU devices!\n", of_node_full_name(node)); - kfree(pmu->irq_affinity); armpmu_free(pmu); return ret; } diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 8462da2..05a3eb4 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -75,6 +75,8 @@ struct pmu_hw_events { * already have to allocate this struct per cpu. */ struct arm_pmu *percpu_pmu; + + int irq; }; enum armpmu_attr_groups { @@ -88,7 +90,6 @@ struct arm_pmu { struct pmu pmu; cpumask_t active_irqs; cpumask_t supported_cpus; - int *irq_affinity; char *name; irqreturn_t (*handle_irq)(int irq_num, void *dev); void (*enable)(struct perf_event *event);