From patchwork Fri Feb 12 16:55:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Glauber X-Patchwork-Id: 8293641 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 80B71C02AA for ; Fri, 12 Feb 2016 16:58:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 48DFB20306 for ; Fri, 12 Feb 2016 16:58:03 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F3EA420270 for ; Fri, 12 Feb 2016 16:58:01 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aUH0x-0005jI-1M; Fri, 12 Feb 2016 16:56:15 +0000 Received: from mail-wm0-f67.google.com ([74.125.82.67]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aUH0a-0005V6-HW for linux-arm-kernel@lists.infradead.org; Fri, 12 Feb 2016 16:55:55 +0000 Received: by mail-wm0-f67.google.com with SMTP id 128so3713541wmz.3 for ; Fri, 12 Feb 2016 08:55:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=v6zcuqNtjSGnmOvxdbNXWfj5RUXV4kTeQHjvgUUFoLY=; b=dbeLzctwegYJs0yE2bviNNBxaWuCROauz06St55QFQ+hetL7TyX0PzpGBFAaoAzxKA 8uXPZNmHvhgR+owzPshgu8fbvNG9FBDzdhMumun3mJdEtrweoIio5/z3lLIk+FHcspbj T21XVHL4eFREuubsIjYFXdxrOK+ArKXLVVjUSPAC32mdR8AMAMTiA/ZEdjJI/ClNa66K U80+9UgoXSMU2qTnSUE0H4qHnTdJaoSRLpKF81GwFCG4PmlBN5hZ+nycBKY3G6/DJ+BI r7foNEnwld67fmmj5X+K1KBwRvUtjGwDU6BPCj+2doyFwk2qahvdBNzR75GcEOrJkmHo SwDw== X-Gm-Message-State: AG10YOQK0/+O4vjASjf9r+nnCPh07uxBKWkoZHEFyO08mJLEzgyohbWqCOZdRYTKcF4HGQ== X-Received: by 10.28.100.134 with SMTP id y128mr5697379wmb.87.1455296135603; Fri, 12 Feb 2016 08:55:35 -0800 (PST) Received: from wintermute.fritz.box (HSI-KBW-46-223-159-71.hsi.kabel-badenwuerttemberg.de. [46.223.159.71]) by smtp.gmail.com with ESMTPSA id l7sm12863367wjx.14.2016.02.12.08.55.34 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 12 Feb 2016 08:55:35 -0800 (PST) From: Jan Glauber To: Will Deacon , Mark Rutland Subject: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX Date: Fri, 12 Feb 2016 17:55:06 +0100 Message-Id: <8bd93a25e069ed6428bc6c21fc53270962ccafcc.1455295032.git.jglauber@cavium.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: References: In-Reply-To: References: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160212_085552_915735_9551653A X-CRM114-Status: GOOD ( 26.00 ) X-Spam-Score: -2.6 (--) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Jan Glauber MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Provide uncore facilities for non-CPU performance counter units. Based on Intel/AMD uncore pmu support. The uncore PMUs can be found under /sys/bus/event_source/devices. All counters are exported via sysfs in the corresponding events files under the PMU directory so the perf tool can list the event names. There are 2 points that are special in this implementation: 1) The PMU detection solely relies on PCI device detection. If a matching PCI device is found the PMU is created. The code can deal with multiple units of the same type, e.g. more than one memory controller. 2) Counters are summarized across the different units of the same type, e.g. L2C TAD 0..7 is presented as a single counter (adding the values from TAD 0 to 7). Although losing the ability to read a single value the merged values are easier to use and yield enough information. Signed-off-by: Jan Glauber --- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/uncore/Makefile | 1 + arch/arm64/kernel/uncore/uncore_cavium.c | 210 +++++++++++++++++++++++++++++++ arch/arm64/kernel/uncore/uncore_cavium.h | 73 +++++++++++ 4 files changed, 285 insertions(+) create mode 100644 arch/arm64/kernel/uncore/Makefile create mode 100644 arch/arm64/kernel/uncore/uncore_cavium.c create mode 100644 arch/arm64/kernel/uncore/uncore_cavium.h diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 83cd7e6..c2d2810 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -42,6 +42,7 @@ arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o arm64-obj-$(CONFIG_ACPI) += acpi.o arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o +arm64-obj-$(CONFIG_ARCH_THUNDER) += uncore/ obj-y += $(arm64-obj-y) vdso/ obj-m += $(arm64-obj-m) diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile new file mode 100644 index 0000000..b9c72c2 --- /dev/null +++ b/arch/arm64/kernel/uncore/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c new file mode 100644 index 0000000..0cfcc83 --- /dev/null +++ b/arch/arm64/kernel/uncore/uncore_cavium.c @@ -0,0 +1,210 @@ +/* + * Cavium Thunder uncore PMU support. Derived from Intel and AMD uncore code. + * + * Copyright (C) 2015,2016 Cavium Inc. + * Author: Jan Glauber + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "uncore_cavium.h" + +int thunder_uncore_version; + +struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event) +{ + return NULL; +} + +void thunder_uncore_read(struct perf_event *event) +{ + struct thunder_uncore *uncore = event_to_thunder_uncore(event); + struct hw_perf_event *hwc = &event->hw; + u64 prev, new = 0; + s64 delta; + int i; + + /* + * since we do not enable counter overflow interrupts, + * we do not have to worry about prev_count changing on us + */ + + prev = local64_read(&hwc->prev_count); + + /* read counter values from all units */ + for (i = 0; i < uncore->nr_units; i++) + new += readq(map_offset(hwc->event_base, uncore, i)); + + local64_set(&hwc->prev_count, new); + delta = new - prev; + local64_add(delta, &event->count); +} + +void thunder_uncore_del(struct perf_event *event, int flags) +{ + struct thunder_uncore *uncore = event_to_thunder_uncore(event); + struct hw_perf_event *hwc = &event->hw; + int i; + + event->pmu->stop(event, PERF_EF_UPDATE); + + for (i = 0; i < uncore->num_counters; i++) { + if (cmpxchg(&uncore->events[i], event, NULL) == event) + break; + } + hwc->idx = -1; +} + +int thunder_uncore_event_init(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + struct thunder_uncore *uncore; + + if (event->attr.type != event->pmu->type) + return -ENOENT; + + /* we do not support sampling */ + if (is_sampling_event(event)) + return -EINVAL; + + /* counters do not have these bits */ + if (event->attr.exclude_user || + event->attr.exclude_kernel || + event->attr.exclude_host || + event->attr.exclude_guest || + event->attr.exclude_hv || + event->attr.exclude_idle) + return -EINVAL; + + /* and we do not enable counter overflow interrupts */ + + uncore = event_to_thunder_uncore(event); + if (!uncore) + return -ENODEV; + if (!uncore->event_valid(event->attr.config)) + return -EINVAL; + + hwc->config = event->attr.config; + hwc->idx = -1; + + /* and we don't care about CPU */ + + return 0; +} + +static cpumask_t thunder_active_mask; + +static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + cpumask_t *active_mask = &thunder_active_mask; + + /* + * Thunder uncore events are independent from CPUs. Provide a cpumask + * nevertheless to prevent perf from adding the event per-cpu and just + * set the mask to one online CPU. + */ + cpumask_set_cpu(cpumask_first(cpu_online_mask), active_mask); + + return cpumap_print_to_pagebuf(true, buf, active_mask); +} +static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL); + +static struct attribute *thunder_uncore_attrs[] = { + &dev_attr_cpumask.attr, + NULL, +}; + +struct attribute_group thunder_uncore_attr_group = { + .attrs = thunder_uncore_attrs, +}; + +ssize_t thunder_events_sysfs_show(struct device *dev, + struct device_attribute *attr, + char *page) +{ + struct perf_pmu_events_attr *pmu_attr = + container_of(attr, struct perf_pmu_events_attr, attr); + + if (pmu_attr->event_str) + return sprintf(page, "%s", pmu_attr->event_str); + + return 0; +} + +int __init thunder_uncore_setup(struct thunder_uncore *uncore, int id, + unsigned long offset, unsigned long size, + struct pmu *pmu) +{ + struct pci_dev *pdev = NULL; + pci_bus_addr_t start; + int ret, node = 0; + + /* detect PCI devices */ + do { + pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM, id, pdev); + if (!pdev) + break; + start = pci_resource_start(pdev, 0); + uncore->pdevs[node].pdev = pdev; + uncore->pdevs[node].base = start; + uncore->pdevs[node].map = ioremap(start + offset, size); + node++; + if (node >= MAX_NR_UNCORE_PDEVS) { + pr_err("reached pdev limit\n"); + break; + } + } while (1); + + if (!node) + return -ENODEV; + + uncore->nr_units = node; + + ret = perf_pmu_register(pmu, pmu->name, -1); + if (ret) + goto fail; + + uncore->pmu = pmu; + return 0; + +fail: + for (node = 0; node < MAX_NR_UNCORE_PDEVS; node++) { + pdev = uncore->pdevs[node].pdev; + if (!pdev) + break; + iounmap(uncore->pdevs[node].map); + pci_dev_put(pdev); + } + return ret; +} + +static int __init thunder_uncore_init(void) +{ + unsigned long implementor = read_cpuid_implementor(); + unsigned long part_number = read_cpuid_part_number(); + u32 variant; + + if (implementor != ARM_CPU_IMP_CAVIUM || + part_number != CAVIUM_CPU_PART_THUNDERX) + return -ENODEV; + + /* detect pass2 which contains different counters */ + variant = MIDR_VARIANT(read_cpuid_id()); + if (variant == 1) + thunder_uncore_version = 1; + pr_info("PMU version: %d\n", thunder_uncore_version); + + return 0; +} +late_initcall(thunder_uncore_init); diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h new file mode 100644 index 0000000..acd121d --- /dev/null +++ b/arch/arm64/kernel/uncore/uncore_cavium.h @@ -0,0 +1,73 @@ +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#undef pr_fmt +#define pr_fmt(fmt) "thunderx_uncore: " fmt + +enum uncore_type { + NOP_TYPE, +}; + +extern int thunder_uncore_version; + +#define MAX_NR_UNCORE_PDEVS 16 + +/* maximum number of parallel hardware counters for all uncore parts */ +#define MAX_COUNTERS 64 + +/* generic uncore struct for different pmu types */ +struct thunder_uncore { + int num_counters; + int nr_units; + int type; + struct pmu *pmu; + int (*event_valid)(u64); + struct { + unsigned long base; + void __iomem *map; + struct pci_dev *pdev; + } pdevs[MAX_NR_UNCORE_PDEVS]; + struct perf_event *events[MAX_COUNTERS]; +}; + +#define EVENT_PTR(_id) (&event_attr_##_id.attr.attr) + +#define EVENT_ATTR(_name, _val) \ +static struct perf_pmu_events_attr event_attr_##_name = { \ + .attr = __ATTR(_name, 0444, thunder_events_sysfs_show, NULL), \ + .event_str = "event=" __stringify(_val), \ +}; + +#define EVENT_ATTR_STR(_name, _str) \ +static struct perf_pmu_events_attr event_attr_##_name = { \ + .attr = __ATTR(_name, 0444, thunder_events_sysfs_show, NULL), \ + .event_str = _str, \ +}; + +static inline void __iomem *map_offset(unsigned long addr, + struct thunder_uncore *uncore, int unit) +{ + return (void __iomem *) (addr + uncore->pdevs[unit].map); +} + +extern struct attribute_group thunder_uncore_attr_group; + +/* Prototypes */ +struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event); +void thunder_uncore_del(struct perf_event *event, int flags); +int thunder_uncore_event_init(struct perf_event *event); +void thunder_uncore_read(struct perf_event *event); +int thunder_uncore_setup(struct thunder_uncore *uncore, int id, + unsigned long offset, unsigned long size, + struct pmu *pmu); +ssize_t thunder_events_sysfs_show(struct device *dev, + struct device_attribute *attr, + char *page);