diff mbox series

[v5,1/2] arm64: perf: Add support caps in sysfs

Message ID 1595328573-12751-1-git-send-email-zhangshaokun@hisilicon.com (mailing list archive)
State New, archived
Headers show
Series [v5,1/2] arm64: perf: Add support caps in sysfs | expand

Commit Message

Shaokun Zhang July 21, 2020, 10:49 a.m. UTC
ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
like STALL_SLOT etc, are related to it. Let's add a caps directory to
/sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
PMMIR_EL1 registers in this entry. The user programs can get the slots
from sysfs directly.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
---
ChangeLog in v5:
    * Add check STALL_SLOT in PMCEID1_EL0

ChangeLog in v4:
    * Address Will's comments.

ChangeLog in v3:
    * Fix one typo in patch3

ChangeLog in v2:
    * Add caps entry in sysfs
    * Fix the PMU events typos
    * Add one new patch to correct event ID in sysfs

 arch/arm64/include/asm/sysreg.h |   2 +
 arch/arm64/kernel/perf_event.c  | 127 ++++++++++++++++++++++++++++------------
 include/linux/perf/arm_pmu.h    |   1 +
 3 files changed, 93 insertions(+), 37 deletions(-)

Comments

Will Deacon July 21, 2020, 11:56 a.m. UTC | #1
Hi Shaokun,

On Tue, Jul 21, 2020 at 06:49:32PM +0800, Shaokun Zhang wrote:
> ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
> like STALL_SLOT etc, are related to it. Let's add a caps directory to
> /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
> PMMIR_EL1 registers in this entry. The user programs can get the slots
> from sysfs directly.
> 
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
> ---
> ChangeLog in v5:
>     * Add check STALL_SLOT in PMCEID1_EL0

Thanks. I was just about to apply this, but then I realised that it's
completely broken for big.LITTLE :( One CPU might have PMMIR_EL1, but
another might not and so code such as:

> +static umode_t
> +armv8pmu_caps_attr_is_visible(struct kobject *kobj, struct attribute *attr,
> +			      int unused)
> +{
> +	int pmuver = armv8pmu_get_pmu_version();
> +	u32 pmceid1 = read_sysreg(pmceid1_el0);
> +
> +	/* Check the PMU version is >= v8.4 and STALL_SLOT is implemented */
> +	if (pmuver >= ID_AA64DFR0_PMUVER_8_4 && (pmceid1 & BIT(31)))
> +		return attr->mode;
> +
> +	return 0;
> +}
> +
> +static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
> +			  char *buf)
> +{
> +	int slots = read_sysreg_s(SYS_PMMIR_EL1) & 0xFF;
> +
> +	return snprintf(buf, PAGE_SIZE, "%d\n", slots);
> +}

Is dangerous if you can migrate between the two functions.

So I think what we need to do is use the cpu_pmu structure to stash the
PMMIR_EL1 register during probe, setting it to zero for CPUs without it,
and then you can just report that value on slots_show(), getting rid of
armv8pmu_caps_attr_is_visible() entirely.

Does that make sense? Sorry I didn't spot this earlier.

Will
Will Deacon July 21, 2020, 12:07 p.m. UTC | #2
On Tue, 21 Jul 2020 18:49:32 +0800, Shaokun Zhang wrote:
> ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
> like STALL_SLOT etc, are related to it. Let's add a caps directory to
> /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
> PMMIR_EL1 registers in this entry. The user programs can get the slots
> from sysfs directly.

Applied to will (for-next/perf), thanks!

(Please note that I only applied the second patch)

[2/2] arm64: perf: Expose some new events via sysfs
      https://git.kernel.org/will/c/55fdc1f44cd6

Cheers,
Mark Rutland July 21, 2020, 12:17 p.m. UTC | #3
On Tue, Jul 21, 2020 at 06:49:32PM +0800, Shaokun Zhang wrote:
> ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
> like STALL_SLOT etc, are related to it. Let's add a caps directory to
> /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
> PMMIR_EL1 registers in this entry. The user programs can get the slots
> from sysfs directly.
> 
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>

[...]

> +static int armv8pmu_get_pmu_version(void)
> +{
> +	int pmuver;
> +	u64 dfr0;
> +
> +	dfr0 = read_sysreg(id_aa64dfr0_el1);
> +	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
> +			ID_AA64DFR0_PMUVER_SHIFT);
> +
> +	return pmuver;
> +}
> +
> +static umode_t
> +armv8pmu_caps_attr_is_visible(struct kobject *kobj, struct attribute *attr,
> +			      int unused)
> +{
> +	int pmuver = armv8pmu_get_pmu_version();
> +	u32 pmceid1 = read_sysreg(pmceid1_el0);
> +
> +	/* Check the PMU version is >= v8.4 and STALL_SLOT is implemented */
> +	if (pmuver >= ID_AA64DFR0_PMUVER_8_4 && (pmceid1 & BIT(31)))
> +		return attr->mode;
> +
> +	return 0;
> +}
> +
> +static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
> +			  char *buf)
> +{
> +	int slots = read_sysreg_s(SYS_PMMIR_EL1) & 0xFF;
> +
> +	return snprintf(buf, PAGE_SIZE, "%d\n", slots);
> +}

You'll need to use the snapshots of these fields we take at probe time;
it's not safe to use read_sysreg() here as on a system with
heterogeneous CPUs this code can be running on one CPU while it's
printing information for another CPU.

Mark.
Mark Rutland July 21, 2020, 12:24 p.m. UTC | #4
On Tue, Jul 21, 2020 at 01:07:24PM +0100, Will Deacon wrote:
> On Tue, 21 Jul 2020 18:49:32 +0800, Shaokun Zhang wrote:
> > ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
> > like STALL_SLOT etc, are related to it. Let's add a caps directory to
> > /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
> > PMMIR_EL1 registers in this entry. The user programs can get the slots
> > from sysfs directly.
> 
> Applied to will (for-next/perf), thanks!
> 
> (Please note that I only applied the second patch)
> 
> [2/2] arm64: perf: Expose some new events via sysfs
>       https://git.kernel.org/will/c/55fdc1f44cd6

FWIW, that looks good to me.

Mark.
Shaokun Zhang July 22, 2020, 2:25 a.m. UTC | #5
Hi Will,

在 2020/7/21 19:56, Will Deacon 写道:
> Hi Shaokun,
> 
> On Tue, Jul 21, 2020 at 06:49:32PM +0800, Shaokun Zhang wrote:
>> ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
>> like STALL_SLOT etc, are related to it. Let's add a caps directory to
>> /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
>> PMMIR_EL1 registers in this entry. The user programs can get the slots
>> from sysfs directly.
>>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
>> ---
>> ChangeLog in v5:
>>     * Add check STALL_SLOT in PMCEID1_EL0
> 
> Thanks. I was just about to apply this, but then I realised that it's
> completely broken for big.LITTLE :( One CPU might have PMMIR_EL1, but

My bad, I miss it completely.

> another might not and so code such as:
> 
>> +static umode_t
>> +armv8pmu_caps_attr_is_visible(struct kobject *kobj, struct attribute *attr,
>> +			      int unused)
>> +{
>> +	int pmuver = armv8pmu_get_pmu_version();
>> +	u32 pmceid1 = read_sysreg(pmceid1_el0);
>> +
>> +	/* Check the PMU version is >= v8.4 and STALL_SLOT is implemented */
>> +	if (pmuver >= ID_AA64DFR0_PMUVER_8_4 && (pmceid1 & BIT(31)))
>> +		return attr->mode;
>> +
>> +	return 0;
>> +}
>> +
>> +static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
>> +			  char *buf)
>> +{
>> +	int slots = read_sysreg_s(SYS_PMMIR_EL1) & 0xFF;
>> +
>> +	return snprintf(buf, PAGE_SIZE, "%d\n", slots);
>> +}
> 
> Is dangerous if you can migrate between the two functions.
> 
> So I think what we need to do is use the cpu_pmu structure to stash the
> PMMIR_EL1 register during probe, setting it to zero for CPUs without it,
> and then you can just report that value on slots_show(), getting rid of

Got it.

> armv8pmu_caps_attr_is_visible() entirely.
> 

Do we need to check whether PMMIR_EL1 register is !0 and return attr->mode?
If not, it shall remove it completely.

> Does that make sense? Sorry I didn't spot this earlier.
> 

Thanks Will's detailed explanation, I will do it soon.

Shaokun,

> Will
> 
> .
>
Shaokun Zhang July 22, 2020, 2:32 a.m. UTC | #6
Hi Mark,

在 2020/7/21 20:17, Mark Rutland 写道:
> On Tue, Jul 21, 2020 at 06:49:32PM +0800, Shaokun Zhang wrote:
>> ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
>> like STALL_SLOT etc, are related to it. Let's add a caps directory to
>> /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
>> PMMIR_EL1 registers in this entry. The user programs can get the slots
>> from sysfs directly.
>>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
> 
> [...]
> 
>> +static int armv8pmu_get_pmu_version(void)
>> +{
>> +	int pmuver;
>> +	u64 dfr0;
>> +
>> +	dfr0 = read_sysreg(id_aa64dfr0_el1);
>> +	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
>> +			ID_AA64DFR0_PMUVER_SHIFT);
>> +
>> +	return pmuver;
>> +}
>> +
>> +static umode_t
>> +armv8pmu_caps_attr_is_visible(struct kobject *kobj, struct attribute *attr,
>> +			      int unused)
>> +{
>> +	int pmuver = armv8pmu_get_pmu_version();
>> +	u32 pmceid1 = read_sysreg(pmceid1_el0);
>> +
>> +	/* Check the PMU version is >= v8.4 and STALL_SLOT is implemented */
>> +	if (pmuver >= ID_AA64DFR0_PMUVER_8_4 && (pmceid1 & BIT(31)))
>> +		return attr->mode;
>> +
>> +	return 0;
>> +}
>> +
>> +static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
>> +			  char *buf)
>> +{
>> +	int slots = read_sysreg_s(SYS_PMMIR_EL1) & 0xFF;
>> +
>> +	return snprintf(buf, PAGE_SIZE, "%d\n", slots);
>> +}
> 
> You'll need to use the snapshots of these fields we take at probe time;

Ok,

> it's not safe to use read_sysreg() here as on a system with
> heterogeneous CPUs this code can be running on one CPU while it's

Apologies, I did not realize it at the beginning and will fix it in next
version.

Thanks,
Shaokun

> printing information for another CPU.
> 
> Mark.
> 
> .
>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 463175f80341..56c45a9207c7 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -321,6 +321,8 @@ 
 #define SYS_PMINTENSET_EL1		sys_reg(3, 0, 9, 14, 1)
 #define SYS_PMINTENCLR_EL1		sys_reg(3, 0, 9, 14, 2)
 
+#define SYS_PMMIR_EL1			sys_reg(3, 0, 9, 14, 6)
+
 #define SYS_MAIR_EL1			sys_reg(3, 0, 10, 2, 0)
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 4d7879484cec..1be042e1e565 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -277,6 +277,53 @@  static struct attribute_group armv8_pmuv3_format_attr_group = {
 	.attrs = armv8_pmuv3_format_attrs,
 };
 
+static int armv8pmu_get_pmu_version(void)
+{
+	int pmuver;
+	u64 dfr0;
+
+	dfr0 = read_sysreg(id_aa64dfr0_el1);
+	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
+			ID_AA64DFR0_PMUVER_SHIFT);
+
+	return pmuver;
+}
+
+static umode_t
+armv8pmu_caps_attr_is_visible(struct kobject *kobj, struct attribute *attr,
+			      int unused)
+{
+	int pmuver = armv8pmu_get_pmu_version();
+	u32 pmceid1 = read_sysreg(pmceid1_el0);
+
+	/* Check the PMU version is >= v8.4 and STALL_SLOT is implemented */
+	if (pmuver >= ID_AA64DFR0_PMUVER_8_4 && (pmceid1 & BIT(31)))
+		return attr->mode;
+
+	return 0;
+}
+
+static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
+			  char *buf)
+{
+	int slots = read_sysreg_s(SYS_PMMIR_EL1) & 0xFF;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", slots);
+}
+
+static DEVICE_ATTR_RO(slots);
+
+static struct attribute *armv8_pmuv3_caps_attrs[] = {
+	&dev_attr_slots.attr,
+	NULL,
+};
+
+static struct attribute_group armv8_pmuv3_caps_attr_group = {
+	.name = "caps",
+	.attrs = armv8_pmuv3_caps_attrs,
+	.is_visible = armv8pmu_caps_attr_is_visible,
+};
+
 /*
  * Perf Events' indices
  */
@@ -940,14 +987,11 @@  static void __armv8pmu_probe_pmu(void *info)
 {
 	struct armv8pmu_probe_info *probe = info;
 	struct arm_pmu *cpu_pmu = probe->pmu;
-	u64 dfr0;
 	u64 pmceid_raw[2];
 	u32 pmceid[2];
 	int pmuver;
 
-	dfr0 = read_sysreg(id_aa64dfr0_el1);
-	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
-			ID_AA64DFR0_PMUVER_SHIFT);
+	pmuver = armv8pmu_get_pmu_version();
 	if (pmuver == 0xf || pmuver == 0)
 		return;
 
@@ -994,7 +1038,8 @@  static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 			  int (*map_event)(struct perf_event *event),
 			  const struct attribute_group *events,
-			  const struct attribute_group *format)
+			  const struct attribute_group *format,
+			  const struct attribute_group *caps)
 {
 	int ret = armv8pmu_probe_pmu(cpu_pmu);
 	if (ret)
@@ -1019,104 +1064,112 @@  static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 			events : &armv8_pmuv3_events_attr_group;
 	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = format ?
 			format : &armv8_pmuv3_format_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_CAPS] = caps ?
+			caps : &armv8_pmuv3_caps_attr_group;
 
 	return 0;
 }
 
+static int armv8_pmu_init_nogroups(struct arm_pmu *cpu_pmu, char *name,
+				   int (*map_event)(struct perf_event *event))
+{
+	return armv8_pmu_init(cpu_pmu, name, map_event, NULL, NULL, NULL);
+}
+
 static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_pmuv3",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_pmuv3",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_a34_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a34",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a34",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_a35_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a35",
-			      armv8_a53_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a35",
+				       armv8_a53_map_event);
 }
 
 static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a53",
-			      armv8_a53_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a53",
+				       armv8_a53_map_event);
 }
 
 static int armv8_a55_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a55",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a55",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a57",
-			      armv8_a57_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a57",
+				       armv8_a57_map_event);
 }
 
 static int armv8_a65_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a65",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a65",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a72",
-			      armv8_a57_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a72",
+				       armv8_a57_map_event);
 }
 
 static int armv8_a73_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a73",
-			      armv8_a73_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a73",
+				       armv8_a73_map_event);
 }
 
 static int armv8_a75_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a75",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a75",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_a76_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a76",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a76",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_a77_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cortex_a77",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a77",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_e1_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_neoverse_e1",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_neoverse_e1",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_n1_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_neoverse_n1",
-			      armv8_pmuv3_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_neoverse_n1",
+				       armv8_pmuv3_map_event);
 }
 
 static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_cavium_thunder",
-			      armv8_thunder_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cavium_thunder",
+				       armv8_thunder_map_event);
 }
 
 static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu)
 {
-	return armv8_pmu_init(cpu_pmu, "armv8_brcm_vulcan",
-			      armv8_vulcan_map_event, NULL, NULL);
+	return armv8_pmu_init_nogroups(cpu_pmu, "armv8_brcm_vulcan",
+				       armv8_vulcan_map_event);
 }
 
 static const struct of_device_id armv8_pmu_of_device_ids[] = {
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 5b616dde9a4c..1e129b57d51a 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -73,6 +73,7 @@  enum armpmu_attr_groups {
 	ARMPMU_ATTR_GROUP_COMMON,
 	ARMPMU_ATTR_GROUP_EVENTS,
 	ARMPMU_ATTR_GROUP_FORMATS,
+	ARMPMU_ATTR_GROUP_CAPS,
 	ARMPMU_NR_ATTR_GROUPS
 };