Message ID | 20230710122138.1450930-2-james.clark@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability | expand |
On Mon, Jul 10, 2023 at 5:22 AM James Clark <james.clark@arm.com> wrote: > > This capability gives us the ability to open PERF_TYPE_HARDWARE and > PERF_TYPE_HW_CACHE events on a specific PMU for free. All the > implementation is contained in the Perf core and tool code so no change > to the Arm PMU driver is needed. > > The following basic use case now results in Perf opening the event on > all PMUs rather than picking only one in an unpredictable way: > > $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 > > Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': > > 963279620 armv8_cortex_a57/cycles/ (99.19%) > 752745657 armv8_cortex_a53/cycles/ (94.80%) > > Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") > Suggested-by: Ian Rogers <irogers@google.com> > Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Thanks, Ian > --- > drivers/perf/arm_pmu.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c > index 277e29fbd504..d8844a9461a2 100644 > --- a/drivers/perf/arm_pmu.c > +++ b/drivers/perf/arm_pmu.c > @@ -875,8 +875,13 @@ struct arm_pmu *armpmu_alloc(void) > * configuration (e.g. big.LITTLE). This is not an uncore PMU, > * and we have taken ctx sharing into account (e.g. with our > * pmu::filter callback and pmu::event_init group validation). > + * > + * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open the legacy > + * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a > + * specific PMU. > */ > - .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS, > + .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS | > + PERF_PMU_CAP_EXTENDED_HW_TYPE, > }; > > pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] = > -- > 2.34.1 >
On 7/10/23 17:51, James Clark wrote: > This capability gives us the ability to open PERF_TYPE_HARDWARE and > PERF_TYPE_HW_CACHE events on a specific PMU for free. All the > implementation is contained in the Perf core and tool code so no change > to the Arm PMU driver is needed. > > The following basic use case now results in Perf opening the event on > all PMUs rather than picking only one in an unpredictable way: > > $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 > > Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': > > 963279620 armv8_cortex_a57/cycles/ (99.19%) > 752745657 armv8_cortex_a53/cycles/ (94.80%) > > Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") > Suggested-by: Ian Rogers <irogers@google.com> > Signed-off-by: James Clark <james.clark@arm.com> > --- > drivers/perf/arm_pmu.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c > index 277e29fbd504..d8844a9461a2 100644 > --- a/drivers/perf/arm_pmu.c > +++ b/drivers/perf/arm_pmu.c > @@ -875,8 +875,13 @@ struct arm_pmu *armpmu_alloc(void) > * configuration (e.g. big.LITTLE). This is not an uncore PMU, > * and we have taken ctx sharing into account (e.g. with our > * pmu::filter callback and pmu::event_init group validation). > + * > + * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open the legacy s/legacy/generic ? These hardware events are still around. > + * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a > + * specific PMU. > */ > - .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS, > + .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS | > + PERF_PMU_CAP_EXTENDED_HW_TYPE, > }; > > pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
On 11/07/2023 13:01, Anshuman Khandual wrote: > > > On 7/10/23 17:51, James Clark wrote: >> This capability gives us the ability to open PERF_TYPE_HARDWARE and >> PERF_TYPE_HW_CACHE events on a specific PMU for free. All the >> implementation is contained in the Perf core and tool code so no change >> to the Arm PMU driver is needed. >> >> The following basic use case now results in Perf opening the event on >> all PMUs rather than picking only one in an unpredictable way: >> >> $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 >> >> Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': >> >> 963279620 armv8_cortex_a57/cycles/ (99.19%) >> 752745657 armv8_cortex_a53/cycles/ (94.80%) >> >> Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") >> Suggested-by: Ian Rogers <irogers@google.com> >> Signed-off-by: James Clark <james.clark@arm.com> >> --- >> drivers/perf/arm_pmu.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c >> index 277e29fbd504..d8844a9461a2 100644 >> --- a/drivers/perf/arm_pmu.c >> +++ b/drivers/perf/arm_pmu.c >> @@ -875,8 +875,13 @@ struct arm_pmu *armpmu_alloc(void) >> * configuration (e.g. big.LITTLE). This is not an uncore PMU, >> * and we have taken ctx sharing into account (e.g. with our >> * pmu::filter callback and pmu::event_init group validation). >> + * >> + * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open the legacy > > s/legacy/generic ? These hardware events are still around. True, I thought I saw it mentioned that way somewhere, but I can probably just remove it altogether. PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is enough. > >> + * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a >> + * specific PMU. >> */ >> - .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS, >> + .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS | >> + PERF_PMU_CAP_EXTENDED_HW_TYPE, >> }; >> >> pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
On Tue, Jul 11, 2023 at 7:12 AM James Clark <james.clark@arm.com> wrote: > > > > On 11/07/2023 13:01, Anshuman Khandual wrote: > > > > > > On 7/10/23 17:51, James Clark wrote: > >> This capability gives us the ability to open PERF_TYPE_HARDWARE and > >> PERF_TYPE_HW_CACHE events on a specific PMU for free. All the > >> implementation is contained in the Perf core and tool code so no change > >> to the Arm PMU driver is needed. > >> > >> The following basic use case now results in Perf opening the event on > >> all PMUs rather than picking only one in an unpredictable way: > >> > >> $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 > >> > >> Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': > >> > >> 963279620 armv8_cortex_a57/cycles/ (99.19%) > >> 752745657 armv8_cortex_a53/cycles/ (94.80%) > >> > >> Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") > >> Suggested-by: Ian Rogers <irogers@google.com> > >> Signed-off-by: James Clark <james.clark@arm.com> Hi ARM Linux and ARM Linux PMU people, Could this patch be picked up for Linux 6.5? I don't see it in the tree and it seems a shame to have to wait for it. The other patches do cleanup and so waiting for 6.6 seems okay. Thanks, Ian > >> --- > >> drivers/perf/arm_pmu.c | 7 ++++++- > >> 1 file changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c > >> index 277e29fbd504..d8844a9461a2 100644 > >> --- a/drivers/perf/arm_pmu.c > >> +++ b/drivers/perf/arm_pmu.c > >> @@ -875,8 +875,13 @@ struct arm_pmu *armpmu_alloc(void) > >> * configuration (e.g. big.LITTLE). This is not an uncore PMU, > >> * and we have taken ctx sharing into account (e.g. with our > >> * pmu::filter callback and pmu::event_init group validation). > >> + * > >> + * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open the legacy > > > > s/legacy/generic ? These hardware events are still around. > > True, I thought I saw it mentioned that way somewhere, but I can > probably just remove it altogether. PERF_TYPE_HARDWARE and > PERF_TYPE_HW_CACHE is enough. > > > > > >> + * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a > >> + * specific PMU. > >> */ > >> - .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS, > >> + .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS | > >> + PERF_PMU_CAP_EXTENDED_HW_TYPE, > >> }; > >> > >> pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
On Thu, Jul 20, 2023 at 10:12:21AM -0700, Ian Rogers wrote: > On Tue, Jul 11, 2023 at 7:12 AM James Clark <james.clark@arm.com> wrote: > > > > > > > > On 11/07/2023 13:01, Anshuman Khandual wrote: > > > > > > > > > On 7/10/23 17:51, James Clark wrote: > > >> This capability gives us the ability to open PERF_TYPE_HARDWARE and > > >> PERF_TYPE_HW_CACHE events on a specific PMU for free. All the > > >> implementation is contained in the Perf core and tool code so no change > > >> to the Arm PMU driver is needed. > > >> > > >> The following basic use case now results in Perf opening the event on > > >> all PMUs rather than picking only one in an unpredictable way: > > >> > > >> $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 > > >> > > >> Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': > > >> > > >> 963279620 armv8_cortex_a57/cycles/ (99.19%) > > >> 752745657 armv8_cortex_a53/cycles/ (94.80%) > > >> > > >> Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") > > >> Suggested-by: Ian Rogers <irogers@google.com> > > >> Signed-off-by: James Clark <james.clark@arm.com> > > Hi ARM Linux and ARM Linux PMU people, > > Could this patch be picked up for Linux 6.5? I don't see it in the > tree and it seems a shame to have to wait for it. The other patches do > cleanup and so waiting for 6.6 seems okay. I'm only taking fixes for 6.5 and I don't think this qualifies. If it was an oversight introduced during the recent merge window, then I'd be happier fixing it up, but 55bcf6ef314a was merged ages ago (v5.12?), so I think we can wait. I'll be queuing perf changes for 6.6 next week, so I'll look at this then. Cheers, Will
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 277e29fbd504..d8844a9461a2 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -875,8 +875,13 @@ struct arm_pmu *armpmu_alloc(void) * configuration (e.g. big.LITTLE). This is not an uncore PMU, * and we have taken ctx sharing into account (e.g. with our * pmu::filter callback and pmu::event_init group validation). + * + * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open the legacy + * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a + * specific PMU. */ - .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS, + .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS | + PERF_PMU_CAP_EXTENDED_HW_TYPE, }; pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
This capability gives us the ability to open PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a specific PMU for free. All the implementation is contained in the Perf core and tool code so no change to the Arm PMU driver is needed. The following basic use case now results in Perf opening the event on all PMUs rather than picking only one in an unpredictable way: $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': 963279620 armv8_cortex_a57/cycles/ (99.19%) 752745657 armv8_cortex_a53/cycles/ (94.80%) Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> --- drivers/perf/arm_pmu.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)