Message ID | 20240813132323.98728-1-james.clark@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | perf stat: Make default perf stat command work on Arm big.LITTLE | expand |
On Tue, Aug 13, 2024 at 6:24 AM James Clark <james.clark@linaro.org> wrote: > > The important patches are 3 and 5, the rest are tidyups and tests. > > I don't think there is any interaction with the other open issues > about the uncore DSU cycles event or JSON/legacy hw event priorities > because only hw events on core PMUs are used for the default > stat command. And also just sharing the existing x86 code works so > no big changes are required. > > For patch 3 the weak arch specific symbol has to continue to be used > rather than picking the implementation based on > perf_pmus__supports_extended_type() like in patch 5. This is because > that function ends up calling evsel__hw_name() itself which results > in recursion. But at least one weak arch_* construct has been removed, > so it's better than nothing. Let's not do things this way. The use of strings is architecture neutral, means we don't need to create new arch functions on things like RISC-V, it encapsulates the complexity of things like topdown events, Apple ARM M CPUs not supporting legacy events, etc. Duplicating the existing x86 logic, when that was something trying to be removed, is not the way to go. That logic was a holdover from the hybrid tech debt we've been working to remove with a generic approach. Thanks, Ian > James Clark (7): > perf stat: Initialize instead of overwriting clock event > perf stat: Remove unused default_null_attrs > perf evsel: Use the same arch_evsel__hw_name() on arm64 as x86 > perf evsel: Remove duplicated __evsel__hw_name() code > perf evlist: Use hybrid default attrs whenever extended type is > supported > perf test: Make stat test work on DT devices > perf test: Add a test for default perf stat command > > tools/perf/arch/arm64/util/Build | 1 + > tools/perf/arch/arm64/util/evsel.c | 7 ++++ > tools/perf/arch/x86/util/evlist.c | 65 ------------------------------ > tools/perf/arch/x86/util/evsel.c | 17 +------- > tools/perf/builtin-stat.c | 12 ++---- > tools/perf/tests/shell/stat.sh | 33 ++++++++++++--- > tools/perf/util/evlist.c | 65 ++++++++++++++++++++++++++---- > tools/perf/util/evlist.h | 6 +-- > tools/perf/util/evsel.c | 19 +++++++++ > tools/perf/util/evsel.h | 2 +- > 10 files changed, 119 insertions(+), 108 deletions(-) > create mode 100644 tools/perf/arch/arm64/util/evsel.c > > -- > 2.34.1 >
On 13/08/2024 3:35 pm, Ian Rogers wrote: > On Tue, Aug 13, 2024 at 6:24 AM James Clark <james.clark@linaro.org> wrote: >> >> The important patches are 3 and 5, the rest are tidyups and tests. >> >> I don't think there is any interaction with the other open issues >> about the uncore DSU cycles event or JSON/legacy hw event priorities >> because only hw events on core PMUs are used for the default >> stat command. And also just sharing the existing x86 code works so >> no big changes are required. >> >> For patch 3 the weak arch specific symbol has to continue to be used >> rather than picking the implementation based on >> perf_pmus__supports_extended_type() like in patch 5. This is because >> that function ends up calling evsel__hw_name() itself which results >> in recursion. But at least one weak arch_* construct has been removed, >> so it's better than nothing. > > Let's not do things this way. The use of strings is architecture > neutral, means we don't need to create new arch functions on things > like RISC-V, it encapsulates the complexity of things like topdown If the new arch function is an issue that could be worked around by calling perf_pmus__supports_extended_type() on patch 3 as well? It just needs a small change to not recurse. > events, Apple ARM M CPUs not supporting legacy events, etc. If Apple M doesn't support the HW events does _any_ default Perf stat command (hybrid or not) work? I'm not really trying to fix that here, just make whatever already works work on big.LITTLE. > Duplicating the existing x86 logic, when that was something trying to > be removed, is not the way to go. That logic was a holdover from the > hybrid tech debt we've been working to remove with a generic approach. > > Thanks, > Ian > I think all of that may make sense, but in this case I haven't actually duplicated anything, rather shared the existing code to also be used on Arm. This means we can have the default perf stat working on Arm from today, and if any other changes get made it will continue to work as I've also added a test for it. >> James Clark (7): >> perf stat: Initialize instead of overwriting clock event >> perf stat: Remove unused default_null_attrs >> perf evsel: Use the same arch_evsel__hw_name() on arm64 as x86 >> perf evsel: Remove duplicated __evsel__hw_name() code >> perf evlist: Use hybrid default attrs whenever extended type is >> supported >> perf test: Make stat test work on DT devices >> perf test: Add a test for default perf stat command >> >> tools/perf/arch/arm64/util/Build | 1 + >> tools/perf/arch/arm64/util/evsel.c | 7 ++++ >> tools/perf/arch/x86/util/evlist.c | 65 ------------------------------ >> tools/perf/arch/x86/util/evsel.c | 17 +------- >> tools/perf/builtin-stat.c | 12 ++---- >> tools/perf/tests/shell/stat.sh | 33 ++++++++++++--- >> tools/perf/util/evlist.c | 65 ++++++++++++++++++++++++++---- >> tools/perf/util/evlist.h | 6 +-- >> tools/perf/util/evsel.c | 19 +++++++++ >> tools/perf/util/evsel.h | 2 +- >> 10 files changed, 119 insertions(+), 108 deletions(-) >> create mode 100644 tools/perf/arch/arm64/util/evsel.c >> >> -- >> 2.34.1 >>
On 13/08/2024 3:45 pm, James Clark wrote: > > > On 13/08/2024 3:35 pm, Ian Rogers wrote: >> On Tue, Aug 13, 2024 at 6:24 AM James Clark <james.clark@linaro.org> >> wrote: >>> >>> The important patches are 3 and 5, the rest are tidyups and tests. >>> >>> I don't think there is any interaction with the other open issues >>> about the uncore DSU cycles event or JSON/legacy hw event priorities >>> because only hw events on core PMUs are used for the default >>> stat command. And also just sharing the existing x86 code works so >>> no big changes are required. >>> >>> For patch 3 the weak arch specific symbol has to continue to be used >>> rather than picking the implementation based on >>> perf_pmus__supports_extended_type() like in patch 5. This is because >>> that function ends up calling evsel__hw_name() itself which results >>> in recursion. But at least one weak arch_* construct has been removed, >>> so it's better than nothing. >> >> Let's not do things this way. The use of strings is architecture >> neutral, means we don't need to create new arch functions on things >> like RISC-V, it encapsulates the complexity of things like topdown > > If the new arch function is an issue that could be worked around by > calling perf_pmus__supports_extended_type() on patch 3 as well? It just > needs a small change to not recurse. > >> events, Apple ARM M CPUs not supporting legacy events, etc. > > If Apple M doesn't support the HW events does _any_ default Perf stat > command (hybrid or not) work? I'm not really trying to fix that here, > just make whatever already works work on big.LITTLE. > >> Duplicating the existing x86 logic, when that was something trying to >> be removed, is not the way to go. That logic was a holdover from the >> hybrid tech debt we've been working to remove with a generic approach. >> >> Thanks, >> Ian >> > > I think all of that may make sense, but in this case I haven't actually > duplicated anything, rather shared the existing code to also be used on > Arm. > > This means we can have the default perf stat working on Arm from today, > and if any other changes get made it will continue to work as I've also > added a test for it. > I would also like to note that (not including the new test) this patchset _removes_ more code than it adds, so to say it duplicates code is a bit unfair. Of course this touches some similar areas to your other change, but that doesn't mean I think we shouldn't continue with that one. I'm still happy to review and test or contribute to that one if you like. I think it just helped me to do it in this order and get this thing in a working state first before the next bigger step. >>> James Clark (7): >>> perf stat: Initialize instead of overwriting clock event >>> perf stat: Remove unused default_null_attrs >>> perf evsel: Use the same arch_evsel__hw_name() on arm64 as x86 >>> perf evsel: Remove duplicated __evsel__hw_name() code >>> perf evlist: Use hybrid default attrs whenever extended type is >>> supported >>> perf test: Make stat test work on DT devices >>> perf test: Add a test for default perf stat command >>> >>> tools/perf/arch/arm64/util/Build | 1 + >>> tools/perf/arch/arm64/util/evsel.c | 7 ++++ >>> tools/perf/arch/x86/util/evlist.c | 65 ------------------------------ >>> tools/perf/arch/x86/util/evsel.c | 17 +------- >>> tools/perf/builtin-stat.c | 12 ++---- >>> tools/perf/tests/shell/stat.sh | 33 ++++++++++++--- >>> tools/perf/util/evlist.c | 65 ++++++++++++++++++++++++++---- >>> tools/perf/util/evlist.h | 6 +-- >>> tools/perf/util/evsel.c | 19 +++++++++ >>> tools/perf/util/evsel.h | 2 +- >>> 10 files changed, 119 insertions(+), 108 deletions(-) >>> create mode 100644 tools/perf/arch/arm64/util/evsel.c >>> >>> -- >>> 2.34.1 >>>