Message ID | 20221221223420.2157113-1-irogers@google.com (mailing list archive) |
---|---|
Headers | show |
Series | jevents/pmu-events improvements | expand |
On Wed, Dec 21, 2022 at 2:34 PM Ian Rogers <irogers@google.com> wrote: > > Add an optimization to jevents using the metric code, rewrite metrics > in terms of each other in order to minimize size and improve > readability. For example, on Power8 > other_stall_cpi is rewritten from: > "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL" > to: > "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi" > Which more closely matches the definition on Power9. > > A limitation of the substitutions are that they depend on strict > equality and the shape of the tree. This means that for "a + b + c" > then a substitution of "a + b" will succeed while "b + c" will fail > (the LHS for "+ c" is "a + b" not just "b"). > > Separate out the events and metrics in the pmu-events tables saving > 14.8% in the table size while making it that metrics no longer need to > iterate over all events and vice versa. These changes remove evsel's > direct metric support as the pmu_event no longer has a metric to > populate it. This is a minor issue as the code wasn't working > properly, metrics for this are rare and can still be properly ran > using '-M'. > > Add an ability to just build certain models into the jevents generated > pmu-metrics.c code. This functionality is appropriate for operating > systems like ChromeOS, that aim to minimize binary size and know all > the target CPU models. > > v2. Rebase. Modify the code that skips rewriting a metric with the > same name with itself, to make the name check case insensitive. > > Ian Rogers (9): > perf jevents metric: Correct Function equality > perf jevents metric: Add ability to rewrite metrics in terms of others > perf jevents: Rewrite metrics in the same file with each other > perf pmu-events: Separate metric out of pmu_event > perf stat: Remove evsel metric_name/expr > perf jevents: Combine table prefix and suffix writing > perf pmu-events: Introduce pmu_metrics_table > perf jevents: Generate metrics and events as separate tables > perf jevents: Add model list option Ping. Looking for reviews. Thanks, Ian > tools/perf/arch/arm64/util/pmu.c | 23 +- > tools/perf/arch/powerpc/util/header.c | 4 +- > tools/perf/builtin-list.c | 20 +- > tools/perf/builtin-stat.c | 1 - > tools/perf/pmu-events/Build | 3 +- > tools/perf/pmu-events/empty-pmu-events.c | 111 ++++++- > tools/perf/pmu-events/jevents.py | 353 ++++++++++++++++++----- > tools/perf/pmu-events/metric.py | 79 ++++- > tools/perf/pmu-events/metric_test.py | 10 + > tools/perf/pmu-events/pmu-events.h | 26 +- > tools/perf/tests/expand-cgroup.c | 4 +- > tools/perf/tests/parse-metric.c | 4 +- > tools/perf/tests/pmu-events.c | 68 ++--- > tools/perf/util/cgroup.c | 1 - > tools/perf/util/evsel.c | 2 - > tools/perf/util/evsel.h | 2 - > tools/perf/util/metricgroup.c | 203 +++++++------ > tools/perf/util/metricgroup.h | 4 +- > tools/perf/util/parse-events.c | 2 - > tools/perf/util/pmu.c | 44 +-- > tools/perf/util/pmu.h | 10 +- > tools/perf/util/print-events.c | 32 +- > tools/perf/util/print-events.h | 3 +- > tools/perf/util/python.c | 7 - > tools/perf/util/stat-shadow.c | 112 ------- > tools/perf/util/stat.h | 1 - > 26 files changed, 666 insertions(+), 463 deletions(-) > > -- > 2.39.0.314.g84b9a713c41-goog >
On 21/12/2022 22:34, Ian Rogers wrote: > Add an optimization to jevents using the metric code, rewrite metrics > in terms of each other in order to minimize size and improve > readability. For example, on Power8 > other_stall_cpi is rewritten from: > "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL" > to: > "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi" > Which more closely matches the definition on Power9. > > A limitation of the substitutions are that they depend on strict > equality and the shape of the tree. This means that for "a + b + c" > then a substitution of "a + b" will succeed while "b + c" will fail > (the LHS for "+ c" is "a + b" not just "b"). > > Separate out the events and metrics in the pmu-events tables saving > 14.8% in the table size while making it that metrics no longer need to > iterate over all events and vice versa. These changes remove evsel's > direct metric support as the pmu_event no longer has a metric to > populate it. This is a minor issue as the code wasn't working > properly, metrics for this are rare and can still be properly ran > using '-M'. > > Add an ability to just build certain models into the jevents generated > pmu-metrics.c code. This functionality is appropriate for operating > systems like ChromeOS, that aim to minimize binary size and know all > the target CPU models. From a glance, this does not look like it would work for arm64. As I see in the code, we check the model in the arch folder for the test to see if built. For arm64, as it uses arch/implementator/model folder org, and not just arch/model (like x86) So on the assumption that it does not work for arm64 (or just any arch which uses arch/implementator/model folder org), it would be nice to have that feature also. Or maybe also support not just specifying model but also implementator. > > v2. Rebase. Modify the code that skips rewriting a metric with the > same name with itself, to make the name check case insensitive. > Unfortunately you might need another rebase as this does not apply to acme perf/core (if that is what you want), now for me at: 5670ebf54bd2 (HEAD, origin/tmp.perf/core, origin/perf/core, perf/core) perf cs-etm: Ensure that Coresight timestamps don't go backwards > Ian Rogers (9): > perf jevents metric: Correct Function equality > perf jevents metric: Add ability to rewrite metrics in terms of others > perf jevents: Rewrite metrics in the same file with each other > perf pmu-events: Separate metric out of pmu_event > perf stat: Remove evsel metric_name/expr > perf jevents: Combine table prefix and suffix writing > perf pmu-events: Introduce pmu_metrics_table > perf jevents: Generate metrics and events as separate tables > perf jevents: Add model list option Thanks, John
On Mon, Jan 23, 2023 at 5:26 AM John Garry <john.g.garry@oracle.com> wrote: > > On 21/12/2022 22:34, Ian Rogers wrote: > > Add an optimization to jevents using the metric code, rewrite metrics > > in terms of each other in order to minimize size and improve > > readability. For example, on Power8 > > other_stall_cpi is rewritten from: > > "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL" > > to: > > "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi" > > Which more closely matches the definition on Power9. > > > > A limitation of the substitutions are that they depend on strict > > equality and the shape of the tree. This means that for "a + b + c" > > then a substitution of "a + b" will succeed while "b + c" will fail > > (the LHS for "+ c" is "a + b" not just "b"). > > > > Separate out the events and metrics in the pmu-events tables saving > > 14.8% in the table size while making it that metrics no longer need to > > iterate over all events and vice versa. These changes remove evsel's > > direct metric support as the pmu_event no longer has a metric to > > populate it. This is a minor issue as the code wasn't working > > properly, metrics for this are rare and can still be properly ran > > using '-M'. > > > > Add an ability to just build certain models into the jevents generated > > pmu-metrics.c code. This functionality is appropriate for operating > > systems like ChromeOS, that aim to minimize binary size and know all > > the target CPU models. > > From a glance, this does not look like it would work for arm64. As I > see in the code, we check the model in the arch folder for the test to > see if built. For arm64, as it uses arch/implementator/model folder org, > and not just arch/model (like x86) > > So on the assumption that it does not work for arm64 (or just any arch > which uses arch/implementator/model folder org), it would be nice to > have that feature also. Or maybe also support not just specifying model > but also implementator. Hmm.. this is tricky as x86 isn't following the implementor pattern. I will tweak the comment for the ARM64 case where --model will select an implementor. > > > > v2. Rebase. Modify the code that skips rewriting a metric with the > > same name with itself, to make the name check case insensitive. > > > > > Unfortunately you might need another rebase as this does not apply to > acme perf/core (if that is what you want), now for me at: > > 5670ebf54bd2 (HEAD, origin/tmp.perf/core, origin/perf/core, perf/core) > perf cs-etm: Ensure that Coresight timestamps don't go backwards Will do, thanks! Ian > > Ian Rogers (9): > > perf jevents metric: Correct Function equality > > perf jevents metric: Add ability to rewrite metrics in terms of others > > perf jevents: Rewrite metrics in the same file with each other > > perf pmu-events: Separate metric out of pmu_event > > perf stat: Remove evsel metric_name/expr > > perf jevents: Combine table prefix and suffix writing > > perf pmu-events: Introduce pmu_metrics_table > > perf jevents: Generate metrics and events as separate tables > > perf jevents: Add model list option > > Thanks, > John