mbox series

[v2,0/9] jevents/pmu-events improvements

Message ID 20221221223420.2157113-1-irogers@google.com (mailing list archive)
Headers show
Series jevents/pmu-events improvements | expand

Message

Ian Rogers Dec. 21, 2022, 10:34 p.m. UTC
Add an optimization to jevents using the metric code, rewrite metrics
in terms of each other in order to minimize size and improve
readability. For example, on Power8
other_stall_cpi is rewritten from:
"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
to:
"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

A limitation of the substitutions are that they depend on strict
equality and the shape of the tree. This means that for "a + b + c"
then a substitution of "a + b" will succeed while "b + c" will fail
(the LHS for "+ c" is "a + b" not just "b").

Separate out the events and metrics in the pmu-events tables saving
14.8% in the table size while making it that metrics no longer need to
iterate over all events and vice versa. These changes remove evsel's
direct metric support as the pmu_event no longer has a metric to
populate it. This is a minor issue as the code wasn't working
properly, metrics for this are rare and can still be properly ran
using '-M'.

Add an ability to just build certain models into the jevents generated
pmu-metrics.c code. This functionality is appropriate for operating
systems like ChromeOS, that aim to minimize binary size and know all
the target CPU models.

v2. Rebase. Modify the code that skips rewriting a metric with the
    same name with itself, to make the name check case insensitive.

Ian Rogers (9):
  perf jevents metric: Correct Function equality
  perf jevents metric: Add ability to rewrite metrics in terms of others
  perf jevents: Rewrite metrics in the same file with each other
  perf pmu-events: Separate metric out of pmu_event
  perf stat: Remove evsel metric_name/expr
  perf jevents: Combine table prefix and suffix writing
  perf pmu-events: Introduce pmu_metrics_table
  perf jevents: Generate metrics and events as separate tables
  perf jevents: Add model list option

 tools/perf/arch/arm64/util/pmu.c         |  23 +-
 tools/perf/arch/powerpc/util/header.c    |   4 +-
 tools/perf/builtin-list.c                |  20 +-
 tools/perf/builtin-stat.c                |   1 -
 tools/perf/pmu-events/Build              |   3 +-
 tools/perf/pmu-events/empty-pmu-events.c | 111 ++++++-
 tools/perf/pmu-events/jevents.py         | 353 ++++++++++++++++++-----
 tools/perf/pmu-events/metric.py          |  79 ++++-
 tools/perf/pmu-events/metric_test.py     |  10 +
 tools/perf/pmu-events/pmu-events.h       |  26 +-
 tools/perf/tests/expand-cgroup.c         |   4 +-
 tools/perf/tests/parse-metric.c          |   4 +-
 tools/perf/tests/pmu-events.c            |  68 ++---
 tools/perf/util/cgroup.c                 |   1 -
 tools/perf/util/evsel.c                  |   2 -
 tools/perf/util/evsel.h                  |   2 -
 tools/perf/util/metricgroup.c            | 203 +++++++------
 tools/perf/util/metricgroup.h            |   4 +-
 tools/perf/util/parse-events.c           |   2 -
 tools/perf/util/pmu.c                    |  44 +--
 tools/perf/util/pmu.h                    |  10 +-
 tools/perf/util/print-events.c           |  32 +-
 tools/perf/util/print-events.h           |   3 +-
 tools/perf/util/python.c                 |   7 -
 tools/perf/util/stat-shadow.c            | 112 -------
 tools/perf/util/stat.h                   |   1 -
 26 files changed, 666 insertions(+), 463 deletions(-)

Comments

Ian Rogers Jan. 19, 2023, 3:54 p.m. UTC | #1
On Wed, Dec 21, 2022 at 2:34 PM Ian Rogers <irogers@google.com> wrote:
>
> Add an optimization to jevents using the metric code, rewrite metrics
> in terms of each other in order to minimize size and improve
> readability. For example, on Power8
> other_stall_cpi is rewritten from:
> "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
> to:
> "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
> Which more closely matches the definition on Power9.
>
> A limitation of the substitutions are that they depend on strict
> equality and the shape of the tree. This means that for "a + b + c"
> then a substitution of "a + b" will succeed while "b + c" will fail
> (the LHS for "+ c" is "a + b" not just "b").
>
> Separate out the events and metrics in the pmu-events tables saving
> 14.8% in the table size while making it that metrics no longer need to
> iterate over all events and vice versa. These changes remove evsel's
> direct metric support as the pmu_event no longer has a metric to
> populate it. This is a minor issue as the code wasn't working
> properly, metrics for this are rare and can still be properly ran
> using '-M'.
>
> Add an ability to just build certain models into the jevents generated
> pmu-metrics.c code. This functionality is appropriate for operating
> systems like ChromeOS, that aim to minimize binary size and know all
> the target CPU models.
>
> v2. Rebase. Modify the code that skips rewriting a metric with the
>     same name with itself, to make the name check case insensitive.
>
> Ian Rogers (9):
>   perf jevents metric: Correct Function equality
>   perf jevents metric: Add ability to rewrite metrics in terms of others
>   perf jevents: Rewrite metrics in the same file with each other
>   perf pmu-events: Separate metric out of pmu_event
>   perf stat: Remove evsel metric_name/expr
>   perf jevents: Combine table prefix and suffix writing
>   perf pmu-events: Introduce pmu_metrics_table
>   perf jevents: Generate metrics and events as separate tables
>   perf jevents: Add model list option

Ping. Looking for reviews.

Thanks,
Ian

>  tools/perf/arch/arm64/util/pmu.c         |  23 +-
>  tools/perf/arch/powerpc/util/header.c    |   4 +-
>  tools/perf/builtin-list.c                |  20 +-
>  tools/perf/builtin-stat.c                |   1 -
>  tools/perf/pmu-events/Build              |   3 +-
>  tools/perf/pmu-events/empty-pmu-events.c | 111 ++++++-
>  tools/perf/pmu-events/jevents.py         | 353 ++++++++++++++++++-----
>  tools/perf/pmu-events/metric.py          |  79 ++++-
>  tools/perf/pmu-events/metric_test.py     |  10 +
>  tools/perf/pmu-events/pmu-events.h       |  26 +-
>  tools/perf/tests/expand-cgroup.c         |   4 +-
>  tools/perf/tests/parse-metric.c          |   4 +-
>  tools/perf/tests/pmu-events.c            |  68 ++---
>  tools/perf/util/cgroup.c                 |   1 -
>  tools/perf/util/evsel.c                  |   2 -
>  tools/perf/util/evsel.h                  |   2 -
>  tools/perf/util/metricgroup.c            | 203 +++++++------
>  tools/perf/util/metricgroup.h            |   4 +-
>  tools/perf/util/parse-events.c           |   2 -
>  tools/perf/util/pmu.c                    |  44 +--
>  tools/perf/util/pmu.h                    |  10 +-
>  tools/perf/util/print-events.c           |  32 +-
>  tools/perf/util/print-events.h           |   3 +-
>  tools/perf/util/python.c                 |   7 -
>  tools/perf/util/stat-shadow.c            | 112 -------
>  tools/perf/util/stat.h                   |   1 -
>  26 files changed, 666 insertions(+), 463 deletions(-)
>
> --
> 2.39.0.314.g84b9a713c41-goog
>
John Garry Jan. 23, 2023, 1:25 p.m. UTC | #2
On 21/12/2022 22:34, Ian Rogers wrote:
> Add an optimization to jevents using the metric code, rewrite metrics
> in terms of each other in order to minimize size and improve
> readability. For example, on Power8
> other_stall_cpi is rewritten from:
> "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
> to:
> "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
> Which more closely matches the definition on Power9.
> 
> A limitation of the substitutions are that they depend on strict
> equality and the shape of the tree. This means that for "a + b + c"
> then a substitution of "a + b" will succeed while "b + c" will fail
> (the LHS for "+ c" is "a + b" not just "b").
> 
> Separate out the events and metrics in the pmu-events tables saving
> 14.8% in the table size while making it that metrics no longer need to
> iterate over all events and vice versa. These changes remove evsel's
> direct metric support as the pmu_event no longer has a metric to
> populate it. This is a minor issue as the code wasn't working
> properly, metrics for this are rare and can still be properly ran
> using '-M'.
> 
> Add an ability to just build certain models into the jevents generated
> pmu-metrics.c code. This functionality is appropriate for operating
> systems like ChromeOS, that aim to minimize binary size and know all
> the target CPU models.

 From a glance, this does not look like it would work for arm64. As I 
see in the code, we check the model in the arch folder for the test to 
see if built. For arm64, as it uses arch/implementator/model folder org, 
and not just arch/model (like x86)

So on the assumption that it does not work for arm64 (or just any arch 
which uses arch/implementator/model folder org), it would be nice to 
have that feature also. Or maybe also support not just specifying model 
but also implementator.

> 
> v2. Rebase. Modify the code that skips rewriting a metric with the
>      same name with itself, to make the name check case insensitive.
> 


Unfortunately you might need another rebase as this does not apply to 
acme perf/core (if that is what you want), now for me at:

5670ebf54bd2 (HEAD, origin/tmp.perf/core, origin/perf/core, perf/core)
perf cs-etm: Ensure that Coresight timestamps don't go backwards

> Ian Rogers (9):
>    perf jevents metric: Correct Function equality
>    perf jevents metric: Add ability to rewrite metrics in terms of others
>    perf jevents: Rewrite metrics in the same file with each other
>    perf pmu-events: Separate metric out of pmu_event
>    perf stat: Remove evsel metric_name/expr
>    perf jevents: Combine table prefix and suffix writing
>    perf pmu-events: Introduce pmu_metrics_table
>    perf jevents: Generate metrics and events as separate tables
>    perf jevents: Add model list option

Thanks,
John
Ian Rogers Jan. 24, 2023, 5:04 a.m. UTC | #3
On Mon, Jan 23, 2023 at 5:26 AM John Garry <john.g.garry@oracle.com> wrote:
>
> On 21/12/2022 22:34, Ian Rogers wrote:
> > Add an optimization to jevents using the metric code, rewrite metrics
> > in terms of each other in order to minimize size and improve
> > readability. For example, on Power8
> > other_stall_cpi is rewritten from:
> > "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"
> > to:
> > "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
> > Which more closely matches the definition on Power9.
> >
> > A limitation of the substitutions are that they depend on strict
> > equality and the shape of the tree. This means that for "a + b + c"
> > then a substitution of "a + b" will succeed while "b + c" will fail
> > (the LHS for "+ c" is "a + b" not just "b").
> >
> > Separate out the events and metrics in the pmu-events tables saving
> > 14.8% in the table size while making it that metrics no longer need to
> > iterate over all events and vice versa. These changes remove evsel's
> > direct metric support as the pmu_event no longer has a metric to
> > populate it. This is a minor issue as the code wasn't working
> > properly, metrics for this are rare and can still be properly ran
> > using '-M'.
> >
> > Add an ability to just build certain models into the jevents generated
> > pmu-metrics.c code. This functionality is appropriate for operating
> > systems like ChromeOS, that aim to minimize binary size and know all
> > the target CPU models.
>
>  From a glance, this does not look like it would work for arm64. As I
> see in the code, we check the model in the arch folder for the test to
> see if built. For arm64, as it uses arch/implementator/model folder org,
> and not just arch/model (like x86)
>
> So on the assumption that it does not work for arm64 (or just any arch
> which uses arch/implementator/model folder org), it would be nice to
> have that feature also. Or maybe also support not just specifying model
> but also implementator.

Hmm.. this is tricky as x86 isn't following the implementor pattern. I
will tweak the comment for the ARM64 case where --model will select an
implementor.

> >
> > v2. Rebase. Modify the code that skips rewriting a metric with the
> >      same name with itself, to make the name check case insensitive.
> >
>
>
> Unfortunately you might need another rebase as this does not apply to
> acme perf/core (if that is what you want), now for me at:
>
> 5670ebf54bd2 (HEAD, origin/tmp.perf/core, origin/perf/core, perf/core)
> perf cs-etm: Ensure that Coresight timestamps don't go backwards

Will do, thanks!
Ian

> > Ian Rogers (9):
> >    perf jevents metric: Correct Function equality
> >    perf jevents metric: Add ability to rewrite metrics in terms of others
> >    perf jevents: Rewrite metrics in the same file with each other
> >    perf pmu-events: Separate metric out of pmu_event
> >    perf stat: Remove evsel metric_name/expr
> >    perf jevents: Combine table prefix and suffix writing
> >    perf pmu-events: Introduce pmu_metrics_table
> >    perf jevents: Generate metrics and events as separate tables
> >    perf jevents: Add model list option
>
> Thanks,
> John