mbox series

[V4,0/7] Clean up perf mem

Message ID 20240123185036.3461837-1-kan.liang@linux.intel.com (mailing list archive)
Headers show
Series Clean up perf mem | expand

Message

Liang, Kan Jan. 23, 2024, 6:50 p.m. UTC
From: Kan Liang <kan.liang@linux.intel.com>

Changes since V3:
- Fix the powerPC building error (Kajol Jain)
- The s390 does not support perf mem. Remove the code. (Thomas)
- Add reviewed-by and tested-by from Kajol Jain for patch 1 and 2
- Add tested-by from Leo

Changes since V2:
- Fix the Arm64 building error (Leo)
- Add two new patches to clean up perf_mem_events__record_args()
  and perf_pmus__num_mem_pmus() (Leo)

Changes since V1:
- Fix strcmp of PMU name checking (Ravi)
- Fix "/," typo (Ian)
- Rename several functions with perf_pmu__mem_events prefix. (Ian)
- Fold the header removal patch into the patch where the cleanups made.
  (Arnaldo)
- Add reviewed-by and tested-by from Ian and Ravi

As discussed in the below thread, the patch set is to clean up perf mem.
https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/

Introduce generic functions perf_mem_events__ptr(),
perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
ARCH specific ones.
Simplify the perf_mem_event__supported().

Only keeps the ARCH-specific perf_mem_events array in the corresponding
mem-events.c for each ARCH.

There is no functional change.

The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
etc. But I can only test it on two Intel platforms.
Please give it try, if you have machines with other ARCHs.

Here are the test results:
Intel hybrid machine:

$perf mem record -e list
ldlat-loads  : available
ldlat-stores : available

$perf mem record -e ldlat-loads -v --ldlat 50
calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P

$perf mem record -v
calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P

$perf mem record -t store -v
calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P


Intel SPR:
$perf mem record -e list
ldlat-loads  : available
ldlat-stores : available

$perf mem record -e ldlat-loads -v --ldlat 50
calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P

$perf mem record -v
calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P

$perf mem record -t store -v
calling: record -e cpu/mem-stores/P

Kan Liang (7):
  perf mem: Add mem_events into the supported perf_pmu
  perf mem: Clean up perf_mem_events__ptr()
  perf mem: Clean up perf_mem_events__name()
  perf mem: Clean up perf_mem_event__supported()
  perf mem: Clean up is_mem_loads_aux_event()
  perf mem: Clean up perf_mem_events__record_args()
  perf mem: Clean up perf_pmus__num_mem_pmus()

 tools/perf/arch/arm/util/pmu.c            |   3 +
 tools/perf/arch/arm64/util/mem-events.c   |  39 +---
 tools/perf/arch/arm64/util/mem-events.h   |   7 +
 tools/perf/arch/powerpc/util/Build        |   1 +
 tools/perf/arch/powerpc/util/mem-events.c |  16 +-
 tools/perf/arch/powerpc/util/mem-events.h |   7 +
 tools/perf/arch/powerpc/util/pmu.c        |  12 ++
 tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
 tools/perf/arch/x86/util/mem-events.h     |  10 +
 tools/perf/arch/x86/util/pmu.c            |  19 +-
 tools/perf/builtin-c2c.c                  |  45 ++---
 tools/perf/builtin-mem.c                  |  48 ++---
 tools/perf/util/mem-events.c              | 217 +++++++++++++---------
 tools/perf/util/mem-events.h              |  19 +-
 tools/perf/util/pmu.c                     |   4 +-
 tools/perf/util/pmu.h                     |   7 +
 tools/perf/util/pmus.c                    |   6 -
 tools/perf/util/pmus.h                    |   1 -
 18 files changed, 279 insertions(+), 281 deletions(-)
 create mode 100644 tools/perf/arch/arm64/util/mem-events.h
 create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
 create mode 100644 tools/perf/arch/powerpc/util/pmu.c
 create mode 100644 tools/perf/arch/x86/util/mem-events.h

Comments

Ian Rogers Jan. 24, 2024, 6:24 p.m. UTC | #1
On Tue, Jan 23, 2024 at 10:51 AM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> Changes since V3:
> - Fix the powerPC building error (Kajol Jain)
> - The s390 does not support perf mem. Remove the code. (Thomas)
> - Add reviewed-by and tested-by from Kajol Jain for patch 1 and 2
> - Add tested-by from Leo
>
> Changes since V2:
> - Fix the Arm64 building error (Leo)
> - Add two new patches to clean up perf_mem_events__record_args()
>   and perf_pmus__num_mem_pmus() (Leo)
>
> Changes since V1:
> - Fix strcmp of PMU name checking (Ravi)
> - Fix "/," typo (Ian)
> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
> - Fold the header removal patch into the patch where the cleanups made.
>   (Arnaldo)
> - Add reviewed-by and tested-by from Ian and Ravi
>
> As discussed in the below thread, the patch set is to clean up perf mem.
> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
>
> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
>
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
>
> There is no functional change.
>
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.
>
> Here are the test results:
> Intel hybrid machine:
>
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>
>
> Intel SPR:
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
>
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P
>
> Kan Liang (7):
>   perf mem: Add mem_events into the supported perf_pmu
>   perf mem: Clean up perf_mem_events__ptr()
>   perf mem: Clean up perf_mem_events__name()
>   perf mem: Clean up perf_mem_event__supported()
>   perf mem: Clean up is_mem_loads_aux_event()
>   perf mem: Clean up perf_mem_events__record_args()
>   perf mem: Clean up perf_pmus__num_mem_pmus()

I think this is ready to land in perf-tools-next, multiple Tested-by
or Reviewed-by.

Thanks,
Ian

>  tools/perf/arch/arm/util/pmu.c            |   3 +
>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>  tools/perf/arch/powerpc/util/Build        |   1 +
>  tools/perf/arch/powerpc/util/mem-events.c |  16 +-
>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>  tools/perf/arch/powerpc/util/pmu.c        |  12 ++
>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>  tools/perf/builtin-c2c.c                  |  45 ++---
>  tools/perf/builtin-mem.c                  |  48 ++---
>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>  tools/perf/util/mem-events.h              |  19 +-
>  tools/perf/util/pmu.c                     |   4 +-
>  tools/perf/util/pmu.h                     |   7 +
>  tools/perf/util/pmus.c                    |   6 -
>  tools/perf/util/pmus.h                    |   1 -
>  18 files changed, 279 insertions(+), 281 deletions(-)
>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>
> --
> 2.35.1
>
Namhyung Kim Jan. 25, 2024, 5:24 a.m. UTC | #2
On Wed, Jan 24, 2024 at 10:24 AM Ian Rogers <irogers@google.com> wrote:
>
> On Tue, Jan 23, 2024 at 10:51 AM <kan.liang@linux.intel.com> wrote:
> >
> > From: Kan Liang <kan.liang@linux.intel.com>
> >
> > Changes since V3:
> > - Fix the powerPC building error (Kajol Jain)
> > - The s390 does not support perf mem. Remove the code. (Thomas)
> > - Add reviewed-by and tested-by from Kajol Jain for patch 1 and 2
> > - Add tested-by from Leo
> >
> > Changes since V2:
> > - Fix the Arm64 building error (Leo)
> > - Add two new patches to clean up perf_mem_events__record_args()
> >   and perf_pmus__num_mem_pmus() (Leo)
> >
> > Changes since V1:
> > - Fix strcmp of PMU name checking (Ravi)
> > - Fix "/," typo (Ian)
> > - Rename several functions with perf_pmu__mem_events prefix. (Ian)
> > - Fold the header removal patch into the patch where the cleanups made.
> >   (Arnaldo)
> > - Add reviewed-by and tested-by from Ian and Ravi
> >
> > As discussed in the below thread, the patch set is to clean up perf mem.
> > https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
> >
> > Introduce generic functions perf_mem_events__ptr(),
> > perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> > ARCH specific ones.
> > Simplify the perf_mem_event__supported().
> >
> > Only keeps the ARCH-specific perf_mem_events array in the corresponding
> > mem-events.c for each ARCH.
> >
> > There is no functional change.
> >
> > The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> > etc. But I can only test it on two Intel platforms.
> > Please give it try, if you have machines with other ARCHs.
> >
> > Here are the test results:
> > Intel hybrid machine:
> >
> > $perf mem record -e list
> > ldlat-loads  : available
> > ldlat-stores : available
> >
> > $perf mem record -e ldlat-loads -v --ldlat 50
> > calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
> >
> > $perf mem record -v
> > calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
> >
> > $perf mem record -t store -v
> > calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
> >
> >
> > Intel SPR:
> > $perf mem record -e list
> > ldlat-loads  : available
> > ldlat-stores : available
> >
> > $perf mem record -e ldlat-loads -v --ldlat 50
> > calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
> >
> > $perf mem record -v
> > calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
> >
> > $perf mem record -t store -v
> > calling: record -e cpu/mem-stores/P
> >
> > Kan Liang (7):
> >   perf mem: Add mem_events into the supported perf_pmu
> >   perf mem: Clean up perf_mem_events__ptr()
> >   perf mem: Clean up perf_mem_events__name()
> >   perf mem: Clean up perf_mem_event__supported()
> >   perf mem: Clean up is_mem_loads_aux_event()
> >   perf mem: Clean up perf_mem_events__record_args()
> >   perf mem: Clean up perf_pmus__num_mem_pmus()
>
> I think this is ready to land in perf-tools-next, multiple Tested-by
> or Reviewed-by.

Sure, queued for a local testing.

Thanks,
Namhyung
Namhyung Kim Jan. 25, 2024, 9:44 p.m. UTC | #3
On Wed, Jan 24, 2024 at 9:24 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Jan 24, 2024 at 10:24 AM Ian Rogers <irogers@google.com> wrote:
> >
> > On Tue, Jan 23, 2024 at 10:51 AM <kan.liang@linux.intel.com> wrote:
> > >
> > > From: Kan Liang <kan.liang@linux.intel.com>
> > >
> > > Changes since V3:
> > > - Fix the powerPC building error (Kajol Jain)
> > > - The s390 does not support perf mem. Remove the code. (Thomas)
> > > - Add reviewed-by and tested-by from Kajol Jain for patch 1 and 2
> > > - Add tested-by from Leo
> > >
> > > Changes since V2:
> > > - Fix the Arm64 building error (Leo)
> > > - Add two new patches to clean up perf_mem_events__record_args()
> > >   and perf_pmus__num_mem_pmus() (Leo)
> > >
> > > Changes since V1:
> > > - Fix strcmp of PMU name checking (Ravi)
> > > - Fix "/," typo (Ian)
> > > - Rename several functions with perf_pmu__mem_events prefix. (Ian)
> > > - Fold the header removal patch into the patch where the cleanups made.
> > >   (Arnaldo)
> > > - Add reviewed-by and tested-by from Ian and Ravi
> > >
> > > As discussed in the below thread, the patch set is to clean up perf mem.
> > > https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
> > >
> > > Introduce generic functions perf_mem_events__ptr(),
> > > perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> > > ARCH specific ones.
> > > Simplify the perf_mem_event__supported().
> > >
> > > Only keeps the ARCH-specific perf_mem_events array in the corresponding
> > > mem-events.c for each ARCH.
> > >
> > > There is no functional change.
> > >
> > > The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> > > etc. But I can only test it on two Intel platforms.
> > > Please give it try, if you have machines with other ARCHs.
> > >
> > > Here are the test results:
> > > Intel hybrid machine:
> > >
> > > $perf mem record -e list
> > > ldlat-loads  : available
> > > ldlat-stores : available
> > >
> > > $perf mem record -e ldlat-loads -v --ldlat 50
> > > calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
> > >
> > > $perf mem record -v
> > > calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
> > >
> > > $perf mem record -t store -v
> > > calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
> > >
> > >
> > > Intel SPR:
> > > $perf mem record -e list
> > > ldlat-loads  : available
> > > ldlat-stores : available
> > >
> > > $perf mem record -e ldlat-loads -v --ldlat 50
> > > calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
> > >
> > > $perf mem record -v
> > > calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
> > >
> > > $perf mem record -t store -v
> > > calling: record -e cpu/mem-stores/P
> > >
> > > Kan Liang (7):
> > >   perf mem: Add mem_events into the supported perf_pmu
> > >   perf mem: Clean up perf_mem_events__ptr()
> > >   perf mem: Clean up perf_mem_events__name()
> > >   perf mem: Clean up perf_mem_event__supported()
> > >   perf mem: Clean up is_mem_loads_aux_event()
> > >   perf mem: Clean up perf_mem_events__record_args()
> > >   perf mem: Clean up perf_pmus__num_mem_pmus()
> >
> > I think this is ready to land in perf-tools-next, multiple Tested-by
> > or Reviewed-by.
>
> Sure, queued for a local testing.

Applied to perf-tools-next, thanks!

Namhyung