mbox series

[V3,0/7] Clean up perf mem

Message ID 20231213195154.1085945-1-kan.liang@linux.intel.com (mailing list archive)
Headers show
Series Clean up perf mem | expand

Message

Liang, Kan Dec. 13, 2023, 7:51 p.m. UTC
From: Kan Liang <kan.liang@linux.intel.com>

Changes since V2:
- Fix the Arm64 building error (Leo)
- Add two new patches to clean up perf_mem_events__record_args()
  and perf_pmus__num_mem_pmus() (Leo)

Changes since V1:
- Fix strcmp of PMU name checking (Ravi)
- Fix "/," typo (Ian)
- Rename several functions with perf_pmu__mem_events prefix. (Ian)
- Fold the header removal patch into the patch where the cleanups made.
  (Arnaldo)
- Add reviewed-by and tested-by from Ian and Ravi

As discussed in the below thread, the patch set is to clean up perf mem.
https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/

Introduce generic functions perf_mem_events__ptr(),
perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
ARCH specific ones.
Simplify the perf_mem_event__supported().

Only keeps the ARCH-specific perf_mem_events array in the corresponding
mem-events.c for each ARCH.

There is no functional change.

The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
etc. But I can only test it on two Intel platforms.
Please give it try, if you have machines with other ARCHs.

Here are the test results:
Intel hybrid machine:

$perf mem record -e list
ldlat-loads  : available
ldlat-stores : available

$perf mem record -e ldlat-loads -v --ldlat 50
calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P

$perf mem record -v
calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P

$perf mem record -t store -v
calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P


Intel SPR:
$perf mem record -e list
ldlat-loads  : available
ldlat-stores : available

$perf mem record -e ldlat-loads -v --ldlat 50
calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P

$perf mem record -v
calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P

$perf mem record -t store -v
calling: record -e cpu/mem-stores/P

Kan Liang (7):
  perf mem: Add mem_events into the supported perf_pmu
  perf mem: Clean up perf_mem_events__ptr()
  perf mem: Clean up perf_mem_events__name()
  perf mem: Clean up perf_mem_event__supported()
  perf mem: Clean up is_mem_loads_aux_event()
  perf mem: Clean up perf_mem_events__record_args()
  perf mem: Clean up perf_pmus__num_mem_pmus()

 tools/perf/arch/arm/util/pmu.c            |   3 +
 tools/perf/arch/arm64/util/mem-events.c   |  39 +---
 tools/perf/arch/arm64/util/mem-events.h   |   7 +
 tools/perf/arch/powerpc/util/mem-events.c |  13 +-
 tools/perf/arch/powerpc/util/mem-events.h |   7 +
 tools/perf/arch/powerpc/util/pmu.c        |  11 ++
 tools/perf/arch/s390/util/pmu.c           |   3 +
 tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
 tools/perf/arch/x86/util/mem-events.h     |  10 +
 tools/perf/arch/x86/util/pmu.c            |  19 +-
 tools/perf/builtin-c2c.c                  |  45 ++---
 tools/perf/builtin-mem.c                  |  48 ++---
 tools/perf/util/mem-events.c              | 217 +++++++++++++---------
 tools/perf/util/mem-events.h              |  19 +-
 tools/perf/util/pmu.c                     |   4 +-
 tools/perf/util/pmu.h                     |   7 +
 tools/perf/util/pmus.c                    |   6 -
 tools/perf/util/pmus.h                    |   1 -
 18 files changed, 278 insertions(+), 280 deletions(-)
 create mode 100644 tools/perf/arch/arm64/util/mem-events.h
 create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
 create mode 100644 tools/perf/arch/powerpc/util/pmu.c
 create mode 100644 tools/perf/arch/x86/util/mem-events.h

Comments

Leo Yan Dec. 16, 2023, 3:29 a.m. UTC | #1
On Wed, Dec 13, 2023 at 11:51:47AM -0800, kan.liang@linux.intel.com wrote:

[...]

> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
> 
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
> 
> There is no functional change.
> 
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.

This patch series is fine for me:

Reviewed-by: Leo Yan <leo.yan@linaro.org>

I only compiled successfully it on my Arm64 machine but don't test
it due to I have no chance to access a machine with Arm SPE.

James, could you test it?  Thanks a lot!

> Here are the test results:
> Intel hybrid machine:
> 
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
> 
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
> 
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
> 
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
> 
> 
> Intel SPR:
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
> 
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
> 
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
> 
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P
kajoljain Dec. 19, 2023, 9:26 a.m. UTC | #2
Hi,
  I was trying to test this patchset on powerpc.

After applying it on top of acme's perf-tools-next branch, I am getting
below error:

  INSTALL libsubcmd_headers
  INSTALL libperf_headers
  INSTALL libsymbol_headers
  INSTALL libapi_headers
  INSTALL libbpf_headers
  CC      arch/powerpc/util/mem-events.o
In file included from arch/powerpc/util/mem-events.c:3:
arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
undeclared here (not in a function)
    5 | extern struct perf_mem_event
perf_mem_events_power[PERF_MEM_EVENTS__MAX];
      |
^~~~~~~~~~~~~~~~~~~~
make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
arch/powerpc/util/mem-events.o] Error 1
make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
Error 2
make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
Error 2
make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
make[1]: *** [Makefile.perf:251: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

It seems some headerfiles are missing from arch/powerpc/util/mem-
events.c

Thanks,
Kajol Jain

On 12/14/23 01:21, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> Changes since V2:
> - Fix the Arm64 building error (Leo)
> - Add two new patches to clean up perf_mem_events__record_args()
>   and perf_pmus__num_mem_pmus() (Leo)
> 
> Changes since V1:
> - Fix strcmp of PMU name checking (Ravi)
> - Fix "/," typo (Ian)
> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
> - Fold the header removal patch into the patch where the cleanups made.
>   (Arnaldo)
> - Add reviewed-by and tested-by from Ian and Ravi
> 
> As discussed in the below thread, the patch set is to clean up perf mem.
> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
> 
> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
> 
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
> 
> There is no functional change.
> 
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.
> 
> Here are the test results:
> Intel hybrid machine:
> 
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
> 
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
> 
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
> 
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
> 
> 
> Intel SPR:
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
> 
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
> 
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
> 
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P
> 
> Kan Liang (7):
>   perf mem: Add mem_events into the supported perf_pmu
>   perf mem: Clean up perf_mem_events__ptr()
>   perf mem: Clean up perf_mem_events__name()
>   perf mem: Clean up perf_mem_event__supported()
>   perf mem: Clean up is_mem_loads_aux_event()
>   perf mem: Clean up perf_mem_events__record_args()
>   perf mem: Clean up perf_pmus__num_mem_pmus()
> 
>  tools/perf/arch/arm/util/pmu.c            |   3 +
>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>  tools/perf/arch/powerpc/util/mem-events.c |  13 +-
>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>  tools/perf/arch/powerpc/util/pmu.c        |  11 ++
>  tools/perf/arch/s390/util/pmu.c           |   3 +
>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>  tools/perf/builtin-c2c.c                  |  45 ++---
>  tools/perf/builtin-mem.c                  |  48 ++---
>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>  tools/perf/util/mem-events.h              |  19 +-
>  tools/perf/util/pmu.c                     |   4 +-
>  tools/perf/util/pmu.h                     |   7 +
>  tools/perf/util/pmus.c                    |   6 -
>  tools/perf/util/pmus.h                    |   1 -
>  18 files changed, 278 insertions(+), 280 deletions(-)
>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>
Liang, Kan Dec. 19, 2023, 2:15 p.m. UTC | #3
On 2023-12-19 4:26 a.m., kajoljain wrote:
> Hi,
>   I was trying to test this patchset on powerpc.
> 
> After applying it on top of acme's perf-tools-next branch, I am getting
> below error:
> 
>   INSTALL libsubcmd_headers
>   INSTALL libperf_headers
>   INSTALL libsymbol_headers
>   INSTALL libapi_headers
>   INSTALL libbpf_headers
>   CC      arch/powerpc/util/mem-events.o
> In file included from arch/powerpc/util/mem-events.c:3:
> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
> undeclared here (not in a function)
>     5 | extern struct perf_mem_event
> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>       |
> ^~~~~~~~~~~~~~~~~~~~
> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
> arch/powerpc/util/mem-events.o] Error 1
> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
> Error 2
> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
> Error 2
> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
> Error 2
> make[3]: *** Waiting for unfinished jobs....
> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
> make[1]: *** [Makefile.perf:251: sub-make] Error 2
> make: *** [Makefile:70: all] Error 2
> 
> It seems some headerfiles are missing from arch/powerpc/util/mem-
> events.c
> 

Leo updated the headerfiles for ARM. https://termbin.com/0dkn

I guess powerpc has to do the same thing. Could you please try the below
patch?

diff --git a/tools/perf/arch/powerpc/util/mem-events.c
b/tools/perf/arch/powerpc/util/mem-events.c
index 72a6ac2b52f5..765d4a054b0a 100644
--- a/tools/perf/arch/powerpc/util/mem-events.c
+++ b/tools/perf/arch/powerpc/util/mem-events.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
-#include "map_symbol.h"
+#include "util/map_symbol.h"
+#include "util/mem-events.h"
 #include "mem-events.h"

 #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
= l, .aux_event = a }

Thanks,
Kan

> Thanks,
> Kajol Jain
> 
> On 12/14/23 01:21, kan.liang@linux.intel.com wrote:
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> Changes since V2:
>> - Fix the Arm64 building error (Leo)
>> - Add two new patches to clean up perf_mem_events__record_args()
>>   and perf_pmus__num_mem_pmus() (Leo)
>>
>> Changes since V1:
>> - Fix strcmp of PMU name checking (Ravi)
>> - Fix "/," typo (Ian)
>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>> - Fold the header removal patch into the patch where the cleanups made.
>>   (Arnaldo)
>> - Add reviewed-by and tested-by from Ian and Ravi
>>
>> As discussed in the below thread, the patch set is to clean up perf mem.
>> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
>>
>> Introduce generic functions perf_mem_events__ptr(),
>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>> ARCH specific ones.
>> Simplify the perf_mem_event__supported().
>>
>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>> mem-events.c for each ARCH.
>>
>> There is no functional change.
>>
>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>> etc. But I can only test it on two Intel platforms.
>> Please give it try, if you have machines with other ARCHs.
>>
>> Here are the test results:
>> Intel hybrid machine:
>>
>> $perf mem record -e list
>> ldlat-loads  : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>
>> $perf mem record -v
>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>
>>
>> Intel SPR:
>> $perf mem record -e list
>> ldlat-loads  : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>
>> $perf mem record -v
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu/mem-stores/P
>>
>> Kan Liang (7):
>>   perf mem: Add mem_events into the supported perf_pmu
>>   perf mem: Clean up perf_mem_events__ptr()
>>   perf mem: Clean up perf_mem_events__name()
>>   perf mem: Clean up perf_mem_event__supported()
>>   perf mem: Clean up is_mem_loads_aux_event()
>>   perf mem: Clean up perf_mem_events__record_args()
>>   perf mem: Clean up perf_pmus__num_mem_pmus()
>>
>>  tools/perf/arch/arm/util/pmu.c            |   3 +
>>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>>  tools/perf/arch/powerpc/util/mem-events.c |  13 +-
>>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>>  tools/perf/arch/powerpc/util/pmu.c        |  11 ++
>>  tools/perf/arch/s390/util/pmu.c           |   3 +
>>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>>  tools/perf/builtin-c2c.c                  |  45 ++---
>>  tools/perf/builtin-mem.c                  |  48 ++---
>>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>>  tools/perf/util/mem-events.h              |  19 +-
>>  tools/perf/util/pmu.c                     |   4 +-
>>  tools/perf/util/pmu.h                     |   7 +
>>  tools/perf/util/pmus.c                    |   6 -
>>  tools/perf/util/pmus.h                    |   1 -
>>  18 files changed, 278 insertions(+), 280 deletions(-)
>>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>
>
Liang, Kan Jan. 2, 2024, 8:08 p.m. UTC | #4
Hi Kajol Jain

On 2023-12-19 9:15 a.m., Liang, Kan wrote:
> 
> 
> On 2023-12-19 4:26 a.m., kajoljain wrote:
>> Hi,
>>   I was trying to test this patchset on powerpc.
>>
>> After applying it on top of acme's perf-tools-next branch, I am getting
>> below error:
>>
>>   INSTALL libsubcmd_headers
>>   INSTALL libperf_headers
>>   INSTALL libsymbol_headers
>>   INSTALL libapi_headers
>>   INSTALL libbpf_headers
>>   CC      arch/powerpc/util/mem-events.o
>> In file included from arch/powerpc/util/mem-events.c:3:
>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>> undeclared here (not in a function)
>>     5 | extern struct perf_mem_event
>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>       |
>> ^~~~~~~~~~~~~~~~~~~~
>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>> arch/powerpc/util/mem-events.o] Error 1
>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>> Error 2
>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>> Error 2
>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>> Error 2
>> make[3]: *** Waiting for unfinished jobs....
>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>> make: *** [Makefile:70: all] Error 2
>>
>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>> events.c
>>
> 
> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
> 
> I guess powerpc has to do the same thing. Could you please try the below
> patch?


Does the patch work on powerpc?


Thanks,
Kan
> 
> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
> b/tools/perf/arch/powerpc/util/mem-events.c
> index 72a6ac2b52f5..765d4a054b0a 100644
> --- a/tools/perf/arch/powerpc/util/mem-events.c
> +++ b/tools/perf/arch/powerpc/util/mem-events.c
> @@ -1,5 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0
> -#include "map_symbol.h"
> +#include "util/map_symbol.h"
> +#include "util/mem-events.h"
>  #include "mem-events.h"
> 
>  #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
> = l, .aux_event = a }
> 
> Thanks,
> Kan
> 
>> Thanks,
>> Kajol Jain
>>
>> On 12/14/23 01:21, kan.liang@linux.intel.com wrote:
>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>
>>> Changes since V2:
>>> - Fix the Arm64 building error (Leo)
>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>   and perf_pmus__num_mem_pmus() (Leo)
>>>
>>> Changes since V1:
>>> - Fix strcmp of PMU name checking (Ravi)
>>> - Fix "/," typo (Ian)
>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>> - Fold the header removal patch into the patch where the cleanups made.
>>>   (Arnaldo)
>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>
>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
>>>
>>> Introduce generic functions perf_mem_events__ptr(),
>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>> ARCH specific ones.
>>> Simplify the perf_mem_event__supported().
>>>
>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>> mem-events.c for each ARCH.
>>>
>>> There is no functional change.
>>>
>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>> etc. But I can only test it on two Intel platforms.
>>> Please give it try, if you have machines with other ARCHs.
>>>
>>> Here are the test results:
>>> Intel hybrid machine:
>>>
>>> $perf mem record -e list
>>> ldlat-loads  : available
>>> ldlat-stores : available
>>>
>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>
>>> $perf mem record -v
>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>
>>> $perf mem record -t store -v
>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>
>>>
>>> Intel SPR:
>>> $perf mem record -e list
>>> ldlat-loads  : available
>>> ldlat-stores : available
>>>
>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>
>>> $perf mem record -v
>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>
>>> $perf mem record -t store -v
>>> calling: record -e cpu/mem-stores/P
>>>
>>> Kan Liang (7):
>>>   perf mem: Add mem_events into the supported perf_pmu
>>>   perf mem: Clean up perf_mem_events__ptr()
>>>   perf mem: Clean up perf_mem_events__name()
>>>   perf mem: Clean up perf_mem_event__supported()
>>>   perf mem: Clean up is_mem_loads_aux_event()
>>>   perf mem: Clean up perf_mem_events__record_args()
>>>   perf mem: Clean up perf_pmus__num_mem_pmus()
>>>
>>>  tools/perf/arch/arm/util/pmu.c            |   3 +
>>>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>>>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>>>  tools/perf/arch/powerpc/util/mem-events.c |  13 +-
>>>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>>>  tools/perf/arch/powerpc/util/pmu.c        |  11 ++
>>>  tools/perf/arch/s390/util/pmu.c           |   3 +
>>>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>>>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>>>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>>>  tools/perf/builtin-c2c.c                  |  45 ++---
>>>  tools/perf/builtin-mem.c                  |  48 ++---
>>>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>>>  tools/perf/util/mem-events.h              |  19 +-
>>>  tools/perf/util/pmu.c                     |   4 +-
>>>  tools/perf/util/pmu.h                     |   7 +
>>>  tools/perf/util/pmus.c                    |   6 -
>>>  tools/perf/util/pmus.h                    |   1 -
>>>  18 files changed, 278 insertions(+), 280 deletions(-)
>>>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>
>>
>
kajoljain Jan. 5, 2024, 6:38 a.m. UTC | #5
On 1/3/24 01:38, Liang, Kan wrote:
> Hi Kajol Jain
> 
> On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>>
>>
>> On 2023-12-19 4:26 a.m., kajoljain wrote:
>>> Hi,
>>>   I was trying to test this patchset on powerpc.
>>>
>>> After applying it on top of acme's perf-tools-next branch, I am getting
>>> below error:
>>>
>>>   INSTALL libsubcmd_headers
>>>   INSTALL libperf_headers
>>>   INSTALL libsymbol_headers
>>>   INSTALL libapi_headers
>>>   INSTALL libbpf_headers
>>>   CC      arch/powerpc/util/mem-events.o
>>> In file included from arch/powerpc/util/mem-events.c:3:
>>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>>> undeclared here (not in a function)
>>>     5 | extern struct perf_mem_event
>>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>>       |
>>> ^~~~~~~~~~~~~~~~~~~~
>>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>>> arch/powerpc/util/mem-events.o] Error 1
>>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>>> Error 2
>>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>>> Error 2
>>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>>> Error 2
>>> make[3]: *** Waiting for unfinished jobs....
>>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>>> make: *** [Makefile:70: all] Error 2
>>>
>>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>>> events.c
>>>
>>
>> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>>
>> I guess powerpc has to do the same thing. Could you please try the below
>> patch?
> 
> 
> Does the patch work on powerpc?

Hi Kan,
   Sorry I went for vacation so couldn't update. Yes this fix works. But
we have another issue, actually this patch set changes uses ldlat
attribute. But ldlat is not supported in powerpc because of which perf
mem is failing in powerpc.

I am looking into a work around to fix this issue. I will update the fix.

Thanks,
Kajol Jain


> 
> 
> Thanks,
> Kan
>>
>> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
>> b/tools/perf/arch/powerpc/util/mem-events.c
>> index 72a6ac2b52f5..765d4a054b0a 100644
>> --- a/tools/perf/arch/powerpc/util/mem-events.c
>> +++ b/tools/perf/arch/powerpc/util/mem-events.c
>> @@ -1,5 +1,6 @@
>>  // SPDX-License-Identifier: GPL-2.0
>> -#include "map_symbol.h"
>> +#include "util/map_symbol.h"
>> +#include "util/mem-events.h"
>>  #include "mem-events.h"
>>
>>  #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
>> = l, .aux_event = a }
>>
>> Thanks,
>> Kan
>>
>>> Thanks,
>>> Kajol Jain
>>>
>>> On 12/14/23 01:21, kan.liang@linux.intel.com wrote:
>>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>>
>>>> Changes since V2:
>>>> - Fix the Arm64 building error (Leo)
>>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>>   and perf_pmus__num_mem_pmus() (Leo)
>>>>
>>>> Changes since V1:
>>>> - Fix strcmp of PMU name checking (Ravi)
>>>> - Fix "/," typo (Ian)
>>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>>> - Fold the header removal patch into the patch where the cleanups made.
>>>>   (Arnaldo)
>>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>>
>>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>>> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
>>>>
>>>> Introduce generic functions perf_mem_events__ptr(),
>>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>>> ARCH specific ones.
>>>> Simplify the perf_mem_event__supported().
>>>>
>>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>>> mem-events.c for each ARCH.
>>>>
>>>> There is no functional change.
>>>>
>>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>>> etc. But I can only test it on two Intel platforms.
>>>> Please give it try, if you have machines with other ARCHs.
>>>>
>>>> Here are the test results:
>>>> Intel hybrid machine:
>>>>
>>>> $perf mem record -e list
>>>> ldlat-loads  : available
>>>> ldlat-stores : available
>>>>
>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>>
>>>> $perf mem record -v
>>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>>
>>>> $perf mem record -t store -v
>>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>>
>>>>
>>>> Intel SPR:
>>>> $perf mem record -e list
>>>> ldlat-loads  : available
>>>> ldlat-stores : available
>>>>
>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>>
>>>> $perf mem record -v
>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>>
>>>> $perf mem record -t store -v
>>>> calling: record -e cpu/mem-stores/P
>>>>
>>>> Kan Liang (7):
>>>>   perf mem: Add mem_events into the supported perf_pmu
>>>>   perf mem: Clean up perf_mem_events__ptr()
>>>>   perf mem: Clean up perf_mem_events__name()
>>>>   perf mem: Clean up perf_mem_event__supported()
>>>>   perf mem: Clean up is_mem_loads_aux_event()
>>>>   perf mem: Clean up perf_mem_events__record_args()
>>>>   perf mem: Clean up perf_pmus__num_mem_pmus()
>>>>
>>>>  tools/perf/arch/arm/util/pmu.c            |   3 +
>>>>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>>>>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>>>>  tools/perf/arch/powerpc/util/mem-events.c |  13 +-
>>>>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>>>>  tools/perf/arch/powerpc/util/pmu.c        |  11 ++
>>>>  tools/perf/arch/s390/util/pmu.c           |   3 +
>>>>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>>>>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>>>>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>>>>  tools/perf/builtin-c2c.c                  |  45 ++---
>>>>  tools/perf/builtin-mem.c                  |  48 ++---
>>>>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>>>>  tools/perf/util/mem-events.h              |  19 +-
>>>>  tools/perf/util/pmu.c                     |   4 +-
>>>>  tools/perf/util/pmu.h                     |   7 +
>>>>  tools/perf/util/pmus.c                    |   6 -
>>>>  tools/perf/util/pmus.h                    |   1 -
>>>>  18 files changed, 278 insertions(+), 280 deletions(-)
>>>>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>>
>>>
>>
Liang, Kan Jan. 5, 2024, 2:38 p.m. UTC | #6
On 2024-01-05 1:38 a.m., kajoljain wrote:
> 
> 
> On 1/3/24 01:38, Liang, Kan wrote:
>> Hi Kajol Jain
>>
>> On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>>>
>>>
>>> On 2023-12-19 4:26 a.m., kajoljain wrote:
>>>> Hi,
>>>>   I was trying to test this patchset on powerpc.
>>>>
>>>> After applying it on top of acme's perf-tools-next branch, I am getting
>>>> below error:
>>>>
>>>>   INSTALL libsubcmd_headers
>>>>   INSTALL libperf_headers
>>>>   INSTALL libsymbol_headers
>>>>   INSTALL libapi_headers
>>>>   INSTALL libbpf_headers
>>>>   CC      arch/powerpc/util/mem-events.o
>>>> In file included from arch/powerpc/util/mem-events.c:3:
>>>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>>>> undeclared here (not in a function)
>>>>     5 | extern struct perf_mem_event
>>>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>>>       |
>>>> ^~~~~~~~~~~~~~~~~~~~
>>>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>>>> arch/powerpc/util/mem-events.o] Error 1
>>>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>>>> Error 2
>>>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>>>> Error 2
>>>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>>>> Error 2
>>>> make[3]: *** Waiting for unfinished jobs....
>>>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>>>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>>>> make: *** [Makefile:70: all] Error 2
>>>>
>>>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>>>> events.c
>>>>
>>>
>>> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>>>
>>> I guess powerpc has to do the same thing. Could you please try the below
>>> patch?
>>
>>
>> Does the patch work on powerpc?
> 
> Hi Kan,
>    Sorry I went for vacation so couldn't update. Yes this fix works. 

Thanks for the update.

> But
> we have another issue, actually this patch set changes uses ldlat
> attribute. But ldlat is not supported in powerpc because of which perf
> mem is failing in powerpc.

For powerpc, the patch 3 introduced a perf_mem_events_power, which
doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
sure if it's the problem.
Also, S390 still uses the default perf_mem_events, which includes ldlat.
I'm not sure if S390 supports the ldlat.

Thanks,
Kan
> 
> I am looking into a work around to fix this issue. I will update the fix.
> 
> Thanks,
> Kajol Jain
> 
> 
>>
>>
>> Thanks,
>> Kan
>>>
>>> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
>>> b/tools/perf/arch/powerpc/util/mem-events.c
>>> index 72a6ac2b52f5..765d4a054b0a 100644
>>> --- a/tools/perf/arch/powerpc/util/mem-events.c
>>> +++ b/tools/perf/arch/powerpc/util/mem-events.c
>>> @@ -1,5 +1,6 @@
>>>  // SPDX-License-Identifier: GPL-2.0
>>> -#include "map_symbol.h"
>>> +#include "util/map_symbol.h"
>>> +#include "util/mem-events.h"
>>>  #include "mem-events.h"
>>>
>>>  #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
>>> = l, .aux_event = a }
>>>
>>> Thanks,
>>> Kan
>>>
>>>> Thanks,
>>>> Kajol Jain
>>>>
>>>> On 12/14/23 01:21, kan.liang@linux.intel.com wrote:
>>>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>>>
>>>>> Changes since V2:
>>>>> - Fix the Arm64 building error (Leo)
>>>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>>>   and perf_pmus__num_mem_pmus() (Leo)
>>>>>
>>>>> Changes since V1:
>>>>> - Fix strcmp of PMU name checking (Ravi)
>>>>> - Fix "/," typo (Ian)
>>>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>>>> - Fold the header removal patch into the patch where the cleanups made.
>>>>>   (Arnaldo)
>>>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>>>
>>>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>>>> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
>>>>>
>>>>> Introduce generic functions perf_mem_events__ptr(),
>>>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>>>> ARCH specific ones.
>>>>> Simplify the perf_mem_event__supported().
>>>>>
>>>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>>>> mem-events.c for each ARCH.
>>>>>
>>>>> There is no functional change.
>>>>>
>>>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>>>> etc. But I can only test it on two Intel platforms.
>>>>> Please give it try, if you have machines with other ARCHs.
>>>>>
>>>>> Here are the test results:
>>>>> Intel hybrid machine:
>>>>>
>>>>> $perf mem record -e list
>>>>> ldlat-loads  : available
>>>>> ldlat-stores : available
>>>>>
>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>>>
>>>>> $perf mem record -v
>>>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>>>
>>>>> $perf mem record -t store -v
>>>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>>>
>>>>>
>>>>> Intel SPR:
>>>>> $perf mem record -e list
>>>>> ldlat-loads  : available
>>>>> ldlat-stores : available
>>>>>
>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>>>
>>>>> $perf mem record -v
>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>>>
>>>>> $perf mem record -t store -v
>>>>> calling: record -e cpu/mem-stores/P
>>>>>
>>>>> Kan Liang (7):
>>>>>   perf mem: Add mem_events into the supported perf_pmu
>>>>>   perf mem: Clean up perf_mem_events__ptr()
>>>>>   perf mem: Clean up perf_mem_events__name()
>>>>>   perf mem: Clean up perf_mem_event__supported()
>>>>>   perf mem: Clean up is_mem_loads_aux_event()
>>>>>   perf mem: Clean up perf_mem_events__record_args()
>>>>>   perf mem: Clean up perf_pmus__num_mem_pmus()
>>>>>
>>>>>  tools/perf/arch/arm/util/pmu.c            |   3 +
>>>>>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>>>>>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>>>>>  tools/perf/arch/powerpc/util/mem-events.c |  13 +-
>>>>>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>>>>>  tools/perf/arch/powerpc/util/pmu.c        |  11 ++
>>>>>  tools/perf/arch/s390/util/pmu.c           |   3 +
>>>>>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>>>>>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>>>>>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>>>>>  tools/perf/builtin-c2c.c                  |  45 ++---
>>>>>  tools/perf/builtin-mem.c                  |  48 ++---
>>>>>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>>>>>  tools/perf/util/mem-events.h              |  19 +-
>>>>>  tools/perf/util/pmu.c                     |   4 +-
>>>>>  tools/perf/util/pmu.h                     |   7 +
>>>>>  tools/perf/util/pmus.c                    |   6 -
>>>>>  tools/perf/util/pmus.h                    |   1 -
>>>>>  18 files changed, 278 insertions(+), 280 deletions(-)
>>>>>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>>>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>>>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>>>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>>>
>>>>
>>>
>
Leo Yan Jan. 7, 2024, 4:08 a.m. UTC | #7
On Wed, Dec 13, 2023 at 11:51:47AM -0800, kan.liang@linux.intel.com wrote:

[...]

> Introduce generic functions perf_mem_events__ptr(),
> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
> ARCH specific ones.
> Simplify the perf_mem_event__supported().
> 
> Only keeps the ARCH-specific perf_mem_events array in the corresponding
> mem-events.c for each ARCH.
> 
> There is no functional change.
> 
> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
> etc. But I can only test it on two Intel platforms.
> Please give it try, if you have machines with other ARCHs.
> 
> Here are the test results:
> Intel hybrid machine:
> 
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
> 
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
> 
> $perf mem record -v
> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
> 
> $perf mem record -t store -v
> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
> 
> 
> Intel SPR:
> $perf mem record -e list
> ldlat-loads  : available
> ldlat-stores : available
> 
> $perf mem record -e ldlat-loads -v --ldlat 50
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
> 
> $perf mem record -v
> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
> 
> $perf mem record -t store -v
> calling: record -e cpu/mem-stores/P

After applying this series, below tests pass with Arm SPE:

# ./perf c2c record -- /home/leoy/false_sharing.exe 2
# ./perf c2c report

# ./perf mem record -e list
# ./perf mem record -e spe-load -v --ldlat 50
# ./perf mem record -v
# ./perf mem report
# ./perf mem record -t store -v
# ./perf mem report

Tested-by: Leo Yan <leo.yan@linaro.org>
Liang, Kan Jan. 9, 2024, 2:01 p.m. UTC | #8
On 2024-01-06 11:08 p.m., Leo Yan wrote:
> On Wed, Dec 13, 2023 at 11:51:47AM -0800, kan.liang@linux.intel.com wrote:
> 
> [...]
> 
>> Introduce generic functions perf_mem_events__ptr(),
>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>> ARCH specific ones.
>> Simplify the perf_mem_event__supported().
>>
>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>> mem-events.c for each ARCH.
>>
>> There is no functional change.
>>
>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>> etc. But I can only test it on two Intel platforms.
>> Please give it try, if you have machines with other ARCHs.
>>
>> Here are the test results:
>> Intel hybrid machine:
>>
>> $perf mem record -e list
>> ldlat-loads  : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>
>> $perf mem record -v
>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>
>>
>> Intel SPR:
>> $perf mem record -e list
>> ldlat-loads  : available
>> ldlat-stores : available
>>
>> $perf mem record -e ldlat-loads -v --ldlat 50
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>
>> $perf mem record -v
>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>
>> $perf mem record -t store -v
>> calling: record -e cpu/mem-stores/P
> 
> After applying this series, below tests pass with Arm SPE:
> 
> # ./perf c2c record -- /home/leoy/false_sharing.exe 2
> # ./perf c2c report
> 
> # ./perf mem record -e list
> # ./perf mem record -e spe-load -v --ldlat 50
> # ./perf mem record -v
> # ./perf mem report
> # ./perf mem record -t store -v
> # ./perf mem report
> 
> Tested-by: Leo Yan <leo.yan@linaro.org>
>

Thanks Leo.

Kan
kajoljain Jan. 16, 2024, 2:05 p.m. UTC | #9
On 1/5/24 20:08, Liang, Kan wrote:
> 
> 
> On 2024-01-05 1:38 a.m., kajoljain wrote:
>>
>>
>> On 1/3/24 01:38, Liang, Kan wrote:
>>> Hi Kajol Jain
>>>
>>> On 2023-12-19 9:15 a.m., Liang, Kan wrote:
>>>>
>>>>
>>>> On 2023-12-19 4:26 a.m., kajoljain wrote:
>>>>> Hi,
>>>>>   I was trying to test this patchset on powerpc.
>>>>>
>>>>> After applying it on top of acme's perf-tools-next branch, I am getting
>>>>> below error:
>>>>>
>>>>>   INSTALL libsubcmd_headers
>>>>>   INSTALL libperf_headers
>>>>>   INSTALL libsymbol_headers
>>>>>   INSTALL libapi_headers
>>>>>   INSTALL libbpf_headers
>>>>>   CC      arch/powerpc/util/mem-events.o
>>>>> In file included from arch/powerpc/util/mem-events.c:3:
>>>>> arch/powerpc/util/mem-events.h:5:52: error: ‘PERF_MEM_EVENTS__MAX’
>>>>> undeclared here (not in a function)
>>>>>     5 | extern struct perf_mem_event
>>>>> perf_mem_events_power[PERF_MEM_EVENTS__MAX];
>>>>>       |
>>>>> ^~~~~~~~~~~~~~~~~~~~
>>>>> make[6]: *** [/home/kajol/linux/tools/build/Makefile.build:105:
>>>>> arch/powerpc/util/mem-events.o] Error 1
>>>>> make[5]: *** [/home/kajol/linux/tools/build/Makefile.build:158: util]
>>>>> Error 2
>>>>> make[4]: *** [/home/kajol/linux/tools/build/Makefile.build:158: powerpc]
>>>>> Error 2
>>>>> make[3]: *** [/home/kajol/linux/tools/build/Makefile.build:158: arch]
>>>>> Error 2
>>>>> make[3]: *** Waiting for unfinished jobs....
>>>>> make[2]: *** [Makefile.perf:693: perf-in.o] Error 2
>>>>> make[1]: *** [Makefile.perf:251: sub-make] Error 2
>>>>> make: *** [Makefile:70: all] Error 2
>>>>>
>>>>> It seems some headerfiles are missing from arch/powerpc/util/mem-
>>>>> events.c
>>>>>
>>>>
>>>> Leo updated the headerfiles for ARM. https://termbin.com/0dkn
>>>>
>>>> I guess powerpc has to do the same thing. Could you please try the below
>>>> patch?
>>>
>>>
>>> Does the patch work on powerpc?
>>
>> Hi Kan,
>>    Sorry I went for vacation so couldn't update. Yes this fix works. 
> 
> Thanks for the update.
> 
>> But
>> we have another issue, actually this patch set changes uses ldlat
>> attribute. But ldlat is not supported in powerpc because of which perf
>> mem is failing in powerpc.
> 
> For powerpc, the patch 3 introduced a perf_mem_events_power, which
> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
> sure if it's the problem.

Hi Kan,
 Correct there were some small issues with patch 3, I added fix for that.

> Also, S390 still uses the default perf_mem_events, which includes ldlat.
> I'm not sure if S390 supports the ldlat.

I checked it, I didn't find ldlat parameter defined in arch/s390
directory. I think its better to make default ldlat value as false
in tools/perf/util/mem-events.c file.

Thanks,
Kajol Jain

> 
> Thanks,
> Kan
>>
>> I am looking into a work around to fix this issue. I will update the fix.
>>
>> Thanks,
>> Kajol Jain
>>
>>
>>>
>>>
>>> Thanks,
>>> Kan
>>>>
>>>> diff --git a/tools/perf/arch/powerpc/util/mem-events.c
>>>> b/tools/perf/arch/powerpc/util/mem-events.c
>>>> index 72a6ac2b52f5..765d4a054b0a 100644
>>>> --- a/tools/perf/arch/powerpc/util/mem-events.c
>>>> +++ b/tools/perf/arch/powerpc/util/mem-events.c
>>>> @@ -1,5 +1,6 @@
>>>>  // SPDX-License-Identifier: GPL-2.0
>>>> -#include "map_symbol.h"
>>>> +#include "util/map_symbol.h"
>>>> +#include "util/mem-events.h"
>>>>  #include "mem-events.h"
>>>>
>>>>  #define E(t, n, s, l, a) { .tag = t, .name = n, .event_name = s, .ldlat
>>>> = l, .aux_event = a }
>>>>
>>>> Thanks,
>>>> Kan
>>>>
>>>>> Thanks,
>>>>> Kajol Jain
>>>>>
>>>>> On 12/14/23 01:21, kan.liang@linux.intel.com wrote:
>>>>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>>>>
>>>>>> Changes since V2:
>>>>>> - Fix the Arm64 building error (Leo)
>>>>>> - Add two new patches to clean up perf_mem_events__record_args()
>>>>>>   and perf_pmus__num_mem_pmus() (Leo)
>>>>>>
>>>>>> Changes since V1:
>>>>>> - Fix strcmp of PMU name checking (Ravi)
>>>>>> - Fix "/," typo (Ian)
>>>>>> - Rename several functions with perf_pmu__mem_events prefix. (Ian)
>>>>>> - Fold the header removal patch into the patch where the cleanups made.
>>>>>>   (Arnaldo)
>>>>>> - Add reviewed-by and tested-by from Ian and Ravi
>>>>>>
>>>>>> As discussed in the below thread, the patch set is to clean up perf mem.
>>>>>> https://lore.kernel.org/lkml/afefab15-cffc-4345-9cf4-c6a4128d4d9c@linux.intel.com/
>>>>>>
>>>>>> Introduce generic functions perf_mem_events__ptr(),
>>>>>> perf_mem_events__name() ,and is_mem_loads_aux_event() to replace the
>>>>>> ARCH specific ones.
>>>>>> Simplify the perf_mem_event__supported().
>>>>>>
>>>>>> Only keeps the ARCH-specific perf_mem_events array in the corresponding
>>>>>> mem-events.c for each ARCH.
>>>>>>
>>>>>> There is no functional change.
>>>>>>
>>>>>> The patch set touches almost all the ARCHs, Intel, AMD, ARM, Power and
>>>>>> etc. But I can only test it on two Intel platforms.
>>>>>> Please give it try, if you have machines with other ARCHs.
>>>>>>
>>>>>> Here are the test results:
>>>>>> Intel hybrid machine:
>>>>>>
>>>>>> $perf mem record -e list
>>>>>> ldlat-loads  : available
>>>>>> ldlat-stores : available
>>>>>>
>>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>>> calling: record -e cpu_atom/mem-loads,ldlat=50/P -e cpu_core/mem-loads,ldlat=50/P
>>>>>>
>>>>>> $perf mem record -v
>>>>>> calling: record -e cpu_atom/mem-loads,ldlat=30/P -e cpu_atom/mem-stores/P -e cpu_core/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P
>>>>>>
>>>>>> $perf mem record -t store -v
>>>>>> calling: record -e cpu_atom/mem-stores/P -e cpu_core/mem-stores/P
>>>>>>
>>>>>>
>>>>>> Intel SPR:
>>>>>> $perf mem record -e list
>>>>>> ldlat-loads  : available
>>>>>> ldlat-stores : available
>>>>>>
>>>>>> $perf mem record -e ldlat-loads -v --ldlat 50
>>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=50/}:P
>>>>>>
>>>>>> $perf mem record -v
>>>>>> calling: record -e {cpu/mem-loads-aux/,cpu/mem-loads,ldlat=30/}:P -e cpu/mem-stores/P
>>>>>>
>>>>>> $perf mem record -t store -v
>>>>>> calling: record -e cpu/mem-stores/P
>>>>>>
>>>>>> Kan Liang (7):
>>>>>>   perf mem: Add mem_events into the supported perf_pmu
>>>>>>   perf mem: Clean up perf_mem_events__ptr()
>>>>>>   perf mem: Clean up perf_mem_events__name()
>>>>>>   perf mem: Clean up perf_mem_event__supported()
>>>>>>   perf mem: Clean up is_mem_loads_aux_event()
>>>>>>   perf mem: Clean up perf_mem_events__record_args()
>>>>>>   perf mem: Clean up perf_pmus__num_mem_pmus()
>>>>>>
>>>>>>  tools/perf/arch/arm/util/pmu.c            |   3 +
>>>>>>  tools/perf/arch/arm64/util/mem-events.c   |  39 +---
>>>>>>  tools/perf/arch/arm64/util/mem-events.h   |   7 +
>>>>>>  tools/perf/arch/powerpc/util/mem-events.c |  13 +-
>>>>>>  tools/perf/arch/powerpc/util/mem-events.h |   7 +
>>>>>>  tools/perf/arch/powerpc/util/pmu.c        |  11 ++
>>>>>>  tools/perf/arch/s390/util/pmu.c           |   3 +
>>>>>>  tools/perf/arch/x86/util/mem-events.c     |  99 ++--------
>>>>>>  tools/perf/arch/x86/util/mem-events.h     |  10 +
>>>>>>  tools/perf/arch/x86/util/pmu.c            |  19 +-
>>>>>>  tools/perf/builtin-c2c.c                  |  45 ++---
>>>>>>  tools/perf/builtin-mem.c                  |  48 ++---
>>>>>>  tools/perf/util/mem-events.c              | 217 +++++++++++++---------
>>>>>>  tools/perf/util/mem-events.h              |  19 +-
>>>>>>  tools/perf/util/pmu.c                     |   4 +-
>>>>>>  tools/perf/util/pmu.h                     |   7 +
>>>>>>  tools/perf/util/pmus.c                    |   6 -
>>>>>>  tools/perf/util/pmus.h                    |   1 -
>>>>>>  18 files changed, 278 insertions(+), 280 deletions(-)
>>>>>>  create mode 100644 tools/perf/arch/arm64/util/mem-events.h
>>>>>>  create mode 100644 tools/perf/arch/powerpc/util/mem-events.h
>>>>>>  create mode 100644 tools/perf/arch/powerpc/util/pmu.c
>>>>>>  create mode 100644 tools/perf/arch/x86/util/mem-events.h
>>>>>>
>>>>>
>>>>
>>
>
Liang, Kan Jan. 16, 2024, 4:37 p.m. UTC | #10
On 2024-01-16 9:05 a.m., kajoljain wrote:
>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>> sure if it's the problem.
> Hi Kan,
>  Correct there were some small issues with patch 3, I added fix for that.
>

Thanks Kajol Jain! I will fold your fix into V4.

>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>> I'm not sure if S390 supports the ldlat.
> I checked it, I didn't find ldlat parameter defined in arch/s390
> directory. I think its better to make default ldlat value as false
> in tools/perf/util/mem-events.c file.

The s390 may not be the only user for the default perf_mem_events[] in
the tools/perf/util/mem-events.c. We probably cannot change the default
value.
We may share the perf_mem_events_power[] between powerpc and s390. (We
did the similar share for arm and arm64.)

How about the below patch (not tested.)

diff --git a/tools/perf/arch/s390/util/pmu.c
b/tools/perf/arch/s390/util/pmu.c
index 225d7dc2379c..411034c984bb 100644
--- a/tools/perf/arch/s390/util/pmu.c
+++ b/tools/perf/arch/s390/util/pmu.c
@@ -8,6 +8,7 @@
 #include <string.h>

 #include "../../../util/pmu.h"
+#include "../../powerpc/util/mem-events.h"

 #define        S390_PMUPAI_CRYPTO      "pai_crypto"
 #define        S390_PMUPAI_EXT         "pai_ext"
@@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
                pmu->selectable = true;

        if (pmu->is_core)
-               pmu->mem_events = perf_mem_events;
+               pmu->mem_events = perf_mem_events_power;
 }



However, the original s390 code doesn't include any s390 specific code
for perf_mem. So I thought it uses the default perf_mem_events[].
Is there something I missed?

Or does the s390 even support mem events? If not, I may remove the
mem_events from s390.

Thanks,
Kan
kajoljain Jan. 23, 2024, 5:30 a.m. UTC | #11
On 1/16/24 22:07, Liang, Kan wrote:
> 
> 
> On 2024-01-16 9:05 a.m., kajoljain wrote:
>>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>>> sure if it's the problem.
>> Hi Kan,
>>  Correct there were some small issues with patch 3, I added fix for that.
>>
> 
> Thanks Kajol Jain! I will fold your fix into V4.
> 
>>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>>> I'm not sure if S390 supports the ldlat.
>> I checked it, I didn't find ldlat parameter defined in arch/s390
>> directory. I think its better to make default ldlat value as false
>> in tools/perf/util/mem-events.c file.
> 
> The s390 may not be the only user for the default perf_mem_events[] in
> the tools/perf/util/mem-events.c. We probably cannot change the default
> value.
> We may share the perf_mem_events_power[] between powerpc and s390. (We
> did the similar share for arm and arm64.)
> 
> How about the below patch (not tested.)
> 
> diff --git a/tools/perf/arch/s390/util/pmu.c
> b/tools/perf/arch/s390/util/pmu.c
> index 225d7dc2379c..411034c984bb 100644
> --- a/tools/perf/arch/s390/util/pmu.c
> +++ b/tools/perf/arch/s390/util/pmu.c
> @@ -8,6 +8,7 @@
>  #include <string.h>
> 
>  #include "../../../util/pmu.h"
> +#include "../../powerpc/util/mem-events.h"
> 
>  #define        S390_PMUPAI_CRYPTO      "pai_crypto"
>  #define        S390_PMUPAI_EXT         "pai_ext"
> @@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
>                 pmu->selectable = true;
> 
>         if (pmu->is_core)
> -               pmu->mem_events = perf_mem_events;
> +               pmu->mem_events = perf_mem_events_power;
>  }
> 
> 
> 
> However, the original s390 code doesn't include any s390 specific code
> for perf_mem. So I thought it uses the default perf_mem_events[].
> Is there something I missed?
> 
> Or does the s390 even support mem events? If not, I may remove the
> mem_events from s390.

Hi Kan,
   I don't have s390 system to do testing. But from my end I am fine
with the changes.

Thanks,
Kajol Jain

> 
> Thanks,
> Kan
Thomas Richter Jan. 23, 2024, 5:56 a.m. UTC | #12
On 1/23/24 06:30, kajoljain wrote:
> 
> 
> On 1/16/24 22:07, Liang, Kan wrote:
>>
>>
>> On 2024-01-16 9:05 a.m., kajoljain wrote:
>>>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>>>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>>>> sure if it's the problem.
>>> Hi Kan,
>>>  Correct there were some small issues with patch 3, I added fix for that.
>>>
>>
>> Thanks Kajol Jain! I will fold your fix into V4.
>>
>>>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>>>> I'm not sure if S390 supports the ldlat.
>>> I checked it, I didn't find ldlat parameter defined in arch/s390
>>> directory. I think its better to make default ldlat value as false
>>> in tools/perf/util/mem-events.c file.
>>
>> The s390 may not be the only user for the default perf_mem_events[] in
>> the tools/perf/util/mem-events.c. We probably cannot change the default
>> value.
>> We may share the perf_mem_events_power[] between powerpc and s390. (We
>> did the similar share for arm and arm64.)
>>
>> How about the below patch (not tested.)
>>
>> diff --git a/tools/perf/arch/s390/util/pmu.c
>> b/tools/perf/arch/s390/util/pmu.c
>> index 225d7dc2379c..411034c984bb 100644
>> --- a/tools/perf/arch/s390/util/pmu.c
>> +++ b/tools/perf/arch/s390/util/pmu.c
>> @@ -8,6 +8,7 @@
>>  #include <string.h>
>>
>>  #include "../../../util/pmu.h"
>> +#include "../../powerpc/util/mem-events.h"
>>
>>  #define        S390_PMUPAI_CRYPTO      "pai_crypto"
>>  #define        S390_PMUPAI_EXT         "pai_ext"
>> @@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
>>                 pmu->selectable = true;
>>
>>         if (pmu->is_core)
>> -               pmu->mem_events = perf_mem_events;
>> +               pmu->mem_events = perf_mem_events_power;
>>  }
>>
>>
>>
>> However, the original s390 code doesn't include any s390 specific code
>> for perf_mem. So I thought it uses the default perf_mem_events[].
>> Is there something I missed?
>>
>> Or does the s390 even support mem events? If not, I may remove the
>> mem_events from s390.
> 
> Hi Kan,
>    I don't have s390 system to do testing. But from my end I am fine
> with the changes.
> 
> Thanks,
> Kajol Jain
> 

s390 does not support perf mem at all. Right now it is save to remove it from s390.
Thanks

>>
>> Thanks,
>> Kan
>
Liang, Kan Jan. 23, 2024, 2:36 p.m. UTC | #13
On 2024-01-23 12:56 a.m., Thomas Richter wrote:
> On 1/23/24 06:30, kajoljain wrote:
>>
>>
>> On 1/16/24 22:07, Liang, Kan wrote:
>>>
>>>
>>> On 2024-01-16 9:05 a.m., kajoljain wrote:
>>>>> For powerpc, the patch 3 introduced a perf_mem_events_power, which
>>>>> doesn't have ldlat. But it only be assigned to the pmu->is_core. I'm not
>>>>> sure if it's the problem.
>>>> Hi Kan,
>>>>  Correct there were some small issues with patch 3, I added fix for that.
>>>>
>>>
>>> Thanks Kajol Jain! I will fold your fix into V4.
>>>
>>>>> Also, S390 still uses the default perf_mem_events, which includes ldlat.
>>>>> I'm not sure if S390 supports the ldlat.
>>>> I checked it, I didn't find ldlat parameter defined in arch/s390
>>>> directory. I think its better to make default ldlat value as false
>>>> in tools/perf/util/mem-events.c file.
>>>
>>> The s390 may not be the only user for the default perf_mem_events[] in
>>> the tools/perf/util/mem-events.c. We probably cannot change the default
>>> value.
>>> We may share the perf_mem_events_power[] between powerpc and s390. (We
>>> did the similar share for arm and arm64.)
>>>
>>> How about the below patch (not tested.)
>>>
>>> diff --git a/tools/perf/arch/s390/util/pmu.c
>>> b/tools/perf/arch/s390/util/pmu.c
>>> index 225d7dc2379c..411034c984bb 100644
>>> --- a/tools/perf/arch/s390/util/pmu.c
>>> +++ b/tools/perf/arch/s390/util/pmu.c
>>> @@ -8,6 +8,7 @@
>>>  #include <string.h>
>>>
>>>  #include "../../../util/pmu.h"
>>> +#include "../../powerpc/util/mem-events.h"
>>>
>>>  #define        S390_PMUPAI_CRYPTO      "pai_crypto"
>>>  #define        S390_PMUPAI_EXT         "pai_ext"
>>> @@ -21,5 +22,5 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
>>>                 pmu->selectable = true;
>>>
>>>         if (pmu->is_core)
>>> -               pmu->mem_events = perf_mem_events;
>>> +               pmu->mem_events = perf_mem_events_power;
>>>  }
>>>
>>>
>>>
>>> However, the original s390 code doesn't include any s390 specific code
>>> for perf_mem. So I thought it uses the default perf_mem_events[].
>>> Is there something I missed?
>>>
>>> Or does the s390 even support mem events? If not, I may remove the
>>> mem_events from s390.
>>
>> Hi Kan,
>>    I don't have s390 system to do testing. But from my end I am fine
>> with the changes.
>>
>> Thanks,
>> Kajol Jain
>>
> 
> s390 does not support perf mem at all. Right now it is save to remove it from s390.

Thanks for the confirmation!

Thanks,
Kan

> Thanks
> 
>>>
>>> Thanks,
>>> Kan
>>
>