Message ID | 20220704081149.16797-10-mike.leach@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | coresight: Add new API to allocate trace source ID values | expand |
On 04/07/2022 09:11, Mike Leach wrote: > Trace IDs are now dynamically allocated. > > Previously used the static association algorithm that is no longer > used. The 'cpu * 2 + seed' was outdated and broken for systems with high > core counts (>46). as it did not scale and was broken for larger > core counts. > > Trace ID is as unknown in AUXINFO record, and the ID / CPU association > will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. > > Remove legacy Trace ID allocation algorithm. > > Signed-off-by: Mike Leach <mike.leach@linaro.org> > --- > include/linux/coresight-pmu.h | 19 +++++++------------ > tools/include/linux/coresight-pmu.h | 19 +++++++------------ I usually see mentions that these header updates need to be separate commits because they are merged through different trees. > tools/perf/arch/arm/util/cs-etm.c | 21 ++++++++++++--------- > 3 files changed, 26 insertions(+), 33 deletions(-) > > diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h > index 4ac5c081af93..9f7ee380266b 100644 > --- a/include/linux/coresight-pmu.h > +++ b/include/linux/coresight-pmu.h > @@ -8,7 +8,13 @@ > #define _LINUX_CORESIGHT_PMU_H > > #define CORESIGHT_ETM_PMU_NAME "cs_etm" > -#define CORESIGHT_ETM_PMU_SEED 0x10 > + > +/* > + * Metadata now contains an unused trace ID - IDs are transmitted using a > + * PERF_RECORD_AUX_OUTPUT_HW_ID record. > + * Value architecturally defined as reserved in CoreSight. > + */ > +#define CS_UNUSED_TRACE_ID 0x7F > > /* > * Below are the definition of bit offsets for perf option, and works as > @@ -32,15 +38,4 @@ > #define ETM4_CFG_BIT_RETSTK 12 > #define ETM4_CFG_BIT_VMID_OPT 15 > > -static inline int coresight_get_trace_id(int cpu) > -{ > - /* > - * A trace ID of value 0 is invalid, so let's start at some > - * random value that fits in 7 bits and go from there. Since > - * the common convention is to have data trace IDs be I(N) + 1, > - * set instruction trace IDs as a function of the CPU number. > - */ > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); > -} > - > #endif > diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h > index 6c2fd6cc5a98..31d007fab3a6 100644 > --- a/tools/include/linux/coresight-pmu.h > +++ b/tools/include/linux/coresight-pmu.h > @@ -8,7 +8,13 @@ > #define _LINUX_CORESIGHT_PMU_H > > #define CORESIGHT_ETM_PMU_NAME "cs_etm" > -#define CORESIGHT_ETM_PMU_SEED 0x10 > + > +/* > + * Metadata now contains an unused trace ID - IDs are transmitted using a > + * PERF_RECORD_AUX_OUTPUT_HW_ID record. > + * Value architecturally defined as reserved in CoreSight. > + */ > +#define CS_UNUSED_TRACE_ID 0x7F > minor nit: this isn't used in the kernel so only needs to be defined on the tools side. > /* > * Below are the definition of bit offsets for perf option, and works as > @@ -34,15 +40,4 @@ > #define ETM4_CFG_BIT_RETSTK 12 > #define ETM4_CFG_BIT_VMID_OPT 15 > > -static inline int coresight_get_trace_id(int cpu) > -{ > - /* > - * A trace ID of value 0 is invalid, so let's start at some > - * random value that fits in 7 bits and go from there. Since > - * the common convention is to have data trace IDs be I(N) + 1, > - * set instruction trace IDs as a function of the CPU number. > - */ > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); > -} > - > #endif > diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c > index 1b54638d53b0..2d68e6a722ed 100644 > --- a/tools/perf/arch/arm/util/cs-etm.c > +++ b/tools/perf/arch/arm/util/cs-etm.c > @@ -421,13 +421,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, > evlist__to_front(evlist, cs_etm_evsel); > > /* > - * In the case of per-cpu mmaps, we need the CPU on the > - * AUX event. We also need the contextID in order to be notified > + * get the CPU on the sample - need it to associate trace ID in the > + * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps. > + */ > + evsel__set_sample_bit(cs_etm_evsel, CPU); > + > + /* > + * Also the case of per-cpu mmaps, need the contextID in order to be notified > * when a context switch happened. > */ > if (!perf_cpu_map__empty(cpus)) { > - evsel__set_sample_bit(cs_etm_evsel, CPU); > - > err = cs_etm_set_option(itr, cs_etm_evsel, > BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)); > if (err) > @@ -633,8 +636,9 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, > > /* Get trace configuration register */ > data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); > - /* Get traceID from the framework */ > - data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu); > + /* traceID set to unused */ > + data[CS_ETMV4_TRCTRACEIDR] = CS_UNUSED_TRACE_ID; > + > /* Get read-only information from sysFS */ > data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, > metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); > @@ -681,9 +685,8 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, > magic = __perf_cs_etmv3_magic; > /* Get configuration register */ > info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); > - /* Get traceID from the framework */ > - info->priv[*offset + CS_ETM_ETMTRACEIDR] = > - coresight_get_trace_id(cpu); > + /* traceID set to unused */ > + info->priv[*offset + CS_ETM_ETMTRACEIDR] = CS_UNUSED_TRACE_ID; > /* Get read-only information from sysFS */ > info->priv[*offset + CS_ETM_ETMCCER] = > cs_etm_get_ro(cs_etm_pmu, cpu,
Hi James On Wed, 20 Jul 2022 at 15:41, James Clark <james.clark@arm.com> wrote: > > > > On 04/07/2022 09:11, Mike Leach wrote: > > Trace IDs are now dynamically allocated. > > > > Previously used the static association algorithm that is no longer > > used. The 'cpu * 2 + seed' was outdated and broken for systems with high > > core counts (>46). as it did not scale and was broken for larger > > core counts. > > > > Trace ID is as unknown in AUXINFO record, and the ID / CPU association > > will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. > > > > Remove legacy Trace ID allocation algorithm. > > > > Signed-off-by: Mike Leach <mike.leach@linaro.org> > > --- > > include/linux/coresight-pmu.h | 19 +++++++------------ > > tools/include/linux/coresight-pmu.h | 19 +++++++------------ > > I usually see mentions that these header updates need to be separate commits > because they are merged through different trees. > > > tools/perf/arch/arm/util/cs-etm.c | 21 ++++++++++++--------- > > 3 files changed, 26 insertions(+), 33 deletions(-) > > > > diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h > > index 4ac5c081af93..9f7ee380266b 100644 > > --- a/include/linux/coresight-pmu.h > > +++ b/include/linux/coresight-pmu.h > > @@ -8,7 +8,13 @@ > > #define _LINUX_CORESIGHT_PMU_H > > > > #define CORESIGHT_ETM_PMU_NAME "cs_etm" > > -#define CORESIGHT_ETM_PMU_SEED 0x10 > > + > > +/* > > + * Metadata now contains an unused trace ID - IDs are transmitted using a > > + * PERF_RECORD_AUX_OUTPUT_HW_ID record. > > + * Value architecturally defined as reserved in CoreSight. > > + */ > > +#define CS_UNUSED_TRACE_ID 0x7F > > > > /* > > * Below are the definition of bit offsets for perf option, and works as > > @@ -32,15 +38,4 @@ > > #define ETM4_CFG_BIT_RETSTK 12 > > #define ETM4_CFG_BIT_VMID_OPT 15 > > > > -static inline int coresight_get_trace_id(int cpu) > > -{ > > - /* > > - * A trace ID of value 0 is invalid, so let's start at some > > - * random value that fits in 7 bits and go from there. Since > > - * the common convention is to have data trace IDs be I(N) + 1, > > - * set instruction trace IDs as a function of the CPU number. > > - */ > > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); > > -} > > - > > #endif > > diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h > > index 6c2fd6cc5a98..31d007fab3a6 100644 > > --- a/tools/include/linux/coresight-pmu.h > > +++ b/tools/include/linux/coresight-pmu.h > > @@ -8,7 +8,13 @@ > > #define _LINUX_CORESIGHT_PMU_H > > > > #define CORESIGHT_ETM_PMU_NAME "cs_etm" > > -#define CORESIGHT_ETM_PMU_SEED 0x10 > > + > > +/* > > + * Metadata now contains an unused trace ID - IDs are transmitted using a > > + * PERF_RECORD_AUX_OUTPUT_HW_ID record. > > + * Value architecturally defined as reserved in CoreSight. > > + */ > > +#define CS_UNUSED_TRACE_ID 0x7F > > > > minor nit: this isn't used in the kernel so only needs to be defined on the > tools side. > Unfortunately if the two versions of coresight-pmu.h are different, the build process for perf throws out a warning. So they have to be identical. Thanks Mike > > /* > > * Below are the definition of bit offsets for perf option, and works as > > @@ -34,15 +40,4 @@ > > #define ETM4_CFG_BIT_RETSTK 12 > > #define ETM4_CFG_BIT_VMID_OPT 15 > > > > -static inline int coresight_get_trace_id(int cpu) > > -{ > > - /* > > - * A trace ID of value 0 is invalid, so let's start at some > > - * random value that fits in 7 bits and go from there. Since > > - * the common convention is to have data trace IDs be I(N) + 1, > > - * set instruction trace IDs as a function of the CPU number. > > - */ > > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); > > -} > > - > > #endif > > diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c > > index 1b54638d53b0..2d68e6a722ed 100644 > > --- a/tools/perf/arch/arm/util/cs-etm.c > > +++ b/tools/perf/arch/arm/util/cs-etm.c > > @@ -421,13 +421,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, > > evlist__to_front(evlist, cs_etm_evsel); > > > > /* > > - * In the case of per-cpu mmaps, we need the CPU on the > > - * AUX event. We also need the contextID in order to be notified > > + * get the CPU on the sample - need it to associate trace ID in the > > + * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps. > > + */ > > + evsel__set_sample_bit(cs_etm_evsel, CPU); > > + > > + /* > > + * Also the case of per-cpu mmaps, need the contextID in order to be notified > > * when a context switch happened. > > */ > > if (!perf_cpu_map__empty(cpus)) { > > - evsel__set_sample_bit(cs_etm_evsel, CPU); > > - > > err = cs_etm_set_option(itr, cs_etm_evsel, > > BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)); > > if (err) > > @@ -633,8 +636,9 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, > > > > /* Get trace configuration register */ > > data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); > > - /* Get traceID from the framework */ > > - data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu); > > + /* traceID set to unused */ > > + data[CS_ETMV4_TRCTRACEIDR] = CS_UNUSED_TRACE_ID; > > + > > /* Get read-only information from sysFS */ > > data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, > > metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); > > @@ -681,9 +685,8 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, > > magic = __perf_cs_etmv3_magic; > > /* Get configuration register */ > > info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); > > - /* Get traceID from the framework */ > > - info->priv[*offset + CS_ETM_ETMTRACEIDR] = > > - coresight_get_trace_id(cpu); > > + /* traceID set to unused */ > > + info->priv[*offset + CS_ETM_ETMTRACEIDR] = CS_UNUSED_TRACE_ID; > > /* Get read-only information from sysFS */ > > info->priv[*offset + CS_ETM_ETMCCER] = > > cs_etm_get_ro(cs_etm_pmu, cpu,
On August 9, 2022 1:13:15 PM GMT-03:00, Mike Leach <mike.leach@linaro.org> wrote: >Hi James > >On Wed, 20 Jul 2022 at 15:41, James Clark <james.clark@arm.com> wrote: >> >> >> >> On 04/07/2022 09:11, Mike Leach wrote: >> > Trace IDs are now dynamically allocated. >> > >> > Previously used the static association algorithm that is no longer >> > used. The 'cpu * 2 + seed' was outdated and broken for systems with high >> > core counts (>46). as it did not scale and was broken for larger >> > core counts. >> > >> > Trace ID is as unknown in AUXINFO record, and the ID / CPU association >> > will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. >> > >> > Remove legacy Trace ID allocation algorithm. >> > >> > Signed-off-by: Mike Leach <mike.leach@linaro.org> >> > --- >> > include/linux/coresight-pmu.h | 19 +++++++------------ >> > tools/include/linux/coresight-pmu.h | 19 +++++++------------ >> >> I usually see mentions that these header updates need to be separate commits >> because they are merged through different trees. >> >> > tools/perf/arch/arm/util/cs-etm.c | 21 ++++++++++++--------- >> > 3 files changed, 26 insertions(+), 33 deletions(-) >> > >> > diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h >> > index 4ac5c081af93..9f7ee380266b 100644 >> > --- a/include/linux/coresight-pmu.h >> > +++ b/include/linux/coresight-pmu.h >> > @@ -8,7 +8,13 @@ >> > #define _LINUX_CORESIGHT_PMU_H >> > >> > #define CORESIGHT_ETM_PMU_NAME "cs_etm" >> > -#define CORESIGHT_ETM_PMU_SEED 0x10 >> > + >> > +/* >> > + * Metadata now contains an unused trace ID - IDs are transmitted using a >> > + * PERF_RECORD_AUX_OUTPUT_HW_ID record. >> > + * Value architecturally defined as reserved in CoreSight. >> > + */ >> > +#define CS_UNUSED_TRACE_ID 0x7F >> > >> > /* >> > * Below are the definition of bit offsets for perf option, and works as >> > @@ -32,15 +38,4 @@ >> > #define ETM4_CFG_BIT_RETSTK 12 >> > #define ETM4_CFG_BIT_VMID_OPT 15 >> > >> > -static inline int coresight_get_trace_id(int cpu) >> > -{ >> > - /* >> > - * A trace ID of value 0 is invalid, so let's start at some >> > - * random value that fits in 7 bits and go from there. Since >> > - * the common convention is to have data trace IDs be I(N) + 1, >> > - * set instruction trace IDs as a function of the CPU number. >> > - */ >> > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); >> > -} >> > - >> > #endif >> > diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h >> > index 6c2fd6cc5a98..31d007fab3a6 100644 >> > --- a/tools/include/linux/coresight-pmu.h >> > +++ b/tools/include/linux/coresight-pmu.h >> > @@ -8,7 +8,13 @@ >> > #define _LINUX_CORESIGHT_PMU_H >> > >> > #define CORESIGHT_ETM_PMU_NAME "cs_etm" >> > -#define CORESIGHT_ETM_PMU_SEED 0x10 >> > + >> > +/* >> > + * Metadata now contains an unused trace ID - IDs are transmitted using a >> > + * PERF_RECORD_AUX_OUTPUT_HW_ID record. >> > + * Value architecturally defined as reserved in CoreSight. >> > + */ >> > +#define CS_UNUSED_TRACE_ID 0x7F >> > >> >> minor nit: this isn't used in the kernel so only needs to be defined on the >> tools side. >> > >Unfortunately if the two versions of coresight-pmu.h are different, >the build process for perf throws out a warning. So they have to be >identical. James is right, don't worry about the warning, kernel stuff can't go via the perf-tools tree. - Arnaldo > >Thanks > >Mike > >> > /* >> > * Below are the definition of bit offsets for perf option, and works as >> > @@ -34,15 +40,4 @@ >> > #define ETM4_CFG_BIT_RETSTK 12 >> > #define ETM4_CFG_BIT_VMID_OPT 15 >> > >> > -static inline int coresight_get_trace_id(int cpu) >> > -{ >> > - /* >> > - * A trace ID of value 0 is invalid, so let's start at some >> > - * random value that fits in 7 bits and go from there. Since >> > - * the common convention is to have data trace IDs be I(N) + 1, >> > - * set instruction trace IDs as a function of the CPU number. >> > - */ >> > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); >> > -} >> > - >> > #endif >> > diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c >> > index 1b54638d53b0..2d68e6a722ed 100644 >> > --- a/tools/perf/arch/arm/util/cs-etm.c >> > +++ b/tools/perf/arch/arm/util/cs-etm.c >> > @@ -421,13 +421,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, >> > evlist__to_front(evlist, cs_etm_evsel); >> > >> > /* >> > - * In the case of per-cpu mmaps, we need the CPU on the >> > - * AUX event. We also need the contextID in order to be notified >> > + * get the CPU on the sample - need it to associate trace ID in the >> > + * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps. >> > + */ >> > + evsel__set_sample_bit(cs_etm_evsel, CPU); >> > + >> > + /* >> > + * Also the case of per-cpu mmaps, need the contextID in order to be notified >> > * when a context switch happened. >> > */ >> > if (!perf_cpu_map__empty(cpus)) { >> > - evsel__set_sample_bit(cs_etm_evsel, CPU); >> > - >> > err = cs_etm_set_option(itr, cs_etm_evsel, >> > BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)); >> > if (err) >> > @@ -633,8 +636,9 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, >> > >> > /* Get trace configuration register */ >> > data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); >> > - /* Get traceID from the framework */ >> > - data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu); >> > + /* traceID set to unused */ >> > + data[CS_ETMV4_TRCTRACEIDR] = CS_UNUSED_TRACE_ID; >> > + >> > /* Get read-only information from sysFS */ >> > data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, >> > metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); >> > @@ -681,9 +685,8 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, >> > magic = __perf_cs_etmv3_magic; >> > /* Get configuration register */ >> > info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); >> > - /* Get traceID from the framework */ >> > - info->priv[*offset + CS_ETM_ETMTRACEIDR] = >> > - coresight_get_trace_id(cpu); >> > + /* traceID set to unused */ >> > + info->priv[*offset + CS_ETM_ETMTRACEIDR] = CS_UNUSED_TRACE_ID; >> > /* Get read-only information from sysFS */ >> > info->priv[*offset + CS_ETM_ETMCCER] = >> > cs_etm_get_ro(cs_etm_pmu, cpu, > > >
On 09/08/2022 17:13, Mike Leach wrote: > Hi James > > On Wed, 20 Jul 2022 at 15:41, James Clark <james.clark@arm.com> wrote: >> >> >> >> On 04/07/2022 09:11, Mike Leach wrote: >>> Trace IDs are now dynamically allocated. >>> >>> Previously used the static association algorithm that is no longer >>> used. The 'cpu * 2 + seed' was outdated and broken for systems with high >>> core counts (>46). as it did not scale and was broken for larger >>> core counts. >>> >>> Trace ID is as unknown in AUXINFO record, and the ID / CPU association >>> will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. >>> >>> Remove legacy Trace ID allocation algorithm. >>> >>> Signed-off-by: Mike Leach <mike.leach@linaro.org> >>> --- >>> include/linux/coresight-pmu.h | 19 +++++++------------ >>> tools/include/linux/coresight-pmu.h | 19 +++++++------------ >> >> I usually see mentions that these header updates need to be separate commits >> because they are merged through different trees. >> >>> tools/perf/arch/arm/util/cs-etm.c | 21 ++++++++++++--------- >>> 3 files changed, 26 insertions(+), 33 deletions(-) >>> >>> diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h >>> index 4ac5c081af93..9f7ee380266b 100644 >>> --- a/include/linux/coresight-pmu.h >>> +++ b/include/linux/coresight-pmu.h >>> @@ -8,7 +8,13 @@ >>> #define _LINUX_CORESIGHT_PMU_H >>> >>> #define CORESIGHT_ETM_PMU_NAME "cs_etm" >>> -#define CORESIGHT_ETM_PMU_SEED 0x10 >>> + >>> +/* >>> + * Metadata now contains an unused trace ID - IDs are transmitted using a >>> + * PERF_RECORD_AUX_OUTPUT_HW_ID record. >>> + * Value architecturally defined as reserved in CoreSight. >>> + */ >>> +#define CS_UNUSED_TRACE_ID 0x7F >>> >>> /* >>> * Below are the definition of bit offsets for perf option, and works as >>> @@ -32,15 +38,4 @@ >>> #define ETM4_CFG_BIT_RETSTK 12 >>> #define ETM4_CFG_BIT_VMID_OPT 15 >>> >>> -static inline int coresight_get_trace_id(int cpu) >>> -{ >>> - /* >>> - * A trace ID of value 0 is invalid, so let's start at some >>> - * random value that fits in 7 bits and go from there. Since >>> - * the common convention is to have data trace IDs be I(N) + 1, >>> - * set instruction trace IDs as a function of the CPU number. >>> - */ >>> - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); >>> -} >>> - >>> #endif >>> diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h >>> index 6c2fd6cc5a98..31d007fab3a6 100644 >>> --- a/tools/include/linux/coresight-pmu.h >>> +++ b/tools/include/linux/coresight-pmu.h >>> @@ -8,7 +8,13 @@ >>> #define _LINUX_CORESIGHT_PMU_H >>> >>> #define CORESIGHT_ETM_PMU_NAME "cs_etm" >>> -#define CORESIGHT_ETM_PMU_SEED 0x10 >>> + >>> +/* >>> + * Metadata now contains an unused trace ID - IDs are transmitted using a >>> + * PERF_RECORD_AUX_OUTPUT_HW_ID record. >>> + * Value architecturally defined as reserved in CoreSight. >>> + */ >>> +#define CS_UNUSED_TRACE_ID 0x7F >>> >> >> minor nit: this isn't used in the kernel so only needs to be defined on the >> tools side. >> > > Unfortunately if the two versions of coresight-pmu.h are different, > the build process for perf throws out a warning. So they have to be > identical. I was thinking more along the lines of putting it in a header that is only present on the perf side, rather than only having it in one version of a shared header. > > Thanks > > Mike > >>> /* >>> * Below are the definition of bit offsets for perf option, and works as >>> @@ -34,15 +40,4 @@ >>> #define ETM4_CFG_BIT_RETSTK 12 >>> #define ETM4_CFG_BIT_VMID_OPT 15 >>> >>> -static inline int coresight_get_trace_id(int cpu) >>> -{ >>> - /* >>> - * A trace ID of value 0 is invalid, so let's start at some >>> - * random value that fits in 7 bits and go from there. Since >>> - * the common convention is to have data trace IDs be I(N) + 1, >>> - * set instruction trace IDs as a function of the CPU number. >>> - */ >>> - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); >>> -} >>> - >>> #endif >>> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c >>> index 1b54638d53b0..2d68e6a722ed 100644 >>> --- a/tools/perf/arch/arm/util/cs-etm.c >>> +++ b/tools/perf/arch/arm/util/cs-etm.c >>> @@ -421,13 +421,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, >>> evlist__to_front(evlist, cs_etm_evsel); >>> >>> /* >>> - * In the case of per-cpu mmaps, we need the CPU on the >>> - * AUX event. We also need the contextID in order to be notified >>> + * get the CPU on the sample - need it to associate trace ID in the >>> + * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps. >>> + */ >>> + evsel__set_sample_bit(cs_etm_evsel, CPU); >>> + >>> + /* >>> + * Also the case of per-cpu mmaps, need the contextID in order to be notified >>> * when a context switch happened. >>> */ >>> if (!perf_cpu_map__empty(cpus)) { >>> - evsel__set_sample_bit(cs_etm_evsel, CPU); >>> - >>> err = cs_etm_set_option(itr, cs_etm_evsel, >>> BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)); >>> if (err) >>> @@ -633,8 +636,9 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, >>> >>> /* Get trace configuration register */ >>> data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); >>> - /* Get traceID from the framework */ >>> - data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu); >>> + /* traceID set to unused */ >>> + data[CS_ETMV4_TRCTRACEIDR] = CS_UNUSED_TRACE_ID; >>> + >>> /* Get read-only information from sysFS */ >>> data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, >>> metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); >>> @@ -681,9 +685,8 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, >>> magic = __perf_cs_etmv3_magic; >>> /* Get configuration register */ >>> info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); >>> - /* Get traceID from the framework */ >>> - info->priv[*offset + CS_ETM_ETMTRACEIDR] = >>> - coresight_get_trace_id(cpu); >>> + /* traceID set to unused */ >>> + info->priv[*offset + CS_ETM_ETMTRACEIDR] = CS_UNUSED_TRACE_ID; >>> /* Get read-only information from sysFS */ >>> info->priv[*offset + CS_ETM_ETMCCER] = >>> cs_etm_get_ro(cs_etm_pmu, cpu, > > >
diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h index 4ac5c081af93..9f7ee380266b 100644 --- a/include/linux/coresight-pmu.h +++ b/include/linux/coresight-pmu.h @@ -8,7 +8,13 @@ #define _LINUX_CORESIGHT_PMU_H #define CORESIGHT_ETM_PMU_NAME "cs_etm" -#define CORESIGHT_ETM_PMU_SEED 0x10 + +/* + * Metadata now contains an unused trace ID - IDs are transmitted using a + * PERF_RECORD_AUX_OUTPUT_HW_ID record. + * Value architecturally defined as reserved in CoreSight. + */ +#define CS_UNUSED_TRACE_ID 0x7F /* * Below are the definition of bit offsets for perf option, and works as @@ -32,15 +38,4 @@ #define ETM4_CFG_BIT_RETSTK 12 #define ETM4_CFG_BIT_VMID_OPT 15 -static inline int coresight_get_trace_id(int cpu) -{ - /* - * A trace ID of value 0 is invalid, so let's start at some - * random value that fits in 7 bits and go from there. Since - * the common convention is to have data trace IDs be I(N) + 1, - * set instruction trace IDs as a function of the CPU number. - */ - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); -} - #endif diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h index 6c2fd6cc5a98..31d007fab3a6 100644 --- a/tools/include/linux/coresight-pmu.h +++ b/tools/include/linux/coresight-pmu.h @@ -8,7 +8,13 @@ #define _LINUX_CORESIGHT_PMU_H #define CORESIGHT_ETM_PMU_NAME "cs_etm" -#define CORESIGHT_ETM_PMU_SEED 0x10 + +/* + * Metadata now contains an unused trace ID - IDs are transmitted using a + * PERF_RECORD_AUX_OUTPUT_HW_ID record. + * Value architecturally defined as reserved in CoreSight. + */ +#define CS_UNUSED_TRACE_ID 0x7F /* * Below are the definition of bit offsets for perf option, and works as @@ -34,15 +40,4 @@ #define ETM4_CFG_BIT_RETSTK 12 #define ETM4_CFG_BIT_VMID_OPT 15 -static inline int coresight_get_trace_id(int cpu) -{ - /* - * A trace ID of value 0 is invalid, so let's start at some - * random value that fits in 7 bits and go from there. Since - * the common convention is to have data trace IDs be I(N) + 1, - * set instruction trace IDs as a function of the CPU number. - */ - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2)); -} - #endif diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 1b54638d53b0..2d68e6a722ed 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -421,13 +421,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, evlist__to_front(evlist, cs_etm_evsel); /* - * In the case of per-cpu mmaps, we need the CPU on the - * AUX event. We also need the contextID in order to be notified + * get the CPU on the sample - need it to associate trace ID in the + * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps. + */ + evsel__set_sample_bit(cs_etm_evsel, CPU); + + /* + * Also the case of per-cpu mmaps, need the contextID in order to be notified * when a context switch happened. */ if (!perf_cpu_map__empty(cpus)) { - evsel__set_sample_bit(cs_etm_evsel, CPU); - err = cs_etm_set_option(itr, cs_etm_evsel, BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)); if (err) @@ -633,8 +636,9 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, /* Get trace configuration register */ data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); - /* Get traceID from the framework */ - data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu); + /* traceID set to unused */ + data[CS_ETMV4_TRCTRACEIDR] = CS_UNUSED_TRACE_ID; + /* Get read-only information from sysFS */ data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); @@ -681,9 +685,8 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, magic = __perf_cs_etmv3_magic; /* Get configuration register */ info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); - /* Get traceID from the framework */ - info->priv[*offset + CS_ETM_ETMTRACEIDR] = - coresight_get_trace_id(cpu); + /* traceID set to unused */ + info->priv[*offset + CS_ETM_ETMTRACEIDR] = CS_UNUSED_TRACE_ID; /* Get read-only information from sysFS */ info->priv[*offset + CS_ETM_ETMCCER] = cs_etm_get_ro(cs_etm_pmu, cpu,
Trace IDs are now dynamically allocated. Previously used the static association algorithm that is no longer used. The 'cpu * 2 + seed' was outdated and broken for systems with high core counts (>46). as it did not scale and was broken for larger core counts. Trace ID is as unknown in AUXINFO record, and the ID / CPU association will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. Remove legacy Trace ID allocation algorithm. Signed-off-by: Mike Leach <mike.leach@linaro.org> --- include/linux/coresight-pmu.h | 19 +++++++------------ tools/include/linux/coresight-pmu.h | 19 +++++++------------ tools/perf/arch/arm/util/cs-etm.c | 21 ++++++++++++--------- 3 files changed, 26 insertions(+), 33 deletions(-)