[00/10] coresight: Add new API to allocate trace source ID values

Message ID	20220308205000.27646-1-mike.leach@linaro.org (mailing list archive)
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Mike Leach <mike.leach@linaro.org> To: suzuki.poulose@arm.com, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: mathieu.poirier@linaro.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-perf-users@vger.kernel.org, leo.yan@linaro.org, Mike Leach <mike.leach@linaro.org> Subject: [PATCH 00/10] coresight: Add new API to allocate trace source ID values Date: Tue, 8 Mar 2022 20:49:50 +0000 Message-Id: <20220308205000.27646-1-mike.leach@linaro.org> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	coresight: Add new API to allocate trace source ID values \| expand [00/10] coresight: Add new API to allocate trace source ID values [01/10] coresight: trace-id: Add API to dynamically assign trace ID values [02/10] coresight: trace-id: Set up source trace ID map for system [03/10] coresight: stm: Update STM driver to use Trace ID api [04/10] coresight: etm4x: Use trace ID API to dynamically allocate trace ID [05/10] coresight: etm3x: Use trace ID API to allocate IDs [06/10] coresight: perf: traceid: Add perf notifiers for trace ID [07/10] perf: cs-etm: Update event to read trace ID from sysfs [08/10] coresight: Remove legacy Trace ID allocation mechanism [09/10] coresight: etmX.X: stm: Remove unused legacy source trace ID ops [10/10] coresight: trace-id: Add debug & test macros to trace id allocation

Mike Leach March 8, 2022, 8:49 p.m. UTC

The current method for allocating trace source ID values to sources is
to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
The STM is allocated ID 0x1.

This fixed algorithm is used in both the CoreSight driver code, and by
perf when writing the trace metadata in the AUXTRACE_INFO record.

The method needs replacing as currently:-
1. It is inefficient in using available IDs.
2. Does not scale to larger systems with many cores and the algorithm
has no limits so will generate invalid trace IDs for cpu number > 44.

Additionally requirements to allocate additional system IDs on some
systems have been seen.

This patch set  introduces an API that allows the allocation of trace IDs
in a dynamic manner.

Architecturally reserved IDs are never allocated, and the system is
limited to allocating only valid IDs.

Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
the new API.

perf handling is changed so that the ID associated with the CPU is read
from sysfs. The ID allocator is notified when perf events start and stop
so CPU based IDs are kept constant throughout any perf session.

For the ETMx.x devices IDs are allocated on certain events
a) When using sysfs, an ID will be allocated on hardware enable, and freed
when the sysfs reset is written.
b) When using perf, ID is allocated on hardware enable, and freed on
hardware disable.

For both cases the ID is allocated when sysfs is read to get the current
trace ID. This ensures that consistent decode metadata can be extracted
from the system where this read occurs before device enable.

Note: This patchset breaks backward compatibility for perf record.
Because the method for generating the AUXTRACE_INFO meta data has
changed, using an older perf record will result in metadata that
does not match the trace IDs used in the recorded trace data.
This mismatch will cause subsequent decode to fail. Older versions of
perf will still be able to decode data generated by the updated system.


Applies to coresight/next [b54f53bc11a5]
Tested on DB410c

Mike Leach (10):
  coresight: trace-id: Add API to dynamically assign trace ID values
  coresight: trace-id: Set up source trace ID map for system
  coresight: stm: Update STM driver to use Trace ID api
  coresight: etm4x: Use trace ID API to dynamically allocate trace ID
  coresight: etm3x: Use trace ID API to allocate IDs
  coresight: perf: traceid: Add perf notifiers for trace ID
  perf: cs-etm: Update event to read trace ID from sysfs
  coresight: Remove legacy Trace ID allocation mechanism
  coresight: etmX.X: stm: Remove unused legacy source trace ID ops
  coresight: trace-id: Add debug & test macros to trace id allocation

 drivers/hwtracing/coresight/Makefile          |   2 +-
 drivers/hwtracing/coresight/coresight-core.c  |  64 ++---
 .../hwtracing/coresight/coresight-etm-perf.c  |  16 +-
 drivers/hwtracing/coresight/coresight-etm.h   |   3 +-
 .../coresight/coresight-etm3x-core.c          |  93 ++++---
 .../coresight/coresight-etm3x-sysfs.c         |  28 +-
 .../coresight/coresight-etm4x-core.c          |  63 ++++-
 .../coresight/coresight-etm4x-sysfs.c         |  32 ++-
 drivers/hwtracing/coresight/coresight-etm4x.h |   3 +
 drivers/hwtracing/coresight/coresight-priv.h  |   1 +
 drivers/hwtracing/coresight/coresight-stm.c   |  49 +---
 .../hwtracing/coresight/coresight-trace-id.c  | 255 ++++++++++++++++++
 .../hwtracing/coresight/coresight-trace-id.h  |  69 +++++
 include/linux/coresight-pmu.h                 |  12 -
 include/linux/coresight.h                     |   3 -
 tools/perf/arch/arm/util/cs-etm.c             |  12 +-
 16 files changed, 530 insertions(+), 175 deletions(-)
 create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
 create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h

Suzuki K Poulose March 22, 2022, 10:43 a.m. UTC | #1

+ Cc: James Clark

Hi Mike,

On 08/03/2022 20:49, Mike Leach wrote:
> The current method for allocating trace source ID values to sources is
> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
> The STM is allocated ID 0x1.
> 
> This fixed algorithm is used in both the CoreSight driver code, and by
> perf when writing the trace metadata in the AUXTRACE_INFO record.
> 
> The method needs replacing as currently:-
> 1. It is inefficient in using available IDs.
> 2. Does not scale to larger systems with many cores and the algorithm
> has no limits so will generate invalid trace IDs for cpu number > 44.

Thanks for addressing this issue.

> 
> Additionally requirements to allocate additional system IDs on some
> systems have been seen.
> 
> This patch set  introduces an API that allows the allocation of trace IDs
> in a dynamic manner.
> 
> Architecturally reserved IDs are never allocated, and the system is
> limited to allocating only valid IDs.
> 
> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
> the new API.
> 
> perf handling is changed so that the ID associated with the CPU is read
> from sysfs. The ID allocator is notified when perf events start and stop
> so CPU based IDs are kept constant throughout any perf session.
> 
> For the ETMx.x devices IDs are allocated on certain events
> a) When using sysfs, an ID will be allocated on hardware enable, and freed
> when the sysfs reset is written.
> b) When using perf, ID is allocated on hardware enable, and freed on
> hardware disable.
> 
> For both cases the ID is allocated when sysfs is read to get the current
> trace ID. This ensures that consistent decode metadata can be extracted
> from the system where this read occurs before device enable.


> 
> Note: This patchset breaks backward compatibility for perf record.
> Because the method for generating the AUXTRACE_INFO meta data has
> changed, using an older perf record will result in metadata that
> does not match the trace IDs used in the recorded trace data.
> This mismatch will cause subsequent decode to fail. Older versions of
> perf will still be able to decode data generated by the updated system.

I have some concerns over this and the future plans for the dynamic
allocation per sink. i.e., we are breaking/modifying the perf now to
accommodate the dynamic nature of the trace id of a given CPU/ETM.
The proposed approach of exposing this via sysfs may (am not sure if
this would be the case) break for the trace-id per sink change, as a
sink could assign different trace-id for a CPU depending.

So, instead if we make the trace-id available in the perf (say, an new
record format, PERF_RECORD_CS_ETM_TRACEID ?) record, we can rely on the
new packet for the trace-id, irrespective of how that is allocated and
remove the locking/linking of the trace-id with that of the sysfs. This
is not something that exists today. (Ideally it would have been nice to
have some additional fields in RECORD_AUXINFO, but we don't. Instead of
breaking/extending that, we could add a new RECORD).

I believe the packet may need to be generated only once for a session
and that will also allow the flexibility of moving trace-id allocation
around (to a sink in the future).

Thoughts ?

Kind regards
Suzuki


> 
> 
> Applies to coresight/next [b54f53bc11a5]
> Tested on DB410c
> 
> Mike Leach (10):
>    coresight: trace-id: Add API to dynamically assign trace ID values
>    coresight: trace-id: Set up source trace ID map for system
>    coresight: stm: Update STM driver to use Trace ID api
>    coresight: etm4x: Use trace ID API to dynamically allocate trace ID
>    coresight: etm3x: Use trace ID API to allocate IDs
>    coresight: perf: traceid: Add perf notifiers for trace ID
>    perf: cs-etm: Update event to read trace ID from sysfs
>    coresight: Remove legacy Trace ID allocation mechanism
>    coresight: etmX.X: stm: Remove unused legacy source trace ID ops
>    coresight: trace-id: Add debug & test macros to trace id allocation
> 
>   drivers/hwtracing/coresight/Makefile          |   2 +-
>   drivers/hwtracing/coresight/coresight-core.c  |  64 ++---
>   .../hwtracing/coresight/coresight-etm-perf.c  |  16 +-
>   drivers/hwtracing/coresight/coresight-etm.h   |   3 +-
>   .../coresight/coresight-etm3x-core.c          |  93 ++++---
>   .../coresight/coresight-etm3x-sysfs.c         |  28 +-
>   .../coresight/coresight-etm4x-core.c          |  63 ++++-
>   .../coresight/coresight-etm4x-sysfs.c         |  32 ++-
>   drivers/hwtracing/coresight/coresight-etm4x.h |   3 +
>   drivers/hwtracing/coresight/coresight-priv.h  |   1 +
>   drivers/hwtracing/coresight/coresight-stm.c   |  49 +---
>   .../hwtracing/coresight/coresight-trace-id.c  | 255 ++++++++++++++++++
>   .../hwtracing/coresight/coresight-trace-id.h  |  69 +++++
>   include/linux/coresight-pmu.h                 |  12 -
>   include/linux/coresight.h                     |   3 -
>   tools/perf/arch/arm/util/cs-etm.c             |  12 +-
>   16 files changed, 530 insertions(+), 175 deletions(-)
>   create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
>   create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
>

Mike Leach March 22, 2022, 11:38 a.m. UTC | #2

HI Suzuki,

On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
<suzuki.poulose@arm.com> wrote:
>
> + Cc: James Clark
>
> Hi Mike,
>
> On 08/03/2022 20:49, Mike Leach wrote:
> > The current method for allocating trace source ID values to sources is
> > to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
> > The STM is allocated ID 0x1.
> >
> > This fixed algorithm is used in both the CoreSight driver code, and by
> > perf when writing the trace metadata in the AUXTRACE_INFO record.
> >
> > The method needs replacing as currently:-
> > 1. It is inefficient in using available IDs.
> > 2. Does not scale to larger systems with many cores and the algorithm
> > has no limits so will generate invalid trace IDs for cpu number > 44.
>
> Thanks for addressing this issue.
>
> >
> > Additionally requirements to allocate additional system IDs on some
> > systems have been seen.
> >
> > This patch set  introduces an API that allows the allocation of trace IDs
> > in a dynamic manner.
> >
> > Architecturally reserved IDs are never allocated, and the system is
> > limited to allocating only valid IDs.
> >
> > Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
> > the new API.
> >
> > perf handling is changed so that the ID associated with the CPU is read
> > from sysfs. The ID allocator is notified when perf events start and stop
> > so CPU based IDs are kept constant throughout any perf session.
> >
> > For the ETMx.x devices IDs are allocated on certain events
> > a) When using sysfs, an ID will be allocated on hardware enable, and freed
> > when the sysfs reset is written.
> > b) When using perf, ID is allocated on hardware enable, and freed on
> > hardware disable.
> >
> > For both cases the ID is allocated when sysfs is read to get the current
> > trace ID. This ensures that consistent decode metadata can be extracted
> > from the system where this read occurs before device enable.
>
>
> >
> > Note: This patchset breaks backward compatibility for perf record.
> > Because the method for generating the AUXTRACE_INFO meta data has
> > changed, using an older perf record will result in metadata that
> > does not match the trace IDs used in the recorded trace data.
> > This mismatch will cause subsequent decode to fail. Older versions of
> > perf will still be able to decode data generated by the updated system.
>
> I have some concerns over this and the future plans for the dynamic
> allocation per sink. i.e., we are breaking/modifying the perf now to
> accommodate the dynamic nature of the trace id of a given CPU/ETM.

I don't beleive we have a choice for this - we cannot retain what is
an essentially broken allocation mechanism.

> The proposed approach of exposing this via sysfs may (am not sure if
> this would be the case) break for the trace-id per sink change, as a
> sink could assign different trace-id for a CPU depending.
>

If a path exists between a CPU and a sink - the current framework as
far as I can tell would not allow for a new path to be set up between
the cpu and another sink.

However, if we allow multiple paths per CPU, the implementation does
both allocate on read and allocate on enable. Both API functions take
a input of a trace ID allocation structure. At present this is global,
but if we need to introduce per sink allocation, then the mechanisms
for sink / ID table management will have to ensure that the correct
table is provided for the sink at the end of the path in each case.
Thus the API still works as long as you get the sink ID table
management correct. That is why it was designed to take the TraceID
tables as input to all the functions - it is independent of any per
sink management that might come later.

My view is that any multi-sink system is likely to be multi-socket as
well - where different trace infrastructures exist per socket but need
to be handled by a single software infrastructure.

> So, instead if we make the trace-id available in the perf (say, an new
> record format, PERF_RECORD_CS_ETM_TRACEID ?) record, we can rely on the
> new packet for the trace-id, irrespective of how that is allocated and
> remove the locking/linking of the trace-id with that of the sysfs.

The issue here is how to we transmit the information from the driver
to the perf executable?
Even with a new record that problem still exists. The current perf
solves this problem by using the same static algorithm that the driver
uses - so no actual communication is necessary. A similar method is
used to synthesize the value of the etm control register. The command
line options are interpreted by perf, then the same data is passed to
the driver from perf through the event structures and reinterpreted -
hopefully in the same way. All the other values in the perf records
are read directly from sysfs.


>This
> is not something that exists today. (Ideally it would have been nice to
> have some additional fields in RECORD_AUXINFO, but we don't. Instead of
> breaking/extending that, we could add a new RECORD).
>

The trace ID is currently part of RECORD_AUXTRACE_INFO is it not? And
we have extended this in the past for the additional requirements for
ETE - i.e. an additional ID register - read from sysfs, along with a
version number for the record.

Regards

Mike

> I believe the packet may need to be generated only once for a session
> and that will also allow the flexibility of moving trace-id allocation
> around (to a sink in the future).
>
> Thoughts ?
>
> Kind regards
> Suzuki
>
>
> >
> >
> > Applies to coresight/next [b54f53bc11a5]
> > Tested on DB410c
> >
> > Mike Leach (10):
> >    coresight: trace-id: Add API to dynamically assign trace ID values
> >    coresight: trace-id: Set up source trace ID map for system
> >    coresight: stm: Update STM driver to use Trace ID api
> >    coresight: etm4x: Use trace ID API to dynamically allocate trace ID
> >    coresight: etm3x: Use trace ID API to allocate IDs
> >    coresight: perf: traceid: Add perf notifiers for trace ID
> >    perf: cs-etm: Update event to read trace ID from sysfs
> >    coresight: Remove legacy Trace ID allocation mechanism
> >    coresight: etmX.X: stm: Remove unused legacy source trace ID ops
> >    coresight: trace-id: Add debug & test macros to trace id allocation
> >
> >   drivers/hwtracing/coresight/Makefile          |   2 +-
> >   drivers/hwtracing/coresight/coresight-core.c  |  64 ++---
> >   .../hwtracing/coresight/coresight-etm-perf.c  |  16 +-
> >   drivers/hwtracing/coresight/coresight-etm.h   |   3 +-
> >   .../coresight/coresight-etm3x-core.c          |  93 ++++---
> >   .../coresight/coresight-etm3x-sysfs.c         |  28 +-
> >   .../coresight/coresight-etm4x-core.c          |  63 ++++-
> >   .../coresight/coresight-etm4x-sysfs.c         |  32 ++-
> >   drivers/hwtracing/coresight/coresight-etm4x.h |   3 +
> >   drivers/hwtracing/coresight/coresight-priv.h  |   1 +
> >   drivers/hwtracing/coresight/coresight-stm.c   |  49 +---
> >   .../hwtracing/coresight/coresight-trace-id.c  | 255 ++++++++++++++++++
> >   .../hwtracing/coresight/coresight-trace-id.h  |  69 +++++
> >   include/linux/coresight-pmu.h                 |  12 -
> >   include/linux/coresight.h                     |   3 -
> >   tools/perf/arch/arm/util/cs-etm.c             |  12 +-
> >   16 files changed, 530 insertions(+), 175 deletions(-)
> >   create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
> >   create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
> >
>

Suzuki K Poulose March 22, 2022, 12:35 p.m. UTC | #3

On 22/03/2022 11:38, Mike Leach wrote:
> HI Suzuki,
> 
> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
> <suzuki.poulose@arm.com> wrote:
>>
>> + Cc: James Clark
>>
>> Hi Mike,
>>
>> On 08/03/2022 20:49, Mike Leach wrote:
>>> The current method for allocating trace source ID values to sources is
>>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
>>> The STM is allocated ID 0x1.
>>>
>>> This fixed algorithm is used in both the CoreSight driver code, and by
>>> perf when writing the trace metadata in the AUXTRACE_INFO record.
>>>
>>> The method needs replacing as currently:-
>>> 1. It is inefficient in using available IDs.
>>> 2. Does not scale to larger systems with many cores and the algorithm
>>> has no limits so will generate invalid trace IDs for cpu number > 44.
>>
>> Thanks for addressing this issue.
>>
>>>
>>> Additionally requirements to allocate additional system IDs on some
>>> systems have been seen.
>>>
>>> This patch set  introduces an API that allows the allocation of trace IDs
>>> in a dynamic manner.
>>>
>>> Architecturally reserved IDs are never allocated, and the system is
>>> limited to allocating only valid IDs.
>>>
>>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
>>> the new API.
>>>
>>> perf handling is changed so that the ID associated with the CPU is read
>>> from sysfs. The ID allocator is notified when perf events start and stop
>>> so CPU based IDs are kept constant throughout any perf session.
>>>
>>> For the ETMx.x devices IDs are allocated on certain events
>>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
>>> when the sysfs reset is written.
>>> b) When using perf, ID is allocated on hardware enable, and freed on
>>> hardware disable.
>>>
>>> For both cases the ID is allocated when sysfs is read to get the current
>>> trace ID. This ensures that consistent decode metadata can be extracted
>>> from the system where this read occurs before device enable.
>>
>>
>>>
>>> Note: This patchset breaks backward compatibility for perf record.
>>> Because the method for generating the AUXTRACE_INFO meta data has
>>> changed, using an older perf record will result in metadata that
>>> does not match the trace IDs used in the recorded trace data.
>>> This mismatch will cause subsequent decode to fail. Older versions of
>>> perf will still be able to decode data generated by the updated system.
>>
>> I have some concerns over this and the future plans for the dynamic
>> allocation per sink. i.e., we are breaking/modifying the perf now to
>> accommodate the dynamic nature of the trace id of a given CPU/ETM.
> 
> I don't beleive we have a choice for this - we cannot retain what is
> an essentially broken allocation mechanism.
> 

I completely agree and I am happy with the current step by step approach
of moving to a dynamic allocation scheme. Apologies, this wasn't 
conveyed appropriately.

>> The proposed approach of exposing this via sysfs may (am not sure if
>> this would be the case) break for the trace-id per sink change, as a
>> sink could assign different trace-id for a CPU depending.
>>
> 
> If a path exists between a CPU and a sink - the current framework as
> far as I can tell would not allow for a new path to be set up between
> the cpu and another sink.

e.g, if we have concurrent perf sessions, in the future with sink  based
allocation :

perf record -e cs_etm/@sink1/... payload1
perf record -e cs_etm/@sink2/... payload2
perf record -e cs_etm// ...      payload3

The trace id allocated for first session for CPU0 *could* be different
from that of the second or the third. And it may be tricky to guarantee
that the traceids may be unique across the sinks for a given CPU.

Please note that the different perf sessions may be executing on
different CPUs at the same time as long as they go to different sinks.
So, reading the sysfs could only give out a single traceid, which may
or may not be the correct one for a given "perf".

> 
> However, if we allow multiple paths per CPU, the implementation does
> both allocate on read and allocate on enable. Both API functions take
> a input of a trace ID allocation structure. At present this is global,
> but if we need to introduce per sink allocation, then the mechanisms
> for sink / ID table management will have to ensure that the correct
> table is provided for the sink at the end of the path in each case.
> Thus the API still works as long as you get the sink ID table
> management correct. That is why it was designed to take the TraceID
> tables as input to all the functions - it is independent of any per
> sink management that might come later.
> 
> My view is that any multi-sink system is likely to be multi-socket as
> well - where different trace infrastructures exist per socket but need
> to be handled by a single software infrastructure.
> 
>> So, instead if we make the trace-id available in the perf (say, an new
>> record format, PERF_RECORD_CS_ETM_TRACEID ?) record, we can rely on the
>> new packet for the trace-id, irrespective of how that is allocated and
>> remove the locking/linking of the trace-id with that of the sysfs.
> 
> The issue here is how to we transmit the information from the driver
> to the perf executable?

Yes, exactly.

> Even with a new record that problem still exists. The current perf
> solves this problem by using the same static algorithm that the driver
> uses - so no actual communication is necessary. A similar method is
> used to synthesize the value of the etm control register. The command
> line options are interpreted by perf, then the same data is passed to
> the driver from perf through the event structures and reinterpreted -
> hopefully in the same way. All the other values in the perf records
> are read directly from sysfs.

Yes, correct. Now, the trace-id is something that could change per 
session and with the move to sink based allocation, this could break.
So,
> 
> 
>> This
>> is not something that exists today. (Ideally it would have been nice to
>> have some additional fields in RECORD_AUXINFO, but we don't. Instead of
>> breaking/extending that, we could add a new RECORD).
>>
> 
> The trace ID is currently part of RECORD_AUXTRACE_INFO is it not? And
> we have extended this in the past for the additional requirements for
> ETE - i.e. an additional ID register - read from sysfs, along with a
> version number for the record.

Sorry, I meant the RECORD_AUX (which perf gets emitted for each session
of the ETM, with the offset/size and flags).

There are:

RECORD_AUXINFO -> perf created statically.
RECORD_AUX     -> emitted for each "run" of ETM, offset, size, flags
RECORD_AUXTRACE -> actual hw trace

I see that there is already something that we could use;


  /*
   * Data written to the AUX area by hardware due to aux_output, may need
   * to be matched to the event by an architecture-specific hardware ID.
   * This records the hardware ID, but requires sample_id to provide the
   * event ID. e.g. Intel PT uses this record to disambiguate PEBS-via-PT
   * records from multiple events.
   *
   * struct {
   *     struct perf_event_header        header;
   *     u64                             hw_id;
   *     struct sample_id                sample_id;
   * };
   */
  PERF_RECORD_AUX_OUTPUT_HW_ID           = 21,

My suggestion is to emit a record say :

PERF_RECORD_AUX_OUTPUT_HW_ID for each CPU/ETM for a perf session. And 
have the perf report construct the TraceID map for each ETM at decode
from the PERF_RECORD_AUX_OUTPUT_HW_ID records. That way it is future
proof for the "perf" userspace to find the trace-id for a given ETM
rather than reading the sysfs which could be problematic.

Suzuki


> 
> Regards
> 
> Mike
> 
>> I believe the packet may need to be generated only once for a session
>> and that will also allow the flexibility of moving trace-id allocation
>> around (to a sink in the future).
>>
>> Thoughts ?
>>
>> Kind regards
>> Suzuki
>>
>>
>>>
>>>
>>> Applies to coresight/next [b54f53bc11a5]
>>> Tested on DB410c
>>>
>>> Mike Leach (10):
>>>     coresight: trace-id: Add API to dynamically assign trace ID values
>>>     coresight: trace-id: Set up source trace ID map for system
>>>     coresight: stm: Update STM driver to use Trace ID api
>>>     coresight: etm4x: Use trace ID API to dynamically allocate trace ID
>>>     coresight: etm3x: Use trace ID API to allocate IDs
>>>     coresight: perf: traceid: Add perf notifiers for trace ID
>>>     perf: cs-etm: Update event to read trace ID from sysfs
>>>     coresight: Remove legacy Trace ID allocation mechanism
>>>     coresight: etmX.X: stm: Remove unused legacy source trace ID ops
>>>     coresight: trace-id: Add debug & test macros to trace id allocation
>>>
>>>    drivers/hwtracing/coresight/Makefile          |   2 +-
>>>    drivers/hwtracing/coresight/coresight-core.c  |  64 ++---
>>>    .../hwtracing/coresight/coresight-etm-perf.c  |  16 +-
>>>    drivers/hwtracing/coresight/coresight-etm.h   |   3 +-
>>>    .../coresight/coresight-etm3x-core.c          |  93 ++++---
>>>    .../coresight/coresight-etm3x-sysfs.c         |  28 +-
>>>    .../coresight/coresight-etm4x-core.c          |  63 ++++-
>>>    .../coresight/coresight-etm4x-sysfs.c         |  32 ++-
>>>    drivers/hwtracing/coresight/coresight-etm4x.h |   3 +
>>>    drivers/hwtracing/coresight/coresight-priv.h  |   1 +
>>>    drivers/hwtracing/coresight/coresight-stm.c   |  49 +---
>>>    .../hwtracing/coresight/coresight-trace-id.c  | 255 ++++++++++++++++++
>>>    .../hwtracing/coresight/coresight-trace-id.h  |  69 +++++
>>>    include/linux/coresight-pmu.h                 |  12 -
>>>    include/linux/coresight.h                     |   3 -
>>>    tools/perf/arch/arm/util/cs-etm.c             |  12 +-
>>>    16 files changed, 530 insertions(+), 175 deletions(-)
>>>    create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
>>>    create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
>>>
>>
> 
>

Mike Leach March 22, 2022, 2:27 p.m. UTC | #4

Hi Suzuki

On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
<suzuki.poulose@arm.com> wrote:
>
> On 22/03/2022 11:38, Mike Leach wrote:
> > HI Suzuki,
> >
> > On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
> > <suzuki.poulose@arm.com> wrote:
> >>
> >> + Cc: James Clark
> >>
> >> Hi Mike,
> >>
> >> On 08/03/2022 20:49, Mike Leach wrote:
> >>> The current method for allocating trace source ID values to sources is
> >>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
> >>> The STM is allocated ID 0x1.
> >>>
> >>> This fixed algorithm is used in both the CoreSight driver code, and by
> >>> perf when writing the trace metadata in the AUXTRACE_INFO record.
> >>>
> >>> The method needs replacing as currently:-
> >>> 1. It is inefficient in using available IDs.
> >>> 2. Does not scale to larger systems with many cores and the algorithm
> >>> has no limits so will generate invalid trace IDs for cpu number > 44.
> >>
> >> Thanks for addressing this issue.
> >>
> >>>
> >>> Additionally requirements to allocate additional system IDs on some
> >>> systems have been seen.
> >>>
> >>> This patch set  introduces an API that allows the allocation of trace IDs
> >>> in a dynamic manner.
> >>>
> >>> Architecturally reserved IDs are never allocated, and the system is
> >>> limited to allocating only valid IDs.
> >>>
> >>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
> >>> the new API.
> >>>
> >>> perf handling is changed so that the ID associated with the CPU is read
> >>> from sysfs. The ID allocator is notified when perf events start and stop
> >>> so CPU based IDs are kept constant throughout any perf session.
> >>>
> >>> For the ETMx.x devices IDs are allocated on certain events
> >>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
> >>> when the sysfs reset is written.
> >>> b) When using perf, ID is allocated on hardware enable, and freed on
> >>> hardware disable.
> >>>
> >>> For both cases the ID is allocated when sysfs is read to get the current
> >>> trace ID. This ensures that consistent decode metadata can be extracted
> >>> from the system where this read occurs before device enable.
> >>
> >>
> >>>
> >>> Note: This patchset breaks backward compatibility for perf record.
> >>> Because the method for generating the AUXTRACE_INFO meta data has
> >>> changed, using an older perf record will result in metadata that
> >>> does not match the trace IDs used in the recorded trace data.
> >>> This mismatch will cause subsequent decode to fail. Older versions of
> >>> perf will still be able to decode data generated by the updated system.
> >>
> >> I have some concerns over this and the future plans for the dynamic
> >> allocation per sink. i.e., we are breaking/modifying the perf now to
> >> accommodate the dynamic nature of the trace id of a given CPU/ETM.
> >
> > I don't beleive we have a choice for this - we cannot retain what is
> > an essentially broken allocation mechanism.
> >
>
> I completely agree and I am happy with the current step by step approach
> of moving to a dynamic allocation scheme. Apologies, this wasn't
> conveyed appropriately.
>
> >> The proposed approach of exposing this via sysfs may (am not sure if
> >> this would be the case) break for the trace-id per sink change, as a
> >> sink could assign different trace-id for a CPU depending.
> >>
> >
> > If a path exists between a CPU and a sink - the current framework as
> > far as I can tell would not allow for a new path to be set up between
> > the cpu and another sink.
>
> e.g, if we have concurrent perf sessions, in the future with sink  based
> allocation :
>
> perf record -e cs_etm/@sink1/... payload1
> perf record -e cs_etm/@sink2/... payload2
> perf record -e cs_etm// ...      payload3
>
> The trace id allocated for first session for CPU0 *could* be different
> from that of the second or the third.

If these sessions run concurrently then the same Trace ID will be used
for CPU0 for all the sessions.
We ensure this by notifications that a cs_etm session is starting and
stopping - and keep a refcount.
Only when the perf session refcount hits zero can Trace IDs be
released and re-used. Otherwise the association between CPU x and
Trace ID y is maintained - the first session using CPU0 will assign
the ID, the last session to terminate will release the ID from CPU0
(and any other IDs that were allocated during the sessions).

> And it may be tricky to guarantee
> that the traceids may be unique across the sinks for a given CPU.
>




> Please note that the different perf sessions may be executing on
> different CPUs at the same time as long as they go to different sinks.
> So, reading the sysfs could only give out a single traceid, which may
> or may not be the correct one for a given "perf".

As above - reading the trace ID for a given CPU will work for
concurrent sessions.

>
> >
> > However, if we allow multiple paths per CPU, the implementation does
> > both allocate on read and allocate on enable. Both API functions take
> > a input of a trace ID allocation structure. At present this is global,
> > but if we need to introduce per sink allocation, then the mechanisms
> > for sink / ID table management will have to ensure that the correct
> > table is provided for the sink at the end of the path in each case.
> > Thus the API still works as long as you get the sink ID table
> > management correct. That is why it was designed to take the TraceID
> > tables as input to all the functions - it is independent of any per
> > sink management that might come later.
> >
> > My view is that any multi-sink system is likely to be multi-socket as
> > well - where different trace infrastructures exist per socket but need
> > to be handled by a single software infrastructure.
> >
> >> So, instead if we make the trace-id available in the perf (say, an new
> >> record format, PERF_RECORD_CS_ETM_TRACEID ?) record, we can rely on the
> >> new packet for the trace-id, irrespective of how that is allocated and
> >> remove the locking/linking of the trace-id with that of the sysfs.
> >
> > The issue here is how to we transmit the information from the driver
> > to the perf executable?
>
> Yes, exactly.
>
> > Even with a new record that problem still exists. The current perf
> > solves this problem by using the same static algorithm that the driver
> > uses - so no actual communication is necessary. A similar method is
> > used to synthesize the value of the etm control register. The command
> > line options are interpreted by perf, then the same data is passed to
> > the driver from perf through the event structures and reinterpreted -
> > hopefully in the same way. All the other values in the perf records
> > are read directly from sysfs.
>
> Yes, correct. Now, the trace-id is something that could change per
> session and with the move to sink based allocation, this could break.
> So,
> >
> >
> >> This
> >> is not something that exists today. (Ideally it would have been nice to
> >> have some additional fields in RECORD_AUXINFO, but we don't. Instead of
> >> breaking/extending that, we could add a new RECORD).
> >>
> >
> > The trace ID is currently part of RECORD_AUXTRACE_INFO is it not? And
> > we have extended this in the past for the additional requirements for
> > ETE - i.e. an additional ID register - read from sysfs, along with a
> > version number for the record.
>
> Sorry, I meant the RECORD_AUX (which perf gets emitted for each session
> of the ETM, with the offset/size and flags).
>
> There are:
>
> RECORD_AUXINFO -> perf created statically.
> RECORD_AUX     -> emitted for each "run" of ETM, offset, size, flags
> RECORD_AUXTRACE -> actual hw trace
>
> I see that there is already something that we could use;
>
>
>   /*
>    * Data written to the AUX area by hardware due to aux_output, may need
>    * to be matched to the event by an architecture-specific hardware ID.
>    * This records the hardware ID, but requires sample_id to provide the
>    * event ID. e.g. Intel PT uses this record to disambiguate PEBS-via-PT
>    * records from multiple events.
>    *
>    * struct {
>    *     struct perf_event_header        header;
>    *     u64                             hw_id;
>    *     struct sample_id                sample_id;
>    * };
>    */
>   PERF_RECORD_AUX_OUTPUT_HW_ID           = 21,
>
> My suggestion is to emit a record say :
>
> PERF_RECORD_AUX_OUTPUT_HW_ID for each CPU/ETM for a perf session. And
> have the perf report construct the TraceID map for each ETM at decode
> from the PERF_RECORD_AUX_OUTPUT_HW_ID records. That way it is future
> proof for the "perf" userspace to find the trace-id for a given ETM
> rather than reading the sysfs which could be problematic.
>

I think this is an interesting idea - we would effectively drop the
use of the Trace ID in AUXINFO and replace it with this new record -
presumably emitted from somewhere in the etm driver.
It is still a compatibility breaking solution. In fact more so than
the current patch set. With the current patch set you need the driver
changes, and the kernel perf changes to generate a useable file that
will work with earlier versions of the userspace perf report.
With this change you need a change to the drivers, kernel perf and
userspace perf.

However - this is a perf only solution - it does not help when driving
trace from sysfs directly.

I think this could be done. But I think this is a separate task from
the current patch set - and could easily be added later if required.
It involves much more change to the user side of perf which are not
required at present.

Regards

Mike

> Suzuki
>
>
> >
> > Regards
> >
> > Mike
> >
> >> I believe the packet may need to be generated only once for a session
> >> and that will also allow the flexibility of moving trace-id allocation
> >> around (to a sink in the future).
> >>
> >> Thoughts ?
> >>
> >> Kind regards
> >> Suzuki
> >>
> >>
> >>>
> >>>
> >>> Applies to coresight/next [b54f53bc11a5]
> >>> Tested on DB410c
> >>>
> >>> Mike Leach (10):
> >>>     coresight: trace-id: Add API to dynamically assign trace ID values
> >>>     coresight: trace-id: Set up source trace ID map for system
> >>>     coresight: stm: Update STM driver to use Trace ID api
> >>>     coresight: etm4x: Use trace ID API to dynamically allocate trace ID
> >>>     coresight: etm3x: Use trace ID API to allocate IDs
> >>>     coresight: perf: traceid: Add perf notifiers for trace ID
> >>>     perf: cs-etm: Update event to read trace ID from sysfs
> >>>     coresight: Remove legacy Trace ID allocation mechanism
> >>>     coresight: etmX.X: stm: Remove unused legacy source trace ID ops
> >>>     coresight: trace-id: Add debug & test macros to trace id allocation
> >>>
> >>>    drivers/hwtracing/coresight/Makefile          |   2 +-
> >>>    drivers/hwtracing/coresight/coresight-core.c  |  64 ++---
> >>>    .../hwtracing/coresight/coresight-etm-perf.c  |  16 +-
> >>>    drivers/hwtracing/coresight/coresight-etm.h   |   3 +-
> >>>    .../coresight/coresight-etm3x-core.c          |  93 ++++---
> >>>    .../coresight/coresight-etm3x-sysfs.c         |  28 +-
> >>>    .../coresight/coresight-etm4x-core.c          |  63 ++++-
> >>>    .../coresight/coresight-etm4x-sysfs.c         |  32 ++-
> >>>    drivers/hwtracing/coresight/coresight-etm4x.h |   3 +
> >>>    drivers/hwtracing/coresight/coresight-priv.h  |   1 +
> >>>    drivers/hwtracing/coresight/coresight-stm.c   |  49 +---
> >>>    .../hwtracing/coresight/coresight-trace-id.c  | 255 ++++++++++++++++++
> >>>    .../hwtracing/coresight/coresight-trace-id.h  |  69 +++++
> >>>    include/linux/coresight-pmu.h                 |  12 -
> >>>    include/linux/coresight.h                     |   3 -
> >>>    tools/perf/arch/arm/util/cs-etm.c             |  12 +-
> >>>    16 files changed, 530 insertions(+), 175 deletions(-)
> >>>    create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
> >>>    create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
> >>>
> >>
> >
> >
>

Suzuki K Poulose March 22, 2022, 6:52 p.m. UTC | #5

Hi Mike

On 22/03/2022 14:27, Mike Leach wrote:
> Hi Suzuki
> 
> On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
> <suzuki.poulose@arm.com> wrote:
>>
>> On 22/03/2022 11:38, Mike Leach wrote:
>>> HI Suzuki,
>>>
>>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
>>> <suzuki.poulose@arm.com> wrote:
>>>>
>>>> + Cc: James Clark
>>>>
>>>> Hi Mike,
>>>>
>>>> On 08/03/2022 20:49, Mike Leach wrote:
>>>>> The current method for allocating trace source ID values to sources is
>>>>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
>>>>> The STM is allocated ID 0x1.
>>>>>
>>>>> This fixed algorithm is used in both the CoreSight driver code, and by
>>>>> perf when writing the trace metadata in the AUXTRACE_INFO record.
>>>>>
>>>>> The method needs replacing as currently:-
>>>>> 1. It is inefficient in using available IDs.
>>>>> 2. Does not scale to larger systems with many cores and the algorithm
>>>>> has no limits so will generate invalid trace IDs for cpu number > 44.
>>>>
>>>> Thanks for addressing this issue.
>>>>
>>>>>
>>>>> Additionally requirements to allocate additional system IDs on some
>>>>> systems have been seen.
>>>>>
>>>>> This patch set  introduces an API that allows the allocation of trace IDs
>>>>> in a dynamic manner.
>>>>>
>>>>> Architecturally reserved IDs are never allocated, and the system is
>>>>> limited to allocating only valid IDs.
>>>>>
>>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
>>>>> the new API.
>>>>>
>>>>> perf handling is changed so that the ID associated with the CPU is read
>>>>> from sysfs. The ID allocator is notified when perf events start and stop
>>>>> so CPU based IDs are kept constant throughout any perf session.
>>>>>
>>>>> For the ETMx.x devices IDs are allocated on certain events
>>>>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
>>>>> when the sysfs reset is written.
>>>>> b) When using perf, ID is allocated on hardware enable, and freed on
>>>>> hardware disable.
>>>>>
>>>>> For both cases the ID is allocated when sysfs is read to get the current
>>>>> trace ID. This ensures that consistent decode metadata can be extracted
>>>>> from the system where this read occurs before device enable.
>>>>
>>>>
>>>>>
>>>>> Note: This patchset breaks backward compatibility for perf record.
>>>>> Because the method for generating the AUXTRACE_INFO meta data has
>>>>> changed, using an older perf record will result in metadata that
>>>>> does not match the trace IDs used in the recorded trace data.
>>>>> This mismatch will cause subsequent decode to fail. Older versions of
>>>>> perf will still be able to decode data generated by the updated system.
>>>>
>>>> I have some concerns over this and the future plans for the dynamic
>>>> allocation per sink. i.e., we are breaking/modifying the perf now to
>>>> accommodate the dynamic nature of the trace id of a given CPU/ETM.
>>>
>>> I don't beleive we have a choice for this - we cannot retain what is
>>> an essentially broken allocation mechanism.
>>>
>>
>> I completely agree and I am happy with the current step by step approach
>> of moving to a dynamic allocation scheme. Apologies, this wasn't
>> conveyed appropriately.
>>
>>>> The proposed approach of exposing this via sysfs may (am not sure if
>>>> this would be the case) break for the trace-id per sink change, as a
>>>> sink could assign different trace-id for a CPU depending.
>>>>
>>>
>>> If a path exists between a CPU and a sink - the current framework as
>>> far as I can tell would not allow for a new path to be set up between
>>> the cpu and another sink.
>>
>> e.g, if we have concurrent perf sessions, in the future with sink  based
>> allocation :
>>
>> perf record -e cs_etm/@sink1/... payload1
>> perf record -e cs_etm/@sink2/... payload2
>> perf record -e cs_etm// ...      payload3
>>
>> The trace id allocated for first session for CPU0 *could* be different
>> from that of the second or the third.
> 
> If these sessions run concurrently then the same Trace ID will be used
> for CPU0 for all the sessions.
> We ensure this by notifications that a cs_etm session is starting and
> stopping - and keep a refcount.

The scheme is fine now, with a global trace-id map. But with per-sink
allocation, this could cause problems.

e.g., there could be a situation where:

trace_id[CPU0][sink0] == trace_id[CPU1][sink1]

So if we have a session where both CPU0 and CPU1 trace to a common sink,
we get the trace mixed with no way of splitting them. As the perf will
read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.

So my point is, we are changing the ABI for perf to grab the TraceID
with your patches. And clearly this approach could break easily when
we extend to sink-based idmap. So, lets make the ABI change for perf
scalable and bullet proof (as far as we can) by exposing this
information via the perf RECORD. That way any future changes in the
scheme won't affect the perf as long as it has a reliable information
within each "record".


My point is, let us fix this once and for all, so that we don't
need to change this again. I understand this involves more work
in the perf tool. I believe that is for better

Thoughts ?

Suzuki

Mike Leach March 23, 2022, 10:07 a.m. UTC | #6

Hi Suzuki,

On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> Hi Mike
>
> On 22/03/2022 14:27, Mike Leach wrote:
> > Hi Suzuki
> >
> > On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
> > <suzuki.poulose@arm.com> wrote:
> >>
> >> On 22/03/2022 11:38, Mike Leach wrote:
> >>> HI Suzuki,
> >>>
> >>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
> >>> <suzuki.poulose@arm.com> wrote:
> >>>>
> >>>> + Cc: James Clark
> >>>>
> >>>> Hi Mike,
> >>>>
> >>>> On 08/03/2022 20:49, Mike Leach wrote:
> >>>>> The current method for allocating trace source ID values to sources is
> >>>>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
> >>>>> The STM is allocated ID 0x1.
> >>>>>
> >>>>> This fixed algorithm is used in both the CoreSight driver code, and by
> >>>>> perf when writing the trace metadata in the AUXTRACE_INFO record.
> >>>>>
> >>>>> The method needs replacing as currently:-
> >>>>> 1. It is inefficient in using available IDs.
> >>>>> 2. Does not scale to larger systems with many cores and the algorithm
> >>>>> has no limits so will generate invalid trace IDs for cpu number > 44.
> >>>>
> >>>> Thanks for addressing this issue.
> >>>>
> >>>>>
> >>>>> Additionally requirements to allocate additional system IDs on some
> >>>>> systems have been seen.
> >>>>>
> >>>>> This patch set  introduces an API that allows the allocation of trace IDs
> >>>>> in a dynamic manner.
> >>>>>
> >>>>> Architecturally reserved IDs are never allocated, and the system is
> >>>>> limited to allocating only valid IDs.
> >>>>>
> >>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
> >>>>> the new API.
> >>>>>
> >>>>> perf handling is changed so that the ID associated with the CPU is read
> >>>>> from sysfs. The ID allocator is notified when perf events start and stop
> >>>>> so CPU based IDs are kept constant throughout any perf session.
> >>>>>
> >>>>> For the ETMx.x devices IDs are allocated on certain events
> >>>>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
> >>>>> when the sysfs reset is written.
> >>>>> b) When using perf, ID is allocated on hardware enable, and freed on
> >>>>> hardware disable.
> >>>>>
> >>>>> For both cases the ID is allocated when sysfs is read to get the current
> >>>>> trace ID. This ensures that consistent decode metadata can be extracted
> >>>>> from the system where this read occurs before device enable.
> >>>>
> >>>>
> >>>>>
> >>>>> Note: This patchset breaks backward compatibility for perf record.
> >>>>> Because the method for generating the AUXTRACE_INFO meta data has
> >>>>> changed, using an older perf record will result in metadata that
> >>>>> does not match the trace IDs used in the recorded trace data.
> >>>>> This mismatch will cause subsequent decode to fail. Older versions of
> >>>>> perf will still be able to decode data generated by the updated system.
> >>>>
> >>>> I have some concerns over this and the future plans for the dynamic
> >>>> allocation per sink. i.e., we are breaking/modifying the perf now to
> >>>> accommodate the dynamic nature of the trace id of a given CPU/ETM.
> >>>
> >>> I don't beleive we have a choice for this - we cannot retain what is
> >>> an essentially broken allocation mechanism.
> >>>
> >>
> >> I completely agree and I am happy with the current step by step approach
> >> of moving to a dynamic allocation scheme. Apologies, this wasn't
> >> conveyed appropriately.
> >>
> >>>> The proposed approach of exposing this via sysfs may (am not sure if
> >>>> this would be the case) break for the trace-id per sink change, as a
> >>>> sink could assign different trace-id for a CPU depending.
> >>>>
> >>>
> >>> If a path exists between a CPU and a sink - the current framework as
> >>> far as I can tell would not allow for a new path to be set up between
> >>> the cpu and another sink.
> >>
> >> e.g, if we have concurrent perf sessions, in the future with sink  based
> >> allocation :
> >>
> >> perf record -e cs_etm/@sink1/... payload1
> >> perf record -e cs_etm/@sink2/... payload2
> >> perf record -e cs_etm// ...      payload3
> >>
> >> The trace id allocated for first session for CPU0 *could* be different
> >> from that of the second or the third.
> >
> > If these sessions run concurrently then the same Trace ID will be used
> > for CPU0 for all the sessions.
> > We ensure this by notifications that a cs_etm session is starting and
> > stopping - and keep a refcount.
>
> The scheme is fine now, with a global trace-id map. But with per-sink
> allocation, this could cause problems.
>
> e.g., there could be a situation where:
>
> trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
>
> So if we have a session where both CPU0 and CPU1 trace to a common sink,
> we get the trace mixed with no way of splitting them. As the perf will
> read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.

I think we need to consider the CoreSight hardware topology here.

Any CPUx that can trace to a sink reachable by another CPUy must
always get trace IDs from the same pool as CPUy.

Consider the options for multi sink topologies:-

CPU0->funnel0->ETF->ETR
CPU1--+^

Clearly - in-line sinks can never have simultaneous independent
sessions - the session into ETR traces through ETF

Now we could have replicators / programmable replicators -

ATB->Prog replicator->ETR0
                                 +->ETR1

however programmable replicators use trace ID for filtering - this is
effectively a single sink on the input, so once again the Trace IDs
must come from the same pool.

Now, we could have independent per cluster / socket topology
Cluster 0
CPU0->funnel0->ETF0->ETR0
CPU1--+^

Cluster 1
CPU2->funnel1->ETF1->ETR1
CPU3--+^

Here cluster 0 & 1 can have independent sets of trace IDs as their
respective cores can never trace to the same sink.

Finally we have the ETE+TRBE 1:1 type topologies. These could actually
not bother allocating any trace ID when in 1:1 mode, which should
probably be a follow on incremental addition to this initial set.

So, my conclusion when I was considering all this is that "per-sink"
trace Id allocation is in fact "per unique trace path set" allocation.



>
> So my point is, we are changing the ABI for perf to grab the TraceID
> with your patches. And clearly this approach could break easily when
> we extend to sink-based idmap. So, lets make the ABI change for perf
> scalable and bullet proof (as far as we can) by exposing this
> information via the perf RECORD. That way any future changes in the
> scheme won't affect the perf as long as it has a reliable information
> within each "record".
>
>
> My point is, let us fix this once and for all, so that we don't
> need to change this again. I understand this involves more work
> in the perf tool. I believe that is for better
>
> Thoughts ?
>

My preference is the incremental approach.
Fix the trace ID allocation issues that partners are having now, then
update to the perf record approach in a separate follow up patchset.
Then when we start to see systems that require it - update to using
the per-unique-path trace ID pools.

Regards

Mike

> Suzuki

Al Grant March 23, 2022, 10:35 a.m. UTC | #7

> -----Original Message-----
> From: Mike Leach <mike.leach@linaro.org>
> Sent: 23 March 2022 10:08
> To: Suzuki Poulose <Suzuki.Poulose@arm.com>
> Cc: coresight@lists.linaro.org; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; peterz@infradead.org; mingo@redhat.com;
> acme@kernel.org; linux-perf-users@vger.kernel.org; James Clark
> <James.Clark@arm.com>
> Subject: Re: [PATCH 00/10] coresight: Add new API to allocate trace source ID
> values
> 
> Hi Suzuki,
> 
> On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose <suzuki.poulose@arm.com>
> wrote:
> >
> > Hi Mike
> >
> > On 22/03/2022 14:27, Mike Leach wrote:
> > > Hi Suzuki
> > >
> > > On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
> > > <suzuki.poulose@arm.com> wrote:
> > >>
> > >> On 22/03/2022 11:38, Mike Leach wrote:
> > >>> HI Suzuki,
> > >>>
> > >>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
> > >>> <suzuki.poulose@arm.com> wrote:
> > >>>>
> > >>>> + Cc: James Clark
> > >>>>
> > >>>> Hi Mike,
> > >>>>
> > >>>> On 08/03/2022 20:49, Mike Leach wrote:
> > >>>>> The current method for allocating trace source ID values to
> > >>>>> sources is to use a fixed algorithm for CPU based sources of
> (cpu_num * 2 + 0x10).
> > >>>>> The STM is allocated ID 0x1.
> > >>>>>
> > >>>>> This fixed algorithm is used in both the CoreSight driver code,
> > >>>>> and by perf when writing the trace metadata in the AUXTRACE_INFO
> record.
> > >>>>>
> > >>>>> The method needs replacing as currently:- 1. It is inefficient
> > >>>>> in using available IDs.
> > >>>>> 2. Does not scale to larger systems with many cores and the
> > >>>>> algorithm has no limits so will generate invalid trace IDs for cpu
> number > 44.
> > >>>>
> > >>>> Thanks for addressing this issue.
> > >>>>
> > >>>>>
> > >>>>> Additionally requirements to allocate additional system IDs on
> > >>>>> some systems have been seen.
> > >>>>>
> > >>>>> This patch set  introduces an API that allows the allocation of
> > >>>>> trace IDs in a dynamic manner.
> > >>>>>
> > >>>>> Architecturally reserved IDs are never allocated, and the system
> > >>>>> is limited to allocating only valid IDs.
> > >>>>>
> > >>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is
> > >>>>> updated to use the new API.
> > >>>>>
> > >>>>> perf handling is changed so that the ID associated with the CPU
> > >>>>> is read from sysfs. The ID allocator is notified when perf
> > >>>>> events start and stop so CPU based IDs are kept constant
> throughout any perf session.
> > >>>>>
> > >>>>> For the ETMx.x devices IDs are allocated on certain events
> > >>>>> a) When using sysfs, an ID will be allocated on hardware enable,
> > >>>>> and freed when the sysfs reset is written.
> > >>>>> b) When using perf, ID is allocated on hardware enable, and
> > >>>>> freed on hardware disable.
> > >>>>>
> > >>>>> For both cases the ID is allocated when sysfs is read to get the
> > >>>>> current trace ID. This ensures that consistent decode metadata
> > >>>>> can be extracted from the system where this read occurs before
> device enable.
> > >>>>
> > >>>>
> > >>>>>
> > >>>>> Note: This patchset breaks backward compatibility for perf record.
> > >>>>> Because the method for generating the AUXTRACE_INFO meta data
> > >>>>> has changed, using an older perf record will result in metadata
> > >>>>> that does not match the trace IDs used in the recorded trace data.
> > >>>>> This mismatch will cause subsequent decode to fail. Older
> > >>>>> versions of perf will still be able to decode data generated by the
> updated system.
> > >>>>
> > >>>> I have some concerns over this and the future plans for the
> > >>>> dynamic allocation per sink. i.e., we are breaking/modifying the
> > >>>> perf now to accommodate the dynamic nature of the trace id of a
> given CPU/ETM.
> > >>>
> > >>> I don't beleive we have a choice for this - we cannot retain what
> > >>> is an essentially broken allocation mechanism.
> > >>>
> > >>
> > >> I completely agree and I am happy with the current step by step
> > >> approach of moving to a dynamic allocation scheme. Apologies, this
> > >> wasn't conveyed appropriately.
> > >>
> > >>>> The proposed approach of exposing this via sysfs may (am not sure
> > >>>> if this would be the case) break for the trace-id per sink
> > >>>> change, as a sink could assign different trace-id for a CPU depending.
> > >>>>
> > >>>
> > >>> If a path exists between a CPU and a sink - the current framework
> > >>> as far as I can tell would not allow for a new path to be set up
> > >>> between the cpu and another sink.
> > >>
> > >> e.g, if we have concurrent perf sessions, in the future with sink
> > >> based allocation :
> > >>
> > >> perf record -e cs_etm/@sink1/... payload1 perf record -e
> > >> cs_etm/@sink2/... payload2
> > >> perf record -e cs_etm// ...      payload3
> > >>
> > >> The trace id allocated for first session for CPU0 *could* be
> > >> different from that of the second or the third.
> > >
> > > If these sessions run concurrently then the same Trace ID will be
> > > used for CPU0 for all the sessions.
> > > We ensure this by notifications that a cs_etm session is starting
> > > and stopping - and keep a refcount.
> >
> > The scheme is fine now, with a global trace-id map. But with per-sink
> > allocation, this could cause problems.
> >
> > e.g., there could be a situation where:
> >
> > trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
> >
> > So if we have a session where both CPU0 and CPU1 trace to a common
> > sink, we get the trace mixed with no way of splitting them. As the
> > perf will read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.
> 
> I think we need to consider the CoreSight hardware topology here.
> 
> Any CPUx that can trace to a sink reachable by another CPUy must always
> get trace IDs from the same pool as CPUy.
> 
> Consider the options for multi sink topologies:-
> 
> CPU0->funnel0->ETF->ETR
> CPU1--+^
> 
> Clearly - in-line sinks can never have simultaneous independent sessions -
> the session into ETR traces through ETF
> 
> Now we could have replicators / programmable replicators -
> 
> ATB->Prog replicator->ETR0
>                                  +->ETR1
> 
> however programmable replicators use trace ID for filtering - this is
> effectively a single sink on the input, so once again the Trace IDs must come
> from the same pool.
> 
> Now, we could have independent per cluster / socket topology Cluster 0
> CPU0->funnel0->ETF0->ETR0
> CPU1--+^
> 
> Cluster 1
> CPU2->funnel1->ETF1->ETR1
> CPU3--+^
> 
> Here cluster 0 & 1 can have independent sets of trace IDs as their respective
> cores can never trace to the same sink.
> 
> Finally we have the ETE+TRBE 1:1 type topologies. These could actually not
> bother allocating any trace ID when in 1:1 mode, which should probably be a
> follow on incremental addition to this initial set.
> 
> So, my conclusion when I was considering all this is that "per-sink"
> trace Id allocation is in fact "per unique trace path set" allocation.

The id pools can't always be defined statically in all situations.
E.g. if you have 128 CPUs each with their own ETR, and also
replicated into a funnel network leading to a common ETR.

This topology supports (at least) two distinct modes: (a) all CPUs
enabled for tracing to their own ETRs (b) a selection of CPUs
(up to some limit), combined together. Both are valid dynamic
configurations. Perf might have a preference on which one to use,
but both are valid.

But there's no static id pool that works for both. The pool for
(b) has to allocate to some random selection of 128 CPUs,
from only around 120 numbers. The pool has to take account of
which CPUs are selected.

So your comment "any CPU that can trace..." has to be interpreted as
"any CPU that can trace in the currently configured dynamic topology"
rather than "any CPU that can be dynamically configured to trace..."...
is that what you meant?

Alternatively, we could just declare that such systems are too
complicated to support, and say that we wouldn't support the
use of a global sink that (statically) was reachable by 128 CPUs.

Al


> 
> 
> 
> >
> > So my point is, we are changing the ABI for perf to grab the TraceID
> > with your patches. And clearly this approach could break easily when
> > we extend to sink-based idmap. So, lets make the ABI change for perf
> > scalable and bullet proof (as far as we can) by exposing this
> > information via the perf RECORD. That way any future changes in the
> > scheme won't affect the perf as long as it has a reliable information
> > within each "record".
> >
> >
> > My point is, let us fix this once and for all, so that we don't need
> > to change this again. I understand this involves more work in the perf
> > tool. I believe that is for better
> >
> > Thoughts ?
> >
> 
> My preference is the incremental approach.
> Fix the trace ID allocation issues that partners are having now, then update
> to the perf record approach in a separate follow up patchset.
> Then when we start to see systems that require it - update to using the per-
> unique-path trace ID pools.
> 
> Regards
> 
> Mike
> 
> > Suzuki
> 
> 
> 
> --
> Mike Leach
> Principal Engineer, ARM Ltd.
> Manchester Design Centre. UK
> _______________________________________________
> CoreSight mailing list -- coresight@lists.linaro.org To unsubscribe send an
> email to coresight-leave@lists.linaro.org

Suzuki K Poulose March 23, 2022, 10:41 a.m. UTC | #8

Hi Mike

On 23/03/2022 10:07, Mike Leach wrote:
> Hi Suzuki,
> 
> On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi Mike
>>
>> On 22/03/2022 14:27, Mike Leach wrote:
>>> Hi Suzuki
>>>
>>> On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
>>> <suzuki.poulose@arm.com> wrote:
>>>>
>>>> On 22/03/2022 11:38, Mike Leach wrote:
>>>>> HI Suzuki,
>>>>>
>>>>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
>>>>> <suzuki.poulose@arm.com> wrote:
>>>>>>
>>>>>> + Cc: James Clark
>>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> On 08/03/2022 20:49, Mike Leach wrote:
>>>>>>> The current method for allocating trace source ID values to sources is
>>>>>>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
>>>>>>> The STM is allocated ID 0x1.
>>>>>>>
>>>>>>> This fixed algorithm is used in both the CoreSight driver code, and by
>>>>>>> perf when writing the trace metadata in the AUXTRACE_INFO record.
>>>>>>>
>>>>>>> The method needs replacing as currently:-
>>>>>>> 1. It is inefficient in using available IDs.
>>>>>>> 2. Does not scale to larger systems with many cores and the algorithm
>>>>>>> has no limits so will generate invalid trace IDs for cpu number > 44.
>>>>>>
>>>>>> Thanks for addressing this issue.
>>>>>>
>>>>>>>
>>>>>>> Additionally requirements to allocate additional system IDs on some
>>>>>>> systems have been seen.
>>>>>>>
>>>>>>> This patch set  introduces an API that allows the allocation of trace IDs
>>>>>>> in a dynamic manner.
>>>>>>>
>>>>>>> Architecturally reserved IDs are never allocated, and the system is
>>>>>>> limited to allocating only valid IDs.
>>>>>>>
>>>>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
>>>>>>> the new API.
>>>>>>>
>>>>>>> perf handling is changed so that the ID associated with the CPU is read
>>>>>>> from sysfs. The ID allocator is notified when perf events start and stop
>>>>>>> so CPU based IDs are kept constant throughout any perf session.
>>>>>>>
>>>>>>> For the ETMx.x devices IDs are allocated on certain events
>>>>>>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
>>>>>>> when the sysfs reset is written.
>>>>>>> b) When using perf, ID is allocated on hardware enable, and freed on
>>>>>>> hardware disable.
>>>>>>>
>>>>>>> For both cases the ID is allocated when sysfs is read to get the current
>>>>>>> trace ID. This ensures that consistent decode metadata can be extracted
>>>>>>> from the system where this read occurs before device enable.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Note: This patchset breaks backward compatibility for perf record.
>>>>>>> Because the method for generating the AUXTRACE_INFO meta data has
>>>>>>> changed, using an older perf record will result in metadata that
>>>>>>> does not match the trace IDs used in the recorded trace data.
>>>>>>> This mismatch will cause subsequent decode to fail. Older versions of
>>>>>>> perf will still be able to decode data generated by the updated system.
>>>>>>
>>>>>> I have some concerns over this and the future plans for the dynamic
>>>>>> allocation per sink. i.e., we are breaking/modifying the perf now to
>>>>>> accommodate the dynamic nature of the trace id of a given CPU/ETM.
>>>>>
>>>>> I don't beleive we have a choice for this - we cannot retain what is
>>>>> an essentially broken allocation mechanism.
>>>>>
>>>>
>>>> I completely agree and I am happy with the current step by step approach
>>>> of moving to a dynamic allocation scheme. Apologies, this wasn't
>>>> conveyed appropriately.
>>>>
>>>>>> The proposed approach of exposing this via sysfs may (am not sure if
>>>>>> this would be the case) break for the trace-id per sink change, as a
>>>>>> sink could assign different trace-id for a CPU depending.
>>>>>>
>>>>>
>>>>> If a path exists between a CPU and a sink - the current framework as
>>>>> far as I can tell would not allow for a new path to be set up between
>>>>> the cpu and another sink.
>>>>
>>>> e.g, if we have concurrent perf sessions, in the future with sink  based
>>>> allocation :
>>>>
>>>> perf record -e cs_etm/@sink1/... payload1
>>>> perf record -e cs_etm/@sink2/... payload2
>>>> perf record -e cs_etm// ...      payload3
>>>>
>>>> The trace id allocated for first session for CPU0 *could* be different
>>>> from that of the second or the third.
>>>
>>> If these sessions run concurrently then the same Trace ID will be used
>>> for CPU0 for all the sessions.
>>> We ensure this by notifications that a cs_etm session is starting and
>>> stopping - and keep a refcount.
>>
>> The scheme is fine now, with a global trace-id map. But with per-sink
>> allocation, this could cause problems.
>>
>> e.g., there could be a situation where:
>>
>> trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
>>
>> So if we have a session where both CPU0 and CPU1 trace to a common sink,
>> we get the trace mixed with no way of splitting them. As the perf will
>> read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.
> 
> I think we need to consider the CoreSight hardware topology here.
> 
> Any CPUx that can trace to a sink reachable by another CPUy must
> always get trace IDs from the same pool as CPUy.
> 
> Consider the options for multi sink topologies:-
> 
> CPU0->funnel0->ETF->ETR
> CPU1--+^
> 
> Clearly - in-line sinks can never have simultaneous independent
> sessions - the session into ETR traces through ETF
> 
> Now we could have replicators / programmable replicators -
> 
> ATB->Prog replicator->ETR0
>                                   +->ETR1
> 
> however programmable replicators use trace ID for filtering - this is
> effectively a single sink on the input, so once again the Trace IDs
> must come from the same pool.
> 
> Now, we could have independent per cluster / socket topology
> Cluster 0
> CPU0->funnel0->ETF0->ETR0
> CPU1--+^
> 
> Cluster 1
> CPU2->funnel1->ETF1->ETR1
> CPU3--+^
> 

What if the ETR was a common one ? i.e.

Cluster0
CPU0 -> ETF0 .....
                    \
Cluster1            -- ETR0
CPU1 -> ETF1 ..... /

And lets there are 3 perf sessions in parallel, started in the
order below

perf record -e cs_etm/@etf0/ app1 # CPU0 gets a trace-id[etf0] -> 0x50
perf record -e cs_etm/@etf1/ app2 # CPU1 gets a trace-id[etf1] -> 0x50
perf record -e cs_etm/@etr/  app3 # CPU0 and CPU1 both use the existing 
trace ids from the allocations.

So, when app3 threads are scheduled on CPU0 & CPU1, we get the trace in
ETR with the same trace-id of 0x50.

Suzuki

Mike Leach March 23, 2022, 11:05 a.m. UTC | #9

Hi Al,

On Wed, 23 Mar 2022 at 10:36, Al Grant <Al.Grant@arm.com> wrote:
>
> > -----Original Message-----
> > From: Mike Leach <mike.leach@linaro.org>
> > Sent: 23 March 2022 10:08
> > To: Suzuki Poulose <Suzuki.Poulose@arm.com>
> > Cc: coresight@lists.linaro.org; linux-arm-kernel@lists.infradead.org; linux-
> > kernel@vger.kernel.org; peterz@infradead.org; mingo@redhat.com;
> > acme@kernel.org; linux-perf-users@vger.kernel.org; James Clark
> > <James.Clark@arm.com>
> > Subject: Re: [PATCH 00/10] coresight: Add new API to allocate trace source ID
> > values
> >
> > Hi Suzuki,
> >
> > On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose <suzuki.poulose@arm.com>
> > wrote:
> > >
> > > Hi Mike
> > >
> > > On 22/03/2022 14:27, Mike Leach wrote:
> > > > Hi Suzuki
> > > >
> > > > On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
> > > > <suzuki.poulose@arm.com> wrote:
> > > >>
> > > >> On 22/03/2022 11:38, Mike Leach wrote:
> > > >>> HI Suzuki,
> > > >>>
> > > >>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
> > > >>> <suzuki.poulose@arm.com> wrote:
> > > >>>>
> > > >>>> + Cc: James Clark
> > > >>>>
> > > >>>> Hi Mike,
> > > >>>>
> > > >>>> On 08/03/2022 20:49, Mike Leach wrote:
> > > >>>>> The current method for allocating trace source ID values to
> > > >>>>> sources is to use a fixed algorithm for CPU based sources of
> > (cpu_num * 2 + 0x10).
> > > >>>>> The STM is allocated ID 0x1.
> > > >>>>>
> > > >>>>> This fixed algorithm is used in both the CoreSight driver code,
> > > >>>>> and by perf when writing the trace metadata in the AUXTRACE_INFO
> > record.
> > > >>>>>
> > > >>>>> The method needs replacing as currently:- 1. It is inefficient
> > > >>>>> in using available IDs.
> > > >>>>> 2. Does not scale to larger systems with many cores and the
> > > >>>>> algorithm has no limits so will generate invalid trace IDs for cpu
> > number > 44.
> > > >>>>
> > > >>>> Thanks for addressing this issue.
> > > >>>>
> > > >>>>>
> > > >>>>> Additionally requirements to allocate additional system IDs on
> > > >>>>> some systems have been seen.
> > > >>>>>
> > > >>>>> This patch set  introduces an API that allows the allocation of
> > > >>>>> trace IDs in a dynamic manner.
> > > >>>>>
> > > >>>>> Architecturally reserved IDs are never allocated, and the system
> > > >>>>> is limited to allocating only valid IDs.
> > > >>>>>
> > > >>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is
> > > >>>>> updated to use the new API.
> > > >>>>>
> > > >>>>> perf handling is changed so that the ID associated with the CPU
> > > >>>>> is read from sysfs. The ID allocator is notified when perf
> > > >>>>> events start and stop so CPU based IDs are kept constant
> > throughout any perf session.
> > > >>>>>
> > > >>>>> For the ETMx.x devices IDs are allocated on certain events
> > > >>>>> a) When using sysfs, an ID will be allocated on hardware enable,
> > > >>>>> and freed when the sysfs reset is written.
> > > >>>>> b) When using perf, ID is allocated on hardware enable, and
> > > >>>>> freed on hardware disable.
> > > >>>>>
> > > >>>>> For both cases the ID is allocated when sysfs is read to get the
> > > >>>>> current trace ID. This ensures that consistent decode metadata
> > > >>>>> can be extracted from the system where this read occurs before
> > device enable.
> > > >>>>
> > > >>>>
> > > >>>>>
> > > >>>>> Note: This patchset breaks backward compatibility for perf record.
> > > >>>>> Because the method for generating the AUXTRACE_INFO meta data
> > > >>>>> has changed, using an older perf record will result in metadata
> > > >>>>> that does not match the trace IDs used in the recorded trace data.
> > > >>>>> This mismatch will cause subsequent decode to fail. Older
> > > >>>>> versions of perf will still be able to decode data generated by the
> > updated system.
> > > >>>>
> > > >>>> I have some concerns over this and the future plans for the
> > > >>>> dynamic allocation per sink. i.e., we are breaking/modifying the
> > > >>>> perf now to accommodate the dynamic nature of the trace id of a
> > given CPU/ETM.
> > > >>>
> > > >>> I don't beleive we have a choice for this - we cannot retain what
> > > >>> is an essentially broken allocation mechanism.
> > > >>>
> > > >>
> > > >> I completely agree and I am happy with the current step by step
> > > >> approach of moving to a dynamic allocation scheme. Apologies, this
> > > >> wasn't conveyed appropriately.
> > > >>
> > > >>>> The proposed approach of exposing this via sysfs may (am not sure
> > > >>>> if this would be the case) break for the trace-id per sink
> > > >>>> change, as a sink could assign different trace-id for a CPU depending.
> > > >>>>
> > > >>>
> > > >>> If a path exists between a CPU and a sink - the current framework
> > > >>> as far as I can tell would not allow for a new path to be set up
> > > >>> between the cpu and another sink.
> > > >>
> > > >> e.g, if we have concurrent perf sessions, in the future with sink
> > > >> based allocation :
> > > >>
> > > >> perf record -e cs_etm/@sink1/... payload1 perf record -e
> > > >> cs_etm/@sink2/... payload2
> > > >> perf record -e cs_etm// ...      payload3
> > > >>
> > > >> The trace id allocated for first session for CPU0 *could* be
> > > >> different from that of the second or the third.
> > > >
> > > > If these sessions run concurrently then the same Trace ID will be
> > > > used for CPU0 for all the sessions.
> > > > We ensure this by notifications that a cs_etm session is starting
> > > > and stopping - and keep a refcount.
> > >
> > > The scheme is fine now, with a global trace-id map. But with per-sink
> > > allocation, this could cause problems.
> > >
> > > e.g., there could be a situation where:
> > >
> > > trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
> > >
> > > So if we have a session where both CPU0 and CPU1 trace to a common
> > > sink, we get the trace mixed with no way of splitting them. As the
> > > perf will read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.
> >
> > I think we need to consider the CoreSight hardware topology here.
> >
> > Any CPUx that can trace to a sink reachable by another CPUy must always
> > get trace IDs from the same pool as CPUy.
> >
> > Consider the options for multi sink topologies:-
> >
> > CPU0->funnel0->ETF->ETR
> > CPU1--+^
> >
> > Clearly - in-line sinks can never have simultaneous independent sessions -
> > the session into ETR traces through ETF
> >
> > Now we could have replicators / programmable replicators -
> >
> > ATB->Prog replicator->ETR0
> >                                  +->ETR1
> >
> > however programmable replicators use trace ID for filtering - this is
> > effectively a single sink on the input, so once again the Trace IDs must come
> > from the same pool.
> >
> > Now, we could have independent per cluster / socket topology Cluster 0
> > CPU0->funnel0->ETF0->ETR0
> > CPU1--+^
> >
> > Cluster 1
> > CPU2->funnel1->ETF1->ETR1
> > CPU3--+^
> >
> > Here cluster 0 & 1 can have independent sets of trace IDs as their respective
> > cores can never trace to the same sink.
> >
> > Finally we have the ETE+TRBE 1:1 type topologies. These could actually not
> > bother allocating any trace ID when in 1:1 mode, which should probably be a
> > follow on incremental addition to this initial set.
> >
> > So, my conclusion when I was considering all this is that "per-sink"
> > trace Id allocation is in fact "per unique trace path set" allocation.
>
> The id pools can't always be defined statically in all situations.
> E.g. if you have 128 CPUs each with their own ETR, and also
> replicated into a funnel network leading to a common ETR.
>

Agreed - and this will be an issue that is needed to be considered
when implementing multiple ID pools.
This is a possibility when ETE traces with TRBE switched off.


> This topology supports (at least) two distinct modes: (a) all CPUs
> enabled for tracing to their own ETRs (b) a selection of CPUs
> (up to some limit), combined together. Both are valid dynamic
> configurations. Perf might have a preference on which one to use,
> but both are valid.
>
> But there's no static id pool that works for both. The pool for
> (b) has to allocate to some random selection of 128 CPUs,
> from only around 120 numbers. The pool has to take account of
> which CPUs are selected.
>

Also agreed - it is entirely possible to run out of IDs. The choice
then is to not trace on any, or simply trace on the first N that get
valid IDs.
Which has to be a function of the perf handing code. IDs can be
allocated when perf is running through the code to set up the trace
paths from the selected cores.
The present system "allows" trace to go from all cores by allocating
invalid trace IDs once we get past too many cores (46 at present).


> So your comment "any CPU that can trace..." has to be interpreted as
> "any CPU that can trace in the currently configured dynamic topology"
> rather than "any CPU that can be dynamically configured to trace..."...
> is that what you meant?

Pretty much. The real problems with "per-sink"  / Trace ID pools is
recognizing the current set up topology when this can be dynamic.
I anticipated that the metadata attached to allocating trace IDs will
have to expand to recognise this one we get to advanced topologies -
which is why the API was designed to always pass in the metadata.


>
> Alternatively, we could just declare that such systems are too
> complicated to support, and say that we wouldn't support the
> use of a global sink that (statically) was reachable by 128 CPUs.
>

I think that we can support these systems - it is just that the user /
perf will have to accept that there is a limit to the number of trace
IDs that can be allocated & hence the number of CPUs that will
actually trace the event.
That will mean selecting a set of CPUS to trace on, or not scheduling
the event on a core that cannot trace due to the lack of a trace ID.
The infrastructure will not enable a trace path if that path cannot
allocate a unique ID for the current trace ID set. (this has always
been the case - the only change is that it is handled by the dynamic
allocator now)

Alternatively we could simply say that Trace IDs are a limited
resource - we will only support using a single pool at once (111
possible IDs when you take into account the reserved values) - which
is the situation with this patchset.


Mike

> Al
>
>
> >
> >
> >
> > >
> > > So my point is, we are changing the ABI for perf to grab the TraceID
> > > with your patches. And clearly this approach could break easily when
> > > we extend to sink-based idmap. So, lets make the ABI change for perf
> > > scalable and bullet proof (as far as we can) by exposing this
> > > information via the perf RECORD. That way any future changes in the
> > > scheme won't affect the perf as long as it has a reliable information
> > > within each "record".
> > >
> > >
> > > My point is, let us fix this once and for all, so that we don't need
> > > to change this again. I understand this involves more work in the perf
> > > tool. I believe that is for better
> > >
> > > Thoughts ?
> > >
> >
> > My preference is the incremental approach.
> > Fix the trace ID allocation issues that partners are having now, then update
> > to the perf record approach in a separate follow up patchset.
> > Then when we start to see systems that require it - update to using the per-
> > unique-path trace ID pools.
> >
> > Regards
> >
> > Mike
> >
> > > Suzuki
> >
> >
> >
> > --
> > Mike Leach
> > Principal Engineer, ARM Ltd.
> > Manchester Design Centre. UK
> > _______________________________________________
> > CoreSight mailing list -- coresight@lists.linaro.org To unsubscribe send an
> > email to coresight-leave@lists.linaro.org

Mike Leach March 23, 2022, 11:35 a.m. UTC | #10

Hi Suzuki

On Wed, 23 Mar 2022 at 10:41, Suzuki Kuruppassery Poulose
<suzuki.poulose@arm.com> wrote:
>
> Hi Mike
>
> On 23/03/2022 10:07, Mike Leach wrote:
> > Hi Suzuki,
> >
> > On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >>
> >> Hi Mike
> >>
> >> On 22/03/2022 14:27, Mike Leach wrote:
> >>> Hi Suzuki
> >>>
> >>> On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
> >>> <suzuki.poulose@arm.com> wrote:
> >>>>
> >>>> On 22/03/2022 11:38, Mike Leach wrote:
> >>>>> HI Suzuki,
> >>>>>
> >>>>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
> >>>>> <suzuki.poulose@arm.com> wrote:
> >>>>>>
> >>>>>> + Cc: James Clark
> >>>>>>
> >>>>>> Hi Mike,
> >>>>>>
> >>>>>> On 08/03/2022 20:49, Mike Leach wrote:
> >>>>>>> The current method for allocating trace source ID values to sources is
> >>>>>>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
> >>>>>>> The STM is allocated ID 0x1.
> >>>>>>>
> >>>>>>> This fixed algorithm is used in both the CoreSight driver code, and by
> >>>>>>> perf when writing the trace metadata in the AUXTRACE_INFO record.
> >>>>>>>
> >>>>>>> The method needs replacing as currently:-
> >>>>>>> 1. It is inefficient in using available IDs.
> >>>>>>> 2. Does not scale to larger systems with many cores and the algorithm
> >>>>>>> has no limits so will generate invalid trace IDs for cpu number > 44.
> >>>>>>
> >>>>>> Thanks for addressing this issue.
> >>>>>>
> >>>>>>>
> >>>>>>> Additionally requirements to allocate additional system IDs on some
> >>>>>>> systems have been seen.
> >>>>>>>
> >>>>>>> This patch set  introduces an API that allows the allocation of trace IDs
> >>>>>>> in a dynamic manner.
> >>>>>>>
> >>>>>>> Architecturally reserved IDs are never allocated, and the system is
> >>>>>>> limited to allocating only valid IDs.
> >>>>>>>
> >>>>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
> >>>>>>> the new API.
> >>>>>>>
> >>>>>>> perf handling is changed so that the ID associated with the CPU is read
> >>>>>>> from sysfs. The ID allocator is notified when perf events start and stop
> >>>>>>> so CPU based IDs are kept constant throughout any perf session.
> >>>>>>>
> >>>>>>> For the ETMx.x devices IDs are allocated on certain events
> >>>>>>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
> >>>>>>> when the sysfs reset is written.
> >>>>>>> b) When using perf, ID is allocated on hardware enable, and freed on
> >>>>>>> hardware disable.
> >>>>>>>
> >>>>>>> For both cases the ID is allocated when sysfs is read to get the current
> >>>>>>> trace ID. This ensures that consistent decode metadata can be extracted
> >>>>>>> from the system where this read occurs before device enable.
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Note: This patchset breaks backward compatibility for perf record.
> >>>>>>> Because the method for generating the AUXTRACE_INFO meta data has
> >>>>>>> changed, using an older perf record will result in metadata that
> >>>>>>> does not match the trace IDs used in the recorded trace data.
> >>>>>>> This mismatch will cause subsequent decode to fail. Older versions of
> >>>>>>> perf will still be able to decode data generated by the updated system.
> >>>>>>
> >>>>>> I have some concerns over this and the future plans for the dynamic
> >>>>>> allocation per sink. i.e., we are breaking/modifying the perf now to
> >>>>>> accommodate the dynamic nature of the trace id of a given CPU/ETM.
> >>>>>
> >>>>> I don't beleive we have a choice for this - we cannot retain what is
> >>>>> an essentially broken allocation mechanism.
> >>>>>
> >>>>
> >>>> I completely agree and I am happy with the current step by step approach
> >>>> of moving to a dynamic allocation scheme. Apologies, this wasn't
> >>>> conveyed appropriately.
> >>>>
> >>>>>> The proposed approach of exposing this via sysfs may (am not sure if
> >>>>>> this would be the case) break for the trace-id per sink change, as a
> >>>>>> sink could assign different trace-id for a CPU depending.
> >>>>>>
> >>>>>
> >>>>> If a path exists between a CPU and a sink - the current framework as
> >>>>> far as I can tell would not allow for a new path to be set up between
> >>>>> the cpu and another sink.
> >>>>
> >>>> e.g, if we have concurrent perf sessions, in the future with sink  based
> >>>> allocation :
> >>>>
> >>>> perf record -e cs_etm/@sink1/... payload1
> >>>> perf record -e cs_etm/@sink2/... payload2
> >>>> perf record -e cs_etm// ...      payload3
> >>>>
> >>>> The trace id allocated for first session for CPU0 *could* be different
> >>>> from that of the second or the third.
> >>>
> >>> If these sessions run concurrently then the same Trace ID will be used
> >>> for CPU0 for all the sessions.
> >>> We ensure this by notifications that a cs_etm session is starting and
> >>> stopping - and keep a refcount.
> >>
> >> The scheme is fine now, with a global trace-id map. But with per-sink
> >> allocation, this could cause problems.
> >>
> >> e.g., there could be a situation where:
> >>
> >> trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
> >>
> >> So if we have a session where both CPU0 and CPU1 trace to a common sink,
> >> we get the trace mixed with no way of splitting them. As the perf will
> >> read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.
> >
> > I think we need to consider the CoreSight hardware topology here.
> >
> > Any CPUx that can trace to a sink reachable by another CPUy must
> > always get trace IDs from the same pool as CPUy.
> >
> > Consider the options for multi sink topologies:-
> >
> > CPU0->funnel0->ETF->ETR
> > CPU1--+^
> >
> > Clearly - in-line sinks can never have simultaneous independent
> > sessions - the session into ETR traces through ETF
> >
> > Now we could have replicators / programmable replicators -
> >
> > ATB->Prog replicator->ETR0
> >                                   +->ETR1
> >
> > however programmable replicators use trace ID for filtering - this is
> > effectively a single sink on the input, so once again the Trace IDs
> > must come from the same pool.
> >
> > Now, we could have independent per cluster / socket topology
> > Cluster 0
> > CPU0->funnel0->ETF0->ETR0
> > CPU1--+^
> >
> > Cluster 1
> > CPU2->funnel1->ETF1->ETR1
> > CPU3--+^
> >
>
> What if the ETR was a common one ? i.e.
>
> Cluster0
> CPU0 -> ETF0 .....
>                     \
> Cluster1            -- ETR0
> CPU1 -> ETF1 ..... /
>
> And lets there are 3 perf sessions in parallel, started in the
> order below
>
> perf record -e cs_etm/@etf0/ app1 # CPU0 gets a trace-id[etf0] -> 0x50
> perf record -e cs_etm/@etf1/ app2 # CPU1 gets a trace-id[etf1] -> 0x50
> perf record -e cs_etm/@etr/  app3 # CPU0 and CPU1 both use the existing
> trace ids from the allocations.
>

This could be treated as a single combined topology - as soon as any
sink is reachable by CPU0 and CPU1 then we have to treat this as
having a single pool of trace IDs and so CPU0 and CPU1 cannot have the
same ID.
Alternatively, once the allocation metadata is expanded to recognize
trace ID pools - it is entirely possible to ensure that the CPU / ID
fixing is done on a per pool basis.

One of the reasons we need to ensure that the CPU / ID allocation
remains constant is that an event can be scheduled on a CPU multiple
times for a single aux buffer - resulting in multiple trace blocks in
the buffer / multiple buffers in the data file, so we cannot have the
ID change mid buffer / file without significant changes to the decode
process and tracking of CPU / ID changes on a intra buffer basis.

Mike

> So, when app3 threads are scheduled on CPU0 & CPU1, we get the trace in
> ETR with the same trace-id of 0x50.
>
> Suzuki

Suzuki K Poulose March 23, 2022, 12:08 p.m. UTC | #11

Hi Mike

On 23/03/2022 11:35, Mike Leach wrote:
> Hi Suzuki
> 
> On Wed, 23 Mar 2022 at 10:41, Suzuki Kuruppassery Poulose
> <suzuki.poulose@arm.com> wrote:
>>
>> Hi Mike
>>
>> On 23/03/2022 10:07, Mike Leach wrote:
>>> Hi Suzuki,
>>>
>>> On Tue, 22 Mar 2022 at 18:52, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>>>
>>>> Hi Mike
>>>>
>>>> On 22/03/2022 14:27, Mike Leach wrote:
>>>>> Hi Suzuki
>>>>>
>>>>> On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
>>>>> <suzuki.poulose@arm.com> wrote:
>>>>>>
>>>>>> On 22/03/2022 11:38, Mike Leach wrote:
>>>>>>> HI Suzuki,
>>>>>>>
>>>>>>> On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
>>>>>>> <suzuki.poulose@arm.com> wrote:
>>>>>>>>
>>>>>>>> + Cc: James Clark
>>>>>>>>
>>>>>>>> Hi Mike,
>>>>>>>>
>>>>>>>> On 08/03/2022 20:49, Mike Leach wrote:
>>>>>>>>> The current method for allocating trace source ID values to sources is
>>>>>>>>> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
>>>>>>>>> The STM is allocated ID 0x1.
>>>>>>>>>
>>>>>>>>> This fixed algorithm is used in both the CoreSight driver code, and by
>>>>>>>>> perf when writing the trace metadata in the AUXTRACE_INFO record.
>>>>>>>>>
>>>>>>>>> The method needs replacing as currently:-
>>>>>>>>> 1. It is inefficient in using available IDs.
>>>>>>>>> 2. Does not scale to larger systems with many cores and the algorithm
>>>>>>>>> has no limits so will generate invalid trace IDs for cpu number > 44.
>>>>>>>>
>>>>>>>> Thanks for addressing this issue.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Additionally requirements to allocate additional system IDs on some
>>>>>>>>> systems have been seen.
>>>>>>>>>
>>>>>>>>> This patch set  introduces an API that allows the allocation of trace IDs
>>>>>>>>> in a dynamic manner.
>>>>>>>>>
>>>>>>>>> Architecturally reserved IDs are never allocated, and the system is
>>>>>>>>> limited to allocating only valid IDs.
>>>>>>>>>
>>>>>>>>> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
>>>>>>>>> the new API.
>>>>>>>>>
>>>>>>>>> perf handling is changed so that the ID associated with the CPU is read
>>>>>>>>> from sysfs. The ID allocator is notified when perf events start and stop
>>>>>>>>> so CPU based IDs are kept constant throughout any perf session.
>>>>>>>>>
>>>>>>>>> For the ETMx.x devices IDs are allocated on certain events
>>>>>>>>> a) When using sysfs, an ID will be allocated on hardware enable, and freed
>>>>>>>>> when the sysfs reset is written.
>>>>>>>>> b) When using perf, ID is allocated on hardware enable, and freed on
>>>>>>>>> hardware disable.
>>>>>>>>>
>>>>>>>>> For both cases the ID is allocated when sysfs is read to get the current
>>>>>>>>> trace ID. This ensures that consistent decode metadata can be extracted
>>>>>>>>> from the system where this read occurs before device enable.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Note: This patchset breaks backward compatibility for perf record.
>>>>>>>>> Because the method for generating the AUXTRACE_INFO meta data has
>>>>>>>>> changed, using an older perf record will result in metadata that
>>>>>>>>> does not match the trace IDs used in the recorded trace data.
>>>>>>>>> This mismatch will cause subsequent decode to fail. Older versions of
>>>>>>>>> perf will still be able to decode data generated by the updated system.
>>>>>>>>
>>>>>>>> I have some concerns over this and the future plans for the dynamic
>>>>>>>> allocation per sink. i.e., we are breaking/modifying the perf now to
>>>>>>>> accommodate the dynamic nature of the trace id of a given CPU/ETM.
>>>>>>>
>>>>>>> I don't beleive we have a choice for this - we cannot retain what is
>>>>>>> an essentially broken allocation mechanism.
>>>>>>>
>>>>>>
>>>>>> I completely agree and I am happy with the current step by step approach
>>>>>> of moving to a dynamic allocation scheme. Apologies, this wasn't
>>>>>> conveyed appropriately.
>>>>>>
>>>>>>>> The proposed approach of exposing this via sysfs may (am not sure if
>>>>>>>> this would be the case) break for the trace-id per sink change, as a
>>>>>>>> sink could assign different trace-id for a CPU depending.
>>>>>>>>
>>>>>>>
>>>>>>> If a path exists between a CPU and a sink - the current framework as
>>>>>>> far as I can tell would not allow for a new path to be set up between
>>>>>>> the cpu and another sink.
>>>>>>
>>>>>> e.g, if we have concurrent perf sessions, in the future with sink  based
>>>>>> allocation :
>>>>>>
>>>>>> perf record -e cs_etm/@sink1/... payload1
>>>>>> perf record -e cs_etm/@sink2/... payload2
>>>>>> perf record -e cs_etm// ...      payload3
>>>>>>
>>>>>> The trace id allocated for first session for CPU0 *could* be different
>>>>>> from that of the second or the third.
>>>>>
>>>>> If these sessions run concurrently then the same Trace ID will be used
>>>>> for CPU0 for all the sessions.
>>>>> We ensure this by notifications that a cs_etm session is starting and
>>>>> stopping - and keep a refcount.
>>>>
>>>> The scheme is fine now, with a global trace-id map. But with per-sink
>>>> allocation, this could cause problems.
>>>>
>>>> e.g., there could be a situation where:
>>>>
>>>> trace_id[CPU0][sink0] == trace_id[CPU1][sink1]
>>>>
>>>> So if we have a session where both CPU0 and CPU1 trace to a common sink,
>>>> we get the trace mixed with no way of splitting them. As the perf will
>>>> read the trace-id for CPU0 from that of sink0 and CPU1 from sink1.
>>>
>>> I think we need to consider the CoreSight hardware topology here.
>>>
>>> Any CPUx that can trace to a sink reachable by another CPUy must
>>> always get trace IDs from the same pool as CPUy.
>>>
>>> Consider the options for multi sink topologies:-
>>>
>>> CPU0->funnel0->ETF->ETR
>>> CPU1--+^
>>>
>>> Clearly - in-line sinks can never have simultaneous independent
>>> sessions - the session into ETR traces through ETF
>>>
>>> Now we could have replicators / programmable replicators -
>>>
>>> ATB->Prog replicator->ETR0
>>>                                    +->ETR1
>>>
>>> however programmable replicators use trace ID for filtering - this is
>>> effectively a single sink on the input, so once again the Trace IDs
>>> must come from the same pool.
>>>
>>> Now, we could have independent per cluster / socket topology
>>> Cluster 0
>>> CPU0->funnel0->ETF0->ETR0
>>> CPU1--+^
>>>
>>> Cluster 1
>>> CPU2->funnel1->ETF1->ETR1
>>> CPU3--+^
>>>
>>
>> What if the ETR was a common one ? i.e.
>>
>> Cluster0
>> CPU0 -> ETF0 .....
>>                      \
>> Cluster1            -- ETR0
>> CPU1 -> ETF1 ..... /
>>
>> And lets there are 3 perf sessions in parallel, started in the
>> order below
>>
>> perf record -e cs_etm/@etf0/ app1 # CPU0 gets a trace-id[etf0] -> 0x50
>> perf record -e cs_etm/@etf1/ app2 # CPU1 gets a trace-id[etf1] -> 0x50
>> perf record -e cs_etm/@etr/  app3 # CPU0 and CPU1 both use the existing
>> trace ids from the allocations.
>>
> 
> This could be treated as a single combined topology - as soon as any
> sink is reachable by CPU0 and CPU1 then we have to treat this as
> having a single pool of trace IDs and so CPU0 and CPU1 cannot have the
> same ID.

IIUC, that is indeed going to be much more complex than, allocating
trace-id per sink. Moreover, we are going to end up "enforcing" the
pool (with a global system wide ETR) restriction to a sink that is local
to a cluster for e.g. And thus could be back to square 1.

> Alternatively, once the allocation metadata is expanded to recognize
> trace ID pools - it is entirely possible to ensure that the CPU / ID
> fixing is done on a per pool basis.
> 
> One of the reasons we need to ensure that the CPU / ID allocation
> remains constant is that an event can be scheduled on a CPU multiple
> times for a single aux buffer - resulting in multiple trace blocks in
> the buffer / multiple buffers in the data file, so we cannot have the
> ID change mid buffer / file without significant changes to the decode
> process and tracking of CPU / ID changes on a intra buffer basis.

Correct, and we must not. My proposal is not to change the traceid of a 
CPU for a given "perf record". But, instead, since the sink for a CPU is
fixed for a given "perf record" and it can't change, we can allocate
a traceid map per sink, which will remain the same in a given record.

Cheers
Suzuki



> Mike
> 
>> So, when app3 threads are scheduled on CPU0 & CPU1, we get the trace in
>> ETR with the same trace-id of 0x50.
>>
>> Suzuki
> 
> 
>

Mathieu Poirier April 4, 2022, 4:15 p.m. UTC | #12

Good morning,

On Tue, Mar 08, 2022 at 08:49:50PM +0000, Mike Leach wrote:
> The current method for allocating trace source ID values to sources is
> to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
> The STM is allocated ID 0x1.
> 
> This fixed algorithm is used in both the CoreSight driver code, and by
> perf when writing the trace metadata in the AUXTRACE_INFO record.
> 
> The method needs replacing as currently:-
> 1. It is inefficient in using available IDs.
> 2. Does not scale to larger systems with many cores and the algorithm
> has no limits so will generate invalid trace IDs for cpu number > 44.
> 
> Additionally requirements to allocate additional system IDs on some
> systems have been seen.
> 
> This patch set  introduces an API that allows the allocation of trace IDs
> in a dynamic manner.
> 
> Architecturally reserved IDs are never allocated, and the system is
> limited to allocating only valid IDs.
> 
> Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
> the new API.
> 
> perf handling is changed so that the ID associated with the CPU is read
> from sysfs. The ID allocator is notified when perf events start and stop
> so CPU based IDs are kept constant throughout any perf session.
> 
> For the ETMx.x devices IDs are allocated on certain events
> a) When using sysfs, an ID will be allocated on hardware enable, and freed
> when the sysfs reset is written.
> b) When using perf, ID is allocated on hardware enable, and freed on
> hardware disable.
> 
> For both cases the ID is allocated when sysfs is read to get the current
> trace ID. This ensures that consistent decode metadata can be extracted
> from the system where this read occurs before device enable.
> 
> Note: This patchset breaks backward compatibility for perf record.
> Because the method for generating the AUXTRACE_INFO meta data has
> changed, using an older perf record will result in metadata that
> does not match the trace IDs used in the recorded trace data.
> This mismatch will cause subsequent decode to fail. Older versions of
> perf will still be able to decode data generated by the updated system.
> 

I have started looking at this set, comments to follow shortly.

Thanks,
Mathieu

> 
> Applies to coresight/next [b54f53bc11a5]
> Tested on DB410c
> 
> Mike Leach (10):
>   coresight: trace-id: Add API to dynamically assign trace ID values
>   coresight: trace-id: Set up source trace ID map for system
>   coresight: stm: Update STM driver to use Trace ID api
>   coresight: etm4x: Use trace ID API to dynamically allocate trace ID
>   coresight: etm3x: Use trace ID API to allocate IDs
>   coresight: perf: traceid: Add perf notifiers for trace ID
>   perf: cs-etm: Update event to read trace ID from sysfs
>   coresight: Remove legacy Trace ID allocation mechanism
>   coresight: etmX.X: stm: Remove unused legacy source trace ID ops
>   coresight: trace-id: Add debug & test macros to trace id allocation
> 
>  drivers/hwtracing/coresight/Makefile          |   2 +-
>  drivers/hwtracing/coresight/coresight-core.c  |  64 ++---
>  .../hwtracing/coresight/coresight-etm-perf.c  |  16 +-
>  drivers/hwtracing/coresight/coresight-etm.h   |   3 +-
>  .../coresight/coresight-etm3x-core.c          |  93 ++++---
>  .../coresight/coresight-etm3x-sysfs.c         |  28 +-
>  .../coresight/coresight-etm4x-core.c          |  63 ++++-
>  .../coresight/coresight-etm4x-sysfs.c         |  32 ++-
>  drivers/hwtracing/coresight/coresight-etm4x.h |   3 +
>  drivers/hwtracing/coresight/coresight-priv.h  |   1 +
>  drivers/hwtracing/coresight/coresight-stm.c   |  49 +---
>  .../hwtracing/coresight/coresight-trace-id.c  | 255 ++++++++++++++++++
>  .../hwtracing/coresight/coresight-trace-id.h  |  69 +++++
>  include/linux/coresight-pmu.h                 |  12 -
>  include/linux/coresight.h                     |   3 -
>  tools/perf/arch/arm/util/cs-etm.c             |  12 +-
>  16 files changed, 530 insertions(+), 175 deletions(-)
>  create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.c
>  create mode 100644 drivers/hwtracing/coresight/coresight-trace-id.h
> 
> -- 
> 2.17.1
>

[00/10] coresight: Add new API to allocate trace source ID values

Message

Comments