diff mbox series

[v7,2/2] perf cs-etm: Split --dump-raw-trace by AUX records

Message ID 20210624164303.28632-3-james.clark@arm.com (mailing list archive)
State New, archived
Headers show
Series perf cs-etm: Split Coresight decode by aux records | expand

Commit Message

James Clark June 24, 2021, 4:43 p.m. UTC
Currently --dump-raw-trace skips queueing and splitting buffers because
of an early exit condition in cs_etm__process_auxtrace_info(). Once
that is removed we can print the split data by using the queues
and searching for split buffers with the same reference as the
one that is currently being processed.

This keeps the same behaviour of dumping in file order when an AUXTRACE
event appears, rather than moving trace dump to where AUX records are in
the file.

There will be a newline and size printout for each fragment. For example
this buffer is comprised of two AUX records, but was printed as one:

  0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t

  . ... CoreSight ETM Trace data: size 160 bytes
          Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
          Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
          Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
          Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;

But is now printed as two fragments:

  0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t

  . ... CoreSight ETM Trace data: size 80 bytes
          Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
          Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;

  . ... CoreSight ETM Trace data: size 80 bytes
          Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
          Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;

Decoding errors that appeared in problematic files are now not present,
for example:

        Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
        ...
        PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822

Signed-off-by: James Clark <james.clark@arm.com>
---
 tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

Comments

Leo Yan June 28, 2021, 1:27 a.m. UTC | #1
Hi James,

On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
> Currently --dump-raw-trace skips queueing and splitting buffers because
> of an early exit condition in cs_etm__process_auxtrace_info(). Once
> that is removed we can print the split data by using the queues
> and searching for split buffers with the same reference as the
> one that is currently being processed.
> 
> This keeps the same behaviour of dumping in file order when an AUXTRACE
> event appears, rather than moving trace dump to where AUX records are in
> the file.
> 
> There will be a newline and size printout for each fragment. For example
> this buffer is comprised of two AUX records, but was printed as one:
> 
>   0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t
> 
>   . ... CoreSight ETM Trace data: size 160 bytes
>           Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>           Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>           Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
>           Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;
> 
> But is now printed as two fragments:
> 
>   0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t
> 
>   . ... CoreSight ETM Trace data: size 80 bytes
>           Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>           Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
> 
>   . ... CoreSight ETM Trace data: size 80 bytes
>           Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
>           Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;
> 
> Decoding errors that appeared in problematic files are now not present,
> for example:
> 
>         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
>         ...
>         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
> 
> Signed-off-by: James Clark <james.clark@arm.com>

I tested this patch and the result looks good for me.

I found a side effect introduced by this change is the perf raw dump
also synthesizes events PERF_RECORD_ATTR.  This is because function
cs_etm__synth_events() will execute after applying this patch and it
synthesizes PERF_RECORD_ATTR events.  I don't see any harm for this,
so:

Tested-by: Leo Yan <leo.yan@linaro.org>

Please see an extra comment in below.

> ---
>  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 88e8122f73c9..ad777c2a342f 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -2430,6 +2430,22 @@ static int cs_etm__process_event(struct perf_session *session,
>  	return 0;
>  }
>  
> +static void dump_queued_data(struct cs_etm_auxtrace *etm,
> +			     struct perf_record_auxtrace *event)
> +{
> +	struct auxtrace_buffer *buf;
> +	unsigned int i;
> +	/*
> +	 * Find all buffers with same reference in the queues and dump them.
> +	 * This is because the queues can contain multiple entries of the same
> +	 * buffer that were split on aux records.
> +	 */
> +	for (i = 0; i < etm->queues.nr_queues; ++i)
> +		list_for_each_entry(buf, &etm->queues.queue_array[i].head, list)
> +			if (buf->reference == event->reference)
> +				cs_etm__dump_event(etm, buf);
> +}
> +
>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
>  					  union perf_event *event,
>  					  struct perf_tool *tool __maybe_unused)
> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
>  				cs_etm__dump_event(etm, buffer);
>  				auxtrace_buffer__put_data(buffer);
>  			}
> -	}
> +	} else if (dump_trace)
> +		dump_queued_data(etm, &event->auxtrace);

IIUC, in the function cs_etm__process_auxtrace_event(), since
"etm->data_queued" is always true, below flow will never run:

    if (!etm->data_queued) {
        ......

        if (dump_trace)
            if (auxtrace_buffer__get_data(buffer, fd)) {
                    cs_etm__dump_event(etm, buffer);
                    auxtrace_buffer__put_data(buffer);
            }
    }

If so, it's better to use a new patch to polish the code.

Thanks,
Leo

>  
>  	return 0;
>  }
> @@ -3038,7 +3055,6 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>  
>  	if (dump_trace) {
>  		cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
> -		return 0;
>  	}
>  
>  	err = cs_etm__synth_events(etm, session);
> -- 
> 2.28.0
>
James Clark June 28, 2021, 10:38 a.m. UTC | #2
On 28/06/2021 02:27, Leo Yan wrote:
> Hi James,
> 
> On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
>> Currently --dump-raw-trace skips queueing and splitting buffers because
>> of an early exit condition in cs_etm__process_auxtrace_info(). Once
>> that is removed we can print the split data by using the queues
>> and searching for split buffers with the same reference as the
>> one that is currently being processed.
>>
>> This keeps the same behaviour of dumping in file order when an AUXTRACE
>> event appears, rather than moving trace dump to where AUX records are in
>> the file.
>>
>> There will be a newline and size printout for each fragment. For example
>> this buffer is comprised of two AUX records, but was printed as one:
>>
>>   0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t
>>
>>   . ... CoreSight ETM Trace data: size 160 bytes
>>           Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>>           Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>>           Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>>           Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
>>           Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>>           Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;
>>
>> But is now printed as two fragments:
>>
>>   0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t
>>
>>   . ... CoreSight ETM Trace data: size 80 bytes
>>           Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>>           Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>>           Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>>
>>   . ... CoreSight ETM Trace data: size 80 bytes
>>           Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
>>           Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>>           Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;
>>
>> Decoding errors that appeared in problematic files are now not present,
>> for example:
>>
>>         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
>>         ...
>>         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
>>
>> Signed-off-by: James Clark <james.clark@arm.com>
> 
> I tested this patch and the result looks good for me.
> 
> I found a side effect introduced by this change is the perf raw dump
> also synthesizes events PERF_RECORD_ATTR.  This is because function
> cs_etm__synth_events() will execute after applying this patch and it
> synthesizes PERF_RECORD_ATTR events.  I don't see any harm for this,
> so:
> 
> Tested-by: Leo Yan <leo.yan@linaro.org>
> 

Thanks for the testing!

> Please see an extra comment in below.
> 
>> ---
>>  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>> index 88e8122f73c9..ad777c2a342f 100644
>> --- a/tools/perf/util/cs-etm.c
>> +++ b/tools/perf/util/cs-etm.c
>> @@ -2430,6 +2430,22 @@ static int cs_etm__process_event(struct perf_session *session,
>>  	return 0;
>>  }
>>  
>> +static void dump_queued_data(struct cs_etm_auxtrace *etm,
>> +			     struct perf_record_auxtrace *event)
>> +{
>> +	struct auxtrace_buffer *buf;
>> +	unsigned int i;
>> +	/*
>> +	 * Find all buffers with same reference in the queues and dump them.
>> +	 * This is because the queues can contain multiple entries of the same
>> +	 * buffer that were split on aux records.
>> +	 */
>> +	for (i = 0; i < etm->queues.nr_queues; ++i)
>> +		list_for_each_entry(buf, &etm->queues.queue_array[i].head, list)
>> +			if (buf->reference == event->reference)
>> +				cs_etm__dump_event(etm, buf);
>> +}
>> +
>>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
>>  					  union perf_event *event,
>>  					  struct perf_tool *tool __maybe_unused)
>> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
>>  				cs_etm__dump_event(etm, buffer);
>>  				auxtrace_buffer__put_data(buffer);
>>  			}
>> -	}
>> +	} else if (dump_trace)
>> +		dump_queued_data(etm, &event->auxtrace);
> 
> IIUC, in the function cs_etm__process_auxtrace_event(), since
> "etm->data_queued" is always true, below flow will never run:
> 
>     if (!etm->data_queued) {
>         ......
> 
>         if (dump_trace)
>             if (auxtrace_buffer__get_data(buffer, fd)) {
>                     cs_etm__dump_event(etm, buffer);
>                     auxtrace_buffer__put_data(buffer);
>             }
>     }
> 
> If so, it's better to use a new patch to polish the code.
> 

Hi Leo,

I think this is not true in piped mode because there is no auxtrace index.
In that mode, events are processed only in file order and cs_etm__process_auxtrace_event()
is called for each buffer.

You can reproduce this with something like this:

     ./perf record -o - ls > stdio.data
     cat stdio.data | ./perf report -i -

There are some other Coresight features that don't work as expected in this mode, like
sorting timestamps between CPUs. The aux split patchset won't work either because random
access isn't possible. And the TRBE patch that I'm working on now won't work, because it
also requires the random access to lookup the flags on the AUX record to configure the 
decoder for unformatted trace.


Thanks
James

> Thanks,
> Leo
> 
>>  
>>  	return 0;
>>  }
>> @@ -3038,7 +3055,6 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>>  
>>  	if (dump_trace) {
>>  		cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
>> -		return 0;
>>  	}
>>  
>>  	err = cs_etm__synth_events(etm, session);
>> -- 
>> 2.28.0
>>
Leo Yan June 28, 2021, 12:08 p.m. UTC | #3
On Mon, Jun 28, 2021 at 11:38:34AM +0100, James Clark wrote:

[...]

> >>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
> >>  					  union perf_event *event,
> >>  					  struct perf_tool *tool __maybe_unused)
> >> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
> >>  				cs_etm__dump_event(etm, buffer);
> >>  				auxtrace_buffer__put_data(buffer);
> >>  			}
> >> -	}
> >> +	} else if (dump_trace)
> >> +		dump_queued_data(etm, &event->auxtrace);
> > 
> > IIUC, in the function cs_etm__process_auxtrace_event(), since
> > "etm->data_queued" is always true, below flow will never run:
> > 
> >     if (!etm->data_queued) {
> >         ......
> > 
> >         if (dump_trace)
> >             if (auxtrace_buffer__get_data(buffer, fd)) {
> >                     cs_etm__dump_event(etm, buffer);
> >                     auxtrace_buffer__put_data(buffer);
> >             }
> >     }
> > 
> > If so, it's better to use a new patch to polish the code.
> > 
> 
> Hi Leo,
> 
> I think this is not true in piped mode because there is no auxtrace index.
> In that mode, events are processed only in file order and cs_etm__process_auxtrace_event()
> is called for each buffer.
> 
> You can reproduce this with something like this:
> 
>      ./perf record -o - ls > stdio.data
>      cat stdio.data | ./perf report -i -

You are right!  I tried these two commands with cs_etm event, just as
you said, in this case, the AUX trace data is not queued; so the flow
for "if (!etm->data_queued)" should be kept.  If so, I am very fine
for current change.  Thanks for sharing the knowledge.

> There are some other Coresight features that don't work as expected in this mode, like
> sorting timestamps between CPUs. The aux split patchset won't work either because random
> access isn't possible. And the TRBE patch that I'm working on now won't work, because it
> also requires the random access to lookup the flags on the AUX record to configure the 
> decoder for unformatted trace.

Cool, looking forward for the patches :)

Leo
Mathieu Poirier June 28, 2021, 8:01 p.m. UTC | #4
On Mon, Jun 28, 2021 at 08:08:02PM +0800, Leo Yan wrote:
> On Mon, Jun 28, 2021 at 11:38:34AM +0100, James Clark wrote:
> 
> [...]
> 
> > >>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
> > >>  					  union perf_event *event,
> > >>  					  struct perf_tool *tool __maybe_unused)
> > >> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
> > >>  				cs_etm__dump_event(etm, buffer);
> > >>  				auxtrace_buffer__put_data(buffer);
> > >>  			}
> > >> -	}
> > >> +	} else if (dump_trace)
> > >> +		dump_queued_data(etm, &event->auxtrace);
> > > 
> > > IIUC, in the function cs_etm__process_auxtrace_event(), since
> > > "etm->data_queued" is always true, below flow will never run:
> > > 
> > >     if (!etm->data_queued) {
> > >         ......
> > > 
> > >         if (dump_trace)
> > >             if (auxtrace_buffer__get_data(buffer, fd)) {
> > >                     cs_etm__dump_event(etm, buffer);
> > >                     auxtrace_buffer__put_data(buffer);
> > >             }
> > >     }
> > > 
> > > If so, it's better to use a new patch to polish the code.
> > > 
> > 
> > Hi Leo,
> > 
> > I think this is not true in piped mode because there is no auxtrace index.
> > In that mode, events are processed only in file order and cs_etm__process_auxtrace_event()
> > is called for each buffer.
> > 
> > You can reproduce this with something like this:
> > 
> >      ./perf record -o - ls > stdio.data
> >      cat stdio.data | ./perf report -i -
> 
> You are right!  I tried these two commands with cs_etm event, just as
> you said, in this case, the AUX trace data is not queued; so the flow
> for "if (!etm->data_queued)" should be kept.  If so, I am very fine
> for current change.  Thanks for sharing the knowledge.
> 
> > There are some other Coresight features that don't work as expected in this mode, like
> > sorting timestamps between CPUs. The aux split patchset won't work either because random
> > access isn't possible. And the TRBE patch that I'm working on now won't work, because it
> > also requires the random access to lookup the flags on the AUX record to configure the 
> > decoder for unformatted trace.
>

There is a lot of things happening in this area.  Based on the above should I
still plan to review this set or should I wait for another revision?

Thanks,
Mathieu

> Cool, looking forward for the patches :)
> 
> Leo
Leo Yan June 29, 2021, 6:09 a.m. UTC | #5
Hi Mathieu,

On Mon, Jun 28, 2021 at 02:01:32PM -0600, Mathieu Poirier wrote:

[...]

> > > Hi Leo,
> > > 
> > > I think this is not true in piped mode because there is no auxtrace index.
> > > In that mode, events are processed only in file order and cs_etm__process_auxtrace_event()
> > > is called for each buffer.
> > > 
> > > You can reproduce this with something like this:
> > > 
> > >      ./perf record -o - ls > stdio.data
> > >      cat stdio.data | ./perf report -i -
> > 
> > You are right!  I tried these two commands with cs_etm event, just as
> > you said, in this case, the AUX trace data is not queued; so the flow
> > for "if (!etm->data_queued)" should be kept.  If so, I am very fine
> > for current change.  Thanks for sharing the knowledge.
> > 
> > > There are some other Coresight features that don't work as expected in this mode, like
> > > sorting timestamps between CPUs. The aux split patchset won't work either because random
> > > access isn't possible. And the TRBE patch that I'm working on now won't work, because it
> > > also requires the random access to lookup the flags on the AUX record to configure the 
> > > decoder for unformatted trace.
> >
> 
> There is a lot of things happening in this area.  Based on the above should I
> still plan to review this set or should I wait for another revision?

I think you could continue to review this patch set for AUX data splitting.

Since we have concern for the AUX data splitting with snapshot mode,
James and me both have verified the AUX data splitting (this patch
set) with snapshot mode, and the testing result shows this patch set
is reliable.

Regard another patch set for fixing snapshot mode [1], I will send a new
version (drop patch 1/3 and refine for patch 3/3), so you could hold
on for that patch set.

Thanks,
Leo

[1] https://lore.kernel.org/patchwork/cover/1437696/
James Clark June 29, 2021, 8:52 a.m. UTC | #6
On 28/06/2021 21:01, Mathieu Poirier wrote:
> On Mon, Jun 28, 2021 at 08:08:02PM +0800, Leo Yan wrote:
>> On Mon, Jun 28, 2021 at 11:38:34AM +0100, James Clark wrote:
>>
>> [...]
>>
>>>>>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
>>>>>  					  union perf_event *event,
>>>>>  					  struct perf_tool *tool __maybe_unused)
>>>>> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
>>>>>  				cs_etm__dump_event(etm, buffer);
>>>>>  				auxtrace_buffer__put_data(buffer);
>>>>>  			}
>>>>> -	}
>>>>> +	} else if (dump_trace)
>>>>> +		dump_queued_data(etm, &event->auxtrace);
>>>>
>>>> IIUC, in the function cs_etm__process_auxtrace_event(), since
>>>> "etm->data_queued" is always true, below flow will never run:
>>>>
>>>>     if (!etm->data_queued) {
>>>>         ......
>>>>
>>>>         if (dump_trace)
>>>>             if (auxtrace_buffer__get_data(buffer, fd)) {
>>>>                     cs_etm__dump_event(etm, buffer);
>>>>                     auxtrace_buffer__put_data(buffer);
>>>>             }
>>>>     }
>>>>
>>>> If so, it's better to use a new patch to polish the code.
>>>>
>>>
>>> Hi Leo,
>>>
>>> I think this is not true in piped mode because there is no auxtrace index.
>>> In that mode, events are processed only in file order and cs_etm__process_auxtrace_event()
>>> is called for each buffer.
>>>
>>> You can reproduce this with something like this:
>>>
>>>      ./perf record -o - ls > stdio.data
>>>      cat stdio.data | ./perf report -i -
>>
>> You are right!  I tried these two commands with cs_etm event, just as
>> you said, in this case, the AUX trace data is not queued; so the flow
>> for "if (!etm->data_queued)" should be kept.  If so, I am very fine
>> for current change.  Thanks for sharing the knowledge.
>>
>>> There are some other Coresight features that don't work as expected in this mode, like
>>> sorting timestamps between CPUs. The aux split patchset won't work either because random
>>> access isn't possible. And the TRBE patch that I'm working on now won't work, because it
>>> also requires the random access to lookup the flags on the AUX record to configure the 
>>> decoder for unformatted trace.
>>
> 
> There is a lot of things happening in this area.  Based on the above should I
> still plan to review this set or should I wait for another revision?

From my point of view, this one is final. It looks like both Leo and I have tested
it with and without his snapshot changes and it's working as expected in both cases.

Thanks
James

> 
> Thanks,
> Mathieu
> 
>> Cool, looking forward for the patches :)
>>
>> Leo
Mathieu Poirier June 29, 2021, 7:08 p.m. UTC | #7
On Tue, 29 Jun 2021 at 02:52, James Clark <james.clark@arm.com> wrote:
>
>
>
> On 28/06/2021 21:01, Mathieu Poirier wrote:
> > On Mon, Jun 28, 2021 at 08:08:02PM +0800, Leo Yan wrote:
> >> On Mon, Jun 28, 2021 at 11:38:34AM +0100, James Clark wrote:
> >>
> >> [...]
> >>
> >>>>>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
> >>>>>                                     union perf_event *event,
> >>>>>                                     struct perf_tool *tool __maybe_unused)
> >>>>> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
> >>>>>                           cs_etm__dump_event(etm, buffer);
> >>>>>                           auxtrace_buffer__put_data(buffer);
> >>>>>                   }
> >>>>> - }
> >>>>> + } else if (dump_trace)
> >>>>> +         dump_queued_data(etm, &event->auxtrace);
> >>>>
> >>>> IIUC, in the function cs_etm__process_auxtrace_event(), since
> >>>> "etm->data_queued" is always true, below flow will never run:
> >>>>
> >>>>     if (!etm->data_queued) {
> >>>>         ......
> >>>>
> >>>>         if (dump_trace)
> >>>>             if (auxtrace_buffer__get_data(buffer, fd)) {
> >>>>                     cs_etm__dump_event(etm, buffer);
> >>>>                     auxtrace_buffer__put_data(buffer);
> >>>>             }
> >>>>     }
> >>>>
> >>>> If so, it's better to use a new patch to polish the code.
> >>>>
> >>>
> >>> Hi Leo,
> >>>
> >>> I think this is not true in piped mode because there is no auxtrace index.
> >>> In that mode, events are processed only in file order and cs_etm__process_auxtrace_event()
> >>> is called for each buffer.
> >>>
> >>> You can reproduce this with something like this:
> >>>
> >>>      ./perf record -o - ls > stdio.data
> >>>      cat stdio.data | ./perf report -i -
> >>
> >> You are right!  I tried these two commands with cs_etm event, just as
> >> you said, in this case, the AUX trace data is not queued; so the flow
> >> for "if (!etm->data_queued)" should be kept.  If so, I am very fine
> >> for current change.  Thanks for sharing the knowledge.
> >>
> >>> There are some other Coresight features that don't work as expected in this mode, like
> >>> sorting timestamps between CPUs. The aux split patchset won't work either because random
> >>> access isn't possible. And the TRBE patch that I'm working on now won't work, because it
> >>> also requires the random access to lookup the flags on the AUX record to configure the
> >>> decoder for unformatted trace.
> >>
> >
> > There is a lot of things happening in this area.  Based on the above should I
> > still plan to review this set or should I wait for another revision?
>
> From my point of view, this one is final. It looks like both Leo and I have tested
> it with and without his snapshot changes and it's working as expected in both cases.
>

Very well - I will start working on it once I'm through with the
ultrasoc patchset.

> Thanks
> James
>
> >
> > Thanks,
> > Mathieu
> >
> >> Cool, looking forward for the patches :)
> >>
> >> Leo
Mathieu Poirier July 5, 2021, 7:33 p.m. UTC | #8
On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
> Currently --dump-raw-trace skips queueing and splitting buffers because
> of an early exit condition in cs_etm__process_auxtrace_info(). Once
> that is removed we can print the split data by using the queues
> and searching for split buffers with the same reference as the
> one that is currently being processed.
> 
> This keeps the same behaviour of dumping in file order when an AUXTRACE
> event appears, rather than moving trace dump to where AUX records are in
> the file.
> 
> There will be a newline and size printout for each fragment. For example
> this buffer is comprised of two AUX records, but was printed as one:
> 
>   0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t
> 
>   . ... CoreSight ETM Trace data: size 160 bytes
>           Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>           Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
>           Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
>           Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;
> 
> But is now printed as two fragments:
> 
>   0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t
> 
>   . ... CoreSight ETM Trace data: size 80 bytes
>           Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
>           Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
> 
>   . ... CoreSight ETM Trace data: size 80 bytes
>           Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
>           Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
>           Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;
> 
> Decoding errors that appeared in problematic files are now not present,
> for example:
> 
>         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
>         ...
>         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
> 
> Signed-off-by: James Clark <james.clark@arm.com>
> ---
>  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>

> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 88e8122f73c9..ad777c2a342f 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -2430,6 +2430,22 @@ static int cs_etm__process_event(struct perf_session *session,
>  	return 0;
>  }
>  
> +static void dump_queued_data(struct cs_etm_auxtrace *etm,
> +			     struct perf_record_auxtrace *event)
> +{
> +	struct auxtrace_buffer *buf;
> +	unsigned int i;
> +	/*
> +	 * Find all buffers with same reference in the queues and dump them.
> +	 * This is because the queues can contain multiple entries of the same
> +	 * buffer that were split on aux records.
> +	 */
> +	for (i = 0; i < etm->queues.nr_queues; ++i)
> +		list_for_each_entry(buf, &etm->queues.queue_array[i].head, list)
> +			if (buf->reference == event->reference)
> +				cs_etm__dump_event(etm, buf);
> +}
> +
>  static int cs_etm__process_auxtrace_event(struct perf_session *session,
>  					  union perf_event *event,
>  					  struct perf_tool *tool __maybe_unused)
> @@ -2462,7 +2478,8 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
>  				cs_etm__dump_event(etm, buffer);
>  				auxtrace_buffer__put_data(buffer);
>  			}
> -	}
> +	} else if (dump_trace)
> +		dump_queued_data(etm, &event->auxtrace);
>  
>  	return 0;
>  }
> @@ -3038,7 +3055,6 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>  
>  	if (dump_trace) {
>  		cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
> -		return 0;
>  	}
>  
>  	err = cs_etm__synth_events(etm, session);
> -- 
> 2.28.0
>
Arnaldo Carvalho de Melo July 14, 2021, 5:45 p.m. UTC | #9
Em Mon, Jul 05, 2021 at 01:33:38PM -0600, Mathieu Poirier escreveu:
> On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
> > for example:
> > 
> >         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
> >         ...
> >         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
> > 
> > Signed-off-by: James Clark <james.clark@arm.com>
> > ---
> >  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > 
> 
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>



Thanks, applied.

- Arnaldo
Mathieu Poirier July 19, 2021, 4:33 p.m. UTC | #10
Hi Arnaldo,

On Wed, Jul 14, 2021 at 02:45:34PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jul 05, 2021 at 01:33:38PM -0600, Mathieu Poirier escreveu:
> > On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
> > > for example:
> > > 
> > >         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
> > >         ...
> > >         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
> > > 
> > > Signed-off-by: James Clark <james.clark@arm.com>
> > > ---
> > >  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
> > >  1 file changed, 18 insertions(+), 2 deletions(-)
> > > 
> > 
> > Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> 
> 
> 
> Thanks, applied.

This patch is the second of a two patch series[1][2].  The first one was applied and
is in Linus' current master branch.  But I can't find the second one in
either your core branch or Linus', and that despite your reply above that it has been
applied.  As such I am guessing something went wrong with the process.  

How do you want to proceed - should we resend the second patch[2] or you still
have it somewhere in your Inbox?

Thanks,
Mathieu

[1]. https://lists.linaro.org/pipermail/coresight/2021-June/006575.html
[2]. https://lists.linaro.org/pipermail/coresight/2021-June/006576.html

> 
> - Arnaldo
>
Arnaldo Carvalho de Melo July 19, 2021, 8:10 p.m. UTC | #11
Em Mon, Jul 19, 2021 at 10:33:36AM -0600, Mathieu Poirier escreveu:
> Hi Arnaldo,
> 
> On Wed, Jul 14, 2021 at 02:45:34PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jul 05, 2021 at 01:33:38PM -0600, Mathieu Poirier escreveu:
> > > On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
> > > > for example:
> > > > 
> > > >         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
> > > >         ...
> > > >         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
> > > > 
> > > > Signed-off-by: James Clark <james.clark@arm.com>
> > > > ---
> > > >  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
> > > >  1 file changed, 18 insertions(+), 2 deletions(-)
> > > > 
> > > 
> > > Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > 
> > 
> > 
> > Thanks, applied.
> 
> This patch is the second of a two patch series[1][2].  The first one was applied and
> is in Linus' current master branch.  But I can't find the second one in
> either your core branch or Linus', and that despite your reply above that it has been
> applied.  As such I am guessing something went wrong with the process.  
> 
> How do you want to proceed - should we resend the second patch[2] or you still
> have it somewhere in your Inbox?

My bad, its no in tmp.perf/urgent, after testing together with other
patches will move to perf/urgent and by the end of this week I'll submit
it to Linus, sorry.

- Arnaldo
 
> Thanks,
> Mathieu
> 
> [1]. https://lists.linaro.org/pipermail/coresight/2021-June/006575.html
> [2]. https://lists.linaro.org/pipermail/coresight/2021-June/006576.html
> 
> > 
> > - Arnaldo
> >
Mathieu Poirier July 20, 2021, 3:15 p.m. UTC | #12
On Mon, Jul 19, 2021 at 05:10:34PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jul 19, 2021 at 10:33:36AM -0600, Mathieu Poirier escreveu:
> > Hi Arnaldo,
> > 
> > On Wed, Jul 14, 2021 at 02:45:34PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Jul 05, 2021 at 01:33:38PM -0600, Mathieu Poirier escreveu:
> > > > On Thu, Jun 24, 2021 at 05:43:03PM +0100, James Clark wrote:
> > > > > for example:
> > > > > 
> > > > >         Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
> > > > >         ...
> > > > >         PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822
> > > > > 
> > > > > Signed-off-by: James Clark <james.clark@arm.com>
> > > > > ---
> > > > >  tools/perf/util/cs-etm.c | 20 ++++++++++++++++++--
> > > > >  1 file changed, 18 insertions(+), 2 deletions(-)
> > > > > 
> > > > 
> > > > Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > > 
> > > 
> > > 
> > > Thanks, applied.
> > 
> > This patch is the second of a two patch series[1][2].  The first one was applied and
> > is in Linus' current master branch.  But I can't find the second one in
> > either your core branch or Linus', and that despite your reply above that it has been
> > applied.  As such I am guessing something went wrong with the process.  
> > 
> > How do you want to proceed - should we resend the second patch[2] or you still
> > have it somewhere in your Inbox?
> 
> My bad, its no in tmp.perf/urgent, after testing together with other
> patches will move to perf/urgent and by the end of this week I'll submit
> it to Linus, sorry.
>

Very good - thanks for the follow up.

> - Arnaldo
>  
> > Thanks,
> > Mathieu
> > 
> > [1]. https://lists.linaro.org/pipermail/coresight/2021-June/006575.html
> > [2]. https://lists.linaro.org/pipermail/coresight/2021-June/006576.html
> > 
> > > 
> > > - Arnaldo
> > > 
> 
> -- 
> 
> - Arnaldo
diff mbox series

Patch

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 88e8122f73c9..ad777c2a342f 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2430,6 +2430,22 @@  static int cs_etm__process_event(struct perf_session *session,
 	return 0;
 }
 
+static void dump_queued_data(struct cs_etm_auxtrace *etm,
+			     struct perf_record_auxtrace *event)
+{
+	struct auxtrace_buffer *buf;
+	unsigned int i;
+	/*
+	 * Find all buffers with same reference in the queues and dump them.
+	 * This is because the queues can contain multiple entries of the same
+	 * buffer that were split on aux records.
+	 */
+	for (i = 0; i < etm->queues.nr_queues; ++i)
+		list_for_each_entry(buf, &etm->queues.queue_array[i].head, list)
+			if (buf->reference == event->reference)
+				cs_etm__dump_event(etm, buf);
+}
+
 static int cs_etm__process_auxtrace_event(struct perf_session *session,
 					  union perf_event *event,
 					  struct perf_tool *tool __maybe_unused)
@@ -2462,7 +2478,8 @@  static int cs_etm__process_auxtrace_event(struct perf_session *session,
 				cs_etm__dump_event(etm, buffer);
 				auxtrace_buffer__put_data(buffer);
 			}
-	}
+	} else if (dump_trace)
+		dump_queued_data(etm, &event->auxtrace);
 
 	return 0;
 }
@@ -3038,7 +3055,6 @@  int cs_etm__process_auxtrace_info(union perf_event *event,
 
 	if (dump_trace) {
 		cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
-		return 0;
 	}
 
 	err = cs_etm__synth_events(etm, session);