diff mbox series

[v1,3/7] perf cs-etm: Calculate per CPU metadata array size

Message ID 20210109074435.626855-4-leo.yan@linaro.org (mailing list archive)
State New, archived
Headers show
Series coresight: etm-perf: Fix pid tracing with VHE | expand

Commit Message

Leo Yan Jan. 9, 2021, 7:44 a.m. UTC
The metadata array can be extended over time and the tool, if using the
predefined macro (like CS_ETMV4_PRIV_MAX for ETMv4) as metadata array
size to copy data, it can cause compatible issue within different
versions of perf tool.

E.g. we recorded a data file with an old version tool, afterwards if
use the new version perf tool to parse the file, since the metadata
array has been extended and the macro CS_ETMV4_PRIV_MAX has been
altered, if use it to parse the perf data with old format, this will
lead to mismatch.

To maintain backward compatibility, this patch calculates per CPU
metadata array size on the runtime, the calculation is based on the
info stored in the data file so that it's reliable.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

Comments

Suzuki K Poulose Jan. 11, 2021, 7:28 a.m. UTC | #1
On 1/9/21 7:44 AM, Leo Yan wrote:
> The metadata array can be extended over time and the tool, if using the
> predefined macro (like CS_ETMV4_PRIV_MAX for ETMv4) as metadata array
> size to copy data, it can cause compatible issue within different
> versions of perf tool.
> 
> E.g. we recorded a data file with an old version tool, afterwards if
> use the new version perf tool to parse the file, since the metadata
> array has been extended and the macro CS_ETMV4_PRIV_MAX has been
> altered, if use it to parse the perf data with old format, this will
> lead to mismatch.
> 
> To maintain backward compatibility, this patch calculates per CPU
> metadata array size on the runtime, the calculation is based on the
> info stored in the data file so that it's reliable.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Looks good to me.

Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Mike Leach Jan. 11, 2021, 12:09 p.m. UTC | #2
Hi Leo,

I think there is an issue here in that your modification assumes that
all cpus in the system are of the same ETM type. The original routine
allowed for differing ETM types, thus differing cpu ETM field lengths
between ETMv4 / ETMv3, the field size was used after the relevant
magic number for the cpu ETM was read.

You have replaced two different sizes - with a single calculated size.

Moving forwards we are seeing the newer FEAT_ETE protocol drivers
appearing on the list, which will ultimately need a new metadata
structure.

We have had discussions within ARM regarding the changing of the
format to be more self describing - which should probably be opened
out to the CS mailing list.

Regards

Mike


On Mon, 11 Jan 2021 at 07:29, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> On 1/9/21 7:44 AM, Leo Yan wrote:
> > The metadata array can be extended over time and the tool, if using the
> > predefined macro (like CS_ETMV4_PRIV_MAX for ETMv4) as metadata array
> > size to copy data, it can cause compatible issue within different
> > versions of perf tool.
> >
> > E.g. we recorded a data file with an old version tool, afterwards if
> > use the new version perf tool to parse the file, since the metadata
> > array has been extended and the macro CS_ETMV4_PRIV_MAX has been
> > altered, if use it to parse the perf data with old format, this will
> > lead to mismatch.
> >
> > To maintain backward compatibility, this patch calculates per CPU
> > metadata array size on the runtime, the calculation is based on the
> > info stored in the data file so that it's reliable.
> >
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
>
> Looks good to me.
>
> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Leo Yan Jan. 11, 2021, 3:06 p.m. UTC | #3
Hi Mike,

On Mon, Jan 11, 2021 at 12:09:12PM +0000, Mike Leach wrote:
> Hi Leo,
> 
> I think there is an issue here in that your modification assumes that
> all cpus in the system are of the same ETM type. The original routine
> allowed for differing ETM types, thus differing cpu ETM field lengths
> between ETMv4 / ETMv3, the field size was used after the relevant
> magic number for the cpu ETM was read.
> 
> You have replaced two different sizes - with a single calculated size.

Thanks for pointing out this.

> Moving forwards we are seeing the newer FEAT_ETE protocol drivers
> appearing on the list, which will ultimately need a new metadata
> structure.
> 
> We have had discussions within ARM regarding the changing of the
> format to be more self describing - which should probably be opened
> out to the CS mailing list.

I think here have two options.  One option is I think we can use
__perf_cs_etmv3_magic/__perf_cs_etmv4_magic as indicator for the
starting of next metadata array; when copy the metadata, always check
the next item in the buffer, if it's __perf_cs_etmv3_magic or
__perf_cs_etmv4_magic, will break loop and start copying metadata
array for next CPU.  The suggested change is pasted in below.

Another option is I drop patches 03,05/07 in the series and leave the
backward compatibility fixing for a saperate patch series with self
describing method.  Especially, if you think the first option will
introduce trouble for enabling self describing later, then I am happy
to drop patches 03,05.

How about you think for this?

Thanks,
Leo

---8<---

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index a2a369e2fbb6..edaec57362f0 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2558,12 +2558,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
 				err = -ENOMEM;
 				goto err_free_metadata;
 			}
-			for (k = 0; k < CS_ETM_PRIV_MAX; k++)
+			for (k = 0; k < CS_ETM_PRIV_MAX; k++) {
 				metadata[j][k] = ptr[i + k];
 
+				if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
+				    ptr[i + k + 1] == __perf_cs_etmv4_magic) {
+					k++;
+					break;
+				}
+			}
+
 			/* The traceID is our handle */
 			idx = metadata[j][CS_ETM_ETMTRACEIDR];
-			i += CS_ETM_PRIV_MAX;
+			i += k;
 		} else if (ptr[i] == __perf_cs_etmv4_magic) {
 			metadata[j] = zalloc(sizeof(*metadata[j]) *
 					     CS_ETMV4_PRIV_MAX);
@@ -2571,12 +2578,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
 				err = -ENOMEM;
 				goto err_free_metadata;
 			}
-			for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
+			for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) {
 				metadata[j][k] = ptr[i + k];
 
+				if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
+				    ptr[i + k + 1] == __perf_cs_etmv4_magic) {
+					k++;
+					break;
+				}
+			}
+
 			/* The traceID is our handle */
 			idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
-			i += CS_ETMV4_PRIV_MAX;
+			i += k;
 		}
 
 		/* Get an RB node for this CPU */
Mike Leach Jan. 13, 2021, midnight UTC | #4
Hi Leo,

On Mon, 11 Jan 2021 at 15:06, Leo Yan <leo.yan@linaro.org> wrote:
>
> Hi Mike,
>
> On Mon, Jan 11, 2021 at 12:09:12PM +0000, Mike Leach wrote:
> > Hi Leo,
> >
> > I think there is an issue here in that your modification assumes that
> > all cpus in the system are of the same ETM type. The original routine
> > allowed for differing ETM types, thus differing cpu ETM field lengths
> > between ETMv4 / ETMv3, the field size was used after the relevant
> > magic number for the cpu ETM was read.
> >
> > You have replaced two different sizes - with a single calculated size.
>
> Thanks for pointing out this.
>
> > Moving forwards we are seeing the newer FEAT_ETE protocol drivers
> > appearing on the list, which will ultimately need a new metadata
> > structure.
> >
> > We have had discussions within ARM regarding the changing of the
> > format to be more self describing - which should probably be opened
> > out to the CS mailing list.
>
> I think here have two options.  One option is I think we can use
> __perf_cs_etmv3_magic/__perf_cs_etmv4_magic as indicator for the
> starting of next metadata array; when copy the metadata, always check
> the next item in the buffer, if it's __perf_cs_etmv3_magic or
> __perf_cs_etmv4_magic, will break loop and start copying metadata
> array for next CPU.  The suggested change is pasted in below.
>
> Another option is I drop patches 03,05/07 in the series and leave the
> backward compatibility fixing for a saperate patch series with self
> describing method.  Especially, if you think the first option will
> introduce trouble for enabling self describing later, then I am happy
> to drop patches 03,05.
>
> How about you think for this?
>
> Thanks,
> Leo
>
> ---8<---
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index a2a369e2fbb6..edaec57362f0 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -2558,12 +2558,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>                                 err = -ENOMEM;
>                                 goto err_free_metadata;
>                         }
> -                       for (k = 0; k < CS_ETM_PRIV_MAX; k++)
> +                       for (k = 0; k < CS_ETM_PRIV_MAX; k++) {
>                                 metadata[j][k] = ptr[i + k];
>
> +                               if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
> +                                   ptr[i + k + 1] == __perf_cs_etmv4_magic) {
> +                                       k++;
> +                                       break;
> +                               }
> +                       }
> +
>                         /* The traceID is our handle */
>                         idx = metadata[j][CS_ETM_ETMTRACEIDR];
> -                       i += CS_ETM_PRIV_MAX;
> +                       i += k;
>                 } else if (ptr[i] == __perf_cs_etmv4_magic) {
>                         metadata[j] = zalloc(sizeof(*metadata[j]) *
>                                              CS_ETMV4_PRIV_MAX);
> @@ -2571,12 +2578,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>                                 err = -ENOMEM;
>                                 goto err_free_metadata;
>                         }
> -                       for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
> +                       for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) {
>                                 metadata[j][k] = ptr[i + k];
>
> +                               if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
> +                                   ptr[i + k + 1] == __perf_cs_etmv4_magic) {
> +                                       k++;
> +                                       break;
> +                               }
> +                       }
> +
>                         /* The traceID is our handle */
>                         idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
> -                       i += CS_ETMV4_PRIV_MAX;
> +                       i += k;
>                 }
>
>                 /* Get an RB node for this CPU */

That would be a spot fix for the read /copy case, but will not fix the
print routine which will still bail out on older versions of the
format. (when using perf report --dump).

The "self describing" format I have been looking at will add an
NR_PARAMS value to the common block in the CPU metadata parameter
list, increment the header version to '1' and update the format writer
to use the version 1 format while having the reader understand both v0
and v1 formats.

i..e in cs-etm.h perf I add:
/*
 * Update the version for new format.
 *
 * New version 1 format adds a param count to the per cpu metadata.
 * This allows easy adding of new metadata parameters.
 * Requires that new params always added after current ones.
 * Also allows client reader to handle file versions that are different by
 * checking the number of params in the file vs the number expected.
 */
#define CS_HEADER_CURRENT_VERSION 1

/* Beginning of header common to both ETMv3 and V4 */
enum {
    CS_ETM_MAGIC,
    CS_ETM_CPU,
    CS_ETM_NR_PARAMS, /* number of parameters to follow in this block */
};

where in verison 1, NR_PARAMS indicates the total number of params
that follow - so adding new parameters can be added to the metadata
enums and the tool will automatically adjust, and will handle v0
files, plus older and newer files that have differing numbers of
parameters, as long as the parameters are only ever added to the end
of the list.

I have been working on a patch for this today, which took a little
longer than expected as it was a little more complex than expected
(the printing routines in for the --dump command!).

I will post this tomorrow when tested - and if we agree it works it
could be rolled into your set - it would make adding the PID parameter
easier, and ensure that this new format is available for the upcoming
developments.

Regards


Mike


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Leo Yan Jan. 13, 2021, 2:27 a.m. UTC | #5
Hi Mike,

On Wed, Jan 13, 2021 at 12:00:10AM +0000, Mike Leach wrote:

[...]

> > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > index a2a369e2fbb6..edaec57362f0 100644
> > --- a/tools/perf/util/cs-etm.c
> > +++ b/tools/perf/util/cs-etm.c
> > @@ -2558,12 +2558,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
> >                                 err = -ENOMEM;
> >                                 goto err_free_metadata;
> >                         }
> > -                       for (k = 0; k < CS_ETM_PRIV_MAX; k++)
> > +                       for (k = 0; k < CS_ETM_PRIV_MAX; k++) {
> >                                 metadata[j][k] = ptr[i + k];
> >
> > +                               if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
> > +                                   ptr[i + k + 1] == __perf_cs_etmv4_magic) {
> > +                                       k++;
> > +                                       break;
> > +                               }
> > +                       }
> > +
> >                         /* The traceID is our handle */
> >                         idx = metadata[j][CS_ETM_ETMTRACEIDR];
> > -                       i += CS_ETM_PRIV_MAX;
> > +                       i += k;
> >                 } else if (ptr[i] == __perf_cs_etmv4_magic) {
> >                         metadata[j] = zalloc(sizeof(*metadata[j]) *
> >                                              CS_ETMV4_PRIV_MAX);
> > @@ -2571,12 +2578,19 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
> >                                 err = -ENOMEM;
> >                                 goto err_free_metadata;
> >                         }
> > -                       for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
> > +                       for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) {
> >                                 metadata[j][k] = ptr[i + k];
> >
> > +                               if (ptr[i + k + 1] == __perf_cs_etmv3_magic ||
> > +                                   ptr[i + k + 1] == __perf_cs_etmv4_magic) {
> > +                                       k++;
> > +                                       break;
> > +                               }
> > +                       }
> > +
> >                         /* The traceID is our handle */
> >                         idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
> > -                       i += CS_ETMV4_PRIV_MAX;
> > +                       i += k;
> >                 }
> >
> >                 /* Get an RB node for this CPU */
> 
> That would be a spot fix for the read /copy case, but will not fix the
> print routine which will still bail out on older versions of the
> format. (when using perf report --dump).
> 
> The "self describing" format I have been looking at will add an
> NR_PARAMS value to the common block in the CPU metadata parameter
> list, increment the header version to '1' and update the format writer
> to use the version 1 format while having the reader understand both v0
> and v1 formats.
> 
> i..e in cs-etm.h perf I add:
> /*
>  * Update the version for new format.
>  *
>  * New version 1 format adds a param count to the per cpu metadata.
>  * This allows easy adding of new metadata parameters.
>  * Requires that new params always added after current ones.
>  * Also allows client reader to handle file versions that are different by
>  * checking the number of params in the file vs the number expected.
>  */
> #define CS_HEADER_CURRENT_VERSION 1
> 
> /* Beginning of header common to both ETMv3 and V4 */
> enum {
>     CS_ETM_MAGIC,
>     CS_ETM_CPU,
>     CS_ETM_NR_PARAMS, /* number of parameters to follow in this block */
> };
> 
> where in verison 1, NR_PARAMS indicates the total number of params
> that follow - so adding new parameters can be added to the metadata
> enums and the tool will automatically adjust, and will handle v0
> files, plus older and newer files that have differing numbers of
> parameters, as long as the parameters are only ever added to the end
> of the list.
> 
> I have been working on a patch for this today, which took a little
> longer than expected as it was a little more complex than expected
> (the printing routines in for the --dump command!).
> 
> I will post this tomorrow when tested - and if we agree it works it
> could be rolled into your set - it would make adding the PID parameter
> easier, and ensure that this new format is available for the upcoming
> developments.

Thanks for the info.  I will look at your patch and see how to fit
with it.

Thanks,
Leo
Mathieu Poirier Jan. 15, 2021, 10:46 p.m. UTC | #6
On Mon, Jan 11, 2021 at 12:09:12PM +0000, Mike Leach wrote:
> Hi Leo,
> 
> I think there is an issue here in that your modification assumes that
> all cpus in the system are of the same ETM type. The original routine
> allowed for differing ETM types, thus differing cpu ETM field lengths
> between ETMv4 / ETMv3, the field size was used after the relevant
> magic number for the cpu ETM was read.
> 
> You have replaced two different sizes - with a single calculated size.

I usually go through an entire patchset before looking at the comments people
have made.  In this case Mike and I are coming to the exact same conclusion.

I will look at Mike's patch on Monday.

> 
> Moving forwards we are seeing the newer FEAT_ETE protocol drivers
> appearing on the list, which will ultimately need a new metadata
> structure.
> 
> We have had discussions within ARM regarding the changing of the
> format to be more self describing - which should probably be opened
> out to the CS mailing list.
> 
> Regards
> 
> Mike
> 
> 
> On Mon, 11 Jan 2021 at 07:29, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >
> > On 1/9/21 7:44 AM, Leo Yan wrote:
> > > The metadata array can be extended over time and the tool, if using the
> > > predefined macro (like CS_ETMV4_PRIV_MAX for ETMv4) as metadata array
> > > size to copy data, it can cause compatible issue within different
> > > versions of perf tool.
> > >
> > > E.g. we recorded a data file with an old version tool, afterwards if
> > > use the new version perf tool to parse the file, since the metadata
> > > array has been extended and the macro CS_ETMV4_PRIV_MAX has been
> > > altered, if use it to parse the perf data with old format, this will
> > > lead to mismatch.
> > >
> > > To maintain backward compatibility, this patch calculates per CPU
> > > metadata array size on the runtime, the calculation is based on the
> > > info stored in the data file so that it's reliable.
> > >
> > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> >
> > Looks good to me.
> >
> > Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >
> 
> 
> --
> Mike Leach
> Principal Engineer, ARM Ltd.
> Manchester Design Centre. UK
Leo Yan Jan. 16, 2021, 12:50 a.m. UTC | #7
Hi Mathieu,

On Fri, Jan 15, 2021 at 03:46:58PM -0700, Mathieu Poirier wrote:
> On Mon, Jan 11, 2021 at 12:09:12PM +0000, Mike Leach wrote:
> > Hi Leo,
> > 
> > I think there is an issue here in that your modification assumes that
> > all cpus in the system are of the same ETM type. The original routine
> > allowed for differing ETM types, thus differing cpu ETM field lengths
> > between ETMv4 / ETMv3, the field size was used after the relevant
> > magic number for the cpu ETM was read.
> > 
> > You have replaced two different sizes - with a single calculated size.
> 
> I usually go through an entire patchset before looking at the comments people
> have made.  In this case Mike and I are coming to the exact same conclusion.

Agreed, now this work depends on Mike's patch for extending metadata
version; otherwise if without Mike's patch, it will cause compability
issue.

> I will look at Mike's patch on Monday.

Cool!

Thanks for review,
Leo
diff mbox series

Patch

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index a2a369e2fbb6..5e284725dceb 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2497,6 +2497,7 @@  int cs_etm__process_auxtrace_info(union perf_event *event,
 	int i, j, k;
 	u64 *ptr, *hdr = NULL;
 	u64 **metadata = NULL;
+	int metadata_cpu_array_size;
 
 	/*
 	 * sizeof(auxtrace_info_event::type) +
@@ -2544,6 +2545,19 @@  int cs_etm__process_auxtrace_info(union perf_event *event,
 		goto err_free_traceid_list;
 	}
 
+	/*
+	 * The metadata is a two dimensional array, the first dimension uses CPU
+	 * number as index and the second dimension is the metadata array per
+	 * CPU.  Since the metadata array can be extended over time, the
+	 * predefined macros (CS_ETM_PRIV_MAX or CS_ETMV4_PRIV_MAX) might
+	 * mismatch within different versions of tool, this can lead to copy
+	 * wrong data.  To maintain backward compatibility, calculate CPU's
+	 * metadata array size on the runtime.
+	 */
+	metadata_cpu_array_size =
+		(auxtrace_info->header.size -
+		 sizeof(struct perf_record_auxtrace_info)) / num_cpu / sizeof(u64);
+
 	/*
 	 * The metadata is stored in the auxtrace_info section and encodes
 	 * the configuration of the ARM embedded trace macrocell which is
@@ -2558,12 +2572,12 @@  int cs_etm__process_auxtrace_info(union perf_event *event,
 				err = -ENOMEM;
 				goto err_free_metadata;
 			}
-			for (k = 0; k < CS_ETM_PRIV_MAX; k++)
+			for (k = 0; k < metadata_cpu_array_size; k++)
 				metadata[j][k] = ptr[i + k];
 
 			/* The traceID is our handle */
 			idx = metadata[j][CS_ETM_ETMTRACEIDR];
-			i += CS_ETM_PRIV_MAX;
+			i += metadata_cpu_array_size;
 		} else if (ptr[i] == __perf_cs_etmv4_magic) {
 			metadata[j] = zalloc(sizeof(*metadata[j]) *
 					     CS_ETMV4_PRIV_MAX);
@@ -2571,12 +2585,12 @@  int cs_etm__process_auxtrace_info(union perf_event *event,
 				err = -ENOMEM;
 				goto err_free_metadata;
 			}
-			for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
+			for (k = 0; k < metadata_cpu_array_size; k++)
 				metadata[j][k] = ptr[i + k];
 
 			/* The traceID is our handle */
 			idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
-			i += CS_ETMV4_PRIV_MAX;
+			i += metadata_cpu_array_size;
 		}
 
 		/* Get an RB node for this CPU */