diff mbox series

[v3,4/5] trace-cmd split: Enable support for buffer selection

Message ID 20240123134215.385415-5-pierre.gondois@arm.com (mailing list archive)
State Superseded
Headers show
Series trace-cmd: split: Handle splitting files with multiple instances | expand

Commit Message

Pierre Gondois Jan. 23, 2024, 1:42 p.m. UTC
The 'trace-cmd split' command conserves all buffers/instances
of the input .dat file. Add support to select the instances to
keep and to place as the top instance in an output .dat file.

Multiple .dat files can be generated with different combinations
of the instances to keep. The first instance in the command line
is selected as the top instance in an output .dat file.

For example, with a trace recorded with:
  $ trace-cmd record -e sched_wakeup -B switch_instance \
    -e sched_switch -B timer_instance -e timer

Creating a test.dat file containing the top instance and
the switch_instance:
  $ trace-cmd split --top -B switch_instance -o test.dat

Creating a test.dat file containing the switch_instance as
the top instance, and the initial top instance as an instance
named 'old_top':
  $ trace-cmd split -B switch_instance --top=old_top -o test.dat

Splitting all instances in different .dat files:
  $ trace-cmd split --top -o top.dat -B switch_instance \
    -o switch.dat -B timer_instance -o timer.dat

To achieve this, new 'struct name_list', 'struct output_file_list'
structures are created, with associated functions.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
---
 tracecmd/trace-split.c | 307 +++++++++++++++++++++++++++++++++++------
 1 file changed, 262 insertions(+), 45 deletions(-)

Comments

Steven Rostedt Jan. 24, 2024, 10:37 p.m. UTC | #1
[ Replying to the correct version of the patch I tested ]

On Tue, 23 Jan 2024 14:42:14 +0100
Pierre Gondois <pierre.gondois@arm.com> wrote:

> The 'trace-cmd split' command conserves all buffers/instances
> of the input .dat file. Add support to select the instances to
> keep and to place as the top instance in an output .dat file.
> 
> Multiple .dat files can be generated with different combinations
> of the instances to keep. The first instance in the command line
> is selected as the top instance in an output .dat file.
> 
> For example, with a trace recorded with:
>   $ trace-cmd record -e sched_wakeup -B switch_instance \
>     -e sched_switch -B timer_instance -e timer

I tried this with a trace.dat file that has a instance called "tast" and it
didn't work. Note, the top instance had no data.

  $ trace-cmd split -i /work/traces/trace-tast.dat -o /tmp/trace-tast2.dat -B tast

  $ trace-cmd report /tmp/trace-tast2.dat
cpus=2
tast:        trace-cmd-3559  [001] 59413.154884: sched_waking:         comm=kworker/u4:2 pid=1324 prio=120 target_cpu=000
tast:        trace-cmd-3564  [000] 59413.154900: sched_switch:         trace-cmd:3564 [120] R ==> kworker/u4:2:1324 [120]
tast:        trace-cmd-3559  [001] 59413.154907: sched_switch:         trace-cmd:3559 [120] S ==> trace-cmd:3565 [120]
tast:     kworker/u4:2-1324  [000] 59413.154911: sched_waking:         comm=sshd pid=18725 prio=120 target_cpu=001
tast:     kworker/u4:2-1324  [000] 59413.154929: sched_switch:         kworker/u4:2:1324 [120] I ==> sshd:18725 [120]
tast:             sshd-18725 [000] 59413.155120: sched_waking:         comm=sslh-fork pid=18724 prio=120 target_cpu=001
tast:             sshd-18725 [000] 59413.155178: sched_waking:         comm=kworker/0:1 pid=815 prio=120 target_cpu=000
tast:             sshd-18725 [000] 59413.155189: sched_switch:         sshd:18725 [120] S ==> sslh-fork:18724 [120]
tast:        sslh-fork-18724 [000] 59413.155303: sched_switch:         sslh-fork:18724 [120] S ==> kworker/0:1:815 [120]

Still has "tast" as an instance and not the top.

-- Steve

> 
> Creating a test.dat file containing the top instance and
> the switch_instance:
>   $ trace-cmd split --top -B switch_instance -o test.dat
> 
> Creating a test.dat file containing the switch_instance as
> the top instance, and the initial top instance as an instance
> named 'old_top':
>   $ trace-cmd split -B switch_instance --top=old_top -o test.dat
> 
> Splitting all instances in different .dat files:
>   $ trace-cmd split --top -o top.dat -B switch_instance \
>     -o switch.dat -B timer_instance -o timer.dat
> 
> To achieve this, new 'struct name_list', 'struct output_file_list'
> structures are created, with associated functions.
> 
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
> ---
Pierre Gondois Jan. 25, 2024, 3:16 p.m. UTC | #2
Hello Steven,

On 1/24/24 23:37, Steven Rostedt wrote:
> 
> [ Replying to the correct version of the patch I tested ]
> 
> On Tue, 23 Jan 2024 14:42:14 +0100
> Pierre Gondois <pierre.gondois@arm.com> wrote:
> 
>> The 'trace-cmd split' command conserves all buffers/instances
>> of the input .dat file. Add support to select the instances to
>> keep and to place as the top instance in an output .dat file.
>>
>> Multiple .dat files can be generated with different combinations
>> of the instances to keep. The first instance in the command line
>> is selected as the top instance in an output .dat file.
>>
>> For example, with a trace recorded with:
>>    $ trace-cmd record -e sched_wakeup -B switch_instance \
>>      -e sched_switch -B timer_instance -e timer
> 
> I tried this with a trace.dat file that has a instance called "tast" and it
> didn't work. Note, the top instance had no data.
> 
>    $ trace-cmd split -i /work/traces/trace-tast.dat -o /tmp/trace-tast2.dat -B tast

With this command, the expected behaviour should be:
trace-tast2.dat:
- Contains all the instances of the input trace-tast.dat file
trace-tast.dat.1:
- Contains the tast instance only, as the top instance

If the 2 files don't match the above description, there is an issue.

---

Otherwise, I believe the command line should be parsed as:
1-
Select all the desired instances with -B or --top,
the first instance is placed as the top instance.
2-
When parsing a '-o', select all the previously selected instances and
put them in the output file.
3-
Start all over again from 1-. If there is no output file described
at the end of the command line, place the result in [input_file].dat.X

So placing the output file at the end of the command line is important
here. If you think the command line should be parsed another way, please
let me know,

> 
>    $ trace-cmd report /tmp/trace-tast2.dat
> cpus=2
> tast:        trace-cmd-3559  [001] 59413.154884: sched_waking:         comm=kworker/u4:2 pid=1324 prio=120 target_cpu=000
> tast:        trace-cmd-3564  [000] 59413.154900: sched_switch:         trace-cmd:3564 [120] R ==> kworker/u4:2:1324 [120]
> tast:        trace-cmd-3559  [001] 59413.154907: sched_switch:         trace-cmd:3559 [120] S ==> trace-cmd:3565 [120]
> tast:     kworker/u4:2-1324  [000] 59413.154911: sched_waking:         comm=sshd pid=18725 prio=120 target_cpu=001
> tast:     kworker/u4:2-1324  [000] 59413.154929: sched_switch:         kworker/u4:2:1324 [120] I ==> sshd:18725 [120]
> tast:             sshd-18725 [000] 59413.155120: sched_waking:         comm=sslh-fork pid=18724 prio=120 target_cpu=001
> tast:             sshd-18725 [000] 59413.155178: sched_waking:         comm=kworker/0:1 pid=815 prio=120 target_cpu=000
> tast:             sshd-18725 [000] 59413.155189: sched_switch:         sshd:18725 [120] S ==> sslh-fork:18724 [120]
> tast:        sslh-fork-18724 [000] 59413.155303: sched_switch:         sslh-fork:18724 [120] S ==> kworker/0:1:815 [120]
> 
> Still has "tast" as an instance and not the top.> 
> -- Steve
> 
>>
>> Creating a test.dat file containing the top instance and
>> the switch_instance:
>>    $ trace-cmd split --top -B switch_instance -o test.dat
>>
>> Creating a test.dat file containing the switch_instance as
>> the top instance, and the initial top instance as an instance
>> named 'old_top':
>>    $ trace-cmd split -B switch_instance --top=old_top -o test.dat
>>
>> Splitting all instances in different .dat files:
>>    $ trace-cmd split --top -o top.dat -B switch_instance \
>>      -o switch.dat -B timer_instance -o timer.dat
>>
>> To achieve this, new 'struct name_list', 'struct output_file_list'
>> structures are created, with associated functions.
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>> ---

Regards,
Pierre
Steven Rostedt Jan. 25, 2024, 4:28 p.m. UTC | #3
On Thu, 25 Jan 2024 16:16:42 +0100
Pierre Gondois <pierre.gondois@arm.com> wrote:

> >> For example, with a trace recorded with:
> >>    $ trace-cmd record -e sched_wakeup -B switch_instance \
> >>      -e sched_switch -B timer_instance -e timer  
> > 
> > I tried this with a trace.dat file that has a instance called "tast" and it
> > didn't work. Note, the top instance had no data.
> > 
> >    $ trace-cmd split -i /work/traces/trace-tast.dat -o /tmp/trace-tast2.dat -B tast  
> 
> With this command, the expected behaviour should be:
> trace-tast2.dat:
> - Contains all the instances of the input trace-tast.dat file

Wait what? -i should *not* be used for output if -o is specified!

Now looking at it, I have a bunch of unwanted files in the /work/traces/
directory. That directory is only for traces that I took from test boxes
that I was debugging. All "split" output was suppose to go into /tmp!

> trace-tast.dat.1:
> - Contains the tast instance only, as the top instance

No, that makes no sense to me. The above is me saying:

 "I want to split out the 'tast' instance and make it the top level
 instance in trace-tast2.dat."

-i just mean "input source", and is only used if -o is not specified.

From the man page of 'trace-cmd split'

 -i file
       If this option is not specified, then the split command will look
       for the file named trace.dat. This options will allow the reading of
       another file other than trace.dat.

  -o file
       By default, the split command will use the input file name as a
       basis of where to write the split files. The output file will be the
       input file with an attached '.#\' to the end: trace.dat.1,
       trace.dat.2, etc.

           This option will change the name of the base file used.

           -o file  will create file.1, file.2, etc.


> 
> If the 2 files don't match the above description, there is an issue.

No, it shouldn't do that. It did match your description, but I never
noticed because it was writing files into my "input" directory, which I was
never looking at. :-(

Using your method means we can't use -i for just an input source.

> 
> ---
> 
> Otherwise, I believe the command line should be parsed as:
> 1-
> Select all the desired instances with -B or --top,
> the first instance is placed as the top instance.
> 2-
> When parsing a '-o', select all the previously selected instances and
> put them in the output file.



> 3-
> Start all over again from 1-. If there is no output file described
> at the end of the command line, place the result in [input_file].dat.X

No. input_file should only be used if no output file is specified. If it
had done that before your patches, it was a bug and needs to be fixed.

> 
> So placing the output file at the end of the command line is important
> here. If you think the command line should be parsed another way, please
> let me know,

Now, placement can be important, and perhaps we can use the input_file as
default as if the output file is specified later. I was thinking of doing
this as a separate patch, but if we do allow for multiple output files,
then we can have the commands to do for those files after that.

That is if we have:

  trace-cmd split -i trace.dat --top -o trace-foo.dat -B foo -o trace-bar.dat -B bar

It could then split out the top instance by itself in trace.dat.1, have
trace-foo.dat have the 'foo' instance as its main instance, and
trace-bar.dat have 'bar' instance as its main instance. So order would matter then.

But if nothing is between -i and -o, then nothing should be done against
-i. That is, if the above was:

  trace-cmd split -i trace.dat -o trace-foo.dat -B foo -o trace-bar.dat -B bar

(removed --top) then it would not create the trace.dat.1 file.

And we need to update the man pages and possibly the usage to explicitly
state how order matters.

-- Steve
Steven Rostedt Jan. 25, 2024, 4:51 p.m. UTC | #4
On Thu, 25 Jan 2024 11:28:34 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Thu, 25 Jan 2024 16:16:42 +0100
> Pierre Gondois <pierre.gondois@arm.com> wrote:
> 
> > >> For example, with a trace recorded with:
> > >>    $ trace-cmd record -e sched_wakeup -B switch_instance \
> > >>      -e sched_switch -B timer_instance -e timer    
> > > 
> > > I tried this with a trace.dat file that has a instance called "tast" and it
> > > didn't work. Note, the top instance had no data.
> > > 
> > >    $ trace-cmd split -i /work/traces/trace-tast.dat -o /tmp/trace-tast2.dat -B tast    
> > 
> > With this command, the expected behaviour should be:
> > trace-tast2.dat:
> > - Contains all the instances of the input trace-tast.dat file  
> 
> Wait what? -i should *not* be used for output if -o is specified!

OK, I see what you did.

As if I wrote:

trace-cmd split -i /work/traces/trace-tast.dat  -B tast -o /tmp/trace-tast2.dat

I got what I want.

But once -o is used, it should repeat for the -B. the -B should *not* use
the input file unless -o is not specified.

  trace-cmd split -i /work/traces/trace-tast.dat -o /tmp/trace-tast2.dat -B tast

What the above should have done the same. But if we added:

  trace-cmd split -i /work/traces/trace-tast.dat --top -o /tmp/trace-tast2.dat -B tast

Then /tmp/trace-tast2.dat would just have the top instance, and it should create:

 /tmp/trace-tast2.dat.tast

to hold the tast image.

Now if I had had:

  trace-cmd split -i /work/traces/trace-tast.dat --top  -B tast -o /tmp/trace-tast2.dat

Then because there was a command "--top" before the -B but it had no -o
assigned for it, then the input_file would be considered the output file
and that should act like you described.

-- Steve
Steven Rostedt Jan. 25, 2024, 5:10 p.m. UTC | #5
On Thu, 25 Jan 2024 11:51:47 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Now if I had had:
> 
>   trace-cmd split -i /work/traces/trace-tast.dat --top  -B tast -o /tmp/trace-tast2.dat
> 
> Then because there was a command "--top" before the -B but it had no -o
> assigned for it, then the input_file would be considered the output file
> and that should act like you described.

Of course then things get confusing if we were to have:

trace-cmd split -i /work/traces/trace-tast.dat  -B foo -o /tmp/trace-foo.dat -B bar

We could specify that each -B will just use the top level by default. So
the above would create:

 /tmp/trace-foo.dat

With the instance "foo"

But the "bar" would be in:

 /work/traces/trace-tast.dat.bar

because the top level didn't specify a -o.

So to make it more specific. Each -B will default to the toplevel output
with ".<instance>" appended to it.

That is:

  split <top-level-commands> -B instance1 <instance1-commands> -B <instance2-commands>

If a -o is specified in the <top-level-commands> it becomes the default top
level output file.  If a -o is not specified in any of the
<instance*-commands> then, it will default to the top level output file
with ".<instance-name>" attached to it unless it has its own -o specified.

For multi-file splits, it will append ".0001", ".0002", etc to that file.

-- Steve
Pierre Gondois Jan. 26, 2024, 8:54 a.m. UTC | #6
Hello Steven,

On 1/25/24 18:10, Steven Rostedt wrote:
> On Thu, 25 Jan 2024 11:51:47 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
>> Now if I had had:
>>
>>    trace-cmd split -i /work/traces/trace-tast.dat --top  -B tast -o /tmp/trace-tast2.dat
>>
>> Then because there was a command "--top" before the -B but it had no -o
>> assigned for it, then the input_file would be considered the output file
>> and that should act like you described.
> 
> Of course then things get confusing if we were to have:
> 
> trace-cmd split -i /work/traces/trace-tast.dat  -B foo -o /tmp/trace-foo.dat -B bar
> 
> We could specify that each -B will just use the top level by default. So
> the above would create:
> 
>   /tmp/trace-foo.dat
> 
> With the instance "foo"
> 
> But the "bar" would be in:
> 
>   /work/traces/trace-tast.dat.bar
> 
> because the top level didn't specify a -o.
> 
> So to make it more specific. Each -B will default to the toplevel output
> with ".<instance>" appended to it.
> 
> That is:
> 
>    split <top-level-commands> -B instance1 <instance1-commands> -B <instance2-commands>
> 
> If a -o is specified in the <top-level-commands> it becomes the default top
> level output file.  If a -o is not specified in any of the
> <instance*-commands> then, it will default to the top level output file
> with ".<instance-name>" attached to it unless it has its own -o specified.

I also wanted to handle the case where multiple instances could be placed
in an output file. Meaning that with the patches:
- -B/--top options are parsed to select the instances to keep,
- a -o option ends the parsing of instances and place them in the given output
   file. If no -o option is parsed, then the default output file or the input
   file is used as a base name for the last generated output file (i.e. trace.dat.1
   if no input file is specified)

---

For example, with a trace recorded with:
	$ trace-cmd record -e sched_wakeup -B switch_instance -e sched_switch -B timer_instance -e timer

Creating a test.dat file containing the top instance and
the switch_instance:
	$ trace-cmd split --top -B switch_instance -o test.dat

Creating a test.dat file containing the switch_instance as
the top instance, and the initial top instance as an instance
named 'old_top':
	$ trace-cmd split -B switch_instance --top=old_top -o test.dat

Splitting all instances in different .dat files:
	$ trace-cmd split --top -o top.dat -B switch_instance -o switch.dat -B timer_instance -o timer.dat

And by default, if no -B/--top is specified for an output file,
keep all the instances (but of course all the other options provided
to the split command are applied, i.e. start/end timings):
Keep all the instances and place them in test.dat:
	$ trace-cmd split -o test.dat
Keep all the instances and place them in the default file trace.dat.1:
	$ trace-cmd split

---

Maybe the list of generated files should be printed to avoid what happened
to you, i.e. generating files that were not expected.
I think that having a default name with a suffix being the top instance
is a good idea, but I don't think it would be possible to parse the command
line to have 2 instances in one file in such case.

> 
> For multi-file splits, it will append ".0001", ".0002", etc to that file.
> 

Regards,
Pierre
Steven Rostedt Feb. 2, 2024, 2:17 a.m. UTC | #7
On Fri, 26 Jan 2024 09:54:53 +0100
Pierre Gondois <pierre.gondois@arm.com> wrote:

> Hello Steven,
> 

Hi Pierre,

Sorry for the late reply. I've been putting out fires (you can read LWN or
The Register to see what exactly I was doing).

> On 1/25/24 18:10, Steven Rostedt wrote:
> > On Thu, 25 Jan 2024 11:51:47 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> >   
> >> Now if I had had:
> >>
> >>    trace-cmd split -i /work/traces/trace-tast.dat --top  -B tast -o /tmp/trace-tast2.dat
> >>
> >> Then because there was a command "--top" before the -B but it had no -o
> >> assigned for it, then the input_file would be considered the output file
> >> and that should act like you described.  
> > 
> > Of course then things get confusing if we were to have:
> > 
> > trace-cmd split -i /work/traces/trace-tast.dat  -B foo -o /tmp/trace-foo.dat -B bar
> > 
> > We could specify that each -B will just use the top level by default. So
> > the above would create:
> > 
> >   /tmp/trace-foo.dat
> > 
> > With the instance "foo"
> > 
> > But the "bar" would be in:
> > 
> >   /work/traces/trace-tast.dat.bar
> > 
> > because the top level didn't specify a -o.
> > 
> > So to make it more specific. Each -B will default to the toplevel output
> > with ".<instance>" appended to it.
> > 
> > That is:
> > 
> >    split <top-level-commands> -B instance1 <instance1-commands> -B <instance2-commands>
> > 
> > If a -o is specified in the <top-level-commands> it becomes the default top
> > level output file.  If a -o is not specified in any of the
> > <instance*-commands> then, it will default to the top level output file
> > with ".<instance-name>" attached to it unless it has its own -o specified.  
> 
> I also wanted to handle the case where multiple instances could be placed
> in an output file. Meaning that with the patches:
> - -B/--top options are parsed to select the instances to keep,
> - a -o option ends the parsing of instances and place them in the given output
>    file. If no -o option is parsed, then the default output file or the input
>    file is used as a base name for the last generated output file (i.e. trace.dat.1
>    if no input file is specified)

OK, I see what you want. You want to be able to have multiple buffers in a
single file.

> 
> ---
> 
> For example, with a trace recorded with:
> 	$ trace-cmd record -e sched_wakeup -B switch_instance -e sched_switch -B timer_instance -e timer
> 
> Creating a test.dat file containing the top instance and
> the switch_instance:
> 	$ trace-cmd split --top -B switch_instance -o test.dat
> 
> Creating a test.dat file containing the switch_instance as
> the top instance, and the initial top instance as an instance
> named 'old_top':
> 	$ trace-cmd split -B switch_instance --top=old_top -o test.dat
> 
> Splitting all instances in different .dat files:
> 	$ trace-cmd split --top -o top.dat -B switch_instance -o switch.dat -B timer_instance -o timer.dat
> 
> And by default, if no -B/--top is specified for an output file,
> keep all the instances (but of course all the other options provided
> to the split command are applied, i.e. start/end timings):
> Keep all the instances and place them in test.dat:
> 	$ trace-cmd split -o test.dat
> Keep all the instances and place them in the default file trace.dat.1:
> 	$ trace-cmd split
> 
> ---
> 
> Maybe the list of generated files should be printed to avoid what happened
> to you, i.e. generating files that were not expected.
> I think that having a default name with a suffix being the top instance
> is a good idea, but I don't think it would be possible to parse the command
> line to have 2 instances in one file in such case.
> 
> > 
> > For multi-file splits, it will append ".0001", ".0002", etc to that file.

So what I'm thinking is that the -B buffers default to the same buffer as
the top buffer unless specified differently.

Here's what I think would work. We add four new options.

 --top
 -B <instance>
 -t
 -b

Where --top represents the "top" instance.

-B represents the <instance>

-t can be used after -B to make it a top instance.

-b can be used before any -B to make the top instance into its own instance.

Rules:

  --top can not come after -B.
  -t must come after -B
  -b must come before -B

There can only be one top instance, and all named instances must be unique.
That is, you can't have two instances called "foo". 

We have: trace-cmd split <TOP_COMMANDS> <INSTANCE_COMMANDS>

  TOP_COMMANDS :: nil | TOP_PARAMS

  INSTANCE_COMMANDS :: nil | -B name INSTANCE_PARAMS INSTANCE_COMMANDS

  TOP_PARAMS :: nil | TOP_OPTIONS TOP_PARAMS

  TOP_OPTIONS :: --top | -b name | -o file

  INSTANCE_PARAMS :: nil | INSTANCE_OPTIONS INSTANCE_PARAMS

  INSTANCE_OPTIONS :: nil | -t | -o file

Examples:

 trace-cmd split --top -B foo

Will make a trace.dat.1 that has top and foo in it.

 trace-cmd split --top

Will make a trace.dat.1 that only has the top instance

 trace-cmd split -B foo

Will make a trace.dat.1 that only has foo in it.

 trace-cmd split -B foo -t

Will make a trace.dat.1 that only has foo in it. It will also promote the
instance foo to the top instance (it will lose its "foo" name).

 trace-cmd split --top -o buba.dat -B foo

Will create a buba.dat that has both top and foo in it

 trace-cmd split --top -B foo -o buba.dat

Will create a trace-dat.1 with just top in it, and buba.dat with just foo
in it.

 trace-cmd split --top -B foo -o buba.dat -t

Will create a trace-dat.1 with just top in it, and buba.dat with just foo
in it (promoting foo to top instance)

 trace-cmd split --top -b foo -B foo -t

Will make a trace.dat.1 file where the top instance becomes "foo" and the
foo instance becomes the top instance.

  trace-cmd split -o buba.dat -B foo -t

Will create just a buba.dat file with only foo in it, promoting it as the
top instance.

  trace-cmd split -o buba.dat --top -B foo -o fooba.dat

Will create a buba.dat with just the top instance in it (note --top can
come before or after -o and that does not make any difference, but the
order of -B and -o does matter). It will also create a fooba.dat file with
just foo in it.

  trace-cmd split --top -B foo -t -o fooba.dat -B bar

This will create a trace.dat.1 with the top instance and the bar instance,
but also make a fooba.dat file with just "foo", and promoting it as the top
instance.

  trace-cmd split --top -o buba.dat -B foo -o foobar.dat -B bar -o foobar.dat

This will make two files. One with buba.dat that holds the top instance,
and foobar.dat that holds both the foo and bar instances.

  trace-cmd split --top -o buba.dat -B foo -o foobar.dat -B bar -t -o foobar.dat

This will make two files. One with buba.dat that holds just the top
instance and a foobar.dat file that has both foo and bar, but bar gets
promoted to the top instance.

Does that make sense?

-- Steve
Pierre Gondois Feb. 5, 2024, 1:38 p.m. UTC | #8
Hello Steven,

On 2/2/24 03:17, Steven Rostedt wrote:
> On Fri, 26 Jan 2024 09:54:53 +0100
> Pierre Gondois <pierre.gondois@arm.com> wrote:
> 
>> Hello Steven,
>>
> 
> Hi Pierre,
> 
> Sorry for the late reply. I've been putting out fires (you can read LWN or
> The Register to see what exactly I was doing).

Ok yes, no worries

> 
>> On 1/25/24 18:10, Steven Rostedt wrote:
>>> On Thu, 25 Jan 2024 11:51:47 -0500
>>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>>    
>>>> Now if I had had:
>>>>
>>>>     trace-cmd split -i /work/traces/trace-tast.dat --top  -B tast -o /tmp/trace-tast2.dat
>>>>
>>>> Then because there was a command "--top" before the -B but it had no -o
>>>> assigned for it, then the input_file would be considered the output file
>>>> and that should act like you described.
>>>
>>> Of course then things get confusing if we were to have:
>>>
>>> trace-cmd split -i /work/traces/trace-tast.dat  -B foo -o /tmp/trace-foo.dat -B bar
>>>
>>> We could specify that each -B will just use the top level by default. So
>>> the above would create:
>>>
>>>    /tmp/trace-foo.dat
>>>
>>> With the instance "foo"
>>>
>>> But the "bar" would be in:
>>>
>>>    /work/traces/trace-tast.dat.bar
>>>
>>> because the top level didn't specify a -o.
>>>
>>> So to make it more specific. Each -B will default to the toplevel output
>>> with ".<instance>" appended to it.
>>>
>>> That is:
>>>
>>>     split <top-level-commands> -B instance1 <instance1-commands> -B <instance2-commands>
>>>
>>> If a -o is specified in the <top-level-commands> it becomes the default top
>>> level output file.  If a -o is not specified in any of the
>>> <instance*-commands> then, it will default to the top level output file
>>> with ".<instance-name>" attached to it unless it has its own -o specified.
>>
>> I also wanted to handle the case where multiple instances could be placed
>> in an output file. Meaning that with the patches:
>> - -B/--top options are parsed to select the instances to keep,
>> - a -o option ends the parsing of instances and place them in the given output
>>     file. If no -o option is parsed, then the default output file or the input
>>     file is used as a base name for the last generated output file (i.e. trace.dat.1
>>     if no input file is specified)
> 
> OK, I see what you want. You want to be able to have multiple buffers in a
> single file.
> 
>>
>> ---
>>
>> For example, with a trace recorded with:
>> 	$ trace-cmd record -e sched_wakeup -B switch_instance -e sched_switch -B timer_instance -e timer
>>
>> Creating a test.dat file containing the top instance and
>> the switch_instance:
>> 	$ trace-cmd split --top -B switch_instance -o test.dat
>>
>> Creating a test.dat file containing the switch_instance as
>> the top instance, and the initial top instance as an instance
>> named 'old_top':
>> 	$ trace-cmd split -B switch_instance --top=old_top -o test.dat
>>
>> Splitting all instances in different .dat files:
>> 	$ trace-cmd split --top -o top.dat -B switch_instance -o switch.dat -B timer_instance -o timer.dat
>>
>> And by default, if no -B/--top is specified for an output file,
>> keep all the instances (but of course all the other options provided
>> to the split command are applied, i.e. start/end timings):
>> Keep all the instances and place them in test.dat:
>> 	$ trace-cmd split -o test.dat
>> Keep all the instances and place them in the default file trace.dat.1:
>> 	$ trace-cmd split
>>
>> ---
>>
>> Maybe the list of generated files should be printed to avoid what happened
>> to you, i.e. generating files that were not expected.
>> I think that having a default name with a suffix being the top instance
>> is a good idea, but I don't think it would be possible to parse the command
>> line to have 2 instances in one file in such case.
>>
>>>
>>> For multi-file splits, it will append ".0001", ".0002", etc to that file.
> 
> So what I'm thinking is that the -B buffers default to the same buffer as
> the top buffer unless specified differently.
> 
> Here's what I think would work. We add four new options.
> 
>   --top
>   -B <instance>
>   -t
>   -b
> 
> Where --top represents the "top" instance.
> 
> -B represents the <instance>
> 
> -t can be used after -B to make it a top instance.
> 
> -b can be used before any -B to make the top instance into its own instance.

This set of parameters sounds good to me.

> 
> Rules:
> 
>    --top can not come after -B.
>    -t must come after -B
>    -b must come before -B

IIUC, these rules are described in the syntax below.

> 
> There can only be one top instance, and all named instances must be unique.
> That is, you can't have two instances called "foo".
> 
> We have: trace-cmd split <TOP_COMMANDS> <INSTANCE_COMMANDS>
> 
>    TOP_COMMANDS :: nil | TOP_PARAMS
> 
>    INSTANCE_COMMANDS :: nil | -B name INSTANCE_PARAMS INSTANCE_COMMANDS
> 
>    TOP_PARAMS :: nil | TOP_OPTIONS TOP_PARAMS
> 
>    TOP_OPTIONS :: --top | -b name | -o file
> 
>    INSTANCE_PARAMS :: nil | INSTANCE_OPTIONS INSTANCE_PARAMS
> 
>    INSTANCE_OPTIONS :: nil | -t | -o file

I think it might also be difficult to place the top instance in multiple output
files (or the intention was to do it in multiple steps ?). For instance, from a
trace containing a top instance and foo/bar instances, what would be the command
to obtain:
- a foo.dat file containing the top + foo instances
- a bar.dat file containing the top + bar instances

If this was:
   trace-cmd split --top -o foo.dat --top -o bar.dat -B foo -o foo.dat -B bar -o bar.dat
I think it would be clearer to revolve around the output file:
   trace-cmd split --top -B foo -o foo.dat --top -B bar -o bar.dat
but this is open to debate.

---

Also, if one wants to place all the instance in one output.dat file, IIUC all the
instances will have to be nominatively selected right ?

> 
> Examples:
> 
>   trace-cmd split --top -B foo
> 
> Will make a trace.dat.1 that has top and foo in it.
> 
>   trace-cmd split --top
> 
> Will make a trace.dat.1 that only has the top instance
> 
>   trace-cmd split -B foo
> 
> Will make a trace.dat.1 that only has foo in it.
> 
>   trace-cmd split -B foo -t
> 
> Will make a trace.dat.1 that only has foo in it. It will also promote the
> instance foo to the top instance (it will lose its "foo" name).
> 
>   trace-cmd split --top -o buba.dat -B foo
> 
> Will create a buba.dat that has both top and foo in it

 From the syntax above, I would have thought that this would be parsed as:
- <TOP_COMMANDS>: '--top -o buba.dat'
- <INSTANCE_COMMANDS>: '-B foo'

and so that the 'foo' instance would end up in the default trace.dat.1
as there is no output file specified.

> 
>   trace-cmd split --top -B foo -o buba.dat
> 
> Will create a trace-dat.1 with just top in it, and buba.dat with just foo
> in it.
> 
>   trace-cmd split --top -B foo -o buba.dat -t
> 
> Will create a trace-dat.1 with just top in it, and buba.dat with just foo
> in it (promoting foo to top instance)
> 
>   trace-cmd split --top -b foo -B foo -t
> 
> Will make a trace.dat.1 file where the top instance becomes "foo" and the
> foo instance becomes the top instance.
> 
>    trace-cmd split -o buba.dat -B foo -t

I think this would be parsed (cf. the syntax above) as:
- <TOP_COMMANDS>: '-o buba.dat'
- <INSTANCE_COMMANDS>: '-B foo -t'

So there would be
- a buba.dat file containing all the possible instances and the
   top instance
- a default trace.data.1 containing the foo instance as the top instance

> 
> Will create just a buba.dat file with only foo in it, promoting it as the
> top instance.
> 
>    trace-cmd split -o buba.dat --top -B foo -o fooba.dat
> 
> Will create a buba.dat with just the top instance in it (note --top can
> come before or after -o and that does not make any difference, but the
> order of -B and -o does matter). It will also create a fooba.dat file with
> just foo in it.
> 
>    trace-cmd split --top -B foo -t -o fooba.dat -B bar
> 
> This will create a trace.dat.1 with the top instance and the bar instance,
> but also make a fooba.dat file with just "foo", and promoting it as the top
> instance.
> 
>    trace-cmd split --top -o buba.dat -B foo -o foobar.dat -B bar -o foobar.dat
> 
> This will make two files. One with buba.dat that holds the top instance,
> and foobar.dat that holds both the foo and bar instances.
> 
>    trace-cmd split --top -o buba.dat -B foo -o foobar.dat -B bar -t -o foobar.dat
> 
> This will make two files. One with buba.dat that holds just the top
> instance and a foobar.dat file that has both foo and bar, but bar gets
> promoted to the top instance.
> 
> Does that make sense?
> 
> -- Steve

Regards,
Pierre
Steven Rostedt Feb. 11, 2024, 11:35 p.m. UTC | #9
On Mon, 5 Feb 2024 14:38:38 +0100
Pierre Gondois <pierre.gondois@arm.com> wrote:
 > 
> > So what I'm thinking is that the -B buffers default to the same buffer as
> > the top buffer unless specified differently.
> > 
> > Here's what I think would work. We add four new options.
> > 
> >   --top
> >   -B <instance>
> >   -t
> >   -b
> > 
> > Where --top represents the "top" instance.
> > 
> > -B represents the <instance>
> > 
> > -t can be used after -B to make it a top instance.
> > 
> > -b can be used before any -B to make the top instance into its own instance.  
> 
> This set of parameters sounds good to me.
> 
> > 
> > Rules:
> > 
> >    --top can not come after -B.
> >    -t must come after -B
> >    -b must come before -B  
> 
> IIUC, these rules are described in the syntax below.
> 
> > 
> > There can only be one top instance, and all named instances must be unique.
> > That is, you can't have two instances called "foo".
> > 
> > We have: trace-cmd split <TOP_COMMANDS> <INSTANCE_COMMANDS>
> > 
> >    TOP_COMMANDS :: nil | TOP_PARAMS
> > 
> >    INSTANCE_COMMANDS :: nil | -B name INSTANCE_PARAMS INSTANCE_COMMANDS
> > 
> >    TOP_PARAMS :: nil | TOP_OPTIONS TOP_PARAMS
> > 
> >    TOP_OPTIONS :: --top | -b name | -o file
> > 
> >    INSTANCE_PARAMS :: nil | INSTANCE_OPTIONS INSTANCE_PARAMS
> > 
> >    INSTANCE_OPTIONS :: nil | -t | -o file  
> 
> I think it might also be difficult to place the top instance in multiple output
> files (or the intention was to do it in multiple steps ?). For instance, from a
> trace containing a top instance and foo/bar instances, what would be the command
> to obtain:
> - a foo.dat file containing the top + foo instances
> - a bar.dat file containing the top + bar instances

Does that need to be done in a single command, or two commands?

  trace-cmd split --top -B foo
  trace-cmd split --top -B bar

?

> 
> If this was:
>    trace-cmd split --top -o foo.dat --top -o bar.dat -B foo -o foo.dat -B bar -o bar.dat
> I think it would be clearer to revolve around the output file:
>    trace-cmd split --top -B foo -o foo.dat --top -B bar -o bar.dat
> but this is open to debate.

My concern is that if we want to do that in one command it can cause
the code and parameters to become too complex. But I'm not against the
idea. I personally think that if we allow too much it may become too
confusing.

> 
> ---
> 
> Also, if one wants to place all the instance in one output.dat file, IIUC all the
> instances will have to be nominatively selected right ?
> 
> > 
> > Examples:
> > 
> >   trace-cmd split --top -B foo
> > 
> > Will make a trace.dat.1 that has top and foo in it.
> > 
> >   trace-cmd split --top
> > 
> > Will make a trace.dat.1 that only has the top instance
> > 
> >   trace-cmd split -B foo
> > 
> > Will make a trace.dat.1 that only has foo in it.
> > 
> >   trace-cmd split -B foo -t
> > 
> > Will make a trace.dat.1 that only has foo in it. It will also promote the
> > instance foo to the top instance (it will lose its "foo" name).
> > 
> >   trace-cmd split --top -o buba.dat -B foo
> > 
> > Will create a buba.dat that has both top and foo in it  
> 
>  From the syntax above, I would have thought that this would be parsed as:
> - <TOP_COMMANDS>: '--top -o buba.dat'
> - <INSTANCE_COMMANDS>: '-B foo'
> 
> and so that the 'foo' instance would end up in the default trace.dat.1
> as there is no output file specified.

No, I was saying that -B will default to the top file, not the default
file. If the top file is specified, then the -B will also be in that file.

> 
> > 
> >   trace-cmd split --top -B foo -o buba.dat
> > 
> > Will create a trace-dat.1 with just top in it, and buba.dat with just foo
> > in it.
> > 
> >   trace-cmd split --top -B foo -o buba.dat -t
> > 
> > Will create a trace-dat.1 with just top in it, and buba.dat with just foo
> > in it (promoting foo to top instance)
> > 
> >   trace-cmd split --top -b foo -B foo -t
> > 
> > Will make a trace.dat.1 file where the top instance becomes "foo" and the
> > foo instance becomes the top instance.
> > 
> >    trace-cmd split -o buba.dat -B foo -t  
> 
> I think this would be parsed (cf. the syntax above) as:
> - <TOP_COMMANDS>: '-o buba.dat'
> - <INSTANCE_COMMANDS>: '-B foo -t'
> 
> So there would be
> - a buba.dat file containing all the possible instances and the
>    top instance
> - a default trace.data.1 containing the foo instance as the top instance

Again, -o of the top file becomes the default for all the rest, unless
they specify their own output file.

-- Steve
Pierre Gondois Feb. 12, 2024, 9:11 a.m. UTC | #10
On 2/12/24 00:35, Steven Rostedt wrote:
> On Mon, 5 Feb 2024 14:38:38 +0100
> Pierre Gondois <pierre.gondois@arm.com> wrote:
>   >
>>> So what I'm thinking is that the -B buffers default to the same buffer as
>>> the top buffer unless specified differently.
>>>
>>> Here's what I think would work. We add four new options.
>>>
>>>    --top
>>>    -B <instance>
>>>    -t
>>>    -b
>>>
>>> Where --top represents the "top" instance.
>>>
>>> -B represents the <instance>
>>>
>>> -t can be used after -B to make it a top instance.
>>>
>>> -b can be used before any -B to make the top instance into its own instance.
>>
>> This set of parameters sounds good to me.
>>
>>>
>>> Rules:
>>>
>>>     --top can not come after -B.
>>>     -t must come after -B
>>>     -b must come before -B
>>
>> IIUC, these rules are described in the syntax below.
>>
>>>
>>> There can only be one top instance, and all named instances must be unique.
>>> That is, you can't have two instances called "foo".
>>>
>>> We have: trace-cmd split <TOP_COMMANDS> <INSTANCE_COMMANDS>
>>>
>>>     TOP_COMMANDS :: nil | TOP_PARAMS
>>>
>>>     INSTANCE_COMMANDS :: nil | -B name INSTANCE_PARAMS INSTANCE_COMMANDS
>>>
>>>     TOP_PARAMS :: nil | TOP_OPTIONS TOP_PARAMS
>>>
>>>     TOP_OPTIONS :: --top | -b name | -o file
>>>
>>>     INSTANCE_PARAMS :: nil | INSTANCE_OPTIONS INSTANCE_PARAMS
>>>
>>>     INSTANCE_OPTIONS :: nil | -t | -o file
>>
>> I think it might also be difficult to place the top instance in multiple output
>> files (or the intention was to do it in multiple steps ?). For instance, from a
>> trace containing a top instance and foo/bar instances, what would be the command
>> to obtain:
>> - a foo.dat file containing the top + foo instances
>> - a bar.dat file containing the top + bar instances
> 
> Does that need to be done in a single command, or two commands?
> 
>    trace-cmd split --top -B foo
>    trace-cmd split --top -B bar
> 
> ?

Yes ok, we can do that.

> 
>>
>> If this was:
>>     trace-cmd split --top -o foo.dat --top -o bar.dat -B foo -o foo.dat -B bar -o bar.dat
>> I think it would be clearer to revolve around the output file:
>>     trace-cmd split --top -B foo -o foo.dat --top -B bar -o bar.dat
>> but this is open to debate.
> 
> My concern is that if we want to do that in one command it can cause
> the code and parameters to become too complex. But I'm not against the
> idea. I personally think that if we allow too much it may become too
> confusing.

Ok ok, we can limit the command to one output file.

> 
>>
>> ---
>>
>> Also, if one wants to place all the instance in one output.dat file, IIUC all the
>> instances will have to be nominatively selected right ?
>>
>>>
>>> Examples:
>>>
>>>    trace-cmd split --top -B foo
>>>
>>> Will make a trace.dat.1 that has top and foo in it.
>>>
>>>    trace-cmd split --top
>>>
>>> Will make a trace.dat.1 that only has the top instance
>>>
>>>    trace-cmd split -B foo
>>>
>>> Will make a trace.dat.1 that only has foo in it.
>>>
>>>    trace-cmd split -B foo -t
>>>
>>> Will make a trace.dat.1 that only has foo in it. It will also promote the
>>> instance foo to the top instance (it will lose its "foo" name).
>>>
>>>    trace-cmd split --top -o buba.dat -B foo
>>>
>>> Will create a buba.dat that has both top and foo in it
>>
>>   From the syntax above, I would have thought that this would be parsed as:
>> - <TOP_COMMANDS>: '--top -o buba.dat'
>> - <INSTANCE_COMMANDS>: '-B foo'
>>
>> and so that the 'foo' instance would end up in the default trace.dat.1
>> as there is no output file specified.
> 
> No, I was saying that -B will default to the top file, not the default
> file. If the top file is specified, then the -B will also be in that file.

Ok right. I think I got confused due to the possibility to have multiple
output files when expanding 'trace-cmd split <TOP_COMMANDS> <INSTANCE_COMMANDS>'

I think the description should match this then:
   trace-cmd split [OTHER_OPTIONS] [--top [-b name]] [-o output_file] [-B instance_name [-t]]*
     [start-time [end-time]]
# The '-t' option can only be used once per command line

or:
   trace-cmd split <TOP_COMMANDS> <OUTPUT_COMMAND> <INSTANCE_COMMANDS>

   TOP_COMMANDS :: nil | --top TOP_PARAMS
   TOP_PARAMS :: nil | -b name

   OUTPUT_COMMAND :: nil | -o output_file

   INSTANCE_COMMANDS :: nil | -B name INSTANCE_PARAMS INSTANCE_COMMANDS
   INSTANCE_PARAMS :: nil | -t
# INSTANCE_PARAMS can only expand to '-t' once.

I'm also not sure it is necessary to have a specific order for the
--top/-o/-B parameters. I should also work when they are unordered
if we limit the command to one output file:
- trace-cmd split -B foo -t --top -b old_top -o output.dat
- trace-cmd split -o output.dat -B foo -t --top -b old_top
- trace-cmd split -o output.dat --top -b old_top -B foo -t
should all be equivalent.

Regards,
Pierre

> 
>>
>>>
>>>    trace-cmd split --top -B foo -o buba.dat
>>>
>>> Will create a trace-dat.1 with just top in it, and buba.dat with just foo
>>> in it.
>>>
>>>    trace-cmd split --top -B foo -o buba.dat -t
>>>
>>> Will create a trace-dat.1 with just top in it, and buba.dat with just foo
>>> in it (promoting foo to top instance)
>>>
>>>    trace-cmd split --top -b foo -B foo -t
>>>
>>> Will make a trace.dat.1 file where the top instance becomes "foo" and the
>>> foo instance becomes the top instance.
>>>
>>>     trace-cmd split -o buba.dat -B foo -t
>>
>> I think this would be parsed (cf. the syntax above) as:
>> - <TOP_COMMANDS>: '-o buba.dat'
>> - <INSTANCE_COMMANDS>: '-B foo -t'
>>
>> So there would be
>> - a buba.dat file containing all the possible instances and the
>>     top instance
>> - a default trace.data.1 containing the foo instance as the top instance
> 
> Again, -o of the top file becomes the default for all the rest, unless
> they specify their own output file.
> 
> -- Steve
Steven Rostedt Feb. 12, 2024, 9:28 p.m. UTC | #11
On Mon, 12 Feb 2024 10:11:52 +0100
Pierre Gondois <pierre.gondois@arm.com> wrote:

> Ok right. I think I got confused due to the possibility to have multiple
> output files when expanding 'trace-cmd split <TOP_COMMANDS> <INSTANCE_COMMANDS>'

Yeah, I think we got on the tangent of having split become too powerful for
one line. Having it only have one output name probably is the least
confusing. It can be called multiple times if someone wants more complex
parsing.

> 
> I think the description should match this then:
>    trace-cmd split [OTHER_OPTIONS] [--top [-b name]] [-o output_file] [-B instance_name [-t]]*
>      [start-time [end-time]]
> # The '-t' option can only be used once per command line
> 
> or:
>    trace-cmd split <TOP_COMMANDS> <OUTPUT_COMMAND> <INSTANCE_COMMANDS>
> 
>    TOP_COMMANDS :: nil | --top TOP_PARAMS
>    TOP_PARAMS :: nil | -b name
> 
>    OUTPUT_COMMAND :: nil | -o output_file
> 
>    INSTANCE_COMMANDS :: nil | -B name INSTANCE_PARAMS INSTANCE_COMMANDS
>    INSTANCE_PARAMS :: nil | -t
> # INSTANCE_PARAMS can only expand to '-t' once.
> 
> I'm also not sure it is necessary to have a specific order for the
> --top/-o/-B parameters. I should also work when they are unordered
> if we limit the command to one output file:
> - trace-cmd split -B foo -t --top -b old_top -o output.dat
> - trace-cmd split -o output.dat -B foo -t --top -b old_top
> - trace-cmd split -o output.dat --top -b old_top -B foo -t
> should all be equivalent.

Yeah, that's fine. Let's go with that.

-- Steve
diff mbox series

Patch

diff --git a/tracecmd/trace-split.c b/tracecmd/trace-split.c
index 0820a3bb..73ef6e97 100644
--- a/tracecmd/trace-split.c
+++ b/tracecmd/trace-split.c
@@ -51,11 +51,89 @@  struct cpu_data {
 	char				*file;
 };
 
+struct name_list {
+	struct list_head		list;
+	const char			*name;
+
+	/* Identify the top instance in the input trace. */
+	bool				was_top_instance;
+
+	/* Identify the top instance in each output trace. */
+	bool				is_top_instance;
+};
+
+static void free_names(struct list_head *list) {
+	struct name_list *item;
+
+	while (!list_empty(list)) {
+		item = container_of(list->next, struct name_list, list);
+		list_del(&item->list);
+		free((char *)item->name);
+		free(item);
+	}
+}
+
+struct output_file_list {
+	struct list_head list;
+	const char *output;
+	char *output_file;
+	struct list_head instance_list;
+	struct list_head handle_list;
+};
+
+static struct list_head output_file_list;
+
+static struct output_file_list *add_output_file(void) {
+	struct output_file_list *item;
+
+	item = calloc(1, sizeof(*item));
+	if (!item)
+		die("Failed to allocate output_file item");
+
+	list_head_init(&item->instance_list);
+	list_head_init(&item->handle_list);
+	list_add_tail(&item->list, &output_file_list);
+	return item;
+}
+
+static void add_output_file_instance(struct output_file_list *output_file, const char *instance,
+				     bool was_top_instance, bool is_top_instance) {
+	struct name_list *item;
+
+	item = calloc(1, sizeof(*item));
+	if (!item)
+		die("Failed to allocate output_file item");
+
+	item->name = strdup(instance);
+	if (!item->name)
+		die("Failed to duplicate %s", instance);
+
+	item->was_top_instance = was_top_instance;
+	item->is_top_instance = is_top_instance;
+	list_add_tail(&item->list, &output_file->instance_list);
+}
+
+static void free_handles(struct list_head *list);
+
+static void free_output_files(struct list_head *list) {
+	struct output_file_list *item;
+
+	while (!list_empty(list)) {
+		item = container_of(list->next, struct output_file_list, list);
+		list_del(&item->list);
+		free_names(&item->instance_list);
+		free_handles(&item->handle_list);
+		free((char *)item->output);
+		free((char *)item->output_file);
+		free(item);
+	}
+}
+
 struct handle_list {
 	struct list_head		list;
 	const char			*name;
 	int				index;
-	struct tracecmd_input 		*handle;
+	struct tracecmd_input		*handle;
 
 	/* Identify the top instance in the input trace. */
 	bool				was_top_instance;
@@ -103,10 +181,28 @@  static void add_handle(const char *name, int index, bool was_top_instance)
 
 	item->index = index;
 	item->was_top_instance = was_top_instance;
-	item->handle = get_handle(item);
+
+	/* Don't need to request a real handle for handle_list nodes. */
+	item->handle = NULL;
 	list_add_tail(&item->list, &handle_list);
 }
 
+static void add_handle_copy(struct handle_list *item, struct name_list *name_entry,
+			    struct list_head *list) {
+	struct handle_list *copied_item;
+
+	copied_item = calloc(1, sizeof(*copied_item));
+	copied_item->name = strdup(name_entry->name);
+	if (!copied_item->name)
+		die("Failed to duplicate %s", name_entry->name);
+
+	copied_item->index = item->index;
+	copied_item->was_top_instance = item->was_top_instance;
+	copied_item->is_top_instance = name_entry->is_top_instance;
+	copied_item->handle = get_handle(copied_item);
+	list_add_tail(&copied_item->list, list);
+}
+
 static void free_handles(struct list_head *list)
 {
 	struct handle_list *item;
@@ -115,7 +211,8 @@  static void free_handles(struct list_head *list)
 		item = container_of(list->next, struct handle_list, list);
 		list_del(&item->list);
 		free((char *)item->name);
-		tracecmd_close(item->handle);
+		if (item->handle)
+			tracecmd_close(item->handle);
 		free(item);
 	}
 }
@@ -473,13 +570,10 @@  static void touch_file(const char *file)
 }
 
 static unsigned long long parse_file(struct tracecmd_input *handle,
-				     const char *output_file,
-				     unsigned long long start,
-				     unsigned long long end, int percpu,
-				     int only_cpu, int count,
-				     enum split_types type,
-				     bool *end_reached)
-{
+				     struct output_file_list *output_file_entry,
+				     unsigned long long start, unsigned long long end, int percpu,
+				     int only_cpu, int count, enum split_types type,
+				     bool *end_reached) {
 	unsigned long long current;
 	struct handle_list *handle_entry;
 	struct tracecmd_output *ohandle;
@@ -493,10 +587,11 @@  static unsigned long long parse_file(struct tracecmd_input *handle,
 	int ret;
 	int fd;
 
-	ohandle = tracecmd_copy(handle, output_file, TRACECMD_FILE_CMD_LINES, 0, NULL);
+	ohandle = tracecmd_copy(handle, output_file_entry->output_file,
+				TRACECMD_FILE_CMD_LINES, 0, NULL);
 	tracecmd_set_out_clock(ohandle, tracecmd_get_trace_clock(handle));
 
-	list_for_each_entry(handle_entry, &handle_list, list) {
+	list_for_each_entry(handle_entry, &output_file_entry->handle_list, list) {
 		struct tracecmd_input *curr_handle;
 		bool curr_end_reached = false;
 
@@ -507,7 +602,8 @@  static unsigned long long parse_file(struct tracecmd_input *handle,
 			die("Failed to allocate cpu_data for %d cpus", cpus);
 
 		for (cpu = 0; cpu < cpus; cpu++) {
-			file = get_temp_file(output_file, handle_entry->name, cpu);
+			file = get_temp_file(output_file_entry->output_file,
+					     handle_entry->name, cpu);
 			touch_file(file);
 
 			fd = open(file, O_WRONLY | O_CREAT | O_TRUNC | O_LARGEFILE, 0644);
@@ -540,7 +636,7 @@  static unsigned long long parse_file(struct tracecmd_input *handle,
 		for (cpu = 0; cpu < cpus; cpu++)
 			cpu_list[cpu] = cpu_data[cpu].file;
 
-		if (handle_entry->was_top_instance)
+		if (handle_entry->is_top_instance)
 			ret = tracecmd_append_cpu_data(ohandle, cpus, cpu_list);
 		else
 			ret = tracecmd_append_buffer_cpu_data(ohandle, handle_entry->name, cpus,
@@ -573,16 +669,76 @@  static unsigned long long parse_file(struct tracecmd_input *handle,
 	return current;
 }
 
+/**
+ * map_handle_output - Map the buffer/instances to the output_files.
+ *
+ * The trace-cmd split command can create multiple .dat files, each containing
+ * various combinations of the buffers available in the input .dat file.
+ * Associate the list of buffers/instances to keep to each output file. A
+ * buffer/instance can be present in multiple output files.
+ */
+static void map_handle_output(void) {
+	struct output_file_list *output_file_entry;
+	struct handle_list *handle_entry;
+	struct name_list *name_entry;
+	bool found_match = false;
+
+	list_for_each_entry(output_file_entry, &output_file_list, list) {
+		/*
+		 * No specific instance were given for this output file.
+		 * Select all the instances.
+		 */
+		if (list_empty(&output_file_entry->instance_list)) {
+			list_for_each_entry(handle_entry, &handle_list, list) {
+				/* Keep the old top instance as the top instance. */
+				add_output_file_instance(output_file_entry, handle_entry->name,
+							 handle_entry->was_top_instance,
+							 handle_entry->was_top_instance);
+			}
+		}
+
+		list_for_each_entry(name_entry, &output_file_entry->instance_list, list) {
+			list_for_each_entry(handle_entry, &handle_list, list) {
+				if ((name_entry->was_top_instance &&
+				     handle_entry->was_top_instance) ||
+				    !strcmp(handle_entry->name, name_entry->name)) {
+					found_match = true;
+					goto found;
+				}
+			}
+
+		found:
+			if (!found_match) {
+				warning("Buffer %s requested, but not found.", name_entry->name);
+				continue;
+			}
+
+			/*
+			 * As a handle can be requested in multiple output files,
+			 * the name_entry must be copied (and not de/en-queued).
+			 */
+			add_handle_copy(handle_entry, name_entry, &output_file_entry->handle_list);
+			found_match = false;
+		}
+	}
+}
+
+enum { OPT_top = 237,
+};
+
 void trace_split (int argc, char **argv)
 {
 	struct tracecmd_input *handle;
+	struct output_file_list *output_file_param = NULL;
+	struct output_file_list *output_file_entry;
 	unsigned long long start_ns = 0, end_ns = 0;
 	unsigned long long current;
+	bool was_top_instance;
+	bool is_top_instance;
 	bool end_reached = false;
 	double start, end;
 	char *endptr;
 	char *output = NULL;
-	char *output_file;
 	enum split_types split_type = SPLIT_NONE;
 	enum split_types type = SPLIT_NONE;
 	int instances;
@@ -593,12 +749,26 @@  void trace_split (int argc, char **argv)
 	int ac;
 	int c;
 
+	static struct option long_options[] = {
+		{"top", optional_argument, NULL, OPT_top},
+		{NULL, 0, NULL, 0},
+	};
+	int option_index = 0;
+
+	/* The first instance is the new top buffer of the output file. */
+	is_top_instance = true;
+	was_top_instance = false;
+
 	list_head_init(&handle_list);
+	list_head_init(&output_file_list);
+
+	output_file_param = add_output_file();
 
 	if (strcmp(argv[1], "split") != 0)
 		usage(argv);
 
-	while ((c = getopt(argc-1, argv+1, "+ho:i:s:m:u:e:p:rcC:")) >= 0) {
+	while ((c = getopt_long(argc - 1, argv + 1, "+ho:i:s:m:u:e:p:rcC:B:",
+				long_options, &option_index)) >= 0) {
 		switch (c) {
 		case 'h':
 			usage(argv);
@@ -634,13 +804,48 @@  void trace_split (int argc, char **argv)
 			cpu = atoi(optarg);
 			break;
 		case 'o':
-			if (output)
+			if (!output_file_param) {
+				/* The first name is the main instance. */
+				is_top_instance = true;
+				output_file_param = add_output_file();
+			}
+
+			if (output_file_param->output)
 				die("only one output file allowed");
-			output = strdup(optarg);
+
+			output_file_param->output = strdup(optarg);
+			if (!output_file_param->output)
+				die("Failed to duplicate %s", optarg);
+
+			/* New output file. */
+			output_file_param = NULL;
 			break;
 		case 'i':
 			input_file = optarg;
 			break;
+		case OPT_top:
+			was_top_instance = true;
+			if (!optarg)
+				output = (char *)default_top_instance_name;
+			else
+				output = optarg;
+			/* Fall through */
+		case 'B':
+			if (!output_file_param) {
+				/* The first name is the main instance. */
+				is_top_instance = true;
+				output_file_param = add_output_file();
+			}
+			if (!output)
+				output = optarg;
+
+			add_output_file_instance(output_file_param, output,
+						 was_top_instance, is_top_instance);
+
+			output = NULL;
+			was_top_instance = false;
+			is_top_instance = false;
+			break;
 		default:
 			usage(argv);
 		}
@@ -686,19 +891,6 @@  void trace_split (int argc, char **argv)
 
 	page_size = tracecmd_page_size(handle);
 
-	if (!output)
-		output = strdup(input_file);
-
-	if (!repeat && strcmp(output, input_file) == 0) {
-		output = realloc(output, strlen(output) + 3);
-		strcat(output, ".1");
-	}
-
-	output_file = malloc(strlen(output) + 50);
-	if (!output_file)
-		die("Failed to allocate for %s", output);
-	c = 1;
-
 	add_handle(default_top_instance_name, -1, true);
 	instances = tracecmd_buffer_instances(handle);
 	if (instances) {
@@ -713,25 +905,50 @@  void trace_split (int argc, char **argv)
 		}
 	}
 
-	do {
-		if (repeat)
-			sprintf(output_file, "%s.%04d", output, c++);
-		else
-			strcpy(output_file, output);
-			
-		current = parse_file(handle, output_file, start_ns, end_ns,
-				     percpu, cpu, count, type, &end_reached);
+	if (output_file_param && !output_file_param->output) {
+		output_file_param->output = strdup(input_file);
+		if (!output_file_param->output)
+			die("Failed to duplicate %s", input_file);
+	}
 
-		if (!repeat)
-			break;
-		start_ns = 0;
-	} while (!end_reached && (current && (!end_ns || current < end_ns)));
+	map_handle_output();
 
-	free(output);
-	free(output_file);
+	list_for_each_entry(output_file_entry, &output_file_list, list) {
+		output = (char *)output_file_entry->output;
+
+		if (!repeat && strcmp(output, input_file) == 0) {
+			output = realloc(output, strlen(output) + 3);
+			strcat(output, ".1");
+		}
+
+		output_file_entry->output_file = malloc(strlen(output) + 50);
+		if (!output_file_entry->output_file)
+			die("Failed to allocate for %s", output);
+
+		output_file_entry->output = output;
+	}
+
+	list_for_each_entry(output_file_entry, &output_file_list, list) {
+		c = 1;
+		do {
+			if (repeat)
+				sprintf(output_file_entry->output_file, "%s.%04d",
+					output_file_entry->output, c++);
+			else
+				strcpy(output_file_entry->output_file, output_file_entry->output);
+
+			current = parse_file(handle, output_file_entry, start_ns, end_ns, percpu,
+					     cpu, count, type, &end_reached);
+
+			if (!repeat)
+				break;
+			start_ns = 0;
+		} while (!end_reached && (current && (!end_ns || current < end_ns)));
+	}
 
 	tracecmd_close(handle);
 	free_handles(&handle_list);
+	free_output_files(&output_file_list);
 
 	return;
 }