diff mbox

trace: fix group name generation

Message ID 147648037766.14770.15179642645810274907.stgit@bahia (mailing list archive)
State New, archived
Headers show

Commit Message

Greg Kurz Oct. 14, 2016, 9:26 p.m. UTC
Since commit "80dd5c4918ab trace: introduce a formal group name for trace
events", tracetool generates C variable names and macro definitions out
of the path to the trace-events-all file.

The current code takes care of turning '/' and '-' characters into
underscores so that the resulting names are valid C tokens. This is
enough because these are the only illegal characters that appear in
a relative path within the QEMU source tree.

Things are different for out of tree builds where the path may contain
arbitrary character combinations, causing tracetool to generate invalid
names.

This may cause a variety of build breaks like below:

trace/generated-tracers.h:4:15: warning: ISO C99 requires whitespace after the macro name
 #define TRACE_.BUILD_GENERATED_TRACERS_H

or

trace/generated-tracers.c:17497:13: error: variable or field ‘trace_’ declared void
 static void trace_=build_register_events(void)

or

trace/generated-tracers.c: In function ‘trace_2build_register_event’:
trace/generated-tracers.c:17499:32: error: invalid suffix "build_trace_events" on integer constant
     trace_event_register_group(2build_trace_events);

This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
are kept. All other characters are turned into underscores. Also, since the
first character of C symbol names cannot be a digit, an underscore is
prepended to the group name.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 scripts/tracetool.py |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Blake Oct. 14, 2016, 9:31 p.m. UTC | #1
On 10/14/2016 04:26 PM, Greg Kurz wrote:
> Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> events", tracetool generates C variable names and macro definitions out
> of the path to the trace-events-all file.
> 
> The current code takes care of turning '/' and '-' characters into
> underscores so that the resulting names are valid C tokens. This is
> enough because these are the only illegal characters that appear in
> a relative path within the QEMU source tree.
> 
> Things are different for out of tree builds where the path may contain
> arbitrary character combinations, causing tracetool to generate invalid
> names.
> 

> This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> are kept. All other characters are turned into underscores. Also, since the
> first character of C symbol names cannot be a digit, an underscore is
> prepended to the group name.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  scripts/tracetool.py |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> index 629b2593c846..b81b834db924 100755
> --- a/scripts/tracetool.py
> +++ b/scripts/tracetool.py
> @@ -70,7 +70,7 @@ def make_group_name(filename):
>  
>      if dirname == "":
>          return "common"
> -    return re.sub(r"/|-", "_", dirname)
> +    return "_" + re.sub(r"[^\w]", "_", dirname)

This STILL doesn't solve the complaint that the build is now dependent
on the location.  Why can't we STRIP off any prefix prior to the in-tree
portion of the naming that we know is sane, instead of munging the
prefix but in the process creating source code that generates with
different lengths?

Ideally, compiling twice, once in directory 'a', and the second time in
directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
difference in the final size of the executable due to the difference in
lengths of the debugging symbols used to record the longer name of the
second directory being encoded into lots of macro names.
Greg Kurz Oct. 14, 2016, 10:06 p.m. UTC | #2
On Fri, 14 Oct 2016 16:31:01 -0500
Eric Blake <eblake@redhat.com> wrote:

> On 10/14/2016 04:26 PM, Greg Kurz wrote:
> > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> > events", tracetool generates C variable names and macro definitions out
> > of the path to the trace-events-all file.
> > 
> > The current code takes care of turning '/' and '-' characters into
> > underscores so that the resulting names are valid C tokens. This is
> > enough because these are the only illegal characters that appear in
> > a relative path within the QEMU source tree.
> > 
> > Things are different for out of tree builds where the path may contain
> > arbitrary character combinations, causing tracetool to generate invalid
> > names.
> >   
> 
> > This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> > are kept. All other characters are turned into underscores. Also, since the
> > first character of C symbol names cannot be a digit, an underscore is
> > prepended to the group name.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  scripts/tracetool.py |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> > index 629b2593c846..b81b834db924 100755
> > --- a/scripts/tracetool.py
> > +++ b/scripts/tracetool.py
> > @@ -70,7 +70,7 @@ def make_group_name(filename):
> >  
> >      if dirname == "":
> >          return "common"
> > -    return re.sub(r"/|-", "_", dirname)
> > +    return "_" + re.sub(r"[^\w]", "_", dirname)  
> 
> This STILL doesn't solve the complaint that the build is now dependent
> on the location.  Why can't we STRIP off any prefix prior to the in-tree
> portion of the naming that we know is sane, instead of munging the
> prefix but in the process creating source code that generates with
> different lengths?
> 

Heh, because the complaint was about the build break :)

> Ideally, compiling twice, once in directory 'a', and the second time in
> directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> difference in the final size of the executable due to the difference in
> lengths of the debugging symbols used to record the longer name of the
> second directory being encoded into lots of macro names.
> 

You're right. I'm okay to look for a way to fix the symbol variable size
issue, but I think this patch still makes sense as it makes tracetool
less fragile in case we add a directory with an illegal character to
the QEMU tree... and it fixes annoying build breaks (Paolo also hit
this and asked to revert 80dd5c4918ab in another mail).

Cc'ing Peter...
Peter Maydell Oct. 15, 2016, 12:03 p.m. UTC | #3
On 14 October 2016 at 23:06, Greg Kurz <groug@kaod.org> wrote:
> On Fri, 14 Oct 2016 16:31:01 -0500
> Eric Blake <eblake@redhat.com> wrote:
>
>> On 10/14/2016 04:26 PM, Greg Kurz wrote:
>> > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
>> > events", tracetool generates C variable names and macro definitions out
>> > of the path to the trace-events-all file.

>> > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
>> > index 629b2593c846..b81b834db924 100755
>> > --- a/scripts/tracetool.py
>> > +++ b/scripts/tracetool.py
>> > @@ -70,7 +70,7 @@ def make_group_name(filename):
>> >
>> >      if dirname == "":
>> >          return "common"
>> > -    return re.sub(r"/|-", "_", dirname)
>> > +    return "_" + re.sub(r"[^\w]", "_", dirname)
>>
>> This STILL doesn't solve the complaint that the build is now dependent
>> on the location.  Why can't we STRIP off any prefix prior to the in-tree
>> portion of the naming that we know is sane, instead of munging the
>> prefix but in the process creating source code that generates with
>> different lengths?
>>
>
> Heh, because the complaint was about the build break :)
>
>> Ideally, compiling twice, once in directory 'a', and the second time in
>> directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
>> difference in the final size of the executable due to the difference in
>> lengths of the debugging symbols used to record the longer name of the
>> second directory being encoded into lots of macro names.
>>
>
> You're right. I'm okay to look for a way to fix the symbol variable size
> issue, but I think this patch still makes sense as it makes tracetool
> less fragile in case we add a directory with an illegal character to
> the QEMU tree... and it fixes annoying build breaks (Paolo also hit
> this and asked to revert 80dd5c4918ab in another mail).

I agree with Eric that we should just not be putting
the build directory name in these variables -- this
looks like it's simply a bug to me, and we should fix it.

I think the chances of us adding a directory to the QEMU
tree itself with a silly name are quite low (not least because
if you do that then you get a handy build failure to tell
you not to do that ;-))

thanks
-- PMM
Greg Kurz Oct. 16, 2016, 9:20 a.m. UTC | #4
On Sat, 15 Oct 2016 13:03:10 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:

> On 14 October 2016 at 23:06, Greg Kurz <groug@kaod.org> wrote:
> > On Fri, 14 Oct 2016 16:31:01 -0500
> > Eric Blake <eblake@redhat.com> wrote:
> >  
> >> On 10/14/2016 04:26 PM, Greg Kurz wrote:  
> >> > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> >> > events", tracetool generates C variable names and macro definitions out
> >> > of the path to the trace-events-all file.  
> 
> >> > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> >> > index 629b2593c846..b81b834db924 100755
> >> > --- a/scripts/tracetool.py
> >> > +++ b/scripts/tracetool.py
> >> > @@ -70,7 +70,7 @@ def make_group_name(filename):
> >> >
> >> >      if dirname == "":
> >> >          return "common"
> >> > -    return re.sub(r"/|-", "_", dirname)
> >> > +    return "_" + re.sub(r"[^\w]", "_", dirname)  
> >>
> >> This STILL doesn't solve the complaint that the build is now dependent
> >> on the location.  Why can't we STRIP off any prefix prior to the in-tree
> >> portion of the naming that we know is sane, instead of munging the
> >> prefix but in the process creating source code that generates with
> >> different lengths?
> >>  
> >
> > Heh, because the complaint was about the build break :)
> >  
> >> Ideally, compiling twice, once in directory 'a', and the second time in
> >> directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> >> difference in the final size of the executable due to the difference in
> >> lengths of the debugging symbols used to record the longer name of the
> >> second directory being encoded into lots of macro names.
> >>  
> >
> > You're right. I'm okay to look for a way to fix the symbol variable size
> > issue, but I think this patch still makes sense as it makes tracetool
> > less fragile in case we add a directory with an illegal character to
> > the QEMU tree... and it fixes annoying build breaks (Paolo also hit
> > this and asked to revert 80dd5c4918ab in another mail).  
> 
> I agree with Eric that we should just not be putting
> the build directory name in these variables -- this
> looks like it's simply a bug to me, and we should fix it.
> 

As I was saying in the previous mail, I also agree with Eric :)

And it isn't even about the variable size of the debugging info,
but rather about the random names of the generated symbols...

> I think the chances of us adding a directory to the QEMU
> tree itself with a silly name are quite low (not least because
> if you do that then you get a handy build failure to tell
> you not to do that ;-))
> 

Fair enough, even if the error is a bit obscure at first sight... and
we already have a silly named directory (9pfs), which would be supported
with a leading '_'... but hopefully, it is not at the top level ;)

> thanks
> -- PMM

Cheers.

--
Greg
Daniel P. Berrangé Oct. 18, 2016, 3:16 p.m. UTC | #5
On Fri, Oct 14, 2016 at 04:31:01PM -0500, Eric Blake wrote:
> On 10/14/2016 04:26 PM, Greg Kurz wrote:
> > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> > events", tracetool generates C variable names and macro definitions out
> > of the path to the trace-events-all file.
> > 
> > The current code takes care of turning '/' and '-' characters into
> > underscores so that the resulting names are valid C tokens. This is
> > enough because these are the only illegal characters that appear in
> > a relative path within the QEMU source tree.
> > 
> > Things are different for out of tree builds where the path may contain
> > arbitrary character combinations, causing tracetool to generate invalid
> > names.
> > 
> 
> > This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> > are kept. All other characters are turned into underscores. Also, since the
> > first character of C symbol names cannot be a digit, an underscore is
> > prepended to the group name.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  scripts/tracetool.py |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> > index 629b2593c846..b81b834db924 100755
> > --- a/scripts/tracetool.py
> > +++ b/scripts/tracetool.py
> > @@ -70,7 +70,7 @@ def make_group_name(filename):
> >  
> >      if dirname == "":
> >          return "common"
> > -    return re.sub(r"/|-", "_", dirname)
> > +    return "_" + re.sub(r"[^\w]", "_", dirname)
> 
> This STILL doesn't solve the complaint that the build is now dependent
> on the location.  Why can't we STRIP off any prefix prior to the in-tree
> portion of the naming that we know is sane, instead of munging the
> prefix but in the process creating source code that generates with
> different lengths?
> 
> Ideally, compiling twice, once in directory 'a', and the second time in
> directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> difference in the final size of the executable due to the difference in
> lengths of the debugging symbols used to record the longer name of the
> second directory being encoded into lots of macro names.

This is a mistake on my part - the code was supposed to be stripping
off the build directory prefix, leaving only the relative path to the
file wrt source directory. The code is simply wrong as is.

Regards,
Daniel
Daniel P. Berrangé Oct. 18, 2016, 3:31 p.m. UTC | #6
On Tue, Oct 18, 2016 at 04:16:46PM +0100, Daniel P. Berrange wrote:
> On Fri, Oct 14, 2016 at 04:31:01PM -0500, Eric Blake wrote:
> > On 10/14/2016 04:26 PM, Greg Kurz wrote:
> > > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> > > events", tracetool generates C variable names and macro definitions out
> > > of the path to the trace-events-all file.
> > > 
> > > The current code takes care of turning '/' and '-' characters into
> > > underscores so that the resulting names are valid C tokens. This is
> > > enough because these are the only illegal characters that appear in
> > > a relative path within the QEMU source tree.
> > > 
> > > Things are different for out of tree builds where the path may contain
> > > arbitrary character combinations, causing tracetool to generate invalid
> > > names.
> > > 
> > 
> > > This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> > > are kept. All other characters are turned into underscores. Also, since the
> > > first character of C symbol names cannot be a digit, an underscore is
> > > prepended to the group name.
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  scripts/tracetool.py |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> > > index 629b2593c846..b81b834db924 100755
> > > --- a/scripts/tracetool.py
> > > +++ b/scripts/tracetool.py
> > > @@ -70,7 +70,7 @@ def make_group_name(filename):
> > >  
> > >      if dirname == "":
> > >          return "common"
> > > -    return re.sub(r"/|-", "_", dirname)
> > > +    return "_" + re.sub(r"[^\w]", "_", dirname)
> > 
> > This STILL doesn't solve the complaint that the build is now dependent
> > on the location.  Why can't we STRIP off any prefix prior to the in-tree
> > portion of the naming that we know is sane, instead of munging the
> > prefix but in the process creating source code that generates with
> > different lengths?
> > 
> > Ideally, compiling twice, once in directory 'a', and the second time in
> > directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> > difference in the final size of the executable due to the difference in
> > lengths of the debugging symbols used to record the longer name of the
> > second directory being encoded into lots of macro names.
> 
> This is a mistake on my part - the code was supposed to be stripping
> off the build directory prefix, leaving only the relative path to the
> file wrt source directory. The code is simply wrong as is.

Ah ha, I realize what the issue is.

Currently in git master we have multiple trace-events files and we merge
them into a single trace-events-all file, then generate the various
bits we need. This trace-events-all file is naturally in the build
dir, not the source dir

In my trace events patch build refactor series though, I have stopped
creating trace-events-all, and we instead generate bits directly from
the trace-events files in source dir. So this problem only appeared
because we've only merge part of my series into master.

IOW, I think Greg's proposed fix is fine as a workaround - once the
rest of my patches merge, build dir should not pollute this at all.

Regards,
Daniel
Greg Kurz Oct. 18, 2016, 3:36 p.m. UTC | #7
On Tue, 18 Oct 2016 16:16:46 +0100
"Daniel P. Berrange" <berrange@redhat.com> wrote:

> On Fri, Oct 14, 2016 at 04:31:01PM -0500, Eric Blake wrote:
> > On 10/14/2016 04:26 PM, Greg Kurz wrote:  
> > > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> > > events", tracetool generates C variable names and macro definitions out
> > > of the path to the trace-events-all file.
> > > 
> > > The current code takes care of turning '/' and '-' characters into
> > > underscores so that the resulting names are valid C tokens. This is
> > > enough because these are the only illegal characters that appear in
> > > a relative path within the QEMU source tree.
> > > 
> > > Things are different for out of tree builds where the path may contain
> > > arbitrary character combinations, causing tracetool to generate invalid
> > > names.
> > >   
> >   
> > > This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> > > are kept. All other characters are turned into underscores. Also, since the
> > > first character of C symbol names cannot be a digit, an underscore is
> > > prepended to the group name.
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  scripts/tracetool.py |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> > > index 629b2593c846..b81b834db924 100755
> > > --- a/scripts/tracetool.py
> > > +++ b/scripts/tracetool.py
> > > @@ -70,7 +70,7 @@ def make_group_name(filename):
> > >  
> > >      if dirname == "":
> > >          return "common"
> > > -    return re.sub(r"/|-", "_", dirname)
> > > +    return "_" + re.sub(r"[^\w]", "_", dirname)  
> > 
> > This STILL doesn't solve the complaint that the build is now dependent
> > on the location.  Why can't we STRIP off any prefix prior to the in-tree
> > portion of the naming that we know is sane, instead of munging the
> > prefix but in the process creating source code that generates with
> > different lengths?
> > 
> > Ideally, compiling twice, once in directory 'a', and the second time in
> > directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> > difference in the final size of the executable due to the difference in
> > lengths of the debugging symbols used to record the longer name of the
> > second directory being encoded into lots of macro names.  
> 
> This is a mistake on my part - the code was supposed to be stripping
> off the build directory prefix, leaving only the relative path to the
> file wrt source directory. The code is simply wrong as is.
> 

I don't know how that can be done without tracetool having knowledge of
what the build directory actually is. My first thought was to rely on
os.getcwd() since tracetool is invoked in the build directory, but I'm
now inclined to implement --group and let the caller (i.e. the makefile)
decide for the group name.

Cc'ing Peter who had some thoughts on the question.

> Regards,
> Daniel

Cheers.

--
Greg
Greg Kurz Oct. 18, 2016, 3:52 p.m. UTC | #8
On Tue, 18 Oct 2016 16:31:24 +0100
"Daniel P. Berrange" <berrange@redhat.com> wrote:

> On Tue, Oct 18, 2016 at 04:16:46PM +0100, Daniel P. Berrange wrote:
> > On Fri, Oct 14, 2016 at 04:31:01PM -0500, Eric Blake wrote:  
> > > On 10/14/2016 04:26 PM, Greg Kurz wrote:  
> > > > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> > > > events", tracetool generates C variable names and macro definitions out
> > > > of the path to the trace-events-all file.
> > > > 
> > > > The current code takes care of turning '/' and '-' characters into
> > > > underscores so that the resulting names are valid C tokens. This is
> > > > enough because these are the only illegal characters that appear in
> > > > a relative path within the QEMU source tree.
> > > > 
> > > > Things are different for out of tree builds where the path may contain
> > > > arbitrary character combinations, causing tracetool to generate invalid
> > > > names.
> > > >   
> > >   
> > > > This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> > > > are kept. All other characters are turned into underscores. Also, since the
> > > > first character of C symbol names cannot be a digit, an underscore is
> > > > prepended to the group name.
> > > > 
> > > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > > ---
> > > >  scripts/tracetool.py |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> > > > index 629b2593c846..b81b834db924 100755
> > > > --- a/scripts/tracetool.py
> > > > +++ b/scripts/tracetool.py
> > > > @@ -70,7 +70,7 @@ def make_group_name(filename):
> > > >  
> > > >      if dirname == "":
> > > >          return "common"
> > > > -    return re.sub(r"/|-", "_", dirname)
> > > > +    return "_" + re.sub(r"[^\w]", "_", dirname)  
> > > 
> > > This STILL doesn't solve the complaint that the build is now dependent
> > > on the location.  Why can't we STRIP off any prefix prior to the in-tree
> > > portion of the naming that we know is sane, instead of munging the
> > > prefix but in the process creating source code that generates with
> > > different lengths?
> > > 
> > > Ideally, compiling twice, once in directory 'a', and the second time in
> > > directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> > > difference in the final size of the executable due to the difference in
> > > lengths of the debugging symbols used to record the longer name of the
> > > second directory being encoded into lots of macro names.  
> > 
> > This is a mistake on my part - the code was supposed to be stripping
> > off the build directory prefix, leaving only the relative path to the
> > file wrt source directory. The code is simply wrong as is.  
> 
> Ah ha, I realize what the issue is.
> 
> Currently in git master we have multiple trace-events files and we merge
> them into a single trace-events-all file, then generate the various
> bits we need. This trace-events-all file is naturally in the build
> dir, not the source dir
> 
> In my trace events patch build refactor series though, I have stopped
> creating trace-events-all, and we instead generate bits directly from
> the trace-events files in source dir. So this problem only appeared
> because we've only merge part of my series into master.
> 

Heh commit 80dd5c4918ab then makes sense in this scenario, except perhaps
the nit about --group in the changelog.

> IOW, I think Greg's proposed fix is fine as a workaround - once the
> rest of my patches merge, build dir should not pollute this at all.
> 

I have two other patches ready to fix the current situation:
- one using os.getcwd() to guess the build directory
- one implementing --group as mentioned in my other mail

But the one that filters unwanted characters is a less intrusive
workaround.

> Regards,
> Daniel

Cheers.

---
Greg
Stefan Hajnoczi Oct. 20, 2016, 2:25 p.m. UTC | #9
On Tue, Oct 18, 2016 at 05:52:00PM +0200, Greg Kurz wrote:
> On Tue, 18 Oct 2016 16:31:24 +0100
> "Daniel P. Berrange" <berrange@redhat.com> wrote:
> 
> > On Tue, Oct 18, 2016 at 04:16:46PM +0100, Daniel P. Berrange wrote:
> > > On Fri, Oct 14, 2016 at 04:31:01PM -0500, Eric Blake wrote:  
> > > > On 10/14/2016 04:26 PM, Greg Kurz wrote:  
> > > > > Since commit "80dd5c4918ab trace: introduce a formal group name for trace
> > > > > events", tracetool generates C variable names and macro definitions out
> > > > > of the path to the trace-events-all file.
> > > > > 
> > > > > The current code takes care of turning '/' and '-' characters into
> > > > > underscores so that the resulting names are valid C tokens. This is
> > > > > enough because these are the only illegal characters that appear in
> > > > > a relative path within the QEMU source tree.
> > > > > 
> > > > > Things are different for out of tree builds where the path may contain
> > > > > arbitrary character combinations, causing tracetool to generate invalid
> > > > > names.
> > > > >   
> > > >   
> > > > > This patch ensures that only letters [A-Za-z], digits [0-9] and underscores
> > > > > are kept. All other characters are turned into underscores. Also, since the
> > > > > first character of C symbol names cannot be a digit, an underscore is
> > > > > prepended to the group name.
> > > > > 
> > > > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > > > ---
> > > > >  scripts/tracetool.py |    2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/scripts/tracetool.py b/scripts/tracetool.py
> > > > > index 629b2593c846..b81b834db924 100755
> > > > > --- a/scripts/tracetool.py
> > > > > +++ b/scripts/tracetool.py
> > > > > @@ -70,7 +70,7 @@ def make_group_name(filename):
> > > > >  
> > > > >      if dirname == "":
> > > > >          return "common"
> > > > > -    return re.sub(r"/|-", "_", dirname)
> > > > > +    return "_" + re.sub(r"[^\w]", "_", dirname)  
> > > > 
> > > > This STILL doesn't solve the complaint that the build is now dependent
> > > > on the location.  Why can't we STRIP off any prefix prior to the in-tree
> > > > portion of the naming that we know is sane, instead of munging the
> > > > prefix but in the process creating source code that generates with
> > > > different lengths?
> > > > 
> > > > Ideally, compiling twice, once in directory 'a', and the second time in
> > > > directory 'aaaaaaaaaaaaaaaaaaaaaaaaaaaa', should not make a noticeable
> > > > difference in the final size of the executable due to the difference in
> > > > lengths of the debugging symbols used to record the longer name of the
> > > > second directory being encoded into lots of macro names.  
> > > 
> > > This is a mistake on my part - the code was supposed to be stripping
> > > off the build directory prefix, leaving only the relative path to the
> > > file wrt source directory. The code is simply wrong as is.  
> > 
> > Ah ha, I realize what the issue is.
> > 
> > Currently in git master we have multiple trace-events files and we merge
> > them into a single trace-events-all file, then generate the various
> > bits we need. This trace-events-all file is naturally in the build
> > dir, not the source dir
> > 
> > In my trace events patch build refactor series though, I have stopped
> > creating trace-events-all, and we instead generate bits directly from
> > the trace-events files in source dir. So this problem only appeared
> > because we've only merge part of my series into master.
> > 
> 
> Heh commit 80dd5c4918ab then makes sense in this scenario, except perhaps
> the nit about --group in the changelog.
> 
> > IOW, I think Greg's proposed fix is fine as a workaround - once the
> > rest of my patches merge, build dir should not pollute this at all.
> > 
> 
> I have two other patches ready to fix the current situation:
> - one using os.getcwd() to guess the build directory
> - one implementing --group as mentioned in my other mail
> 
> But the one that filters unwanted characters is a less intrusive
> workaround.

If Dan's patches will eliminate the issue then we can take a workaround.

Any more comments about Greg's patch before I merge it?

Stefan
Fam Zheng Nov. 17, 2016, 8:07 a.m. UTC | #10
On Thu, 10/20 15:25, Stefan Hajnoczi wrote:
> > 
> > I have two other patches ready to fix the current situation:
> > - one using os.getcwd() to guess the build directory
> > - one implementing --group as mentioned in my other mail
> > 
> > But the one that filters unwanted characters is a less intrusive
> > workaround.
> 
> If Dan's patches will eliminate the issue then we can take a workaround.
> 
> Any more comments about Greg's patch before I merge it?

Should we include this in -rc1? I still see a build error today.

Fam
Greg Kurz Nov. 17, 2016, 9:10 a.m. UTC | #11
On Thu, 17 Nov 2016 16:07:38 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Thu, 10/20 15:25, Stefan Hajnoczi wrote:
> > > 
> > > I have two other patches ready to fix the current situation:
> > > - one using os.getcwd() to guess the build directory
> > > - one implementing --group as mentioned in my other mail
> > > 
> > > But the one that filters unwanted characters is a less intrusive
> > > workaround.  
> > 
> > If Dan's patches will eliminate the issue then we can take a workaround.
> > 
> > Any more comments about Greg's patch before I merge it?  
> 
> Should we include this in -rc1? I still see a build error today.
> 
> Fam
> 

Hi Fam,

My patch was partly superseded by this commit:

commit 630b210b9abbf362905a2096c22c5eb1d6224e77
Author: Stefan Weil <sw@weilnetz.de>
Date:   Thu Oct 13 20:29:30 2016 +0200

    Fix build for less common build directories names

which does:

-    return re.sub(r"/|-", "_", dirname)
+    return re.sub(r"[^A-Za-z0-9]", "_", dirname)

What is the build error you're hitting ?

--
Greg
Fam Zheng Nov. 17, 2016, 9:34 a.m. UTC | #12
On Thu, 11/17 10:10, Greg Kurz wrote:
> On Thu, 17 Nov 2016 16:07:38 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > On Thu, 10/20 15:25, Stefan Hajnoczi wrote:
> > > > 
> > > > I have two other patches ready to fix the current situation:
> > > > - one using os.getcwd() to guess the build directory
> > > > - one implementing --group as mentioned in my other mail
> > > > 
> > > > But the one that filters unwanted characters is a less intrusive
> > > > workaround.  
> > > 
> > > If Dan's patches will eliminate the issue then we can take a workaround.
> > > 
> > > Any more comments about Greg's patch before I merge it?  
> > 
> > Should we include this in -rc1? I still see a build error today.
> > 
> > Fam
> > 
> 
> Hi Fam,
> 
> My patch was partly superseded by this commit:
> 
> commit 630b210b9abbf362905a2096c22c5eb1d6224e77
> Author: Stefan Weil <sw@weilnetz.de>
> Date:   Thu Oct 13 20:29:30 2016 +0200
> 
>     Fix build for less common build directories names
> 
> which does:
> 
> -    return re.sub(r"/|-", "_", dirname)
> +    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
> 
> What is the build error you're hitting ?

The substracted parts (basedir and dirname) are different paths for out-of-tree
build, and the "dirname" after the nonsense substraction as fed into re.sub()
could still start with a number. In my case the source is at

    /var/tmp/aaa-qemu-clone

and the build dir is

    /var/tmp/qemu-aio-poll-v2

Then I get an error as:

trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events" on integer constant
 TraceEvent *2_trace_events[] = {
             ^
trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before numeric constant
trace/generated-tracers.c: In function ‘trace_2_register_events’:
trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on integer constant
     trace_event_register_group(2_trace_events);
                                ^
make: *** [trace/generated-tracers.o] Error 1
Greg Kurz Nov. 17, 2016, 9:48 a.m. UTC | #13
On Thu, 17 Nov 2016 17:34:44 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Thu, 11/17 10:10, Greg Kurz wrote:
> > On Thu, 17 Nov 2016 16:07:38 +0800
> > Fam Zheng <famz@redhat.com> wrote:
> >   
> > > On Thu, 10/20 15:25, Stefan Hajnoczi wrote:  
> > > > > 
> > > > > I have two other patches ready to fix the current situation:
> > > > > - one using os.getcwd() to guess the build directory
> > > > > - one implementing --group as mentioned in my other mail
> > > > > 
> > > > > But the one that filters unwanted characters is a less intrusive
> > > > > workaround.    
> > > > 
> > > > If Dan's patches will eliminate the issue then we can take a workaround.
> > > > 
> > > > Any more comments about Greg's patch before I merge it?    
> > > 
> > > Should we include this in -rc1? I still see a build error today.
> > > 
> > > Fam
> > >   
> > 
> > Hi Fam,
> > 
> > My patch was partly superseded by this commit:
> > 
> > commit 630b210b9abbf362905a2096c22c5eb1d6224e77
> > Author: Stefan Weil <sw@weilnetz.de>
> > Date:   Thu Oct 13 20:29:30 2016 +0200
> > 
> >     Fix build for less common build directories names
> > 
> > which does:
> > 
> > -    return re.sub(r"/|-", "_", dirname)
> > +    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
> > 
> > What is the build error you're hitting ?  
> 
> The substracted parts (basedir and dirname) are different paths for out-of-tree
> build, and the "dirname" after the nonsense substraction as fed into re.sub()
> could still start with a number. In my case the source is at
> 
>     /var/tmp/aaa-qemu-clone
> 
> and the build dir is
> 
>     /var/tmp/qemu-aio-poll-v2
> 
> Then I get an error as:
> 
> trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events" on integer constant
>  TraceEvent *2_trace_events[] = {
>              ^
> trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before numeric constant
> trace/generated-tracers.c: In function ‘trace_2_register_events’:
> trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on integer constant
>      trace_event_register_group(2_trace_events);
>                                 ^
> make: *** [trace/generated-tracers.o] Error 1

My patch was addressing this as well... it is unfortunate it got dumped. :-\

I guess the following should be enough to fix your issue... but I'm no tracetool
expert and I cannot assure it doesn't break anything elsewhere...

-    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
+    return "_" + re.sub(r"[^A-Za-z0-9]", "_", dirname)

Cheers.

--
Greg
Fam Zheng Nov. 17, 2016, 10 a.m. UTC | #14
On Thu, 11/17 10:48, Greg Kurz wrote:
> On Thu, 17 Nov 2016 17:34:44 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > On Thu, 11/17 10:10, Greg Kurz wrote:
> > > On Thu, 17 Nov 2016 16:07:38 +0800
> > > Fam Zheng <famz@redhat.com> wrote:
> > >   
> > > > On Thu, 10/20 15:25, Stefan Hajnoczi wrote:  
> > > > > > 
> > > > > > I have two other patches ready to fix the current situation:
> > > > > > - one using os.getcwd() to guess the build directory
> > > > > > - one implementing --group as mentioned in my other mail
> > > > > > 
> > > > > > But the one that filters unwanted characters is a less intrusive
> > > > > > workaround.    
> > > > > 
> > > > > If Dan's patches will eliminate the issue then we can take a workaround.
> > > > > 
> > > > > Any more comments about Greg's patch before I merge it?    
> > > > 
> > > > Should we include this in -rc1? I still see a build error today.
> > > > 
> > > > Fam
> > > >   
> > > 
> > > Hi Fam,
> > > 
> > > My patch was partly superseded by this commit:
> > > 
> > > commit 630b210b9abbf362905a2096c22c5eb1d6224e77
> > > Author: Stefan Weil <sw@weilnetz.de>
> > > Date:   Thu Oct 13 20:29:30 2016 +0200
> > > 
> > >     Fix build for less common build directories names
> > > 
> > > which does:
> > > 
> > > -    return re.sub(r"/|-", "_", dirname)
> > > +    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
> > > 
> > > What is the build error you're hitting ?  
> > 
> > The substracted parts (basedir and dirname) are different paths for out-of-tree
> > build, and the "dirname" after the nonsense substraction as fed into re.sub()
> > could still start with a number. In my case the source is at
> > 
> >     /var/tmp/aaa-qemu-clone
> > 
> > and the build dir is
> > 
> >     /var/tmp/qemu-aio-poll-v2
> > 
> > Then I get an error as:
> > 
> > trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events" on integer constant
> >  TraceEvent *2_trace_events[] = {
> >              ^
> > trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before numeric constant
> > trace/generated-tracers.c: In function ‘trace_2_register_events’:
> > trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on integer constant
> >      trace_event_register_group(2_trace_events);
> >                                 ^
> > make: *** [trace/generated-tracers.o] Error 1
> 
> My patch was addressing this as well... it is unfortunate it got dumped. :-\
> 
> I guess the following should be enough to fix your issue... but I'm no tracetool
> expert and I cannot assure it doesn't break anything elsewhere...
> 
> -    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
> +    return "_" + re.sub(r"[^A-Za-z0-9]", "_", dirname)

Yes that does make a workaround for me.

Fam
Stefan Hajnoczi Nov. 17, 2016, 10:05 a.m. UTC | #15
On Thu, Nov 17, 2016 at 10:48:34AM +0100, Greg Kurz wrote:
> On Thu, 17 Nov 2016 17:34:44 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > On Thu, 11/17 10:10, Greg Kurz wrote:
> > > On Thu, 17 Nov 2016 16:07:38 +0800
> > > Fam Zheng <famz@redhat.com> wrote:
> > >   
> > > > On Thu, 10/20 15:25, Stefan Hajnoczi wrote:  
> > > > > > 
> > > > > > I have two other patches ready to fix the current situation:
> > > > > > - one using os.getcwd() to guess the build directory
> > > > > > - one implementing --group as mentioned in my other mail
> > > > > > 
> > > > > > But the one that filters unwanted characters is a less intrusive
> > > > > > workaround.    
> > > > > 
> > > > > If Dan's patches will eliminate the issue then we can take a workaround.
> > > > > 
> > > > > Any more comments about Greg's patch before I merge it?    
> > > > 
> > > > Should we include this in -rc1? I still see a build error today.
> > > > 
> > > > Fam
> > > >   
> > > 
> > > Hi Fam,
> > > 
> > > My patch was partly superseded by this commit:
> > > 
> > > commit 630b210b9abbf362905a2096c22c5eb1d6224e77
> > > Author: Stefan Weil <sw@weilnetz.de>
> > > Date:   Thu Oct 13 20:29:30 2016 +0200
> > > 
> > >     Fix build for less common build directories names
> > > 
> > > which does:
> > > 
> > > -    return re.sub(r"/|-", "_", dirname)
> > > +    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
> > > 
> > > What is the build error you're hitting ?  
> > 
> > The substracted parts (basedir and dirname) are different paths for out-of-tree
> > build, and the "dirname" after the nonsense substraction as fed into re.sub()
> > could still start with a number. In my case the source is at
> > 
> >     /var/tmp/aaa-qemu-clone
> > 
> > and the build dir is
> > 
> >     /var/tmp/qemu-aio-poll-v2
> > 
> > Then I get an error as:
> > 
> > trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events" on integer constant
> >  TraceEvent *2_trace_events[] = {
> >              ^
> > trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before numeric constant
> > trace/generated-tracers.c: In function ‘trace_2_register_events’:
> > trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on integer constant
> >      trace_event_register_group(2_trace_events);
> >                                 ^
> > make: *** [trace/generated-tracers.o] Error 1
> 
> My patch was addressing this as well... it is unfortunate it got dumped. :-\
> 
> I guess the following should be enough to fix your issue... but I'm no tracetool
> expert and I cannot assure it doesn't break anything elsewhere...
> 
> -    return re.sub(r"[^A-Za-z0-9]", "_", dirname)
> +    return "_" + re.sub(r"[^A-Za-z0-9]", "_", dirname)

I think that will solve the problem for QEMU 2.8.  Please send a patch.

Thanks,
Stefan
diff mbox

Patch

diff --git a/scripts/tracetool.py b/scripts/tracetool.py
index 629b2593c846..b81b834db924 100755
--- a/scripts/tracetool.py
+++ b/scripts/tracetool.py
@@ -70,7 +70,7 @@  def make_group_name(filename):
 
     if dirname == "":
         return "common"
-    return re.sub(r"/|-", "_", dirname)
+    return "_" + re.sub(r"[^\w]", "_", dirname)
 
 def main(args):
     global _SCRIPT