diff mbox series

[XEN,v4,14/18] xen,symbols: rework file symbols selection

Message ID 20200331103102.1105674-15-anthony.perard@citrix.com (mailing list archive)
State Superseded
Headers show
Series xen: Build system improvements | expand

Commit Message

Anthony PERARD March 31, 2020, 10:30 a.m. UTC
We want to use the same rune to build mm/*/guest_*.o as the one use to
build every other *.o object. The consequence it that file symbols that
the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.

(1) Currently we have those two file symbols:
    guest_walk.c
    guest_walk_2.o
(2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
    arch/x86/mm/guest_walk.c
    guest_walk_2.o

The order in which those symbols are present may be different.

Currently, in case (1) ./symbols chooses the *.o symbol (object file
name). But in case (2), may choose the *.c symbol (source file name with
path component) if it is first

We want to have ./symbols choose the object file name symbol in both
cases. So this patch changes that ./symbols prefer the "object file
name" symbol over the "source file name with path component" symbols.

The new intended order of preference is:
    - first object file name symbol
    - first source file name with path components symbol
    - last source file name without any path component symbol

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v4:
    - rescope enum symbol_type
    - remove setting values to enums, as it's not needed.
    - rename the enumeration symbols
    
    commmit rewriting:
    
    We want to use the same rune to build mm/*/guest_*.o as the one use to
    build every other *.o object. The consequence it that file symbols that
    the program ./symbols prefere changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
    
    (1) Currently we have those two file symboles:
        guest_walk.c
        guest_walk_2.o
    (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
        arch/x86/mm/guest_walk.c
        guest_walk_2.o
    
    The order in which those symbols are present may be different.
    
    Currently, in case (1) ./symbols chooses the *.o symbol (object file
    name). But in case (2), may choose the *.c symbol (source file name with
    path component) if it is first.
    
    This patch changes that ./symbols prefere the "object file name" symbol over
    the "source file name with path component" symbols.

 xen/tools/symbols.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

Comments

Jan Beulich April 8, 2020, 12:54 p.m. UTC | #1
On 31.03.2020 12:30, Anthony PERARD wrote:
> We want to use the same rune to build mm/*/guest_*.o as the one use to
> build every other *.o object. The consequence it that file symbols that
> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
> 
> (1) Currently we have those two file symbols:
>     guest_walk.c
>     guest_walk_2.o
> (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
>     arch/x86/mm/guest_walk.c
>     guest_walk_2.o
> 
> The order in which those symbols are present may be different.
> 
> Currently, in case (1) ./symbols chooses the *.o symbol (object file
> name). But in case (2), may choose the *.c symbol (source file name with
> path component) if it is first
> 
> We want to have ./symbols choose the object file name symbol in both
> cases.

I guess the reason for wanting this is somehow connected to the
statement at the beginning of the description, but I can't seem
to be able to make the connection.

> So this patch changes that ./symbols prefer the "object file
> name" symbol over the "source file name with path component" symbols.
> 
> The new intended order of preference is:
>     - first object file name symbol
>     - first source file name with path components symbol
>     - last source file name without any path component symbol

Isn't this going to lead to ambiguities again when
CONFIG_ENFORCE_UNIQUE_SYMBOLS? Several object files (in different
dirs) are named the same, after all. Static symbols with the same
name in such objects would hence resolve to the same kallsyms
name.

Jan
Anthony PERARD April 16, 2020, 12:44 p.m. UTC | #2
On Wed, Apr 08, 2020 at 02:54:35PM +0200, Jan Beulich wrote:
> On 31.03.2020 12:30, Anthony PERARD wrote:
> > We want to use the same rune to build mm/*/guest_*.o as the one use to
> > build every other *.o object. The consequence it that file symbols that
> > the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
> > 
> > (1) Currently we have those two file symbols:
> >     guest_walk.c
> >     guest_walk_2.o
> > (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
> >     arch/x86/mm/guest_walk.c
> >     guest_walk_2.o
> > 
> > The order in which those symbols are present may be different.
> > 
> > Currently, in case (1) ./symbols chooses the *.o symbol (object file
> > name). But in case (2), may choose the *.c symbol (source file name with
> > path component) if it is first
> > 
> > We want to have ./symbols choose the object file name symbol in both
> > cases.
> 
> I guess the reason for wanting this is somehow connected to the
> statement at the beginning of the description, but I can't seem
> to be able to make the connection.

I'm not sure I can explain it better.

The "object file name" file symbol is used to distinguish between symbols
from all mm/*/guest_* objects. The other file symbol present in those
object is a "source file name without any path component symbol".

But building those objects with the same rune as any other objects, and
having CONFIG_ENFORCE_UNIQUE_SYMBOLS=y, changes the file symbols present
in the resulting object. We still have the "object file name" symbol,
but now we also have "source file name with path components" symbol.
Unfortunately, all mm/*/guest_*.o in one directory are built from the
same source file, and thus have the same "source file name" symbol, but
have different "object file name" symbol. We still want to be able to
distinguish between guest_*.o in one dir, and the only way for that is
to use the "object file name" symbol.

> > So this patch changes that ./symbols prefer the "object file
> > name" symbol over the "source file name with path component" symbols.
> > 
> > The new intended order of preference is:
> >     - first object file name symbol
> >     - first source file name with path components symbol
> >     - last source file name without any path component symbol
> 
> Isn't this going to lead to ambiguities again when
> CONFIG_ENFORCE_UNIQUE_SYMBOLS? Several object files (in different
> dirs) are named the same, after all. Static symbols with the same
> name in such objects would hence resolve to the same kallsyms
> name.

"object file name" symbol are only present in mm/*/guest_*.o objects,
they all have different basenames. There is no ambiguity here.

Thanks,
Jan Beulich April 16, 2020, 2:22 p.m. UTC | #3
On 16.04.2020 14:44, Anthony PERARD wrote:
> On Wed, Apr 08, 2020 at 02:54:35PM +0200, Jan Beulich wrote:
>> On 31.03.2020 12:30, Anthony PERARD wrote:
>>> We want to use the same rune to build mm/*/guest_*.o as the one use to
>>> build every other *.o object. The consequence it that file symbols that
>>> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
>>>
>>> (1) Currently we have those two file symbols:
>>>     guest_walk.c
>>>     guest_walk_2.o
>>> (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
>>>     arch/x86/mm/guest_walk.c
>>>     guest_walk_2.o
>>>
>>> The order in which those symbols are present may be different.
>>>
>>> Currently, in case (1) ./symbols chooses the *.o symbol (object file
>>> name). But in case (2), may choose the *.c symbol (source file name with
>>> path component) if it is first
>>>
>>> We want to have ./symbols choose the object file name symbol in both
>>> cases.
>>
>> I guess the reason for wanting this is somehow connected to the
>> statement at the beginning of the description, but I can't seem
>> to be able to make the connection.
> 
> I'm not sure I can explain it better.
> 
> The "object file name" file symbol is used to distinguish between symbols
> from all mm/*/guest_* objects. The other file symbol present in those
> object is a "source file name without any path component symbol".
> 
> But building those objects with the same rune as any other objects, and
> having CONFIG_ENFORCE_UNIQUE_SYMBOLS=y, changes the file symbols present
> in the resulting object. We still have the "object file name" symbol,
> but now we also have "source file name with path components" symbol.
> Unfortunately, all mm/*/guest_*.o in one directory are built from the
> same source file, and thus have the same "source file name" symbol, but
> have different "object file name" symbol. We still want to be able to
> distinguish between guest_*.o in one dir, and the only way for that is
> to use the "object file name" symbol.

So where's the difference from how things work right now? The "same rune"
aspect doesn't really change - right now we also build with effectively
the same logic, just that -DGUEST_PAGING_LEVELS=... gets added. I guess
it might help if you showed (for one particular example) how the set of
file symbols changes from what we have now (with and without
CONFIG_ENFORCE_UNIQUE_SYMBOLS=y) to what there would be with your changes
to the symbols utility to what there will be with those changes.

>>> So this patch changes that ./symbols prefer the "object file
>>> name" symbol over the "source file name with path component" symbols.
>>>
>>> The new intended order of preference is:
>>>     - first object file name symbol
>>>     - first source file name with path components symbol
>>>     - last source file name without any path component symbol
>>
>> Isn't this going to lead to ambiguities again when
>> CONFIG_ENFORCE_UNIQUE_SYMBOLS? Several object files (in different
>> dirs) are named the same, after all. Static symbols with the same
>> name in such objects would hence resolve to the same kallsyms
>> name.
> 
> "object file name" symbol are only present in mm/*/guest_*.o objects,
> they all have different basenames. There is no ambiguity here.

At least not right now, I see. Could you make this aspect more explicit
by adding something like "(present only in object files produced from
multiply compiled sources)" to the first bullet point?

Jan
Anthony PERARD April 16, 2020, 3:09 p.m. UTC | #4
On Thu, Apr 16, 2020 at 04:22:05PM +0200, Jan Beulich wrote:
> On 16.04.2020 14:44, Anthony PERARD wrote:
> > On Wed, Apr 08, 2020 at 02:54:35PM +0200, Jan Beulich wrote:
> >> On 31.03.2020 12:30, Anthony PERARD wrote:
> >>> We want to use the same rune to build mm/*/guest_*.o as the one use to
> >>> build every other *.o object. The consequence it that file symbols that
> >>> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
> >>>
> >>> (1) Currently we have those two file symbols:
> >>>     guest_walk.c
> >>>     guest_walk_2.o
> >>> (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
> >>>     arch/x86/mm/guest_walk.c
> >>>     guest_walk_2.o
> >>>
> >>> The order in which those symbols are present may be different.
> >>>
> >>> Currently, in case (1) ./symbols chooses the *.o symbol (object file
> >>> name). But in case (2), may choose the *.c symbol (source file name with
> >>> path component) if it is first
> >>>
> >>> We want to have ./symbols choose the object file name symbol in both
> >>> cases.
> >>
> >> I guess the reason for wanting this is somehow connected to the
> >> statement at the beginning of the description, but I can't seem
> >> to be able to make the connection.
> > 
> > I'm not sure I can explain it better.
> > 
> > The "object file name" file symbol is used to distinguish between symbols
> > from all mm/*/guest_* objects. The other file symbol present in those
> > object is a "source file name without any path component symbol".
> > 
> > But building those objects with the same rune as any other objects, and
> > having CONFIG_ENFORCE_UNIQUE_SYMBOLS=y, changes the file symbols present
> > in the resulting object. We still have the "object file name" symbol,
> > but now we also have "source file name with path components" symbol.
> > Unfortunately, all mm/*/guest_*.o in one directory are built from the
> > same source file, and thus have the same "source file name" symbol, but
> > have different "object file name" symbol. We still want to be able to
> > distinguish between guest_*.o in one dir, and the only way for that is
> > to use the "object file name" symbol.
> 
> So where's the difference from how things work right now? The "same rune"
> aspect doesn't really change - right now we also build with effectively
> the same logic, just that -DGUEST_PAGING_LEVELS=... gets added. I guess
> it might help if you showed (for one particular example) how the set of
> file symbols changes from what we have now (with and without
> CONFIG_ENFORCE_UNIQUE_SYMBOLS=y) to what there would be with your changes
> to the symbols utility to what there will be with those changes.

The logic to build objects from C files changed in 81ecb38b83b0 ("build:
provide option to disambiguate symbol names"), with objects build with
__OBJECT_FILE__ explicitly left alone. So the logic is different now (at
least when CONFIG_ENFORCE_UNIQUE_SYMBOLS=y).

I did add the example of building arch/x86/mm/guest_walk_2.o to the
commit message, reworded below:

For example, when building arch/x86/mm/guest_walk_2.o from guest_walk.c,
this would be the difference of file symbol present in the object when
building with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y:

(1) Currently we have those two file symbols:
    guest_walk.c
    guest_walk_2.o
(2) When building with the same rune, we will have:
    arch/x86/mm/guest_walk.c
    guest_walk_2.o

The order in which those symbols are present may be different. Building
without CONFIG_ENFORCE_UNIQUE_SYMBOLS will result in (1).


> >>> So this patch changes that ./symbols prefer the "object file
> >>> name" symbol over the "source file name with path component" symbols.
> >>>
> >>> The new intended order of preference is:
> >>>     - first object file name symbol
> >>>     - first source file name with path components symbol
> >>>     - last source file name without any path component symbol
> >>
> >> Isn't this going to lead to ambiguities again when
> >> CONFIG_ENFORCE_UNIQUE_SYMBOLS? Several object files (in different
> >> dirs) are named the same, after all. Static symbols with the same
> >> name in such objects would hence resolve to the same kallsyms
> >> name.
> > 
> > "object file name" symbol are only present in mm/*/guest_*.o objects,
> > they all have different basenames. There is no ambiguity here.
> 
> At least not right now, I see. Could you make this aspect more explicit
> by adding something like "(present only in object files produced from
> multiply compiled sources)" to the first bullet point?

Will do.
Jan Beulich April 17, 2020, 7:12 a.m. UTC | #5
On 16.04.2020 17:09, Anthony PERARD wrote:
> On Thu, Apr 16, 2020 at 04:22:05PM +0200, Jan Beulich wrote:
>> On 16.04.2020 14:44, Anthony PERARD wrote:
>>> On Wed, Apr 08, 2020 at 02:54:35PM +0200, Jan Beulich wrote:
>>>> On 31.03.2020 12:30, Anthony PERARD wrote:
>>>>> We want to use the same rune to build mm/*/guest_*.o as the one use to
>>>>> build every other *.o object. The consequence it that file symbols that
>>>>> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
>>>>>
>>>>> (1) Currently we have those two file symbols:
>>>>>     guest_walk.c
>>>>>     guest_walk_2.o
>>>>> (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
>>>>>     arch/x86/mm/guest_walk.c
>>>>>     guest_walk_2.o
>>>>>
>>>>> The order in which those symbols are present may be different.
>>>>>
>>>>> Currently, in case (1) ./symbols chooses the *.o symbol (object file
>>>>> name). But in case (2), may choose the *.c symbol (source file name with
>>>>> path component) if it is first
>>>>>
>>>>> We want to have ./symbols choose the object file name symbol in both
>>>>> cases.
>>>>
>>>> I guess the reason for wanting this is somehow connected to the
>>>> statement at the beginning of the description, but I can't seem
>>>> to be able to make the connection.
>>>
>>> I'm not sure I can explain it better.
>>>
>>> The "object file name" file symbol is used to distinguish between symbols
>>> from all mm/*/guest_* objects. The other file symbol present in those
>>> object is a "source file name without any path component symbol".
>>>
>>> But building those objects with the same rune as any other objects, and
>>> having CONFIG_ENFORCE_UNIQUE_SYMBOLS=y, changes the file symbols present
>>> in the resulting object. We still have the "object file name" symbol,
>>> but now we also have "source file name with path components" symbol.
>>> Unfortunately, all mm/*/guest_*.o in one directory are built from the
>>> same source file, and thus have the same "source file name" symbol, but
>>> have different "object file name" symbol. We still want to be able to
>>> distinguish between guest_*.o in one dir, and the only way for that is
>>> to use the "object file name" symbol.
>>
>> So where's the difference from how things work right now? The "same rune"
>> aspect doesn't really change - right now we also build with effectively
>> the same logic, just that -DGUEST_PAGING_LEVELS=... gets added. I guess
>> it might help if you showed (for one particular example) how the set of
>> file symbols changes from what we have now (with and without
>> CONFIG_ENFORCE_UNIQUE_SYMBOLS=y) to what there would be with your changes
>> to the symbols utility to what there will be with those changes.
> 
> The logic to build objects from C files changed in 81ecb38b83b0 ("build:
> provide option to disambiguate symbol names"), with objects build with
> __OBJECT_FILE__ explicitly left alone. So the logic is different now (at
> least when CONFIG_ENFORCE_UNIQUE_SYMBOLS=y).
> 
> I did add the example of building arch/x86/mm/guest_walk_2.o to the
> commit message, reworded below:
> 
> For example, when building arch/x86/mm/guest_walk_2.o from guest_walk.c,
> this would be the difference of file symbol present in the object when
> building with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y:
> 
> (1) Currently we have those two file symbols:
>     guest_walk.c
>     guest_walk_2.o
> (2) When building with the same rune, we will have:
>     arch/x86/mm/guest_walk.c
>     guest_walk_2.o

Ah, yes, the changed introductory paragraph makes clear (to me)
what presence and what future (1) and (2) are talking about. Yet
what I then still don't understand - what is it that makes the
path appear when switching to the common rune? Oh - I finally
figured it: It's the objcopy step that will apply to all targets
uniformly then. Perhaps it's indeed obvious, but it clearly
wasn't to me when merely looking at the patch.

With this I'd then wonder whether it wouldn't be a far smaller
adjustment to simply skip that --redefine-sym step in case the
object file already has a file symbol naming the object file,
thus simply retaining the status quo.

Jan
Anthony PERARD April 17, 2020, 1:19 p.m. UTC | #6
On Fri, Apr 17, 2020 at 09:12:11AM +0200, Jan Beulich wrote:
> On 16.04.2020 17:09, Anthony PERARD wrote:
> > On Thu, Apr 16, 2020 at 04:22:05PM +0200, Jan Beulich wrote:
> >> On 16.04.2020 14:44, Anthony PERARD wrote:
> >>> On Wed, Apr 08, 2020 at 02:54:35PM +0200, Jan Beulich wrote:
> >>>> On 31.03.2020 12:30, Anthony PERARD wrote:
> >>>>> We want to use the same rune to build mm/*/guest_*.o as the one use to
> >>>>> build every other *.o object. The consequence it that file symbols that
> >>>>> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
> >>>>>
> >>>>> (1) Currently we have those two file symbols:
> >>>>>     guest_walk.c
> >>>>>     guest_walk_2.o
> >>>>> (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
> >>>>>     arch/x86/mm/guest_walk.c
> >>>>>     guest_walk_2.o
> >>>>>
> >>>>> The order in which those symbols are present may be different.
> >>>>>
> >>>>> Currently, in case (1) ./symbols chooses the *.o symbol (object file
> >>>>> name). But in case (2), may choose the *.c symbol (source file name with
> >>>>> path component) if it is first
> >>>>>
> >>>>> We want to have ./symbols choose the object file name symbol in both
> >>>>> cases.
> >>>>
> >>>> I guess the reason for wanting this is somehow connected to the
> >>>> statement at the beginning of the description, but I can't seem
> >>>> to be able to make the connection.
> >>>
> >>> I'm not sure I can explain it better.
> >>>
> >>> The "object file name" file symbol is used to distinguish between symbols
> >>> from all mm/*/guest_* objects. The other file symbol present in those
> >>> object is a "source file name without any path component symbol".
> >>>
> >>> But building those objects with the same rune as any other objects, and
> >>> having CONFIG_ENFORCE_UNIQUE_SYMBOLS=y, changes the file symbols present
> >>> in the resulting object. We still have the "object file name" symbol,
> >>> but now we also have "source file name with path components" symbol.
> >>> Unfortunately, all mm/*/guest_*.o in one directory are built from the
> >>> same source file, and thus have the same "source file name" symbol, but
> >>> have different "object file name" symbol. We still want to be able to
> >>> distinguish between guest_*.o in one dir, and the only way for that is
> >>> to use the "object file name" symbol.
> >>
> >> So where's the difference from how things work right now? The "same rune"
> >> aspect doesn't really change - right now we also build with effectively
> >> the same logic, just that -DGUEST_PAGING_LEVELS=... gets added. I guess
> >> it might help if you showed (for one particular example) how the set of
> >> file symbols changes from what we have now (with and without
> >> CONFIG_ENFORCE_UNIQUE_SYMBOLS=y) to what there would be with your changes
> >> to the symbols utility to what there will be with those changes.
> > 
> > The logic to build objects from C files changed in 81ecb38b83b0 ("build:
> > provide option to disambiguate symbol names"), with objects build with
> > __OBJECT_FILE__ explicitly left alone. So the logic is different now (at
> > least when CONFIG_ENFORCE_UNIQUE_SYMBOLS=y).
> > 
> > I did add the example of building arch/x86/mm/guest_walk_2.o to the
> > commit message, reworded below:
> > 
> > For example, when building arch/x86/mm/guest_walk_2.o from guest_walk.c,
> > this would be the difference of file symbol present in the object when
> > building with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y:
> > 
> > (1) Currently we have those two file symbols:
> >     guest_walk.c
> >     guest_walk_2.o
> > (2) When building with the same rune, we will have:
> >     arch/x86/mm/guest_walk.c
> >     guest_walk_2.o
> 
> Ah, yes, the changed introductory paragraph makes clear (to me)
> what presence and what future (1) and (2) are talking about. Yet
> what I then still don't understand - what is it that makes the
> path appear when switching to the common rune? Oh - I finally
> figured it: It's the objcopy step that will apply to all targets
> uniformly then. Perhaps it's indeed obvious, but it clearly
> wasn't to me when merely looking at the patch.
>
> With this I'd then wonder whether it wouldn't be a far smaller
> adjustment to simply skip that --redefine-sym step in case the
> object file already has a file symbol naming the object file,
> thus simply retaining the status quo.

So, we should call `nm' thousands of time, to find out if calling
`objcopy' is needed, for the 9 objects that doesn't need the extra step?

Or do you mean keeping exception to the rule? And hope that when someone
changes the rule, it doesn't forget to check if the exception needs
changing as well?

Also, I'm going to have to use this patch later anyway as sometime CC
use a full path to the source as file symbol. So this is going to be
important when we will run for example
`clang -o arch/x86/mm/guest_walk_2.o arch/x86/mm/guest_walk.c`.
(There isn't a patch for that yet.)
Jan Beulich April 17, 2020, 1:39 p.m. UTC | #7
On 17.04.2020 15:19, Anthony PERARD wrote:
> On Fri, Apr 17, 2020 at 09:12:11AM +0200, Jan Beulich wrote:
>> On 16.04.2020 17:09, Anthony PERARD wrote:
>>> On Thu, Apr 16, 2020 at 04:22:05PM +0200, Jan Beulich wrote:
>>>> On 16.04.2020 14:44, Anthony PERARD wrote:
>>>>> On Wed, Apr 08, 2020 at 02:54:35PM +0200, Jan Beulich wrote:
>>>>>> On 31.03.2020 12:30, Anthony PERARD wrote:
>>>>>>> We want to use the same rune to build mm/*/guest_*.o as the one use to
>>>>>>> build every other *.o object. The consequence it that file symbols that
>>>>>>> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
>>>>>>>
>>>>>>> (1) Currently we have those two file symbols:
>>>>>>>     guest_walk.c
>>>>>>>     guest_walk_2.o
>>>>>>> (2) with CONFIG_ENFORCE_UNIQUE_SYMBOLS used on guest_walk.c, we will have:
>>>>>>>     arch/x86/mm/guest_walk.c
>>>>>>>     guest_walk_2.o
>>>>>>>
>>>>>>> The order in which those symbols are present may be different.
>>>>>>>
>>>>>>> Currently, in case (1) ./symbols chooses the *.o symbol (object file
>>>>>>> name). But in case (2), may choose the *.c symbol (source file name with
>>>>>>> path component) if it is first
>>>>>>>
>>>>>>> We want to have ./symbols choose the object file name symbol in both
>>>>>>> cases.
>>>>>>
>>>>>> I guess the reason for wanting this is somehow connected to the
>>>>>> statement at the beginning of the description, but I can't seem
>>>>>> to be able to make the connection.
>>>>>
>>>>> I'm not sure I can explain it better.
>>>>>
>>>>> The "object file name" file symbol is used to distinguish between symbols
>>>>> from all mm/*/guest_* objects. The other file symbol present in those
>>>>> object is a "source file name without any path component symbol".
>>>>>
>>>>> But building those objects with the same rune as any other objects, and
>>>>> having CONFIG_ENFORCE_UNIQUE_SYMBOLS=y, changes the file symbols present
>>>>> in the resulting object. We still have the "object file name" symbol,
>>>>> but now we also have "source file name with path components" symbol.
>>>>> Unfortunately, all mm/*/guest_*.o in one directory are built from the
>>>>> same source file, and thus have the same "source file name" symbol, but
>>>>> have different "object file name" symbol. We still want to be able to
>>>>> distinguish between guest_*.o in one dir, and the only way for that is
>>>>> to use the "object file name" symbol.
>>>>
>>>> So where's the difference from how things work right now? The "same rune"
>>>> aspect doesn't really change - right now we also build with effectively
>>>> the same logic, just that -DGUEST_PAGING_LEVELS=... gets added. I guess
>>>> it might help if you showed (for one particular example) how the set of
>>>> file symbols changes from what we have now (with and without
>>>> CONFIG_ENFORCE_UNIQUE_SYMBOLS=y) to what there would be with your changes
>>>> to the symbols utility to what there will be with those changes.
>>>
>>> The logic to build objects from C files changed in 81ecb38b83b0 ("build:
>>> provide option to disambiguate symbol names"), with objects build with
>>> __OBJECT_FILE__ explicitly left alone. So the logic is different now (at
>>> least when CONFIG_ENFORCE_UNIQUE_SYMBOLS=y).
>>>
>>> I did add the example of building arch/x86/mm/guest_walk_2.o to the
>>> commit message, reworded below:
>>>
>>> For example, when building arch/x86/mm/guest_walk_2.o from guest_walk.c,
>>> this would be the difference of file symbol present in the object when
>>> building with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y:
>>>
>>> (1) Currently we have those two file symbols:
>>>     guest_walk.c
>>>     guest_walk_2.o
>>> (2) When building with the same rune, we will have:
>>>     arch/x86/mm/guest_walk.c
>>>     guest_walk_2.o
>>
>> Ah, yes, the changed introductory paragraph makes clear (to me)
>> what presence and what future (1) and (2) are talking about. Yet
>> what I then still don't understand - what is it that makes the
>> path appear when switching to the common rune? Oh - I finally
>> figured it: It's the objcopy step that will apply to all targets
>> uniformly then. Perhaps it's indeed obvious, but it clearly
>> wasn't to me when merely looking at the patch.
>>
>> With this I'd then wonder whether it wouldn't be a far smaller
>> adjustment to simply skip that --redefine-sym step in case the
>> object file already has a file symbol naming the object file,
>> thus simply retaining the status quo.
> 
> So, we should call `nm' thousands of time, to find out if calling
> `objcopy' is needed, for the 9 objects that doesn't need the extra step?

Well that (or rather objdump) was what I was thinking while writing
the earlier reply, but you have a point - I can see how treating
the bigger change for less build time might be worth it. Yet ...

> Or do you mean keeping exception to the rule? And hope that when someone
> changes the rule, it doesn't forget to check if the exception needs
> changing as well?

... "exception" like you put it (requiring special care to keep
multiple instances in sync) is not the only way this can be done
(and indeed I'd not want something like this). Since you have
(in patch 15) e.g.

guest_walk_%.o: guest_walk.c FORCE
	$(call if_changed_rule,cc_o_c)

anyway, the desire to skip the objcopy step could be communicated
to the command from here, without needing to clone the command.
One way might be a special (phony) dependency, another might be to
set some variable along the lines of

guest_walk_%.o: SPECIAL := y

> Also, I'm going to have to use this patch later anyway as sometime CC
> use a full path to the source as file symbol. So this is going to be
> important when we will run for example
> `clang -o arch/x86/mm/guest_walk_2.o arch/x86/mm/guest_walk.c`.
> (There isn't a patch for that yet.)

That's interesting - what will be the goal of that future adjustment?

Jan
Anthony PERARD April 17, 2020, 2:42 p.m. UTC | #8
On Fri, Apr 17, 2020 at 03:39:48PM +0200, Jan Beulich wrote:
> On 17.04.2020 15:19, Anthony PERARD wrote:
> > Or do you mean keeping exception to the rule? And hope that when someone
> > changes the rule, it doesn't forget to check if the exception needs
> > changing as well?
> 
> ... "exception" like you put it (requiring special care to keep
> multiple instances in sync) is not the only way this can be done
> (and indeed I'd not want something like this). Since you have
> (in patch 15) e.g.
> 
> guest_walk_%.o: guest_walk.c FORCE
> 	$(call if_changed_rule,cc_o_c)
> 
> anyway, the desire to skip the objcopy step could be communicated
> to the command from here, without needing to clone the command.
> One way might be a special (phony) dependency, another might be to
> set some variable along the lines of
> 
> guest_walk_%.o: SPECIAL := y

I guess something like that could be done. But if possible, I'd like to
avoid that.

> > Also, I'm going to have to use this patch later anyway as sometime CC
> > use a full path to the source as file symbol. So this is going to be
> > important when we will run for example
> > `clang -o arch/x86/mm/guest_walk_2.o arch/x86/mm/guest_walk.c`.
> > (There isn't a patch for that yet.)
> 
> That's interesting - what will be the goal of that future adjustment?

It's a step toward my goal of been able to have out-of-tree build for
xen, as stated in my cover letter. In order to do that, I try to adapt
Kbuild to build Xen.

Kbuild is building the linux kernel without changing directory, so I'd
like to do the same, as it probably makes it easier to do out-of-tree
build.

Another tool I'd like to use from Kbuild is ./fixdep, it's a small
program that run after running CC and fix the dependency file that CC
generates. The main thing it does is to add a dependency on
Kconfig options that a source file uses instead of having a dependency
on whether any unrelated Kconfig has change at all. But ./fixdep from
Linux only works if we build without changing directory. ([1] for more
on fixdep)

I guess one advantage of never changing directory is that we can always
use relative path in global *FLAGS. There isn't a need to use absolute
path, which is an issue when the source tree is moved to a different
location. That can easily happen when for example you try to build in a
container (mapping the source tree inside it) then try to rebuild from
outside. (After using automation/scripts/containerize for example.)
And we don't need tricks like the .*.d2 files (which isn't needed in the
hypervisor anyway, so far at least).


[1], copied from Linux's scripts/basic/fixdep.c introduction:
    If the user re-runs make *config, autoconf.h will be
    regenerated.  make notices that and will rebuild every file which
    includes autoconf.h, i.e. basically all files. This is extremely
    annoying if the user just changed CONFIG_HIS_DRIVER from n to m.

    So we play the same trick that "mkdep" played before. We replace
    the dependency on autoconf.h by a dependency on every config
    option which is mentioned in any of the listed prerequisites.

    kconfig populates a tree in include/config/ with an empty file
    for each config symbol and when the configuration is updated
    the files representing changed config options are touched
    which then let make pick up the changes and the files that use
    the config symbols are rebuilt.

    So if the user changes his CONFIG_HIS_DRIVER option, only the objects
    which depend on "include/config/his/driver.h" will be rebuilt,
    so most likely only his driver ;-)
Jan Beulich April 20, 2020, 1:39 p.m. UTC | #9
On 17.04.2020 16:42, Anthony PERARD wrote:
> On Fri, Apr 17, 2020 at 03:39:48PM +0200, Jan Beulich wrote:
>> On 17.04.2020 15:19, Anthony PERARD wrote:
>>> Or do you mean keeping exception to the rule? And hope that when someone
>>> changes the rule, it doesn't forget to check if the exception needs
>>> changing as well?
>>
>> ... "exception" like you put it (requiring special care to keep
>> multiple instances in sync) is not the only way this can be done
>> (and indeed I'd not want something like this). Since you have
>> (in patch 15) e.g.
>>
>> guest_walk_%.o: guest_walk.c FORCE
>> 	$(call if_changed_rule,cc_o_c)
>>
>> anyway, the desire to skip the objcopy step could be communicated
>> to the command from here, without needing to clone the command.
>> One way might be a special (phony) dependency, another might be to
>> set some variable along the lines of
>>
>> guest_walk_%.o: SPECIAL := y
> 
> I guess something like that could be done. But if possible, I'd like to
> avoid that.
> 
>>> Also, I'm going to have to use this patch later anyway as sometime CC
>>> use a full path to the source as file symbol. So this is going to be
>>> important when we will run for example
>>> `clang -o arch/x86/mm/guest_walk_2.o arch/x86/mm/guest_walk.c`.
>>> (There isn't a patch for that yet.)
>>
>> That's interesting - what will be the goal of that future adjustment?
> 
> It's a step toward my goal of been able to have out-of-tree build for
> xen, as stated in my cover letter. In order to do that, I try to adapt
> Kbuild to build Xen.
> 
> Kbuild is building the linux kernel without changing directory, so I'd
> like to do the same, as it probably makes it easier to do out-of-tree
> build.
> 
> Another tool I'd like to use from Kbuild is ./fixdep, it's a small
> program that run after running CC and fix the dependency file that CC
> generates. The main thing it does is to add a dependency on
> Kconfig options that a source file uses instead of having a dependency
> on whether any unrelated Kconfig has change at all. But ./fixdep from
> Linux only works if we build without changing directory. ([1] for more
> on fixdep)
> 
> I guess one advantage of never changing directory is that we can always
> use relative path in global *FLAGS. There isn't a need to use absolute
> path, which is an issue when the source tree is moved to a different
> location. That can easily happen when for example you try to build in a
> container (mapping the source tree inside it) then try to rebuild from
> outside. (After using automation/scripts/containerize for example.)
> And we don't need tricks like the .*.d2 files (which isn't needed in the
> hypervisor anyway, so far at least).

Ah, I see. Out-of-tree builds don't necessarily imply source trees
that can also be moved, so you want to actually go a step further.

Jan
diff mbox series

Patch

diff --git a/xen/tools/symbols.c b/xen/tools/symbols.c
index 9f9e2c990061..b3a9465b32d3 100644
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -84,7 +84,12 @@  static int read_symbol(FILE *in, struct sym_entry *s)
 {
 	char str[500], type[20] = "";
 	char *sym, stype;
-	static enum { symbol, single_source, multi_source } last;
+	static enum symbol_type {
+		symbol,
+		file_source,
+		path_source,
+		obj_file,
+	} last;
 	static char *filename;
 	int rc = -1;
 
@@ -125,13 +130,20 @@  static int read_symbol(FILE *in, struct sym_entry *s)
 		 * prefer the first one if that names an object file or has a
 		 * directory component (to cover multiply compiled files).
 		 */
-		bool multi = strchr(str, '/') || (sym && sym[1] == 'o');
+		enum symbol_type current;
 
-		if (multi || last != multi_source) {
+		if (sym && sym[1] == 'o')
+		    current = obj_file;
+		else if (strchr(str, '/'))
+		    current = path_source;
+		else
+		    current = file_source;
+
+		if (current > last || last == file_source) {
 			free(filename);
 			filename = *str ? strdup(str) : NULL;
+			last = current;
 		}
-		last = multi ? multi_source : single_source;
 		goto skip_tail;
 	}