diff mbox

[v6,01/15] x86: properly calculate ELF end of image address

Message ID 1473711511-11931-2-git-send-email-daniel.kiper@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Kiper Sept. 12, 2016, 8:18 p.m. UTC
Currently ELF end of image address is calculated using first line from
"nm -nr xen/xen-syms" output. However, today usually it contains random
symbol address not related to end of image in any way. So, it looks
that for years that stuff have been working just by lucky coincidence.
Hence, it have to be changed to something more reliable. So, let's take
ELF end of image address by reading _end symbol address from nm output.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
---
 xen/arch/x86/Makefile |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Konrad Rzeszutek Wilk Sept. 16, 2016, 8:43 p.m. UTC | #1
On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
> Currently ELF end of image address is calculated using first line from
> "nm -nr xen/xen-syms" output. However, today usually it contains random
> symbol address not related to end of image in any way. So, it looks

Weird. The -n says:

"  -n
       -v
       --numeric-sort
           Sort symbols numerically by their addresses, rather than alphabetically by their names.
"

And you are right. It is ignoring it:

[konrad@char xen]$ nm -nr xen/xen-syms| sort | head -1
ffff82d080000000 T __image_base__
[konrad@char xen]$ nm -nr xen/xen-syms | head -1
ffff82d08033d000 B efi

[konrad@char xen]$ nm -nr xen/xen-syms| sort | tail -5
ffff82d08033cb00 B _end
ffff82d08033cb00 B __per_cpu_data_end
ffff82d08033d000 B __2M_rwdata_end
ffff82d08033d000 B efi
                 U _GLOBAL_OFFSET_TABLE_

> that for years that stuff have been working just by lucky coincidence.
> Hence, it have to be changed to something more reliable. So, let's take
> ELF end of image address by reading _end symbol address from nm output.
> 
> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
> ---
>  xen/arch/x86/Makefile |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index d3875c5..a4fe740 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -91,7 +91,7 @@ endif
>  
>  $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
>  	./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x100000 \
> -	`$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
> +		`$(NM) -nr $(TARGET)-syms | awk '$$3 == "_end" {print "0x"$$1}'`
>  

Something is off with your tabs/spaces.


I would also modify the arch/x86/xen.lds.S and put a comment
around _end = .; to mention this dependency - in case somebody adds some 
extra things after _end.


>  .PHONY: tests
>  tests:
> -- 
> 1.7.10.4
>
Jan Beulich Sept. 19, 2016, 11:14 a.m. UTC | #2
>>> On 16.09.16 at 22:43, <konrad.wilk@oracle.com> wrote:
> On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
>> Currently ELF end of image address is calculated using first line from
>> "nm -nr xen/xen-syms" output. However, today usually it contains random
>> symbol address not related to end of image in any way. So, it looks
> 
> Weird. The -n says:
> 
> "  -n
>        -v
>        --numeric-sort
>            Sort symbols numerically by their addresses, rather than 
> alphabetically by their names.
> "
> 
> And you are right. It is ignoring it:
> 
> [konrad@char xen]$ nm -nr xen/xen-syms| sort | head -1
> ffff82d080000000 T __image_base__
> [konrad@char xen]$ nm -nr xen/xen-syms | head -1
> ffff82d08033d000 B efi

So what is it ignoring, you think? The -n? Surely not. Are you perhaps
overlooking that -r means "reverse" (and hence that piping through
"sort" inverts all the sorting done by nm)?

> [konrad@char xen]$ nm -nr xen/xen-syms| sort | tail -5
> ffff82d08033cb00 B _end
> ffff82d08033cb00 B __per_cpu_data_end
> ffff82d08033d000 B __2M_rwdata_end
> ffff82d08033d000 B efi
>                  U _GLOBAL_OFFSET_TABLE_
> 
>> that for years that stuff have been working just by lucky coincidence.
>> Hence, it have to be changed to something more reliable. So, let's take
>> ELF end of image address by reading _end symbol address from nm output.

So before taking this patch I'd really like to see proof that what gets
done currently does actually go wrong in at least one case. So far I
can't see things working as "just by lucky coincidence". In particular
I can't see why __2M_rwdata_end (aliased to efi) does not relate to
the intended image end. Once we switch back to using large pages
even when not using xen.efi I'd even question whether preferring
_end over __2M_rwdata_end is actually correct.

The only real (but reasonably slim) risk we have right now is that an
absolute symbol might appear with a value larger than
__2M_rwdata_end. nm would certainly benefit from an option
allowing to filter out absolute symbols (just like one can filter out
undefined ones).

Jan
Daniel Kiper Sept. 19, 2016, 11:18 a.m. UTC | #3
On Fri, Sep 16, 2016 at 04:43:21PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
> > Currently ELF end of image address is calculated using first line from
> > "nm -nr xen/xen-syms" output. However, today usually it contains random
> > symbol address not related to end of image in any way. So, it looks
>
> Weird. The -n says:
>
> "  -n
>        -v
>        --numeric-sort
>            Sort symbols numerically by their addresses, rather than alphabetically by their names.
> "
>
> And you are right. It is ignoring it:
>
> [konrad@char xen]$ nm -nr xen/xen-syms| sort | head -1
> ffff82d080000000 T __image_base__
> [konrad@char xen]$ nm -nr xen/xen-syms | head -1
> ffff82d08033d000 B efi
>
> [konrad@char xen]$ nm -nr xen/xen-syms| sort | tail -5
> ffff82d08033cb00 B _end
> ffff82d08033cb00 B __per_cpu_data_end
> ffff82d08033d000 B __2M_rwdata_end
> ffff82d08033d000 B efi
>                  U _GLOBAL_OFFSET_TABLE_

Well, TBH, I have never checked what "-nr" means. However, it looks
that it works at least with nm from binutils 2.22. Please look below:

ffff82d080345000 A efi
ffff82d080345000 A __2M_rwdata_end
ffff82d080344b80 A _end
ffff82d080344b80 B __per_cpu_data_end
ffff82d080344b80 B __bss_end
ffff82d080344b28 b per_cpu__vmxon_region
ffff82d080344b20 b per_cpu__root_vmcb
ffff82d080344b18 b per_cpu__hsa

[...]

Anyway, I think that we should apply my fix. Though I can agree that we
do not need "-nr" for nm here any more.

> > that for years that stuff have been working just by lucky coincidence.
> > Hence, it have to be changed to something more reliable. So, let's take
> > ELF end of image address by reading _end symbol address from nm output.
> >
> > Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
> > ---
> >  xen/arch/x86/Makefile |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> > index d3875c5..a4fe740 100644
> > --- a/xen/arch/x86/Makefile
> > +++ b/xen/arch/x86/Makefile
> > @@ -91,7 +91,7 @@ endif
> >
> >  $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
> >  	./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x100000 \
> > -	`$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
> > +		`$(NM) -nr $(TARGET)-syms | awk '$$3 == "_end" {print "0x"$$1}'`
> >
>
> Something is off with your tabs/spaces.

I think that it is OK. I added second tab to mark that it is a continuation.

> I would also modify the arch/x86/xen.lds.S and put a comment
> around _end = .; to mention this dependency - in case somebody adds some
> extra things after _end.

I am not sure it is needed. However, if Andrew and Jan do not object I can add that.

Daniel
Daniel Kiper Sept. 19, 2016, 1:56 p.m. UTC | #4
On Mon, Sep 19, 2016 at 05:14:07AM -0600, Jan Beulich wrote:
> >>> On 16.09.16 at 22:43, <konrad.wilk@oracle.com> wrote:
> > On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
> >> Currently ELF end of image address is calculated using first line from
> >> "nm -nr xen/xen-syms" output. However, today usually it contains random
> >> symbol address not related to end of image in any way. So, it looks
> >
> > Weird. The -n says:
> >
> > "  -n
> >        -v
> >        --numeric-sort
> >            Sort symbols numerically by their addresses, rather than
> > alphabetically by their names.
> > "
> >
> > And you are right. It is ignoring it:
> >
> > [konrad@char xen]$ nm -nr xen/xen-syms| sort | head -1
> > ffff82d080000000 T __image_base__
> > [konrad@char xen]$ nm -nr xen/xen-syms | head -1
> > ffff82d08033d000 B efi
>
> So what is it ignoring, you think? The -n? Surely not. Are you perhaps
> overlooking that -r means "reverse" (and hence that piping through
> "sort" inverts all the sorting done by nm)?
>
> > [konrad@char xen]$ nm -nr xen/xen-syms| sort | tail -5
> > ffff82d08033cb00 B _end
> > ffff82d08033cb00 B __per_cpu_data_end
> > ffff82d08033d000 B __2M_rwdata_end
> > ffff82d08033d000 B efi
> >                  U _GLOBAL_OFFSET_TABLE_
> >
> >> that for years that stuff have been working just by lucky coincidence.
> >> Hence, it have to be changed to something more reliable. So, let's take
> >> ELF end of image address by reading _end symbol address from nm output.
>
> So before taking this patch I'd really like to see proof that what gets
> done currently does actually go wrong in at least one case. So far I

During initial work on this patch series I discovered that p_memsz in xen
ELF file is set to unreasonably huge value, IIRC, ~1 GiB. That is why at
first I enforced 16 MiB here (just temporary workaround) and later proposed
this patch. Sadly, right now I am not able to find commit which introduced
this issue. However, it seems that it was "fixed" later by another commit.

Anyway, I am not sure why are you against, IMO, more reliable solution.
This way we would avoid in the future similar issues which I described above.

> can't see things working as "just by lucky coincidence". In particular
> I can't see why __2M_rwdata_end (aliased to efi) does not relate to
> the intended image end. Once we switch back to using large pages
> even when not using xen.efi I'd even question whether preferring
> _end over __2M_rwdata_end is actually correct.

This is good question. I was thinking about that after posting v6. It seems
that from image POV _end is correct. However, taking into account that Xen
image mapping covers 16 MiB then I think we should use __2M_rwdata_end (or
define __end_of_image__ symbol equal to __2M_rwdata_end). This way boot
loader will allocate memory region for Xen image later covered properly by
mapping, nothing will be put in memory immediately after the Xen image and
later incorrectly mapped as Xen image.

Daniel
Jan Beulich Sept. 19, 2016, 2:52 p.m. UTC | #5
>>> On 19.09.16 at 15:56, <daniel.kiper@oracle.com> wrote:
> On Mon, Sep 19, 2016 at 05:14:07AM -0600, Jan Beulich wrote:
>> >>> On 16.09.16 at 22:43, <konrad.wilk@oracle.com> wrote:
>> > On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
>> >> Currently ELF end of image address is calculated using first line from
>> >> "nm -nr xen/xen-syms" output. However, today usually it contains random
>> >> symbol address not related to end of image in any way. So, it looks
>> >
>> > Weird. The -n says:
>> >
>> > "  -n
>> >        -v
>> >        --numeric-sort
>> >            Sort symbols numerically by their addresses, rather than
>> > alphabetically by their names.
>> > "
>> >
>> > And you are right. It is ignoring it:
>> >
>> > [konrad@char xen]$ nm -nr xen/xen-syms| sort | head -1
>> > ffff82d080000000 T __image_base__
>> > [konrad@char xen]$ nm -nr xen/xen-syms | head -1
>> > ffff82d08033d000 B efi
>>
>> So what is it ignoring, you think? The -n? Surely not. Are you perhaps
>> overlooking that -r means "reverse" (and hence that piping through
>> "sort" inverts all the sorting done by nm)?
>>
>> > [konrad@char xen]$ nm -nr xen/xen-syms| sort | tail -5
>> > ffff82d08033cb00 B _end
>> > ffff82d08033cb00 B __per_cpu_data_end
>> > ffff82d08033d000 B __2M_rwdata_end
>> > ffff82d08033d000 B efi
>> >                  U _GLOBAL_OFFSET_TABLE_
>> >
>> >> that for years that stuff have been working just by lucky coincidence.
>> >> Hence, it have to be changed to something more reliable. So, let's take
>> >> ELF end of image address by reading _end symbol address from nm output.
>>
>> So before taking this patch I'd really like to see proof that what gets
>> done currently does actually go wrong in at least one case. So far I
> 
> During initial work on this patch series I discovered that p_memsz in xen
> ELF file is set to unreasonably huge value, IIRC, ~1 GiB. That is why at
> first I enforced 16 MiB here (just temporary workaround) and later proposed
> this patch. Sadly, right now I am not able to find commit which introduced
> this issue. However, it seems that it was "fixed" later by another commit.
> 
> Anyway, I am not sure why are you against, IMO, more reliable solution.
> This way we would avoid in the future similar issues which I described 
> above.

I'm not against anything if it gets properly explained. Here,
however, you present some vague statements which even you
can't verify right now as it looks.

And then I'm not sure going from one way of using nm to another
is all that much of an improvement. A true improvement might be
to do away with the nm use and e.g. also consider the section
table.

Jan

>> can't see things working as "just by lucky coincidence". In particular
>> I can't see why __2M_rwdata_end (aliased to efi) does not relate to
>> the intended image end. Once we switch back to using large pages
>> even when not using xen.efi I'd even question whether preferring
>> _end over __2M_rwdata_end is actually correct.
> 
> This is good question. I was thinking about that after posting v6. It seems
> that from image POV _end is correct. However, taking into account that Xen
> image mapping covers 16 MiB then I think we should use __2M_rwdata_end (or
> define __end_of_image__ symbol equal to __2M_rwdata_end). This way boot
> loader will allocate memory region for Xen image later covered properly by
> mapping, nothing will be put in memory immediately after the Xen image and
> later incorrectly mapped as Xen image.
> 
> Daniel
Daniel Kiper Sept. 20, 2016, 11:43 a.m. UTC | #6
On Mon, Sep 19, 2016 at 08:52:02AM -0600, Jan Beulich wrote:
> >>> On 19.09.16 at 15:56, <daniel.kiper@oracle.com> wrote:
> > On Mon, Sep 19, 2016 at 05:14:07AM -0600, Jan Beulich wrote:
> >> >>> On 16.09.16 at 22:43, <konrad.wilk@oracle.com> wrote:
> >> > On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
> >> >> Currently ELF end of image address is calculated using first line from
> >> >> "nm -nr xen/xen-syms" output. However, today usually it contains random
> >> >> symbol address not related to end of image in any way. So, it looks
> >> >
> >> > Weird. The -n says:
> >> >
> >> > "  -n
> >> >        -v
> >> >        --numeric-sort
> >> >            Sort symbols numerically by their addresses, rather than
> >> > alphabetically by their names.
> >> > "
> >> >
> >> > And you are right. It is ignoring it:
> >> >
> >> > [konrad@char xen]$ nm -nr xen/xen-syms| sort | head -1
> >> > ffff82d080000000 T __image_base__
> >> > [konrad@char xen]$ nm -nr xen/xen-syms | head -1
> >> > ffff82d08033d000 B efi
> >>
> >> So what is it ignoring, you think? The -n? Surely not. Are you perhaps
> >> overlooking that -r means "reverse" (and hence that piping through
> >> "sort" inverts all the sorting done by nm)?
> >>
> >> > [konrad@char xen]$ nm -nr xen/xen-syms| sort | tail -5
> >> > ffff82d08033cb00 B _end
> >> > ffff82d08033cb00 B __per_cpu_data_end
> >> > ffff82d08033d000 B __2M_rwdata_end
> >> > ffff82d08033d000 B efi
> >> >                  U _GLOBAL_OFFSET_TABLE_
> >> >
> >> >> that for years that stuff have been working just by lucky coincidence.
> >> >> Hence, it have to be changed to something more reliable. So, let's take
> >> >> ELF end of image address by reading _end symbol address from nm output.
> >>
> >> So before taking this patch I'd really like to see proof that what gets
> >> done currently does actually go wrong in at least one case. So far I
> >
> > During initial work on this patch series I discovered that p_memsz in xen
> > ELF file is set to unreasonably huge value, IIRC, ~1 GiB. That is why at
> > first I enforced 16 MiB here (just temporary workaround) and later proposed
> > this patch. Sadly, right now I am not able to find commit which introduced
> > this issue. However, it seems that it was "fixed" later by another commit.
> >
> > Anyway, I am not sure why are you against, IMO, more reliable solution.
> > This way we would avoid in the future similar issues which I described
> > above.
>
> I'm not against anything if it gets properly explained. Here,
> however, you present some vague statements which even you
> can't verify right now as it looks.
>
> And then I'm not sure going from one way of using nm to another
> is all that much of an improvement. A true improvement might be
> to do away with the nm use and e.g. also consider the section
> table.

However, it looks that this requires changes in mkelf32.c and
I do not think that we should play with it right now.

> >> can't see things working as "just by lucky coincidence". In particular
> >> I can't see why __2M_rwdata_end (aliased to efi) does not relate to
> >> the intended image end. Once we switch back to using large pages
> >> even when not using xen.efi I'd even question whether preferring
> >> _end over __2M_rwdata_end is actually correct.
> >
> > This is good question. I was thinking about that after posting v6. It seems
> > that from image POV _end is correct. However, taking into account that Xen
> > image mapping covers 16 MiB then I think we should use __2M_rwdata_end (or
> > define __end_of_image__ symbol equal to __2M_rwdata_end). This way boot
> > loader will allocate memory region for Xen image later covered properly by
> > mapping, nothing will be put in memory immediately after the Xen image and
> > later incorrectly mapped as Xen image.

Current xen image p_memsz does not end at 16 MiB. It seems to me that this is
incorrect. Should I fix it? It looks that we just have to move out .pad of
#ifdef EFI. Are you OK with it?

Daniel
Jan Beulich Sept. 20, 2016, 1:28 p.m. UTC | #7
>>> On 20.09.16 at 13:43, <daniel.kiper@oracle.com> wrote:
> On Mon, Sep 19, 2016 at 08:52:02AM -0600, Jan Beulich wrote:
>> >>> On 19.09.16 at 15:56, <daniel.kiper@oracle.com> wrote:
>> > On Mon, Sep 19, 2016 at 05:14:07AM -0600, Jan Beulich wrote:
>> >> >>> On 16.09.16 at 22:43, <konrad.wilk@oracle.com> wrote:
>> >> > On Mon, Sep 12, 2016 at 10:18:16PM +0200, Daniel Kiper wrote:
>> >> In particular
>> >> I can't see why __2M_rwdata_end (aliased to efi) does not relate to
>> >> the intended image end. Once we switch back to using large pages
>> >> even when not using xen.efi I'd even question whether preferring
>> >> _end over __2M_rwdata_end is actually correct.
>> >
>> > This is good question. I was thinking about that after posting v6. It seems
>> > that from image POV _end is correct. However, taking into account that Xen
>> > image mapping covers 16 MiB then I think we should use __2M_rwdata_end (or
>> > define __end_of_image__ symbol equal to __2M_rwdata_end). This way boot
>> > loader will allocate memory region for Xen image later covered properly by
>> > mapping, nothing will be put in memory immediately after the Xen image and
>> > later incorrectly mapped as Xen image.
> 
> Current xen image p_memsz does not end at 16 MiB. It seems to me that this is
> incorrect. Should I fix it? It looks that we just have to move out .pad of
> #ifdef EFI. Are you OK with it?

Just to emphasize again: I'm okay with any fix which actually fixes
something. I continue to be unconvinced there is something that
actually needs fixing. So as long as you can properly explain what
is wrong, I won't stand in the way of your change going in. For the
case here this means - if the value is e.g. off by a few bytes, then
I don't see an issue. This 16Mb boundary isn't a hard one anyway.
If otoh the value is off by an arbitrary amount, which just happens
to be small enough in most cases, then I do see a need for a change.

Jan
Daniel Kiper Sept. 20, 2016, 6 p.m. UTC | #8
On Mon, Sep 19, 2016 at 08:52:02AM -0600, Jan Beulich wrote:
> >>> On 19.09.16 at 15:56, <daniel.kiper@oracle.com> wrote:
> > On Mon, Sep 19, 2016 at 05:14:07AM -0600, Jan Beulich wrote:

[...]

> >> So before taking this patch I'd really like to see proof that what gets
> >> done currently does actually go wrong in at least one case. So far I
> >
> > During initial work on this patch series I discovered that p_memsz in xen
> > ELF file is set to unreasonably huge value, IIRC, ~1 GiB. That is why at
> > first I enforced 16 MiB here (just temporary workaround) and later proposed
> > this patch. Sadly, right now I am not able to find commit which introduced
> > this issue. However, it seems that it was "fixed" later by another commit.
> >
> > Anyway, I am not sure why are you against, IMO, more reliable solution.
> > This way we would avoid in the future similar issues which I described
> > above.
>
> I'm not against anything if it gets properly explained. Here,
> however, you present some vague statements which even you
> can't verify right now as it looks.

OK, I did some more tests and found out that after patch "efi: build
xen.gz with EFI code" we have following xen ELF file:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz     Flg Align
  LOAD           0x000080 0x00100000 0x00100000 0x20c120 0x3ff00000 RWE 0x40
                                                         ^^^^^^^^^^

because "nm -nr xen/xen-syms" emits:

ffff82d0c0000000 A ALT_START
ffff82d08034d000 A __2M_rwdata_end
ffff82d08034cc00 A _end
ffff82d08034cc00 B __per_cpu_data_end
ffff82d08034cc00 B __bss_end

[...]

ALT_START lives in xen/arch/x86/efi/relocs-dummy.S. relocs-dummy.S
provides __base_relocs_start and __base_relocs_end symbols which
are referenced in xen/arch/x86/efi/efi-boot.h:efi_arch_relocate_image().
Of course one can argue that maybe we should do some changes in
efi_arch_relocate_image() and/or xen/arch/x86/efi/relocs-dummy.S.
This is true. However, this is separate issue and we should consider
it as such.

"efi: build xen.gz with EFI code" patch clearly shows that current
method used to calculate <final-exec-addr> mkelf32 argument is based
on bogus assumptions and very fragile. So, IMO, it should be changed
to something which is more reliable. And my proposal looks quite good.

Daniel
Jan Beulich Sept. 21, 2016, 9:37 a.m. UTC | #9
>>> On 20.09.16 at 20:00, <daniel.kiper@oracle.com> wrote:
> OK, I did some more tests and found out that after patch "efi: build
> xen.gz with EFI code" we have following xen ELF file:
> 
> Program Headers:
>   Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz     Flg Align
>   LOAD           0x000080 0x00100000 0x00100000 0x20c120 0x3ff00000 RWE 0x40
>                                                          ^^^^^^^^^^
> 
> because "nm -nr xen/xen-syms" emits:
> 
> ffff82d0c0000000 A ALT_START
> ffff82d08034d000 A __2M_rwdata_end
> ffff82d08034cc00 A _end
> ffff82d08034cc00 B __per_cpu_data_end
> ffff82d08034cc00 B __bss_end

So it is your own change that breaks this.

> ALT_START lives in xen/arch/x86/efi/relocs-dummy.S. relocs-dummy.S
> provides __base_relocs_start and __base_relocs_end symbols which
> are referenced in xen/arch/x86/efi/efi-boot.h:efi_arch_relocate_image().
> Of course one can argue that maybe we should do some changes in
> efi_arch_relocate_image() and/or xen/arch/x86/efi/relocs-dummy.S.
> This is true. However, this is separate issue and we should consider
> it as such.
> 
> "efi: build xen.gz with EFI code" patch clearly shows that current
> method used to calculate <final-exec-addr> mkelf32 argument is based
> on bogus assumptions and very fragile. So, IMO, it should be changed
> to something which is more reliable. And my proposal looks quite good.

What about the alternative of simply stripping ALT_START (and then
also VIRT_START) from the image? They're getting screwed up in
xen.efi already anyway (due to not being representable in a COFF
symbol table entry), and hence can't possibly be useful to retain.

But no matter what route we go: Such a change needs to have a
description that properly explains the issue. Vague statements are
not enough.

And btw, vaguely recalling earlier issues (long ago) with awk usage
(on non-Linux platforms) I'd also prefer if such an adjustment would
get away without introducing another use of awk. I think what you
want could easily be achieved by using sed.

Jan
Daniel Kiper Sept. 22, 2016, 11:45 a.m. UTC | #10
On Wed, Sep 21, 2016 at 03:37:52AM -0600, Jan Beulich wrote:
> >>> On 20.09.16 at 20:00, <daniel.kiper@oracle.com> wrote:
> > OK, I did some more tests and found out that after patch "efi: build
> > xen.gz with EFI code" we have following xen ELF file:
> >
> > Program Headers:
> >   Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz     Flg Align
> >   LOAD           0x000080 0x00100000 0x00100000 0x20c120 0x3ff00000 RWE 0x40
> >                                                          ^^^^^^^^^^
> >
> > because "nm -nr xen/xen-syms" emits:
> >
> > ffff82d0c0000000 A ALT_START
> > ffff82d08034d000 A __2M_rwdata_end
> > ffff82d08034cc00 A _end
> > ffff82d08034cc00 B __per_cpu_data_end
> > ffff82d08034cc00 B __bss_end
>
> So it is your own change that breaks this.

Yes, it is but it happens because current method calculating end of
image address is weak. IMO, we should not blindly assume that first
line from "nm -nr" contains proper end of image address. However,
I can agree that maybe ALT_START at consortes are not needed here.
Though I think this is separate issue.

> > ALT_START lives in xen/arch/x86/efi/relocs-dummy.S. relocs-dummy.S
> > provides __base_relocs_start and __base_relocs_end symbols which
> > are referenced in xen/arch/x86/efi/efi-boot.h:efi_arch_relocate_image().
> > Of course one can argue that maybe we should do some changes in
> > efi_arch_relocate_image() and/or xen/arch/x86/efi/relocs-dummy.S.
> > This is true. However, this is separate issue and we should consider
> > it as such.
> >
> > "efi: build xen.gz with EFI code" patch clearly shows that current
> > method used to calculate <final-exec-addr> mkelf32 argument is based
> > on bogus assumptions and very fragile. So, IMO, it should be changed
> > to something which is more reliable. And my proposal looks quite good.
>
> What about the alternative of simply stripping ALT_START (and then
> also VIRT_START) from the image? They're getting screwed up in
> xen.efi already anyway (due to not being representable in a COFF
> symbol table entry), and hence can't possibly be useful to retain.

Do you think about "strip -N ALT_START xen/xen-syms"? I can add that.
However, I am still not sure why do not you want change currently
existing solution used to calculate end of image address. I showed
that it is easy to break. So, why we must live with it?

> But no matter what route we go: Such a change needs to have a
> description that properly explains the issue. Vague statements are
> not enough.

Could you be more specific? Where are my description(s) vague? What
should be added or removed?

> And btw, vaguely recalling earlier issues (long ago) with awk usage
> (on non-Linux platforms) I'd also prefer if such an adjustment would
> get away without introducing another use of awk. I think what you
> want could easily be achieved by using sed.

OK, I can use sed instead of awk if you wish.

Daniel
Jan Beulich Sept. 22, 2016, 12:02 p.m. UTC | #11
>>> On 22.09.16 at 13:45, <daniel.kiper@oracle.com> wrote:
> However, I am still not sure why do not you want change currently
> existing solution used to calculate end of image address. I showed
> that it is easy to break. So, why we must live with it?

I didn't say it needs to remain that way. All I said is that I don't
want it changed without sufficient reason.

>> But no matter what route we go: Such a change needs to have a
>> description that properly explains the issue. Vague statements are
>> not enough.
> 
> Could you be more specific? Where are my description(s) vague? What
> should be added or removed?

Excuse me - it was _you_ (the producer of the patch) who needed a
while to figure out what the actual problem was. If, from looking at
your own patch + description, you can't figure this out immediately,
how can you expect the reader of it to see clearly the issue that
needs fixing?

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index d3875c5..a4fe740 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -91,7 +91,7 @@  endif
 
 $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
 	./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x100000 \
-	`$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
+		`$(NM) -nr $(TARGET)-syms | awk '$$3 == "_end" {print "0x"$$1}'`
 
 .PHONY: tests
 tests: