mbox series

[v7,0/2] x86/boot: Reduce assembly code

Message ID 20241001102239.2609631-1-frediano.ziglio@cloud.com (mailing list archive)
Headers show
Series x86/boot: Reduce assembly code | expand

Message

Frediano Ziglio Oct. 1, 2024, 10:22 a.m. UTC
This series came from part of the work of removing duplications between
boot code and rewriting part of code from assembly to C.
Rewrites EFI code in pure C.

Changes since v1, more details in specific commits:
- style updates;
- comments and descriptions improvements;
- other improvements.

Changes since v2:
- rebased on master, resolved conflicts;
- add comment on trampoline section.

Changes since v3:
- changed new function name;
- declare efi_multiboot2 in a separate header;
- distinguish entry point from using magic number;
- other minor changes (see commens in commits).

Changes since v4:
- rebase on staging;
- set %fs and %gs as other segment registers;
- style and other changes.

Changes since v5:
- fixed a typo.

Changes since v6:
- remove merged patch;
- comment and style;
- change some pointer checks to avoid overflows;
- rename parse-mbi2.c to mbi2.c.

Frediano Ziglio (2):
  x86/boot: Rewrite EFI/MBI2 code partly in C
  x86/boot: Improve MBI2 structure check

 xen/arch/x86/boot/head.S       | 146 +++++++--------------------------
 xen/arch/x86/efi/Makefile      |   1 +
 xen/arch/x86/efi/efi-boot.h    |   7 +-
 xen/arch/x86/efi/mbi2.c        |  66 +++++++++++++++
 xen/arch/x86/efi/stub.c        |  10 +--
 xen/arch/x86/include/asm/efi.h |  18 ++++
 6 files changed, 123 insertions(+), 125 deletions(-)
 create mode 100644 xen/arch/x86/efi/mbi2.c
 create mode 100644 xen/arch/x86/include/asm/efi.h

Comments

Marek Marczykowski-Górecki Oct. 2, 2024, 2:04 p.m. UTC | #1
On Tue, Oct 01, 2024 at 11:22:37AM +0100, Frediano Ziglio wrote:
> This series came from part of the work of removing duplications between
> boot code and rewriting part of code from assembly to C.
> Rewrites EFI code in pure C.

The MB2+EFI tests on Adler Lake fail with this series:
https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1478766782
Looking at the VGA output (unfortunately not collected by the test
itself) it hangs just after bootloader, before printing anything on the
screen (or even clearing it after bootloader). The serial is silent too.

It does pass on Zen 3+ runners.

Since there were some issues with the ADL runner today on plain staging,
I'm not 100% sure if it isn't some infrastructure issue yet. But the
symptoms look different than usual infra issues (and different than
todays failures on staging), so I think it's more likely an issue with
the patches here.

> Changes since v1, more details in specific commits:
> - style updates;
> - comments and descriptions improvements;
> - other improvements.
> 
> Changes since v2:
> - rebased on master, resolved conflicts;
> - add comment on trampoline section.
> 
> Changes since v3:
> - changed new function name;
> - declare efi_multiboot2 in a separate header;
> - distinguish entry point from using magic number;
> - other minor changes (see commens in commits).
> 
> Changes since v4:
> - rebase on staging;
> - set %fs and %gs as other segment registers;
> - style and other changes.
> 
> Changes since v5:
> - fixed a typo.
> 
> Changes since v6:
> - remove merged patch;
> - comment and style;
> - change some pointer checks to avoid overflows;
> - rename parse-mbi2.c to mbi2.c.
> 
> Frediano Ziglio (2):
>   x86/boot: Rewrite EFI/MBI2 code partly in C
>   x86/boot: Improve MBI2 structure check
> 
>  xen/arch/x86/boot/head.S       | 146 +++++++--------------------------
>  xen/arch/x86/efi/Makefile      |   1 +
>  xen/arch/x86/efi/efi-boot.h    |   7 +-
>  xen/arch/x86/efi/mbi2.c        |  66 +++++++++++++++
>  xen/arch/x86/efi/stub.c        |  10 +--
>  xen/arch/x86/include/asm/efi.h |  18 ++++
>  6 files changed, 123 insertions(+), 125 deletions(-)
>  create mode 100644 xen/arch/x86/efi/mbi2.c
>  create mode 100644 xen/arch/x86/include/asm/efi.h
> 
> -- 
> 2.34.1
>
Frediano Ziglio Oct. 2, 2024, 3:27 p.m. UTC | #2
On Wed, Oct 2, 2024 at 3:04 PM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> On Tue, Oct 01, 2024 at 11:22:37AM +0100, Frediano Ziglio wrote:
> > This series came from part of the work of removing duplications between
> > boot code and rewriting part of code from assembly to C.
> > Rewrites EFI code in pure C.
>
> The MB2+EFI tests on Adler Lake fail with this series:
> https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1478766782
> Looking at the VGA output (unfortunately not collected by the test
> itself) it hangs just after bootloader, before printing anything on the
> screen (or even clearing it after bootloader). The serial is silent too.
>

I tried multiple times to take a look at the logs, but I keep getting error 500.

> It does pass on Zen 3+ runners.
>
> Since there were some issues with the ADL runner today on plain staging,
> I'm not 100% sure if it isn't some infrastructure issue yet. But the
> symptoms look different than usual infra issues (and different than
> todays failures on staging), so I think it's more likely an issue with
> the patches here.
>
> > Changes since v1, more details in specific commits:
> > - style updates;
> > - comments and descriptions improvements;
> > - other improvements.
> >
> > Changes since v2:
> > - rebased on master, resolved conflicts;
> > - add comment on trampoline section.
> >
> > Changes since v3:
> > - changed new function name;
> > - declare efi_multiboot2 in a separate header;
> > - distinguish entry point from using magic number;
> > - other minor changes (see commens in commits).
> >
> > Changes since v4:
> > - rebase on staging;
> > - set %fs and %gs as other segment registers;
> > - style and other changes.
> >
> > Changes since v5:
> > - fixed a typo.
> >
> > Changes since v6:
> > - remove merged patch;
> > - comment and style;
> > - change some pointer checks to avoid overflows;
> > - rename parse-mbi2.c to mbi2.c.
> >
> > Frediano Ziglio (2):
> >   x86/boot: Rewrite EFI/MBI2 code partly in C
> >   x86/boot: Improve MBI2 structure check
> >
> >  xen/arch/x86/boot/head.S       | 146 +++++++--------------------------
> >  xen/arch/x86/efi/Makefile      |   1 +
> >  xen/arch/x86/efi/efi-boot.h    |   7 +-
> >  xen/arch/x86/efi/mbi2.c        |  66 +++++++++++++++
> >  xen/arch/x86/efi/stub.c        |  10 +--
> >  xen/arch/x86/include/asm/efi.h |  18 ++++
> >  6 files changed, 123 insertions(+), 125 deletions(-)
> >  create mode 100644 xen/arch/x86/efi/mbi2.c
> >  create mode 100644 xen/arch/x86/include/asm/efi.h
> >

Frediano
Marek Marczykowski-Górecki Oct. 3, 2024, 1:11 a.m. UTC | #3
On Wed, Oct 02, 2024 at 04:27:19PM +0100, Frediano Ziglio wrote:
> On Wed, Oct 2, 2024 at 3:04 PM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
> >
> > On Tue, Oct 01, 2024 at 11:22:37AM +0100, Frediano Ziglio wrote:
> > > This series came from part of the work of removing duplications between
> > > boot code and rewriting part of code from assembly to C.
> > > Rewrites EFI code in pure C.
> >
> > The MB2+EFI tests on Adler Lake fail with this series:
> > https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1478766782
> > Looking at the VGA output (unfortunately not collected by the test
> > itself) it hangs just after bootloader, before printing anything on the
> > screen (or even clearing it after bootloader). The serial is silent too.
> >
> 
> I tried multiple times to take a look at the logs, but I keep getting error 500.

500? That's weird. Anyway, serial log is empty, so you haven't lost
much.
But also, I've ran it a couple more times and it is some regression.
Staging reliably passes while staging+this series fails.

Unfortunately I don't have any more info besides it hangs before
printing anything on serial or VGA. Maybe it hanging only on Intel but
not AMD is some hint? Or maybe it's some memory layout or firmware
differences that matter here (bootloader is exactly the same)?
FWIW some links:
successful staging run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146338
failed staging+this run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980330394
successful staging run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359
successful staging+this run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359

> > It does pass on Zen 3+ runners.
> >
> > Since there were some issues with the ADL runner today on plain staging,
> > I'm not 100% sure if it isn't some infrastructure issue yet. But the
> > symptoms look different than usual infra issues (and different than
> > todays failures on staging), so I think it's more likely an issue with
> > the patches here.
> >
> > > Changes since v1, more details in specific commits:
> > > - style updates;
> > > - comments and descriptions improvements;
> > > - other improvements.
> > >
> > > Changes since v2:
> > > - rebased on master, resolved conflicts;
> > > - add comment on trampoline section.
> > >
> > > Changes since v3:
> > > - changed new function name;
> > > - declare efi_multiboot2 in a separate header;
> > > - distinguish entry point from using magic number;
> > > - other minor changes (see commens in commits).
> > >
> > > Changes since v4:
> > > - rebase on staging;
> > > - set %fs and %gs as other segment registers;
> > > - style and other changes.
> > >
> > > Changes since v5:
> > > - fixed a typo.
> > >
> > > Changes since v6:
> > > - remove merged patch;
> > > - comment and style;
> > > - change some pointer checks to avoid overflows;
> > > - rename parse-mbi2.c to mbi2.c.
> > >
> > > Frediano Ziglio (2):
> > >   x86/boot: Rewrite EFI/MBI2 code partly in C
> > >   x86/boot: Improve MBI2 structure check
> > >
> > >  xen/arch/x86/boot/head.S       | 146 +++++++--------------------------
> > >  xen/arch/x86/efi/Makefile      |   1 +
> > >  xen/arch/x86/efi/efi-boot.h    |   7 +-
> > >  xen/arch/x86/efi/mbi2.c        |  66 +++++++++++++++
> > >  xen/arch/x86/efi/stub.c        |  10 +--
> > >  xen/arch/x86/include/asm/efi.h |  18 ++++
> > >  6 files changed, 123 insertions(+), 125 deletions(-)
> > >  create mode 100644 xen/arch/x86/efi/mbi2.c
> > >  create mode 100644 xen/arch/x86/include/asm/efi.h
> > >
> 
> Frediano
Andrew Cooper Oct. 3, 2024, 7:46 a.m. UTC | #4
On 03/10/2024 2:11 am, Marek Marczykowski-Górecki wrote:
> On Wed, Oct 02, 2024 at 04:27:19PM +0100, Frediano Ziglio wrote:
>> On Wed, Oct 2, 2024 at 3:04 PM Marek Marczykowski-Górecki
>> <marmarek@invisiblethingslab.com> wrote:
>>> On Tue, Oct 01, 2024 at 11:22:37AM +0100, Frediano Ziglio wrote:
>>>> This series came from part of the work of removing duplications between
>>>> boot code and rewriting part of code from assembly to C.
>>>> Rewrites EFI code in pure C.
>>> The MB2+EFI tests on Adler Lake fail with this series:
>>> https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1478766782
>>> Looking at the VGA output (unfortunately not collected by the test
>>> itself) it hangs just after bootloader, before printing anything on the
>>> screen (or even clearing it after bootloader). The serial is silent too.
>>>
>> I tried multiple times to take a look at the logs, but I keep getting error 500.
> 500? That's weird. Anyway, serial log is empty, so you haven't lost
> much.
> But also, I've ran it a couple more times and it is some regression.
> Staging reliably passes while staging+this series fails.
>
> Unfortunately I don't have any more info besides it hangs before
> printing anything on serial or VGA. Maybe it hanging only on Intel but
> not AMD is some hint? Or maybe it's some memory layout or firmware
> differences that matter here (bootloader is exactly the same)?
> FWIW some links:
> successful staging run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146338
> failed staging+this run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980330394
> successful staging run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359
> successful staging+this run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359

Ok, we should do several things.

Frediano, can you split out the efi_multiboot2() prototype in asm/efi.h
and submit it separately from introducing efi_multiboot2_prelude(). 
Mechanically, it's a somewhat-large part of the patch, and won't be
related to whatever other failure is going on here.

Everyone, is it time to consider earlyprintk() ?  This is not the first
time something like this has happened, and it wont be the last.  I for
one am quite bored of doing ad-hoc debugging each time...

~Andrew
Frediano Ziglio Oct. 3, 2024, 9:27 a.m. UTC | #5
On Thu, Oct 3, 2024 at 2:11 AM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> On Wed, Oct 02, 2024 at 04:27:19PM +0100, Frediano Ziglio wrote:
> > On Wed, Oct 2, 2024 at 3:04 PM Marek Marczykowski-Górecki
> > <marmarek@invisiblethingslab.com> wrote:
> > >
> > > On Tue, Oct 01, 2024 at 11:22:37AM +0100, Frediano Ziglio wrote:
> > > > This series came from part of the work of removing duplications between
> > > > boot code and rewriting part of code from assembly to C.
> > > > Rewrites EFI code in pure C.
> > >
> > > The MB2+EFI tests on Adler Lake fail with this series:
> > > https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1478766782
> > > Looking at the VGA output (unfortunately not collected by the test
> > > itself) it hangs just after bootloader, before printing anything on the
> > > screen (or even clearing it after bootloader). The serial is silent too.
> > >
> >
> > I tried multiple times to take a look at the logs, but I keep getting error 500.
>
> 500? That's weird. Anyway, serial log is empty, so you haven't lost
> much.

I'm getting pretty consistently. I can see the full pipeline, but not
the single logs. I tried with both Firefox and Chrome, I also tried
from home (just to check for something like firewall issues), always
error 500.

> But also, I've ran it a couple more times and it is some regression.
> Staging reliably passes while staging+this series fails.
>
> Unfortunately I don't have any more info besides it hangs before
> printing anything on serial or VGA. Maybe it hanging only on Intel but
> not AMD is some hint? Or maybe it's some memory layout or firmware
> differences that matter here (bootloader is exactly the same)?
> FWIW some links:
> successful staging run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146338
> failed staging+this run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980330394
> successful staging run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359
> successful staging+this run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359
>

Furthermore, I tried with 2 additional machines in our Lab, one Intel,
another AMD, both working for me.
Either your compiler did something different or something special on
the firmware.

I could try downloading your executables and machines there, but as I
said, I'm getting error 500 (not sure if we package artifacts).

Can you try without last commit ?

Frediano

> > > It does pass on Zen 3+ runners.
> > >
> > > Since there were some issues with the ADL runner today on plain staging,
> > > I'm not 100% sure if it isn't some infrastructure issue yet. But the
> > > symptoms look different than usual infra issues (and different than
> > > todays failures on staging), so I think it's more likely an issue with
> > > the patches here.
> > >
> > > > Changes since v1, more details in specific commits:
> > > > - style updates;
> > > > - comments and descriptions improvements;
> > > > - other improvements.
> > > >
> > > > Changes since v2:
> > > > - rebased on master, resolved conflicts;
> > > > - add comment on trampoline section.
> > > >
> > > > Changes since v3:
> > > > - changed new function name;
> > > > - declare efi_multiboot2 in a separate header;
> > > > - distinguish entry point from using magic number;
> > > > - other minor changes (see commens in commits).
> > > >
> > > > Changes since v4:
> > > > - rebase on staging;
> > > > - set %fs and %gs as other segment registers;
> > > > - style and other changes.
> > > >
> > > > Changes since v5:
> > > > - fixed a typo.
> > > >
> > > > Changes since v6:
> > > > - remove merged patch;
> > > > - comment and style;
> > > > - change some pointer checks to avoid overflows;
> > > > - rename parse-mbi2.c to mbi2.c.
> > > >
> > > > Frediano Ziglio (2):
> > > >   x86/boot: Rewrite EFI/MBI2 code partly in C
> > > >   x86/boot: Improve MBI2 structure check
> > > >
> > > >  xen/arch/x86/boot/head.S       | 146 +++++++--------------------------
> > > >  xen/arch/x86/efi/Makefile      |   1 +
> > > >  xen/arch/x86/efi/efi-boot.h    |   7 +-
> > > >  xen/arch/x86/efi/mbi2.c        |  66 +++++++++++++++
> > > >  xen/arch/x86/efi/stub.c        |  10 +--
> > > >  xen/arch/x86/include/asm/efi.h |  18 ++++
> > > >  6 files changed, 123 insertions(+), 125 deletions(-)
> > > >  create mode 100644 xen/arch/x86/efi/mbi2.c
> > > >  create mode 100644 xen/arch/x86/include/asm/efi.h
> > > >
Marek Marczykowski-Górecki Oct. 3, 2024, 10:46 a.m. UTC | #6
On Thu, Oct 03, 2024 at 10:27:15AM +0100, Frediano Ziglio wrote:
> On Thu, Oct 3, 2024 at 2:11 AM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
> >
> > On Wed, Oct 02, 2024 at 04:27:19PM +0100, Frediano Ziglio wrote:
> > > On Wed, Oct 2, 2024 at 3:04 PM Marek Marczykowski-Górecki
> > > <marmarek@invisiblethingslab.com> wrote:
> > > >
> > > > On Tue, Oct 01, 2024 at 11:22:37AM +0100, Frediano Ziglio wrote:
> > > > > This series came from part of the work of removing duplications between
> > > > > boot code and rewriting part of code from assembly to C.
> > > > > Rewrites EFI code in pure C.
> > > >
> > > > The MB2+EFI tests on Adler Lake fail with this series:
> > > > https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1478766782
> > > > Looking at the VGA output (unfortunately not collected by the test
> > > > itself) it hangs just after bootloader, before printing anything on the
> > > > screen (or even clearing it after bootloader). The serial is silent too.
> > > >
> > >
> > > I tried multiple times to take a look at the logs, but I keep getting error 500.
> >
> > 500? That's weird. Anyway, serial log is empty, so you haven't lost
> > much.
> 
> I'm getting pretty consistently. I can see the full pipeline, but not
> the single logs. I tried with both Firefox and Chrome, I also tried
> from home (just to check for something like firewall issues), always
> error 500.
> 
> > But also, I've ran it a couple more times and it is some regression.
> > Staging reliably passes while staging+this series fails.
> >
> > Unfortunately I don't have any more info besides it hangs before
> > printing anything on serial or VGA. Maybe it hanging only on Intel but
> > not AMD is some hint? Or maybe it's some memory layout or firmware
> > differences that matter here (bootloader is exactly the same)?
> > FWIW some links:
> > successful staging run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146338
> > failed staging+this run on ADL: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980330394
> > successful staging run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359
> > successful staging+this run on Zen3+: https://gitlab.com/xen-project/people/marmarek/xen/-/jobs/7980146359
> >
> 
> Furthermore, I tried with 2 additional machines in our Lab, one Intel,
> another AMD, both working for me.
> Either your compiler did something different or something special on
> the firmware.
> 
> I could try downloading your executables and machines there, but as I
> said, I'm getting error 500 (not sure if we package artifacts).
> 
> Can you try without last commit ?

Yes, this seems to work:
https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1480052345