diff mbox series

[for-4.0,v4,4/4] i386: allow to load initrd below 4G for recent linux

Message ID 1544063533-10139-5-git-send-email-lizhijian@cn.fujitsu.com (mailing list archive)
State New, archived
Headers show
Series allow to load initrd below 4G for recent kernel | expand

Commit Message

Li Zhijian Dec. 6, 2018, 2:32 a.m. UTC
a new field xloadflags was added to recent x86 linux, and BIT 1:
XLF_CAN_BE_LOADED_ABOVE_4G is used to tell bootload that where initrd can be
loaded safely.

Current QEMU/BIOS always loads initrd below below_4g_mem_size which is always
less than 4G, so here limiting initrd_max to 4G - 1 simply is enough if
this bit is set.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>

---
V3: correct grammar and check XLF_CAN_BE_LOADED_ABOVE_4G first (Michael S. Tsirkin)

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 hw/i386/pc.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Michael S. Tsirkin Dec. 21, 2018, 4:10 p.m. UTC | #1
On Thu, Dec 06, 2018 at 10:32:13AM +0800, Li Zhijian wrote:
> a new field xloadflags was added to recent x86 linux, and BIT 1:
> XLF_CAN_BE_LOADED_ABOVE_4G is used to tell bootload that where initrd can be
> loaded safely.
> 
> Current QEMU/BIOS always loads initrd below below_4g_mem_size which is always
> less than 4G, so here limiting initrd_max to 4G - 1 simply is enough if
> this bit is set.
> 
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Richard Henderson <rth@twiddle.net>
> CC: Eduardo Habkost <ehabkost@redhat.com>
> CC: "Michael S. Tsirkin" <mst@redhat.com>
> CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> 
> ---
> V3: correct grammar and check XLF_CAN_BE_LOADED_ABOVE_4G first (Michael S. Tsirkin)
> 
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
>  hw/i386/pc.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 3b10726..baa99c0 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -904,7 +904,15 @@ static void load_linux(PCMachineState *pcms,
>  #endif
>  
>      /* highest address for loading the initrd */
> -    if (protocol >= 0x203) {
> +    if (protocol >= 0x20c &&
> +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> +        /*
> +         * Although kernel allows initrd loading to above 4G,
> +         * it just makes it as large as possible while still staying below 4G
> +         * since current BIOS always loads initrd below pcms->below_4g_mem_size
> +         */
> +        initrd_max = UINT32_MAX;
> +    } else if (protocol >= 0x203) {
>          initrd_max = ldl_p(header+0x22c);
>      } else {
>          initrd_max = 0x37ffffff;


I still have trouble understanding the above.
Anyone else wants to comment / help rephrase the comment
and commit log so it's readable?

> -- 
> 2.7.4
Eduardo Habkost Dec. 27, 2018, 8:31 p.m. UTC | #2
On Fri, Dec 21, 2018 at 11:10:30AM -0500, Michael S. Tsirkin wrote:
> On Thu, Dec 06, 2018 at 10:32:13AM +0800, Li Zhijian wrote:
> > a new field xloadflags was added to recent x86 linux, and BIT 1:
> > XLF_CAN_BE_LOADED_ABOVE_4G is used to tell bootload that where initrd can be
> > loaded safely.
> > 
> > Current QEMU/BIOS always loads initrd below below_4g_mem_size which is always
> > less than 4G, so here limiting initrd_max to 4G - 1 simply is enough if
> > this bit is set.
> > 
> > CC: Paolo Bonzini <pbonzini@redhat.com>
> > CC: Richard Henderson <rth@twiddle.net>
> > CC: Eduardo Habkost <ehabkost@redhat.com>
> > CC: "Michael S. Tsirkin" <mst@redhat.com>
> > CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> > 
> > ---
> > V3: correct grammar and check XLF_CAN_BE_LOADED_ABOVE_4G first (Michael S. Tsirkin)
> > 
> > Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> > ---
> >  hw/i386/pc.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 3b10726..baa99c0 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -904,7 +904,15 @@ static void load_linux(PCMachineState *pcms,
> >  #endif
> >  
> >      /* highest address for loading the initrd */
> > -    if (protocol >= 0x203) {
> > +    if (protocol >= 0x20c &&
> > +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> > +        /*
> > +         * Although kernel allows initrd loading to above 4G,
> > +         * it just makes it as large as possible while still staying below 4G
> > +         * since current BIOS always loads initrd below pcms->below_4g_mem_size
> > +         */
> > +        initrd_max = UINT32_MAX;
> > +    } else if (protocol >= 0x203) {
> >          initrd_max = ldl_p(header+0x22c);
> >      } else {
> >          initrd_max = 0x37ffffff;
> 
> 
> I still have trouble understanding the above.
> Anyone else wants to comment / help rephrase the comment
> and commit log so it's readable?


The comment seems to contradict what I see on the code:

| Although kernel allows initrd loading to above 4G,

Sounds correct.


| it just makes it as large as possible while still staying below 4G

I'm not a native English speaker, but I believe "it" here should
be interpreted as "the kernel", which would be incorrect.  It's
this QEMU function that limits initrd_max to a uint32 value, not
the kernel.


| since current BIOS always loads initrd below pcms->below_4g_mem_size

I don't know why the BIOS is mentioned here.  The
below_4g_mem_size limit comes from these 2 lines inside
load_linux():

    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
    }
In addition to that, initrd_max is uint32_t simply because QEMU
doesn't support the 64-bit boot protocol (specifically the
ext_ramdisk_image field), so all talk about below_4g_mem_size
seems to be just a distraction.

All that said, I miss one piece of information here: is
XLF_CAN_BE_LOADED_ABOVE_4G really supposed to override
header+0x22c?  linux/Documentation/x86/boot.txt isn't clear about
that.  Is there any reference that can help us confirm this?
Li Zhijian Dec. 28, 2018, 7:20 a.m. UTC | #3
On 12/28/2018 4:31 AM, Eduardo Habkost wrote:
> On Fri, Dec 21, 2018 at 11:10:30AM -0500, Michael S. Tsirkin wrote:
>> On Thu, Dec 06, 2018 at 10:32:13AM +0800, Li Zhijian wrote:
>>>   
>>>       /* highest address for loading the initrd */
>>> -    if (protocol >= 0x203) {
>>> +    if (protocol >= 0x20c &&
>>> +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
>>> +        /*
>>> +         * Although kernel allows initrd loading to above 4G,
>>> +         * it just makes it as large as possible while still staying below 4G
>>> +         * since current BIOS always loads initrd below pcms->below_4g_mem_size
>>> +         */
>>> +        initrd_max = UINT32_MAX;
>>> +    } else if (protocol >= 0x203) {
>>>           initrd_max = ldl_p(header+0x22c);
>>>       } else {
>>>           initrd_max = 0x37ffffff;
>>
>> I still have trouble understanding the above.
>> Anyone else wants to comment / help rephrase the comment
>> and commit log so it's readable?
>
> The comment seems to contradict what I see on the code:
>
> | Although kernel allows initrd loading to above 4G,
>
> Sounds correct.
>
>
> | it just makes it as large as possible while still staying below 4G
>
> I'm not a native English speaker, but I believe "it" here should
> be interpreted as "the kernel", which would be incorrect.  It's
> this QEMU function that limits initrd_max to a uint32 value, not
> the kernel.
>
>
> | since current BIOS always loads initrd below pcms->below_4g_mem_size
>
> I don't know why the BIOS is mentioned here.  The
> below_4g_mem_size limit comes from these 2 lines inside
> load_linux():
>
>      if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
>          initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>      }
> In addition to that, initrd_max is uint32_t simply because QEMU
> doesn't support the 64-bit boot protocol (specifically the
> ext_ramdisk_image field),

Thanks for explaining this :), i'm not clear before.



> so all talk about below_4g_mem_size
> seems to be just a distraction.
>
> All that said, I miss one piece of information here: is
> XLF_CAN_BE_LOADED_ABOVE_4G really supposed to override
> header+0x22c?  linux/Documentation/x86/boot.txt isn't clear about
> that.  Is there any reference that can help us confirm this?

Good question.

Since i'm not familiar with boot protocol, at the beginning, i also CCed to LKML for helps
https://lkml.org/lkml/2018/11/10/82

Ingo said:
>> > If XLF_CAN_BE_LOADED_ABOVE_4G is not > set, then you most likely 
>> are on a 32-bit kernel and there are more > fundamental limits (even 
>> if you were to load it above the 2 GB mark, you > would be limited by 
>> the size of kernel memory.) > > So, in case you are wondering: the 
>> bootloader that broke when setting > the initrd_max field above 2 GB 
>> was, of course, Grub. > > So just use XLF_CAN_BE_LOADED_ABOVE_4G. 
>> There is no need for a new flag > or new field. That's nice, and 
>> that's the best solution!

that make me to believe that if XLF_CAN_BE_LOADED_ABOVE_4G is set, BELOW_4G is allowed too.

if above is credible, might be we can update the comments like:
-------
QEMU doesn't support the 64-bit boot protocol (specifically the
ext_ramdisk_image field).

In addition, kernel allows to load initrd above 4G if XLF_CAN_BE_LOADED_ABOVE_4G is set,
so we believe that load initrd below 4G is allowed too.

For simplicity, so just set initrd_max to UINT32_MAX is enough and safe.
-------
  
Thanks
Zhijian
Stefano Garzarella Jan. 7, 2019, 12:11 p.m. UTC | #4
Hi,

On Thu, Dec 27, 2018 at 9:32 PM Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> On Fri, Dec 21, 2018 at 11:10:30AM -0500, Michael S. Tsirkin wrote:
> > On Thu, Dec 06, 2018 at 10:32:13AM +0800, Li Zhijian wrote:
> > > a new field xloadflags was added to recent x86 linux, and BIT 1:
> > > XLF_CAN_BE_LOADED_ABOVE_4G is used to tell bootload that where initrd can be
> > > loaded safely.
> > >
> > > Current QEMU/BIOS always loads initrd below below_4g_mem_size which is always
> > > less than 4G, so here limiting initrd_max to 4G - 1 simply is enough if
> > > this bit is set.
> > >
> > > CC: Paolo Bonzini <pbonzini@redhat.com>
> > > CC: Richard Henderson <rth@twiddle.net>
> > > CC: Eduardo Habkost <ehabkost@redhat.com>
> > > CC: "Michael S. Tsirkin" <mst@redhat.com>
> > > CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > > Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> > >
> > > ---
> > > V3: correct grammar and check XLF_CAN_BE_LOADED_ABOVE_4G first (Michael S. Tsirkin)
> > >
> > > Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> > > ---
> > >  hw/i386/pc.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index 3b10726..baa99c0 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -904,7 +904,15 @@ static void load_linux(PCMachineState *pcms,
> > >  #endif
> > >
> > >      /* highest address for loading the initrd */
> > > -    if (protocol >= 0x203) {
> > > +    if (protocol >= 0x20c &&
> > > +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> > > +        /*
> > > +         * Although kernel allows initrd loading to above 4G,
> > > +         * it just makes it as large as possible while still staying below 4G
> > > +         * since current BIOS always loads initrd below pcms->below_4g_mem_size
> > > +         */
> > > +        initrd_max = UINT32_MAX;
> > > +    } else if (protocol >= 0x203) {
> > >          initrd_max = ldl_p(header+0x22c);
> > >      } else {
> > >          initrd_max = 0x37ffffff;
> >
> >
> > I still have trouble understanding the above.
> > Anyone else wants to comment / help rephrase the comment
> > and commit log so it's readable?
>
>
> The comment seems to contradict what I see on the code:
>
> | Although kernel allows initrd loading to above 4G,
>
> Sounds correct.
>
>
> | it just makes it as large as possible while still staying below 4G
>
> I'm not a native English speaker, but I believe "it" here should
> be interpreted as "the kernel", which would be incorrect.  It's
> this QEMU function that limits initrd_max to a uint32 value, not
> the kernel.
>
>
> | since current BIOS always loads initrd below pcms->below_4g_mem_size
>
> I don't know why the BIOS is mentioned here.  The
> below_4g_mem_size limit comes from these 2 lines inside
> load_linux():
>
>     if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
>         initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>     }
> In addition to that, initrd_max is uint32_t simply because QEMU
> doesn't support the 64-bit boot protocol (specifically the
> ext_ramdisk_image field), so all talk about below_4g_mem_size
> seems to be just a distraction.
>
> All that said, I miss one piece of information here: is
> XLF_CAN_BE_LOADED_ABOVE_4G really supposed to override
> header+0x22c?  linux/Documentation/x86/boot.txt isn't clear about
> that.  Is there any reference that can help us confirm this?

Looking at the following patch seems that we can confirm the assumption:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4bf7111f50167133a71c23530ca852a41355e739

Note: the patch was reverted due to bugs in some firmwares, but IMHO
the assumption is correct.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=47226ad4f4cfd1e91ded7f2ec42f83ff1c624663

Cheers,
Stefano

>
> --
> Eduardo
>
Paolo Bonzini Jan. 7, 2019, 11:35 p.m. UTC | #5
On 27/12/18 21:31, Eduardo Habkost wrote:
> All that said, I miss one piece of information here: is
> XLF_CAN_BE_LOADED_ABOVE_4G really supposed to override
> header+0x22c?  linux/Documentation/x86/boot.txt isn't clear about
> that.  Is there any reference that can help us confirm this?

Linux has supported initrd up to 4 GB for a very long time (2007, long
before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013), though it
only sets initrd_max to 2 GB to "work around bootloader bugs".  So I
guess the flag can be taken as a hint that you can load at any address,
and perhaps could be renamed.

Paolo
Li Zhijian Jan. 9, 2019, 6:22 a.m. UTC | #6
On 1/7/19 20:11, Stefano Garzarella wrote:
> Hi,
>
> On Thu, Dec 27, 2018 at 9:32 PM Eduardo Habkost <ehabkost@redhat.com> wrote:
>> On Fri, Dec 21, 2018 at 11:10:30AM -0500, Michael S. Tsirkin wrote:
>>> On Thu, Dec 06, 2018 at 10:32:13AM +0800, Li Zhijian wrote:
>>>> a new field xloadflags was added to recent x86 linux, and BIT 1:
>>>> XLF_CAN_BE_LOADED_ABOVE_4G is used to tell bootload that where initrd can be
>>>> loaded safely.
>>>>
>>>> Current QEMU/BIOS always loads initrd below below_4g_mem_size which is always
>>>> less than 4G, so here limiting initrd_max to 4G - 1 simply is enough if
>>>> this bit is set.
>>>>
>>>> CC: Paolo Bonzini <pbonzini@redhat.com>
>>>> CC: Richard Henderson <rth@twiddle.net>
>>>> CC: Eduardo Habkost <ehabkost@redhat.com>
>>>> CC: "Michael S. Tsirkin" <mst@redhat.com>
>>>> CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>>
>>>> ---
>>>> V3: correct grammar and check XLF_CAN_BE_LOADED_ABOVE_4G first (Michael S. Tsirkin)
>>>>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> ---
>>>>   hw/i386/pc.c | 10 +++++++++-
>>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>>> index 3b10726..baa99c0 100644
>>>> --- a/hw/i386/pc.c
>>>> +++ b/hw/i386/pc.c
>>>> @@ -904,7 +904,15 @@ static void load_linux(PCMachineState *pcms,
>>>>   #endif
>>>>
>>>>       /* highest address for loading the initrd */
>>>> -    if (protocol >= 0x203) {
>>>> +    if (protocol >= 0x20c &&
>>>> +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
>>>> +        /*
>>>> +         * Although kernel allows initrd loading to above 4G,
>>>> +         * it just makes it as large as possible while still staying below 4G
>>>> +         * since current BIOS always loads initrd below pcms->below_4g_mem_size
>>>> +         */
>>>> +        initrd_max = UINT32_MAX;
>>>> +    } else if (protocol >= 0x203) {
>>>>           initrd_max = ldl_p(header+0x22c);
>>>>       } else {
>>>>           initrd_max = 0x37ffffff;
>>>
>>> I still have trouble understanding the above.
>>> Anyone else wants to comment / help rephrase the comment
>>> and commit log so it's readable?
>>
>> The comment seems to contradict what I see on the code:
>>
>> | Although kernel allows initrd loading to above 4G,
>>
>> Sounds correct.
>>
>>
>> | it just makes it as large as possible while still staying below 4G
>>
>> I'm not a native English speaker, but I believe "it" here should
>> be interpreted as "the kernel", which would be incorrect.  It's
>> this QEMU function that limits initrd_max to a uint32 value, not
>> the kernel.
>>
>>
>> | since current BIOS always loads initrd below pcms->below_4g_mem_size
>>
>> I don't know why the BIOS is mentioned here.  The
>> below_4g_mem_size limit comes from these 2 lines inside
>> load_linux():
>>
>>      if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
>>          initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>>      }
>> In addition to that, initrd_max is uint32_t simply because QEMU
>> doesn't support the 64-bit boot protocol (specifically the
>> ext_ramdisk_image field), so all talk about below_4g_mem_size
>> seems to be just a distraction.
>>
>> All that said, I miss one piece of information here: is
>> XLF_CAN_BE_LOADED_ABOVE_4G really supposed to override
>> header+0x22c?  linux/Documentation/x86/boot.txt isn't clear about
>> that.  Is there any reference that can help us confirm this?
> Looking at the following patch seems that we can confirm the assumption:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4bf7111f50167133a71c23530ca852a41355e739
>
> Note: the patch was reverted due to bugs in some firmwares, but IMHO
> the assumption is correct.
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=47226ad4f4cfd1e91ded7f2ec42f83ff1c624663

thanks for you info.

When use '-kernel vmlinux -initrd initrd.cgz' to launch a VM,
the firmware(it could be linuxboot_dma.bin) helps to read initrd
contents into guest memory(below ramdisk_max) and jump to kernel.
that's similar with what bootloader does, like grub.

And firmware can work well with some fixes in this patchset.

Zhijian


>
> Cheers,
> Stefano
>
>> --
>> Eduardo
>>
>
> .
>
diff mbox series

Patch

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3b10726..baa99c0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -904,7 +904,15 @@  static void load_linux(PCMachineState *pcms,
 #endif
 
     /* highest address for loading the initrd */
-    if (protocol >= 0x203) {
+    if (protocol >= 0x20c &&
+        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
+        /*
+         * Although kernel allows initrd loading to above 4G,
+         * it just makes it as large as possible while still staying below 4G
+         * since current BIOS always loads initrd below pcms->below_4g_mem_size
+         */
+        initrd_max = UINT32_MAX;
+    } else if (protocol >= 0x203) {
         initrd_max = ldl_p(header+0x22c);
     } else {
         initrd_max = 0x37ffffff;