diff mbox series

Mini-OS: x86: zero out .bss segment at boot

Message ID 20240207103138.26901-1-jgross@suse.com (mailing list archive)
State New, archived
Headers show
Series Mini-OS: x86: zero out .bss segment at boot | expand

Commit Message

Jürgen Groß Feb. 7, 2024, 10:31 a.m. UTC
The .bss segment should be zeroed at very early boot.

While adding the extern declaration of __bss_start for x86, make it
together with the other linker table defined section boundaries
common for all architectures.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/setup.c      | 2 ++
 include/arm/arch_mm.h | 1 -
 include/mm.h          | 2 ++
 include/x86/arch_mm.h | 1 -
 4 files changed, 4 insertions(+), 2 deletions(-)

Comments

Samuel Thibault Feb. 7, 2024, 10:38 a.m. UTC | #1
Juergen Gross, le mer. 07 févr. 2024 11:31:38 +0100, a ecrit:
> The .bss segment should be zeroed at very early boot.

Is that not done by the elf loader of Xen?

> While adding the extern declaration of __bss_start for x86, make it
> together with the other linker table defined section boundaries
> common for all architectures.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  arch/x86/setup.c      | 2 ++
>  include/arm/arch_mm.h | 1 -
>  include/mm.h          | 2 ++
>  include/x86/arch_mm.h | 1 -
>  4 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/setup.c b/arch/x86/setup.c
> index b27bbed7..3dddf4ad 100644
> --- a/arch/x86/setup.c
> +++ b/arch/x86/setup.c
> @@ -184,6 +184,8 @@ arch_init(void *par)
>  {
>  	static char hello[] = "Bootstrapping...\n";
>  
> +	memset(&__bss_start, 0, &_end - &__bss_start);
> +
>  	hpc_init();
>  	(void)HYPERVISOR_console_io(CONSOLEIO_write, strlen(hello), hello);
>  
> diff --git a/include/arm/arch_mm.h b/include/arm/arch_mm.h
> index 79d9e05b..335eb4ff 100644
> --- a/include/arm/arch_mm.h
> +++ b/include/arm/arch_mm.h
> @@ -3,7 +3,6 @@
>  
>  typedef uint64_t paddr_t;
>  
> -extern char _text, _etext, _erodata, _edata, _end, __bss_start;
>  extern int _boot_stack[];
>  extern int _boot_stack_end[];
>  extern uint32_t physical_address_offset;	/* Add this to a virtual address to get the physical address (wraps at 4GB) */
> diff --git a/include/mm.h b/include/mm.h
> index 4fc364ff..e02e080b 100644
> --- a/include/mm.h
> +++ b/include/mm.h
> @@ -46,6 +46,8 @@
>  #define round_pgdown(_p)  ((_p) & PAGE_MASK)
>  #define round_pgup(_p)    (((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
>  
> +extern char _text, _etext, _erodata, _edata, _end, __bss_start;
> +
>  extern unsigned long nr_free_pages;
>  
>  extern unsigned long *mm_alloc_bitmap;
> diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
> index a1b975dc..6b398cef 100644
> --- a/include/x86/arch_mm.h
> +++ b/include/x86/arch_mm.h
> @@ -225,7 +225,6 @@ extern unsigned long *phys_to_machine_mapping;
>  #else
>  extern pgentry_t page_table_base[];
>  #endif
> -extern char _text, _etext, _erodata, _edata, _end;
>  extern unsigned long mfn_zero;
>  static __inline__ maddr_t phys_to_machine(paddr_t phys)
>  {
> -- 
> 2.35.3
Jan Beulich Feb. 7, 2024, 10:39 a.m. UTC | #2
On 07.02.2024 11:31, Juergen Gross wrote:
> --- a/arch/x86/setup.c
> +++ b/arch/x86/setup.c
> @@ -184,6 +184,8 @@ arch_init(void *par)
>  {
>  	static char hello[] = "Bootstrapping...\n";
>  
> +	memset(&__bss_start, 0, &_end - &__bss_start);

Doesn't / shouldn't the loader guarantee this? I ask in particular
because doing this here implies earlier assembly code may not write
to any variable in .bss (nothing like that looks to exist right now,
but who knows what may be added at some point).

> --- a/include/mm.h
> +++ b/include/mm.h
> @@ -46,6 +46,8 @@
>  #define round_pgdown(_p)  ((_p) & PAGE_MASK)
>  #define round_pgup(_p)    (((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
>  
> +extern char _text, _etext, _erodata, _edata, _end, __bss_start;

Maybe use the more conventional array form? Thus also eliminating
the need for & at the use sites?

Jan
Jürgen Groß Feb. 7, 2024, 10:42 a.m. UTC | #3
On 07.02.24 11:38, Samuel Thibault wrote:
> Juergen Gross, le mer. 07 févr. 2024 11:31:38 +0100, a ecrit:
>> The .bss segment should be zeroed at very early boot.
> 
> Is that not done by the elf loader of Xen?

It might be done by Xen tools today, but I'm quite sure it is not part
of the ABI. The hypervisor doesn't clear .bss when loading a domain (e.g.
dom0).

I stumbled over it while implementing kexec in Mini-OS. For that I need
it for sure.


Juergen
Jan Beulich Feb. 7, 2024, 10:46 a.m. UTC | #4
On 07.02.2024 11:42, Jürgen Groß wrote:
> On 07.02.24 11:38, Samuel Thibault wrote:
>> Juergen Gross, le mer. 07 févr. 2024 11:31:38 +0100, a ecrit:
>>> The .bss segment should be zeroed at very early boot.
>>
>> Is that not done by the elf loader of Xen?
> 
> It might be done by Xen tools today, but I'm quite sure it is not part
> of the ABI. The hypervisor doesn't clear .bss when loading a domain (e.g.
> dom0).

libelf's elf_memcpy() looks to say otherwise.


> I stumbled over it while implementing kexec in Mini-OS. For that I need
> it for sure.

Hmm, yes, if the protocol there doesn't guarantee zeroing. But then you
probably want to give this as the reason for the change?

Jan
Jürgen Groß Feb. 7, 2024, 10:46 a.m. UTC | #5
On 07.02.24 11:39, Jan Beulich wrote:
> On 07.02.2024 11:31, Juergen Gross wrote:
>> --- a/arch/x86/setup.c
>> +++ b/arch/x86/setup.c
>> @@ -184,6 +184,8 @@ arch_init(void *par)
>>   {
>>   	static char hello[] = "Bootstrapping...\n";
>>   
>> +	memset(&__bss_start, 0, &_end - &__bss_start);
> 
> Doesn't / shouldn't the loader guarantee this? I ask in particular
> because doing this here implies earlier assembly code may not write
> to any variable in .bss (nothing like that looks to exist right now,
> but who knows what may be added at some point).

Oh, am I misremembering that the hypervisor doesn't do the clearing?

I do remember a bug when dom0 Linux crashed due to a not zeroed .bss,
so I added the zeroing in the paravirt startup code of the kernel.

But maybe this was due to a bug in the linker script of the kernel.

> 
>> --- a/include/mm.h
>> +++ b/include/mm.h
>> @@ -46,6 +46,8 @@
>>   #define round_pgdown(_p)  ((_p) & PAGE_MASK)
>>   #define round_pgup(_p)    (((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
>>   
>> +extern char _text, _etext, _erodata, _edata, _end, __bss_start;
> 
> Maybe use the more conventional array form? Thus also eliminating
> the need for & at the use sites?

This was basically just a movement of code to another file.

But I can make this change, of course.


Juergen
Andrew Cooper Feb. 7, 2024, 10:47 a.m. UTC | #6
On 07/02/2024 10:42 am, Jürgen Groß wrote:
> On 07.02.24 11:38, Samuel Thibault wrote:
>> Juergen Gross, le mer. 07 févr. 2024 11:31:38 +0100, a ecrit:
>>> The .bss segment should be zeroed at very early boot.
>>
>> Is that not done by the elf loader of Xen?
>
> It might be done by Xen tools today, but I'm quite sure it is not part
> of the ABI. The hypervisor doesn't clear .bss when loading a domain (e.g.
> dom0).

It is a requirement of loading an ELF file.

If we're not doing it for dom0, that's a bug.

> I stumbled over it while implementing kexec in Mini-OS. For that I need
> it for sure.

A kexec kernel does need to do this, unless the kexec-loader is going to
guarantee to do it, and I don't believe it does.

~Andrew
Jan Beulich Feb. 7, 2024, 10:48 a.m. UTC | #7
On 07.02.2024 11:46, Juergen Gross wrote:
> On 07.02.24 11:39, Jan Beulich wrote:
>> On 07.02.2024 11:31, Juergen Gross wrote:
>>> --- a/arch/x86/setup.c
>>> +++ b/arch/x86/setup.c
>>> @@ -184,6 +184,8 @@ arch_init(void *par)
>>>   {
>>>   	static char hello[] = "Bootstrapping...\n";
>>>   
>>> +	memset(&__bss_start, 0, &_end - &__bss_start);
>>
>> Doesn't / shouldn't the loader guarantee this? I ask in particular
>> because doing this here implies earlier assembly code may not write
>> to any variable in .bss (nothing like that looks to exist right now,
>> but who knows what may be added at some point).
> 
> Oh, am I misremembering that the hypervisor doesn't do the clearing?
> 
> I do remember a bug when dom0 Linux crashed due to a not zeroed .bss,
> so I added the zeroing in the paravirt startup code of the kernel.
> 
> But maybe this was due to a bug in the linker script of the kernel.

That's what I recall.

Jan
Samuel Thibault Feb. 7, 2024, 11 a.m. UTC | #8
Jürgen Groß, le mer. 07 févr. 2024 11:42:20 +0100, a ecrit:
> while implementing kexec in Mini-OS.

Oh, nice :D

> For that I need it for sure.

It needs to be done by kexec itself then.

Samuel
Jürgen Groß Feb. 7, 2024, 11:16 a.m. UTC | #9
On 07.02.24 12:00, Samuel Thibault wrote:
> Jürgen Groß, le mer. 07 févr. 2024 11:42:20 +0100, a ecrit:
>> while implementing kexec in Mini-OS.
> 
> Oh, nice :D
> 
>> For that I need it for sure.
> 
> It needs to be done by kexec itself then.

That's another option, yes.

The question is whether we want to support to be kexec-ed from other
systems, too. Then I'd rather do it on both sides.


Juergen
Samuel Thibault Feb. 7, 2024, 11:34 a.m. UTC | #10
Jürgen Groß, le mer. 07 févr. 2024 12:16:44 +0100, a ecrit:
> On 07.02.24 12:00, Samuel Thibault wrote:
> > Jürgen Groß, le mer. 07 févr. 2024 11:42:20 +0100, a ecrit:
> > > while implementing kexec in Mini-OS.
> > 
> > Oh, nice :D
> > 
> > > For that I need it for sure.
> > 
> > It needs to be done by kexec itself then.
> 
> That's another option, yes.
> 
> The question is whether we want to support to be kexec-ed from other
> systems, too.

But aren't other systems' kexec supports supposed to do the memset?

They really should.

Samuel
Jürgen Groß Feb. 7, 2024, 11:43 a.m. UTC | #11
On 07.02.24 12:34, Samuel Thibault wrote:
> Jürgen Groß, le mer. 07 févr. 2024 12:16:44 +0100, a ecrit:
>> On 07.02.24 12:00, Samuel Thibault wrote:
>>> Jürgen Groß, le mer. 07 févr. 2024 11:42:20 +0100, a ecrit:
>>>> while implementing kexec in Mini-OS.
>>>
>>> Oh, nice :D
>>>
>>>> For that I need it for sure.
>>>
>>> It needs to be done by kexec itself then.
>>
>> That's another option, yes.
>>
>> The question is whether we want to support to be kexec-ed from other
>> systems, too.
> 
> But aren't other systems' kexec supports supposed to do the memset?
> 
> They really should.

I guess there is a reason why the Linux kernel does clear its .bss section
in early boot. Maybe it is due to how the boot process works (the ELF file
is encapsulated in vmlinuz), but following your reasoning they should have
cleared their .bss while unpacking the ELF contents, not while booting the
contents.

I'm not sure they do the .bss clearing in kexec either, as the kernel is
clearing it anyway.


Juergen
Samuel Thibault Feb. 7, 2024, 12:02 p.m. UTC | #12
Jürgen Groß, le mer. 07 févr. 2024 12:43:03 +0100, a ecrit:
> On 07.02.24 12:34, Samuel Thibault wrote:
> > Jürgen Groß, le mer. 07 févr. 2024 12:16:44 +0100, a ecrit:
> > > On 07.02.24 12:00, Samuel Thibault wrote:
> > > > Jürgen Groß, le mer. 07 févr. 2024 11:42:20 +0100, a ecrit:
> > > > > while implementing kexec in Mini-OS.
> > > > 
> > > > Oh, nice :D
> > > > 
> > > > > For that I need it for sure.
> > > > 
> > > > It needs to be done by kexec itself then.
> > > 
> > > That's another option, yes.
> > > 
> > > The question is whether we want to support to be kexec-ed from other
> > > systems, too.
> > 
> > But aren't other systems' kexec supports supposed to do the memset?
> > 
> > They really should.
> 
> I guess there is a reason why the Linux kernel does clear its .bss section
> in early boot. Maybe it is due to how the boot process works (the ELF file
> is encapsulated in vmlinuz),

Yes, the unpack prevents grub etc. from doing it.

> but following your reasoning they should have cleared their .bss while
> unpacking the ELF contents, not while booting the contents.

AIUI the decompressor itself doesn't actually know about ELF.
decompress_kernel() does call handle_relocations(), but it should indeed
clear bss itself and not leave it out to assembly indeed.

> I'm not sure they do the .bss clearing in kexec either,

AIUI they do, see kimage_load_normal_segment() which clears the page
before possibly loading some file piece into it.

Really, this is part of what "loading an ELF" means.

Samuel
Jürgen Groß Feb. 7, 2024, 12:12 p.m. UTC | #13
On 07.02.24 13:02, Samuel Thibault wrote:
> Jürgen Groß, le mer. 07 févr. 2024 12:43:03 +0100, a ecrit:
>> On 07.02.24 12:34, Samuel Thibault wrote:
>>> Jürgen Groß, le mer. 07 févr. 2024 12:16:44 +0100, a ecrit:
>>>> On 07.02.24 12:00, Samuel Thibault wrote:
>>>>> Jürgen Groß, le mer. 07 févr. 2024 11:42:20 +0100, a ecrit:
>>>>>> while implementing kexec in Mini-OS.
>>>>>
>>>>> Oh, nice :D
>>>>>
>>>>>> For that I need it for sure.
>>>>>
>>>>> It needs to be done by kexec itself then.
>>>>
>>>> That's another option, yes.
>>>>
>>>> The question is whether we want to support to be kexec-ed from other
>>>> systems, too.
>>>
>>> But aren't other systems' kexec supports supposed to do the memset?
>>>
>>> They really should.
>>
>> I guess there is a reason why the Linux kernel does clear its .bss section
>> in early boot. Maybe it is due to how the boot process works (the ELF file
>> is encapsulated in vmlinuz),
> 
> Yes, the unpack prevents grub etc. from doing it.
> 
>> but following your reasoning they should have cleared their .bss while
>> unpacking the ELF contents, not while booting the contents.
> 
> AIUI the decompressor itself doesn't actually know about ELF.
> decompress_kernel() does call handle_relocations(), but it should indeed
> clear bss itself and not leave it out to assembly indeed.
> 
>> I'm not sure they do the .bss clearing in kexec either,
> 
> AIUI they do, see kimage_load_normal_segment() which clears the page
> before possibly loading some file piece into it.
> 
> Really, this is part of what "loading an ELF" means.

Okay, I'll do the clearing only in the kexec code then.


Juergen
diff mbox series

Patch

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index b27bbed7..3dddf4ad 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -184,6 +184,8 @@  arch_init(void *par)
 {
 	static char hello[] = "Bootstrapping...\n";
 
+	memset(&__bss_start, 0, &_end - &__bss_start);
+
 	hpc_init();
 	(void)HYPERVISOR_console_io(CONSOLEIO_write, strlen(hello), hello);
 
diff --git a/include/arm/arch_mm.h b/include/arm/arch_mm.h
index 79d9e05b..335eb4ff 100644
--- a/include/arm/arch_mm.h
+++ b/include/arm/arch_mm.h
@@ -3,7 +3,6 @@ 
 
 typedef uint64_t paddr_t;
 
-extern char _text, _etext, _erodata, _edata, _end, __bss_start;
 extern int _boot_stack[];
 extern int _boot_stack_end[];
 extern uint32_t physical_address_offset;	/* Add this to a virtual address to get the physical address (wraps at 4GB) */
diff --git a/include/mm.h b/include/mm.h
index 4fc364ff..e02e080b 100644
--- a/include/mm.h
+++ b/include/mm.h
@@ -46,6 +46,8 @@ 
 #define round_pgdown(_p)  ((_p) & PAGE_MASK)
 #define round_pgup(_p)    (((_p) + (PAGE_SIZE - 1)) & PAGE_MASK)
 
+extern char _text, _etext, _erodata, _edata, _end, __bss_start;
+
 extern unsigned long nr_free_pages;
 
 extern unsigned long *mm_alloc_bitmap;
diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index a1b975dc..6b398cef 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -225,7 +225,6 @@  extern unsigned long *phys_to_machine_mapping;
 #else
 extern pgentry_t page_table_base[];
 #endif
-extern char _text, _etext, _erodata, _edata, _end;
 extern unsigned long mfn_zero;
 static __inline__ maddr_t phys_to_machine(paddr_t phys)
 {