diff mbox series

[2/2] x86: fix setup of brk area

Message ID 20220622161048.4483-3-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series x86: fix brk area initialization | expand

Commit Message

Jürgen Groß June 22, 2022, 4:10 p.m. UTC
Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
put the brk area into the .bss segment, causing it not to be cleared
initially. As the brk area is used to allocate early page tables, these
might contain garbage in not explicitly written entries.

This is especially a problem for Xen PV guests, as the hypervisor will
validate page tables (check for writable page tables and hypervisor
private bits) before accepting them to be used. There have been reports
of early crashes of PV guests due to illegal page table contents.

Fix that by letting clear_bss() clear the brk area, too.

Fixes: e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/kernel/head64.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jan Beulich June 23, 2022, 8:09 a.m. UTC | #1
On 22.06.2022 18:10, Juergen Gross wrote:
> Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
> put the brk area into the .bss segment, causing it not to be cleared
> initially.

This reads contradictively: If the area was put in .bss, it would be
cleared. Thing is it is put in .bss..brk in the object files, while
the linker script puts it in .brk (i.e. outside of .bss).

> As the brk area is used to allocate early page tables, these
> might contain garbage in not explicitly written entries.

I'm surprised this lack of zero-initialization didn't cause any issue
outside of PV Xen. Unless of course there never was the intention for
users of the facility to assume blank pages coming from there, in
which case Xen's use for early page tables would have been wrong (in
not explicitly zeroing the space first).

Jan
Jürgen Groß June 23, 2022, 8:14 a.m. UTC | #2
On 23.06.22 10:09, Jan Beulich wrote:
> On 22.06.2022 18:10, Juergen Gross wrote:
>> Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
>> put the brk area into the .bss segment, causing it not to be cleared
>> initially.
> 
> This reads contradictively: If the area was put in .bss, it would be
> cleared. Thing is it is put in .bss..brk in the object files, while
> the linker script puts it in .brk (i.e. outside of .bss).

Hmm, yes, this should be reworded.

> 
>> As the brk area is used to allocate early page tables, these
>> might contain garbage in not explicitly written entries.
> 
> I'm surprised this lack of zero-initialization didn't cause any issue
> outside of PV Xen. Unless of course there never was the intention for
> users of the facility to assume blank pages coming from there, in
> which case Xen's use for early page tables would have been wrong (in
> not explicitly zeroing the space first).

Fun fact: Its not Xen's use for early page tables, but the kernel's
init code. It is used for bare metal, too.

The use case for initial page tables is the problematic one. Only the
needed page table entries are written by the kernel, so the other ones
keep their initial garbage values. As normally no uninitialized entries
are ever referenced, this will have no real impact.

With Xen, however, ALL entries are being validated, which led to the
early crash of dom0.


Juergen
Jan Beulich June 23, 2022, 8:50 a.m. UTC | #3
On 23.06.2022 10:14, Juergen Gross wrote:
> On 23.06.22 10:09, Jan Beulich wrote:
>> On 22.06.2022 18:10, Juergen Gross wrote:
>>> Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
>>> put the brk area into the .bss segment, causing it not to be cleared
>>> initially.
>>
>> This reads contradictively: If the area was put in .bss, it would be
>> cleared. Thing is it is put in .bss..brk in the object files, while
>> the linker script puts it in .brk (i.e. outside of .bss).
> 
> Hmm, yes, this should be reworded.
> 
>>
>>> As the brk area is used to allocate early page tables, these
>>> might contain garbage in not explicitly written entries.
>>
>> I'm surprised this lack of zero-initialization didn't cause any issue
>> outside of PV Xen. Unless of course there never was the intention for
>> users of the facility to assume blank pages coming from there, in
>> which case Xen's use for early page tables would have been wrong (in
>> not explicitly zeroing the space first).
> 
> Fun fact: Its not Xen's use for early page tables, but the kernel's
> init code. It is used for bare metal, too.
> 
> The use case for initial page tables is the problematic one. Only the
> needed page table entries are written by the kernel, so the other ones
> keep their initial garbage values. As normally no uninitialized entries
> are ever referenced, this will have no real impact.

Are you sure there couldn't surface user-mode accessible page table
entries pointing at random pages?

Jan
Jürgen Groß June 23, 2022, 9:03 a.m. UTC | #4
On 23.06.22 10:50, Jan Beulich wrote:
> On 23.06.2022 10:14, Juergen Gross wrote:
>> On 23.06.22 10:09, Jan Beulich wrote:
>>> On 22.06.2022 18:10, Juergen Gross wrote:
>>>> Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
>>>> put the brk area into the .bss segment, causing it not to be cleared
>>>> initially.
>>>
>>> This reads contradictively: If the area was put in .bss, it would be
>>> cleared. Thing is it is put in .bss..brk in the object files, while
>>> the linker script puts it in .brk (i.e. outside of .bss).
>>
>> Hmm, yes, this should be reworded.
>>
>>>
>>>> As the brk area is used to allocate early page tables, these
>>>> might contain garbage in not explicitly written entries.
>>>
>>> I'm surprised this lack of zero-initialization didn't cause any issue
>>> outside of PV Xen. Unless of course there never was the intention for
>>> users of the facility to assume blank pages coming from there, in
>>> which case Xen's use for early page tables would have been wrong (in
>>> not explicitly zeroing the space first).
>>
>> Fun fact: Its not Xen's use for early page tables, but the kernel's
>> init code. It is used for bare metal, too.
>>
>> The use case for initial page tables is the problematic one. Only the
>> needed page table entries are written by the kernel, so the other ones
>> keep their initial garbage values. As normally no uninitialized entries
>> are ever referenced, this will have no real impact.
> 
> Are you sure there couldn't surface user-mode accessible page table
> entries pointing at random pages?

No, I'm not sure this can't happen.


Juergen
diff mbox series

Patch

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index e7e233209a8c..6a3cfaf6b72a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -430,6 +430,8 @@  void __init clear_bss(void)
 {
 	memset(__bss_start, 0,
 	       (unsigned long) __bss_stop - (unsigned long) __bss_start);
+	memset(__brk_base, 0,
+	       (unsigned long) __brk_limit - (unsigned long) __brk_base);
 }
 
 static unsigned long get_cmd_line_ptr(void)