diff mbox series

[4/4] EFI: free unused boot mem in at least some cases

Message ID 4e2d0d0a-9d85-d704-5d61-ae9efc69256a@suse.com (mailing list archive)
State Superseded
Headers show
Series build: corrections to .init.o generation logic | expand

Commit Message

Jan Beulich Aug. 6, 2020, 9:06 a.m. UTC
Address at least the primary reason why 52bba67f8b87 ("efi/boot: Don't
free ebmalloc area at all") was put in place: Make xen_in_range() aware
of the freed range. This is in particular relevant for EFI-enabled
builds not actually running on EFI, as the entire range will be unused
in this case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
The remaining issue could be addressed too, by making the area 2M in
size and 2M-aligned.

Comments

Andrew Cooper Aug. 10, 2020, 5:09 p.m. UTC | #1
On 06/08/2020 10:06, Jan Beulich wrote:
> Address at least the primary reason why 52bba67f8b87 ("efi/boot: Don't
> free ebmalloc area at all") was put in place: Make xen_in_range() aware
> of the freed range. This is in particular relevant for EFI-enabled
> builds not actually running on EFI, as the entire range will be unused
> in this case.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> The remaining issue could be addressed too, by making the area 2M in
> size and 2M-aligned.

This memory range is only used for relocating the (synthesized?)
multiboot strings, is it not?

I'm not actually convinced that this is a sensible tradeoff.

For one, you've broken setup.c's:

    /* This needs to remain in sync with xen_in_range(). */
    reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end));

which covers the runtime aspect of what xen_in_range() covers during boot.


I think the better course of action is to go with David Woodhouse's work
to not relocate the trampoline until later on boot (if even necessary),
at which point both of the custom allocators can disappear.

~Andrew
David Woodhouse Aug. 14, 2020, 3:14 p.m. UTC | #2
On Mon, 2020-08-10 at 18:09 +0100, Andrew Cooper wrote:
> I think the better course of action is to go with David Woodhouse's work
> to not relocate the trampoline until later on boot (if even necessary),
> at which point both of the custom allocators can disappear.

I confess I had mostly given up on cleaning up the boot code. But if
you like I can dust those patches off and post them one last time.

It certainly was an overall improvement on the current setup.
Jan Beulich Aug. 17, 2020, 3:42 p.m. UTC | #3
On 10.08.2020 19:09, Andrew Cooper wrote:
> On 06/08/2020 10:06, Jan Beulich wrote:
>> Address at least the primary reason why 52bba67f8b87 ("efi/boot: Don't
>> free ebmalloc area at all") was put in place: Make xen_in_range() aware
>> of the freed range. This is in particular relevant for EFI-enabled
>> builds not actually running on EFI, as the entire range will be unused
>> in this case.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> The remaining issue could be addressed too, by making the area 2M in
>> size and 2M-aligned.
> 
> This memory range is only used for relocating the (synthesized?)
> multiboot strings, is it not?

No. Afaict it has nothing to do with multiboot strings. There are
exactly two uses afaics - in place_string() and in
efi_arch_allocate_mmap_buffer(). The former is used to record
command line pieces, e.g. that parsed from the config file, while
the latter is what allocates the memory for the EFI memory map.

> I'm not actually convinced that this is a sensible tradeoff.
> 
> For one, you've broken setup.c's:
> 
>     /* This needs to remain in sync with xen_in_range(). */
>     reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end));
> 
> which covers the runtime aspect of what xen_in_range() covers during boot.

Hmm, I did specifically look at that and thought it wouldn't need
changing. But now that you point it out (again), it looks like I
was wrong.

> I think the better course of action is to go with David Woodhouse's work
> to not relocate the trampoline until later on boot (if even necessary),
> at which point both of the custom allocators can disappear.

Well, in the light of my response above I'd like to express that
I can't see how David's work would make this allocator go away.

Jan
Jan Beulich Aug. 25, 2020, 10:57 a.m. UTC | #4
On 10.08.2020 19:09, Andrew Cooper wrote:
> On 06/08/2020 10:06, Jan Beulich wrote:
>> Address at least the primary reason why 52bba67f8b87 ("efi/boot: Don't
>> free ebmalloc area at all") was put in place: Make xen_in_range() aware
>> of the freed range. This is in particular relevant for EFI-enabled
>> builds not actually running on EFI, as the entire range will be unused
>> in this case.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> The remaining issue could be addressed too, by making the area 2M in
>> size and 2M-aligned.
> 
> This memory range is only used for relocating the (synthesized?)
> multiboot strings, is it not?
> 
> I'm not actually convinced that this is a sensible tradeoff.
> 
> For one, you've broken setup.c's:
> 
>     /* This needs to remain in sync with xen_in_range(). */
>     reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end));
> 
> which covers the runtime aspect of what xen_in_range() covers during boot.

I'm afraid this wasn't a good suggestion here (it was still helpful
to notice that tboot.c also needs adjustment): By not reserving the
range here, it'll get freed by end_boot_allocator(), and hence may
not (again) be freed by free_ebmalloc_unused_mem() (kind of putting
its name under question). Immediately up from the quoted place we
also reserve the space where the modules live, which also gets
freed later. I'm having difficulty to see why this particular
aspect needs to remain in sync between the reservation done here
and xen_in_range().

v2 definitely is broken because of me not having noticed this in
time. I'll first try to fix it without reverting to the v1 model,
but I'd prefer to go back to the earlier approach (keeping merely
the other v2 adjustments). Unless of course you see some breakage
from this that I don't see.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub.c
@@ -52,6 +52,12 @@  bool efi_enabled(unsigned int feature)
 
 void __init efi_init_memory(void) { }
 
+bool __hwdom_init efi_boot_mem_unused(unsigned long *start, unsigned long *end)
+{
+    *start = *end = (unsigned long)_end;
+    return false;
+}
+
 void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { }
 
 bool efi_rs_using_pgtables(void)
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -608,7 +608,7 @@  static void __init kexec_reserve_area(st
 #endif
 }
 
-static inline bool using_2M_mapping(void)
+bool using_2M_mapping(void)
 {
     return !l1_table_offset((unsigned long)__2M_text_end) &&
            !l1_table_offset((unsigned long)__2M_rodata_start) &&
@@ -1970,7 +1970,7 @@  int __hwdom_init xen_in_range(unsigned l
     paddr_t start, end;
     int i;
 
-    enum { region_s3, region_ro, region_rw, nr_regions };
+    enum { region_s3, region_ro, region_rw, region_bss, nr_regions };
     static struct {
         paddr_t s, e;
     } xen_regions[nr_regions] __hwdom_initdata;
@@ -1994,7 +1994,14 @@  int __hwdom_init xen_in_range(unsigned l
         xen_regions[region_ro].e = __pa(&__2M_rodata_end);
         /* hypervisor .data + .bss */
         xen_regions[region_rw].s = __pa(&__2M_rwdata_start);
-        xen_regions[region_rw].e = __pa(&__2M_rwdata_end);
+        if ( efi_boot_mem_unused(&start, &end) )
+        {
+            xen_regions[region_rw].e = __pa(start);
+            xen_regions[region_bss].s = __pa(end);
+            xen_regions[region_bss].e = __pa(&__2M_rwdata_end);
+        }
+        else
+            xen_regions[region_rw].e = __pa(&__2M_rwdata_end);
     }
 
     start = (paddr_t)mfn << PAGE_SHIFT;
--- a/xen/common/efi/ebmalloc.c
+++ b/xen/common/efi/ebmalloc.c
@@ -1,5 +1,9 @@ 
 #include "efi.h"
 #include <xen/init.h>
+#include <xen/mm.h>
+#ifdef CONFIG_X86
+#include <asm/setup.h>
+#endif
 
 #ifdef CONFIG_ARM
 /*
@@ -21,7 +25,7 @@ 
 
 static char __section(".bss.page_aligned") __aligned(PAGE_SIZE)
     ebmalloc_mem[EBMALLOC_SIZE];
-static unsigned long __initdata ebmalloc_allocated;
+static unsigned long __read_mostly ebmalloc_allocated;
 
 /* EFI boot allocator. */
 void __init *ebmalloc(size_t size)
@@ -36,17 +40,32 @@  void __init *ebmalloc(size_t size)
     return ptr;
 }
 
+bool __hwdom_init efi_boot_mem_unused(unsigned long *start, unsigned long *end)
+{
+    *start = (unsigned long)ebmalloc_mem + PAGE_ALIGN(ebmalloc_allocated);
+    *end = (unsigned long)ebmalloc_mem + sizeof(ebmalloc_mem);
+
+    return *start < *end;
+}
+
 void __init free_ebmalloc_unused_mem(void)
 {
-#if 0 /* FIXME: Putting a hole in the BSS breaks the IOMMU mappings for dom0. */
     unsigned long start, end;
 
-    start = (unsigned long)ebmalloc_mem + PAGE_ALIGN(ebmalloc_allocated);
-    end = (unsigned long)ebmalloc_mem + sizeof(ebmalloc_mem);
+#ifdef CONFIG_X86
+    /* FIXME: Putting a hole in .bss would shatter the large page mapping. */
+    if ( using_2M_mapping() )
+    {
+        ebmalloc_allocated = sizeof(ebmalloc_mem);
+        return;
+    }
+#endif
+
+    if ( !efi_boot_mem_unused(&start, &end) )
+        return;
 
     destroy_xen_mappings(start, end);
     init_xenheap_pages(__pa(start), __pa(end));
 
     printk(XENLOG_INFO "Freed %lukB unused BSS memory\n", (end - start) >> 10);
-#endif
 }
--- a/xen/include/asm-x86/setup.h
+++ b/xen/include/asm-x86/setup.h
@@ -9,6 +9,8 @@  extern const char __2M_rodata_start[], _
 extern char __2M_init_start[], __2M_init_end[];
 extern char __2M_rwdata_start[], __2M_rwdata_end[];
 
+bool using_2M_mapping(void);
+
 extern unsigned long xenheap_initial_phys_start;
 extern uint64_t boot_tsc_stamp;
 
--- a/xen/include/xen/efi.h
+++ b/xen/include/xen/efi.h
@@ -33,6 +33,7 @@  struct compat_pf_efi_runtime_call;
 
 bool efi_enabled(unsigned int feature);
 void efi_init_memory(void);
+bool efi_boot_mem_unused(unsigned long *start, unsigned long *end);
 bool efi_rs_using_pgtables(void);
 unsigned long efi_get_time(void);
 void efi_halt_system(void);