diff mbox series

[02/10] mini-os: sort and sanitize e820 memory map

Message ID 20211206072337.9517-3-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series mini-os: add missing PVH features | expand

Commit Message

Juergen Gross Dec. 6, 2021, 7:23 a.m. UTC
Do some processing of the E820 memory map obtained from the hypervisor:

- align the entries to page boundaries
- sort the entries by their start address
- merge adjacent entries of same type

This is relevant for PVH mode only.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 e820.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

Comments

Samuel Thibault Dec. 12, 2021, 12:05 a.m. UTC | #1
Hello,

Juergen Gross, le lun. 06 déc. 2021 08:23:29 +0100, a ecrit:
> - align the entries to page boundaries

> +    /* Adjust map entries to page boundaries. */
> +    for ( i = 0; i < e820_entries; i++ )
> +    {
> +        end = (e820_map[i].addr + e820_map[i].size + PAGE_SIZE - 1) & PAGE_MASK;
> +        e820_map[i].addr &= PAGE_MASK;
> +        e820_map[i].size = end - e820_map[i].addr;
> +    }

Mmm, what if the previous entry ends after the aligned start?

On real machines that does happen, and you'd rather round up the start
address of usable areas, rather than rounding it down (and conversely
for the end).

> +    /* Sort entries by start address. */
> +    for ( i = 0; i < e820_entries - 1; i++ )
> +    {
> +        if ( e820_map[i].addr > e820_map[i + 1].addr )
> +        {
> +            e820_swap_entries(i, i + 1);
> +            i = -1;
> +        }
> +    }

This looks O(n^3) to me? A bubble sort like this should be fine:

    /* Sort entries by start address. */
    for ( last = e820_entries; last > 1; last-- )
    {
        for ( i = 0; i < last - 1; i++ )
        {
            if ( e820_map[i].addr > e820_map[i + 1].addr )
            {
                e820_swap_entries(i, i + 1);
            }
        }
    }

Samuel
Juergen Gross Dec. 13, 2021, 2:56 p.m. UTC | #2
On 12.12.21 01:05, Samuel Thibault wrote:
> Hello,
> 
> Juergen Gross, le lun. 06 déc. 2021 08:23:29 +0100, a ecrit:
>> - align the entries to page boundaries
> 
>> +    /* Adjust map entries to page boundaries. */
>> +    for ( i = 0; i < e820_entries; i++ )
>> +    {
>> +        end = (e820_map[i].addr + e820_map[i].size + PAGE_SIZE - 1) & PAGE_MASK;
>> +        e820_map[i].addr &= PAGE_MASK;
>> +        e820_map[i].size = end - e820_map[i].addr;
>> +    }
> 
> Mmm, what if the previous entry ends after the aligned start?
> 
> On real machines that does happen, and you'd rather round up the start
> address of usable areas, rather than rounding it down (and conversely
> for the end).

I think you are partially right. :-)

Entries for resources managed by Mini-OS (RAM, maybe NVME?) should be
rounded to cover only complete pages (start rounded up, end rounded
down), but all other entries should be rounded to cover the complete
area (start rounded down, end rounded up) in order not to use any
partial used page for e.g. mapping foreign pages.

> 
>> +    /* Sort entries by start address. */
>> +    for ( i = 0; i < e820_entries - 1; i++ )
>> +    {
>> +        if ( e820_map[i].addr > e820_map[i + 1].addr )
>> +        {
>> +            e820_swap_entries(i, i + 1);
>> +            i = -1;
>> +        }
>> +    }
> 
> This looks O(n^3) to me? A bubble sort like this should be fine:
> 
>      /* Sort entries by start address. */
>      for ( last = e820_entries; last > 1; last-- )
>      {
>          for ( i = 0; i < last - 1; i++ )
>          {
>              if ( e820_map[i].addr > e820_map[i + 1].addr )
>              {
>                  e820_swap_entries(i, i + 1);
>              }
>          }
>      }

Hmm, depends.

Assuming a rather well sorted map my version is O(n), while yours
is still O(n^2).

In the end it won't matter that much, because a normal map will have
only very few entries (usually 5 before merging consecutive entries).

I'm fine both ways, whatever you prefer.


Juergen
Samuel Thibault Dec. 13, 2021, 9:19 p.m. UTC | #3
Juergen Gross, le lun. 13 déc. 2021 15:56:21 +0100, a ecrit:
> On 12.12.21 01:05, Samuel Thibault wrote:
> > Hello,
> > 
> > Juergen Gross, le lun. 06 déc. 2021 08:23:29 +0100, a ecrit:
> > > - align the entries to page boundaries
> > 
> > > +    /* Adjust map entries to page boundaries. */
> > > +    for ( i = 0; i < e820_entries; i++ )
> > > +    {
> > > +        end = (e820_map[i].addr + e820_map[i].size + PAGE_SIZE - 1) & PAGE_MASK;
> > > +        e820_map[i].addr &= PAGE_MASK;
> > > +        e820_map[i].size = end - e820_map[i].addr;
> > > +    }
> > 
> > Mmm, what if the previous entry ends after the aligned start?
> > 
> > On real machines that does happen, and you'd rather round up the start
> > address of usable areas, rather than rounding it down (and conversely
> > for the end).
> 
> I think you are partially right. :-)
> 
> Entries for resources managed by Mini-OS (RAM, maybe NVME?) should be
> rounded to cover only complete pages (start rounded up, end rounded
> down), but all other entries should be rounded to cover the complete
> area (start rounded down, end rounded up) in order not to use any
> partial used page for e.g. mapping foreign pages.

Right!

> > > +    /* Sort entries by start address. */
> > > +    for ( i = 0; i < e820_entries - 1; i++ )
> > > +    {
> > > +        if ( e820_map[i].addr > e820_map[i + 1].addr )
> > > +        {
> > > +            e820_swap_entries(i, i + 1);
> > > +            i = -1;
> > > +        }
> > > +    }
> > 
> > This looks O(n^3) to me? A bubble sort like this should be fine:
> > 
> >      /* Sort entries by start address. */
> >      for ( last = e820_entries; last > 1; last-- )
> >      {
> >          for ( i = 0; i < last - 1; i++ )
> >          {
> >              if ( e820_map[i].addr > e820_map[i + 1].addr )
> >              {
> >                  e820_swap_entries(i, i + 1);
> >              }
> >          }
> >      }
> 
> Hmm, depends.
> 
> Assuming a rather well sorted map my version is O(n), while yours
> is still O(n^2).

Right, I was a bit lazy :)

This should be fine:

/* Sort entries by start address. */
for ( i = 1; i < e820_entries; i++ )
    for ( j = i; j > 0 && e820_map[j-1].addr > e820_map[j].addr ) ; j-- )
        e820_swap_entries(j - 1, j);

> I'm fine both ways, whatever you prefer.

I really prefer for loops which don't unexpectedly modify their loop
index, that's much less scary :)

Samuel
Juergen Gross Dec. 14, 2021, 6:33 a.m. UTC | #4
On 13.12.21 22:19, Samuel Thibault wrote:
> Juergen Gross, le lun. 13 déc. 2021 15:56:21 +0100, a ecrit:
>> On 12.12.21 01:05, Samuel Thibault wrote:
>>> Hello,
>>>
>>> Juergen Gross, le lun. 06 déc. 2021 08:23:29 +0100, a ecrit:
>>>> - align the entries to page boundaries
>>>
>>>> +    /* Adjust map entries to page boundaries. */
>>>> +    for ( i = 0; i < e820_entries; i++ )
>>>> +    {
>>>> +        end = (e820_map[i].addr + e820_map[i].size + PAGE_SIZE - 1) & PAGE_MASK;
>>>> +        e820_map[i].addr &= PAGE_MASK;
>>>> +        e820_map[i].size = end - e820_map[i].addr;
>>>> +    }
>>>
>>> Mmm, what if the previous entry ends after the aligned start?
>>>
>>> On real machines that does happen, and you'd rather round up the start
>>> address of usable areas, rather than rounding it down (and conversely
>>> for the end).
>>
>> I think you are partially right. :-)
>>
>> Entries for resources managed by Mini-OS (RAM, maybe NVME?) should be
>> rounded to cover only complete pages (start rounded up, end rounded
>> down), but all other entries should be rounded to cover the complete
>> area (start rounded down, end rounded up) in order not to use any
>> partial used page for e.g. mapping foreign pages.
> 
> Right!
> 
>>>> +    /* Sort entries by start address. */
>>>> +    for ( i = 0; i < e820_entries - 1; i++ )
>>>> +    {
>>>> +        if ( e820_map[i].addr > e820_map[i + 1].addr )
>>>> +        {
>>>> +            e820_swap_entries(i, i + 1);
>>>> +            i = -1;
>>>> +        }
>>>> +    }
>>>
>>> This looks O(n^3) to me? A bubble sort like this should be fine:
>>>
>>>       /* Sort entries by start address. */
>>>       for ( last = e820_entries; last > 1; last-- )
>>>       {
>>>           for ( i = 0; i < last - 1; i++ )
>>>           {
>>>               if ( e820_map[i].addr > e820_map[i + 1].addr )
>>>               {
>>>                   e820_swap_entries(i, i + 1);
>>>               }
>>>           }
>>>       }
>>
>> Hmm, depends.
>>
>> Assuming a rather well sorted map my version is O(n), while yours
>> is still O(n^2).
> 
> Right, I was a bit lazy :)
> 
> This should be fine:
> 
> /* Sort entries by start address. */
> for ( i = 1; i < e820_entries; i++ )
>      for ( j = i; j > 0 && e820_map[j-1].addr > e820_map[j].addr ) ; j-- )
>          e820_swap_entries(j - 1, j);
> 
>> I'm fine both ways, whatever you prefer.
> 
> I really prefer for loops which don't unexpectedly modify their loop
> index, that's much less scary :)

Agreed, I'll take your version.


Juergen
diff mbox series

Patch

diff --git a/e820.c b/e820.c
index 2165280..336a8b8 100644
--- a/e820.c
+++ b/e820.c
@@ -57,6 +57,60 @@  static char *e820_types[E820_TYPES] = {
     [E820_PMEM]     = "PMEM"
 };
 
+static void e820_remove_entry(int idx)
+{
+    int i;
+
+    e820_entries--;
+    for ( i = idx; i < e820_entries; i++ )
+        e820_map[i] = e820_map[i + 1];
+}
+
+static void e820_swap_entries(int idx1, int idx2)
+{
+    struct e820entry entry;
+
+    entry = e820_map[idx1];
+    e820_map[idx1] = e820_map[idx2];
+    e820_map[idx2] = entry;
+}
+
+static void e820_sanitize(void)
+{
+    int i;
+    unsigned long end;
+
+    /* Adjust map entries to page boundaries. */
+    for ( i = 0; i < e820_entries; i++ )
+    {
+        end = (e820_map[i].addr + e820_map[i].size + PAGE_SIZE - 1) & PAGE_MASK;
+        e820_map[i].addr &= PAGE_MASK;
+        e820_map[i].size = end - e820_map[i].addr;
+    }
+
+    /* Sort entries by start address. */
+    for ( i = 0; i < e820_entries - 1; i++ )
+    {
+        if ( e820_map[i].addr > e820_map[i + 1].addr )
+        {
+            e820_swap_entries(i, i + 1);
+            i = -1;
+        }
+    }
+
+    /* Merge adjacent entries of same type. */
+    for ( i = 0; i < e820_entries - 1; i++ )
+    {
+        if ( e820_map[i].type == e820_map[i + 1].type &&
+             e820_map[i].addr + e820_map[i].size == e820_map[i + 1].addr )
+        {
+            e820_map[i].size += e820_map[i + 1].size;
+            e820_remove_entry(i + 1);
+            i--;
+        }
+    }
+}
+
 static void e820_get_memmap(void)
 {
     long ret;
@@ -71,6 +125,8 @@  static void e820_get_memmap(void)
         do_exit();
     }
     e820_entries = memmap.nr_entries;
+
+    e820_sanitize();
 }
 
 void arch_print_memmap(void)