mbox series

[kvm-unit-tests,v1,00/12] Fix and improve the page allocator

Message ID 20201216201200.255172-1-imbrenda@linux.ibm.com (mailing list archive)
Headers show
Series Fix and improve the page allocator | expand

Message

Claudio Imbrenda Dec. 16, 2020, 8:11 p.m. UTC
My previous patchseries was rushed and not polished enough. Furthermore it
introduced some regressions.

This patchseries fixes hopefully all the issues reported, and introduces
some new features.

It also simplifies the code and hopefully makes it more readable.

Fixed:
* allocated memory is now zeroed by default

New features:
* per-allocation flags to specify not just the area (like before) but also
  other parameters
  - zero flag: the allocation will be zeroed
  - fresh flag: the returned memory has never been read or written to before
* default flags: it's possible to specify which flags should be enabled
  automatically at each allocation; by default the zero flag is set.


I would appreciate if people could test these patches, especially on
strange, unusual or exotic hardware (I tested only on s390x)


GitLab:
  https://gitlab.com/imbrenda/kvm-unit-tests/-/tree/page_allocator_fixes
CI:
  https://gitlab.com/imbrenda/kvm-unit-tests/-/pipelines/229689192


Claudio Imbrenda (12):
  lib/x86: fix page.h to include the generic header
  lib/list.h: add list_add_tail
  lib/vmalloc: add some asserts and improvements
  lib/asm: Fix definitions of memory areas
  lib/alloc_page: fix and improve the page allocator
  lib/alloc.h: remove align_min from struct alloc_ops
  lib/alloc_page: Optimization to skip known empty freelists
  lib/alloc_page: rework metadata format
  lib/alloc: replace areas with more generic flags
  lib/alloc_page: Wire up ZERO_FLAG
  lib/alloc_page: Properly handle requests for fresh blocks
  lib/alloc_page: default flags and zero pages by default

 lib/asm-generic/memory_areas.h |   9 +-
 lib/arm/asm/memory_areas.h     |  11 +-
 lib/arm64/asm/memory_areas.h   |  11 +-
 lib/powerpc/asm/memory_areas.h |  11 +-
 lib/ppc64/asm/memory_areas.h   |  11 +-
 lib/s390x/asm/memory_areas.h   |  13 +-
 lib/x86/asm/memory_areas.h     |  27 +--
 lib/x86/asm/page.h             |   4 +-
 lib/alloc.h                    |   1 -
 lib/alloc_page.h               |  66 ++++++--
 lib/list.h                     |   9 +
 lib/alloc_page.c               | 291 ++++++++++++++++++++-------------
 lib/alloc_phys.c               |   9 +-
 lib/s390x/smp.c                |   2 +-
 lib/vmalloc.c                  |  21 +--
 15 files changed, 286 insertions(+), 210 deletions(-)

Comments

Nadav Amit Dec. 17, 2020, 7:41 p.m. UTC | #1
> On Dec 16, 2020, at 12:11 PM, Claudio Imbrenda <imbrenda@linux.ibm.com> wrote:
> 
> My previous patchseries was rushed and not polished enough. Furthermore it
> introduced some regressions.
> 
> This patchseries fixes hopefully all the issues reported, and introduces
> some new features.
> 
> It also simplifies the code and hopefully makes it more readable.
> 
> Fixed:
> * allocated memory is now zeroed by default

Thanks for doing that. Before I test it, did you also fix the issue of x86’s
setup_vm() [1]?

[1] https://www.spinics.net/lists/kvm/msg230470.html
Claudio Imbrenda Dec. 18, 2020, 2:19 p.m. UTC | #2
On Thu, 17 Dec 2020 11:41:05 -0800
Nadav Amit <nadav.amit@gmail.com> wrote:

> > On Dec 16, 2020, at 12:11 PM, Claudio Imbrenda
> > <imbrenda@linux.ibm.com> wrote:
> > 
> > My previous patchseries was rushed and not polished enough.
> > Furthermore it introduced some regressions.
> > 
> > This patchseries fixes hopefully all the issues reported, and
> > introduces some new features.
> > 
> > It also simplifies the code and hopefully makes it more readable.
> > 
> > Fixed:
> > * allocated memory is now zeroed by default  
> 
> Thanks for doing that. Before I test it, did you also fix the issue
> of x86’s setup_vm() [1]?
> 
> [1] https://www.spinics.net/lists/kvm/msg230470.html

unfortunately no, because I could not reproduce it.

In theory setup_vm should just work, since it is called twice, but on
different address ranges.

The only issue I can think of is if the second call overlaps multiple
areas.

can you send me the memory map of that machine you're running the tests
on? (/proc/iomem)


Claudio
Krish Sadhukhan Dec. 24, 2020, 6:19 p.m. UTC | #3
On 12/16/20 12:11 PM, Claudio Imbrenda wrote:
> My previous patchseries was rushed and not polished enough. Furthermore it
> introduced some regressions.
>
> This patchseries fixes hopefully all the issues reported, and introduces
> some new features.
>
> It also simplifies the code and hopefully makes it more readable.
>
> Fixed:
> * allocated memory is now zeroed by default
>
> New features:
> * per-allocation flags to specify not just the area (like before) but also
>    other parameters
>    - zero flag: the allocation will be zeroed
>    - fresh flag: the returned memory has never been read or written to before
> * default flags: it's possible to specify which flags should be enabled
>    automatically at each allocation; by default the zero flag is set.
>
>
> I would appreciate if people could test these patches, especially on
> strange, unusual or exotic hardware (I tested only on s390x)
>
>
> GitLab:
>    https://urldefense.com/v3/__https://gitlab.com/imbrenda/kvm-unit-tests/-/tree/page_allocator_fixes__;!!GqivPVa7Brio!LehVoD4e6fUc92A7OE_Rxl2QwwkrW4aY0WHTmPkgKyxYviNfnTs3hEmYWHsMj3I9paC-$
> CI:
>    https://urldefense.com/v3/__https://gitlab.com/imbrenda/kvm-unit-tests/-/pipelines/229689192__;!!GqivPVa7Brio!LehVoD4e6fUc92A7OE_Rxl2QwwkrW4aY0WHTmPkgKyxYviNfnTs3hEmYWHsMj5GdDaIf$
>
>
> Claudio Imbrenda (12):
>    lib/x86: fix page.h to include the generic header
>    lib/list.h: add list_add_tail
>    lib/vmalloc: add some asserts and improvements
>    lib/asm: Fix definitions of memory areas
>    lib/alloc_page: fix and improve the page allocator
>    lib/alloc.h: remove align_min from struct alloc_ops
>    lib/alloc_page: Optimization to skip known empty freelists
>    lib/alloc_page: rework metadata format
>    lib/alloc: replace areas with more generic flags
>    lib/alloc_page: Wire up ZERO_FLAG
>    lib/alloc_page: Properly handle requests for fresh blocks
>    lib/alloc_page: default flags and zero pages by default
>
>   lib/asm-generic/memory_areas.h |   9 +-
>   lib/arm/asm/memory_areas.h     |  11 +-
>   lib/arm64/asm/memory_areas.h   |  11 +-
>   lib/powerpc/asm/memory_areas.h |  11 +-
>   lib/ppc64/asm/memory_areas.h   |  11 +-
>   lib/s390x/asm/memory_areas.h   |  13 +-
>   lib/x86/asm/memory_areas.h     |  27 +--
>   lib/x86/asm/page.h             |   4 +-
>   lib/alloc.h                    |   1 -
>   lib/alloc_page.h               |  66 ++++++--
>   lib/list.h                     |   9 +
>   lib/alloc_page.c               | 291 ++++++++++++++++++++-------------
>   lib/alloc_phys.c               |   9 +-
>   lib/s390x/smp.c                |   2 +-
>   lib/vmalloc.c                  |  21 +--
>   15 files changed, 286 insertions(+), 210 deletions(-)
>
For patch# 1, 2, 7, 8, 9, 10 and 11:

     Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Nadav Amit Dec. 28, 2020, 6:31 a.m. UTC | #4
> On Dec 18, 2020, at 6:19 AM, Claudio Imbrenda <imbrenda@linux.ibm.com> wrote:
> 
> On Thu, 17 Dec 2020 11:41:05 -0800
> Nadav Amit <nadav.amit@gmail.com> wrote:
> 
>>> On Dec 16, 2020, at 12:11 PM, Claudio Imbrenda
>>> <imbrenda@linux.ibm.com> wrote:
>>> 
>>> My previous patchseries was rushed and not polished enough.
>>> Furthermore it introduced some regressions.
>>> 
>>> This patchseries fixes hopefully all the issues reported, and
>>> introduces some new features.
>>> 
>>> It also simplifies the code and hopefully makes it more readable.
>>> 
>>> Fixed:
>>> * allocated memory is now zeroed by default  
>> 
>> Thanks for doing that. Before I test it, did you also fix the issue
>> of x86’s setup_vm() [1]?
>> 
>> [1] https://www.spinics.net/lists/kvm/msg230470.html
> 
> unfortunately no, because I could not reproduce it.
> 
> In theory setup_vm should just work, since it is called twice, but on
> different address ranges.
> 
> The only issue I can think of is if the second call overlaps multiple
> areas.
> 
> can you send me the memory map of that machine you're running the tests
> on? (/proc/iomem)

Sorry for the late response.

I see two calls to _page_alloc_init_area() with AREA_LOW_NUMBER, one with
(base_pfn=621, top_pfn=bfdd0) and one with (base_pfn=100000 top_pfn=240000).

As for /proc/iomem:

00000000-00000fff : Reserved
00001000-0009e7ff : System RAM
0009e800-0009ffff : Reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ca000-000cafff : Adapter ROM
000cb000-000ccfff : Adapter ROM
000d0000-000d3fff : PCI Bus 0000:00
000d4000-000d7fff : PCI Bus 0000:00
000d8000-000dbfff : PCI Bus 0000:00
000dc000-000fffff : Reserved
  000f0000-000fffff : System ROM
00100000-bfecffff : System RAM
  01000000-01c031d0 : Kernel code
  01c031d1-0266f3bf : Kernel data
  028ef000-02b97fff : Kernel bss
bfed0000-bfefefff : ACPI Tables
bfeff000-bfefffff : ACPI Non-volatile Storage
bff00000-bfffffff : System RAM
c0000000-febfffff : PCI Bus 0000:00
  c0008000-c000bfff : 0000:00:10.0
  e5b00000-e5bfffff : PCI Bus 0000:22
  e5c00000-e5cfffff : PCI Bus 0000:1a
  e5d00000-e5dfffff : PCI Bus 0000:12
  e5e00000-e5efffff : PCI Bus 0000:0a
  e5f00000-e5ffffff : PCI Bus 0000:21
  e6000000-e60fffff : PCI Bus 0000:19
  e6100000-e61fffff : PCI Bus 0000:11
  e6200000-e62fffff : PCI Bus 0000:09
  e6300000-e63fffff : PCI Bus 0000:20
  e6400000-e64fffff : PCI Bus 0000:18
  e6500000-e65fffff : PCI Bus 0000:10
  e6600000-e66fffff : PCI Bus 0000:08
  e6700000-e67fffff : PCI Bus 0000:1f
  e6800000-e68fffff : PCI Bus 0000:17
  e6900000-e69fffff : PCI Bus 0000:0f
  e6a00000-e6afffff : PCI Bus 0000:07
  e6b00000-e6bfffff : PCI Bus 0000:1e
  e6c00000-e6cfffff : PCI Bus 0000:16
  e6d00000-e6dfffff : PCI Bus 0000:0e
  e6e00000-e6efffff : PCI Bus 0000:06
  e6f00000-e6ffffff : PCI Bus 0000:1d
  e7000000-e70fffff : PCI Bus 0000:15
  e7100000-e71fffff : PCI Bus 0000:0d
  e7200000-e72fffff : PCI Bus 0000:05
  e7300000-e73fffff : PCI Bus 0000:1c
  e7400000-e74fffff : PCI Bus 0000:14
  e7500000-e75fffff : PCI Bus 0000:0c
  e7600000-e76fffff : PCI Bus 0000:04
  e7700000-e77fffff : PCI Bus 0000:1b
  e7800000-e78fffff : PCI Bus 0000:13
  e7900000-e79fffff : PCI Bus 0000:0b
  e7a00000-e7afffff : PCI Bus 0000:03
  e7b00000-e7ffffff : PCI Bus 0000:02
  e8000000-efffffff : 0000:00:0f.0
    e8000000-efffffff : vmwgfx probe
  f0000000-f7ffffff : PCI MMCONFIG 0000 [bus 00-7f]
    f0000000-f7ffffff : Reserved
      f0000000-f7ffffff : pnp 00:08
  fb500000-fb5fffff : PCI Bus 0000:22
  fb600000-fb6fffff : PCI Bus 0000:1a
  fb700000-fb7fffff : PCI Bus 0000:12
  fb800000-fb8fffff : PCI Bus 0000:0a
  fb900000-fb9fffff : PCI Bus 0000:21
  fba00000-fbafffff : PCI Bus 0000:19
  fbb00000-fbbfffff : PCI Bus 0000:11
  fbc00000-fbcfffff : PCI Bus 0000:09
  fbd00000-fbdfffff : PCI Bus 0000:20
  fbe00000-fbefffff : PCI Bus 0000:18
  fbf00000-fbffffff : PCI Bus 0000:10
  fc000000-fc0fffff : PCI Bus 0000:08
  fc100000-fc1fffff : PCI Bus 0000:1f
  fc200000-fc2fffff : PCI Bus 0000:17
  fc300000-fc3fffff : PCI Bus 0000:0f
  fc400000-fc4fffff : PCI Bus 0000:07
  fc500000-fc5fffff : PCI Bus 0000:1e
  fc600000-fc6fffff : PCI Bus 0000:16
  fc700000-fc7fffff : PCI Bus 0000:0e
  fc800000-fc8fffff : PCI Bus 0000:06
  fc900000-fc9fffff : PCI Bus 0000:1d
  fca00000-fcafffff : PCI Bus 0000:15
  fcb00000-fcbfffff : PCI Bus 0000:0d
  fcc00000-fccfffff : PCI Bus 0000:05
  fcd00000-fcdfffff : PCI Bus 0000:1c
  fce00000-fcefffff : PCI Bus 0000:14
  fcf00000-fcffffff : PCI Bus 0000:0c
  fd000000-fd0fffff : PCI Bus 0000:04
  fd100000-fd1fffff : PCI Bus 0000:1b
  fd200000-fd2fffff : PCI Bus 0000:13
  fd300000-fd3fffff : PCI Bus 0000:0b
  fd400000-fd4fffff : PCI Bus 0000:03
  fd500000-fdffffff : PCI Bus 0000:02
    fd500000-fd50ffff : 0000:02:01.0
    fd510000-fd51ffff : 0000:02:05.0
    fd5c0000-fd5dffff : 0000:02:01.0
      fd5c0000-fd5dffff : e1000
    fd5ee000-fd5eefff : 0000:02:05.0
      fd5ee000-fd5eefff : ahci
    fd5ef000-fd5effff : 0000:02:03.0
      fd5ef000-fd5effff : ehci_hcd
    fdff0000-fdffffff : 0000:02:01.0
      fdff0000-fdffffff : e1000
  fe000000-fe7fffff : 0000:00:0f.0
    fe000000-fe7fffff : vmwgfx probe
  fe800000-fe9fffff : pnp 00:08
  feba0000-febbffff : 0000:00:10.0
    feba0000-febbffff : mpt
  febc0000-febdffff : 0000:00:10.0
    febc0000-febdffff : mpt
  febfe000-febfffff : 0000:00:07.7
fec00000-fec0ffff : Reserved
  fec00000-fec003ff : IOAPIC 0
fec10000-fec10fff : dmar0
fed00000-fed003ff : HPET 0
  fed00000-fed003ff : pnp 00:04
fee00000-fee00fff : Local APIC
  fee00000-fee00fff : Reserved
fffe0000-ffffffff : Reserved
100000000-23fffffff : System RAM
Claudio Imbrenda Jan. 5, 2021, 3:26 p.m. UTC | #5
On Sun, 27 Dec 2020 22:31:56 -0800
Nadav Amit <nadav.amit@gmail.com> wrote:

[...]

> >> Thanks for doing that. Before I test it, did you also fix the issue
> >> of x86’s setup_vm() [1]?
> >> 
> >> [1] https://www.spinics.net/lists/kvm/msg230470.html  
> > 
> > unfortunately no, because I could not reproduce it.
> > 
> > In theory setup_vm should just work, since it is called twice, but
> > on different address ranges.
> > 
> > The only issue I can think of is if the second call overlaps
> > multiple areas.
> > 
> > can you send me the memory map of that machine you're running the
> > tests on? (/proc/iomem)  
> 
> Sorry for the late response.
> 
> I see two calls to _page_alloc_init_area() with AREA_LOW_NUMBER, one
> with (base_pfn=621, top_pfn=bfdd0) and one with (base_pfn=100000
> top_pfn=240000).

ok, I could reproduce the issue now.

to put it simply, the old code is broken.

I could not reproduce the issue with the new code, so you should be
able to test it.