mbox series

[v2,0/3] memblock, arm: fixes for freeing of the memory map

Message ID 20210519141436.11961-1-rppt@kernel.org (mailing list archive)
Headers show
Series memblock, arm: fixes for freeing of the memory map | expand

Message

Mike Rapoport May 19, 2021, 2:14 p.m. UTC
From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

The coordination between freeing of unused memory map, pfn_valid() and core
mm assumptions about validity of the memory map in various ranges was not
designed for complex layouts of the physical memory with a lot of holes all
over the place.

Kefen Wang reported crashes in move_freepages() on a system with the
following memory layout:

  node   0: [mem 0x0000000080a00000-0x00000000855fffff]
  node   0: [mem 0x0000000086a00000-0x0000000087dfffff]
  node   0: [mem 0x000000008bd00000-0x000000008c4fffff]
  node   0: [mem 0x000000008e300000-0x000000008ecfffff]
  node   0: [mem 0x0000000090d00000-0x00000000bfffffff]
  node   0: [mem 0x00000000cc000000-0x00000000dc9fffff]
  node   0: [mem 0x00000000de700000-0x00000000de9fffff]
  node   0: [mem 0x00000000e0800000-0x00000000e0bfffff]
  node   0: [mem 0x00000000f4b00000-0x00000000f6ffffff]
  node   0: [mem 0x00000000fda00000-0x00000000ffffefff]

The crashes can be mitigated by enabling CONFIG_HOLES_IN_ZONE and
essentially turning pfn_valid_within() to pfn_valid() instead of having it
hardwired to 1.

Alternatively, we can update ARM's implementation of pfn_valid() to take
into accounting rounding of the freed memory map to pageblock boundaries
and make sure it returns true for PFNs that have memory map entries even if
there is no physical memory.

I can take the entire series via memblock tree.

@Kefen, I didn't add your Tested-by yet because the patch is slightly
different from the version you've tested.

v2:
* Use single memblock_overlaps_region() instead of several
memblock_is_map_memory() lookups. This makes this series depend on update
of MEMBLOCK_NOMAP handling in the memory map [2]

v1: Link: https://lore.kernel.org/lkml/20210518090613.21519-1-rppt@kernel.org

[1] https://lore.kernel.org/lkml/2a1592ad-bc9d-4664-fd19-f7448a37edc0@huawei.com
[2] https://lore.kernel.org/lkml/20210511100550.28178-1-rppt@kernel.org

Mike Rapoport (3):
  memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER
  memblock: align freed memory map on pageblock boundaries with SPARSEMEM
  arm: extend pfn_valid to take into accound freed memory map alignment

 arch/arm/mm/init.c | 13 ++++++++++++-
 mm/memblock.c      | 23 ++++++++++++-----------
 2 files changed, 24 insertions(+), 12 deletions(-)


base-commit: d07f6ca923ea0927a1024dfccafc5b53b61cfecc

Comments

Kefeng Wang May 20, 2021, 6:21 a.m. UTC | #1
On 2021/5/19 22:14, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> The coordination between freeing of unused memory map, pfn_valid() and core
> mm assumptions about validity of the memory map in various ranges was not
> designed for complex layouts of the physical memory with a lot of holes all
> over the place.
> 
> Kefen Wang reported crashes in move_freepages() on a system with the
> following memory layout:
> 
>    node   0: [mem 0x0000000080a00000-0x00000000855fffff]
>    node   0: [mem 0x0000000086a00000-0x0000000087dfffff]
>    node   0: [mem 0x000000008bd00000-0x000000008c4fffff]
>    node   0: [mem 0x000000008e300000-0x000000008ecfffff]
>    node   0: [mem 0x0000000090d00000-0x00000000bfffffff]
>    node   0: [mem 0x00000000cc000000-0x00000000dc9fffff]
>    node   0: [mem 0x00000000de700000-0x00000000de9fffff]
>    node   0: [mem 0x00000000e0800000-0x00000000e0bfffff]
>    node   0: [mem 0x00000000f4b00000-0x00000000f6ffffff]
>    node   0: [mem 0x00000000fda00000-0x00000000ffffefff]
> 
> The crashes can be mitigated by enabling CONFIG_HOLES_IN_ZONE and
> essentially turning pfn_valid_within() to pfn_valid() instead of having it
> hardwired to 1.
> 
> Alternatively, we can update ARM's implementation of pfn_valid() to take
> into accounting rounding of the freed memory map to pageblock boundaries
> and make sure it returns true for PFNs that have memory map entries even if
> there is no physical memory.
> 
> I can take the entire series via memblock tree.
> 
> @Kefen, I didn't add your Tested-by yet because the patch is slightly
> different from the version you've tested.

Backport this version(also with link[2]) and oom test could pass too.

> 
> v2:
> * Use single memblock_overlaps_region() instead of several
> memblock_is_map_memory() lookups. This makes this series depend on update
> of MEMBLOCK_NOMAP handling in the memory map [2]
> 
> v1: Link: https://lore.kernel.org/lkml/20210518090613.21519-1-rppt@kernel.org
> 
> [1] https://lore.kernel.org/lkml/2a1592ad-bc9d-4664-fd19-f7448a37edc0@huawei.com
> [2] https://lore.kernel.org/lkml/20210511100550.28178-1-rppt@kernel.org
> 
> Mike Rapoport (3):
>    memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER
>    memblock: align freed memory map on pageblock boundaries with SPARSEMEM
>    arm: extend pfn_valid to take into accound freed memory map alignment
> 
>   arch/arm/mm/init.c | 13 ++++++++++++-
>   mm/memblock.c      | 23 ++++++++++++-----------
>   2 files changed, 24 insertions(+), 12 deletions(-)
> 
> 
> base-commit: d07f6ca923ea0927a1024dfccafc5b53b61cfecc
>