mbox series

[v2,0/1] memory_hotplug: fix the panic when memory end is not

Message ID 20181105150401.97287-1-zaslonko@linux.ibm.com (mailing list archive)
Headers show
Series memory_hotplug: fix the panic when memory end is not | expand

Message

Zaslonko Mikhail Nov. 5, 2018, 3:04 p.m. UTC
This patch refers to the older thread:
https://marc.info/?t=153658306400001&r=1&w=2

I have tried to take the approaches suggested in the discussion like
simply ignoring unaligned memory to section memory much earlier or
initializing struct pages beyond the "end" but both had issues.

First I tried to ignore unaligned memory early by adjusting memory_end
value. But the thing is that kernel mem parameter parsing and memory_end
calculation take place in the architecture code and adjusting it afterwards
in common code might be too late in my view. Also with this approach we
might lose the memory up to the entire section(256Mb on s390) just because
of unfortunate alignment.

Another approach was to fix memmap_init() and initialize struct pages
beyond the end. Since struct pages are allocated section-wise we can try to
round the size parameter passed to the memmap_init() function up to the
section boundary thus forcing the mapping initialization for the entire
section. But then it leads to another VM_BUG_ON panic due to
zone_spans_pfn() sanity check triggered for the first page of each page
block from set_pageblock_migratetype() function:
 page dumped because: VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn))
      Call Trace:
      ([<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140)
       [<00000000003014aa>] set_pageblock_migratetype+0x5a/0x70
       [<0000000000bef706>] memmap_init_zone+0x25e/0x2e0
       [<00000000010fc3d8>] free_area_init_node+0x530/0x558
       [<00000000010fcf02>] free_area_init_nodes+0x81a/0x8f0
       [<00000000010e7fdc>] paging_init+0x124/0x130
       [<00000000010e4dfa>] setup_arch+0xbf2/0xcc8
       [<00000000010de9e6>] start_kernel+0x7e/0x588
       [<000000000010007c>] startup_continue+0x7c/0x300
      Last Breaking-Event-Address:
       [<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x1401
We might ignore this check for the struct pages beyond the "end" but I'm not
sure about further implications.
For now I suggest to stay with my original proposal fixing specific
functions for memory hotplug sysfs handlers.

Changes v1 -> v2:
* Expanded commit message to show both failing scenarious.
* Use 'pfn + i' instead of 'pfn' for zone_spans_pfn() check within
test_pages_in_a_zone() function thus taking CONFIG_HOLES_IN_ZONE into
consideration.

Mikhail Zaslonko (1):
  memory_hotplug: fix the panic when memory end is not on the section
    boundary

 mm/memory_hotplug.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

Comments

Michal Hocko Nov. 5, 2018, 6:35 p.m. UTC | #1
On Mon 05-11-18 16:04:00, Mikhail Zaslonko wrote:
[...]
> Another approach was to fix memmap_init() and initialize struct pages
> beyond the end.

Yes I still do not want to give up at least this option. We do have
struct pages for the full section. Leaving som of them uninitialized is
just asking for problems. And adding special cases to some hotplug paths
just makes the code harder to follow and maintain.

So

> Since struct pages are allocated section-wise we can try to
> round the size parameter passed to the memmap_init() function up to the
> section boundary thus forcing the mapping initialization for the entire
> section. But then it leads to another VM_BUG_ON panic due to
> zone_spans_pfn() sanity check triggered for the first page of each page
> block from set_pageblock_migratetype() function:
>  page dumped because: VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn))
>       Call Trace:
>       ([<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140)
>        [<00000000003014aa>] set_pageblock_migratetype+0x5a/0x70
>        [<0000000000bef706>] memmap_init_zone+0x25e/0x2e0
>        [<00000000010fc3d8>] free_area_init_node+0x530/0x558
>        [<00000000010fcf02>] free_area_init_nodes+0x81a/0x8f0
>        [<00000000010e7fdc>] paging_init+0x124/0x130
>        [<00000000010e4dfa>] setup_arch+0xbf2/0xcc8
>        [<00000000010de9e6>] start_kernel+0x7e/0x588
>        [<000000000010007c>] startup_continue+0x7c/0x300
>       Last Breaking-Event-Address:
>        [<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x1401
> We might ignore this check for the struct pages beyond the "end" but I'm not
> sure about further implications.

find out all these implictions or do something like below (untested)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a919ba5cb3c8..a3f9ad8e40ee 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5544,6 +5544,21 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			cond_resched();
 		}
 	}
+
+#ifdef CONFIG_SPARSEMEM
+	/*
+	 * If we do not have a zone which doesn't span the rest of the
+	 * section then we should at least initialize those pages. We
+	 * could blow up on a poisoned page in some paths which depend
+	 * on full pageblocks being allocated (e.g. memory hotplug).
+	 */
+	while (end_pfn % PAGES_PER_SECTION) {
+		__init_single_page(pfn_to_page(end_pfn), end_pfn, zone, nid);
+		end_pfn++
+	}
+
+#endif
+
 }
 
 #ifdef CONFIG_ZONE_DEVICE