[for,4.19-stable,v2,22/24] mm/memunmap: don't access uninitialized memmap in memunmap_pages()
diff mbox series

Message ID 20200121180150.37454-23-david@redhat.com
State New
Headers show
  • mm/memory_hotplug: backport of pending stable fixes
Related show

Commit Message

David Hildenbrand Jan. 21, 2020, 6:01 p.m. UTC
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>

commit 77e080e7680e1e615587352f70c87b9e98126d03 upstream.

-- snip --

- Missing mm/hmm.c and kernel/memremap.c unification.
-- hmm code does not need fixes (no altmap)
- Missing 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap")

-- snip --

Patch series "mm/memory_hotplug: Shrink zones before removing memory",

This series fixes the access of uninitialized memmaps when shrinking
zones/nodes and when removing memory.  Also, it contains all fixes for
crashes that can be triggered when removing certain namespace using
memunmap_pages() - ZONE_DEVICE, reported by Aneesh.

We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be
more involved (we don't have SECTION_IS_ONLINE as an indicator), and
shrinking is only of limited use (set_zone_contiguous() cannot detect
the ZONE_DEVICE as contiguous).

We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount
of code to a minimum.  Shrinking is especially necessary to keep
zone->contiguous set where possible, especially, on memory unplug of
DIMMs at zone boundaries.


Zones are now properly shrunk when offlining memory blocks or when
onlining failed.  This allows to properly shrink zones on memory unplug
even if the separate memory blocks of a DIMM were onlined to different
zones or re-onlined to a different zone after offlining.


  :/# cat /proc/zoneinfo
  Node 1, zone  Movable
          spanned  0
          present  0
          managed  0
  :/# echo "online_movable" > /sys/devices/system/memory/memory41/state
  :/# echo "online_movable" > /sys/devices/system/memory/memory43/state
  :/# cat /proc/zoneinfo
  Node 1, zone  Movable
          spanned  98304
          present  65536
          managed  65536
  :/# echo 0 > /sys/devices/system/memory/memory43/online
  :/# cat /proc/zoneinfo
  Node 1, zone  Movable
          spanned  32768
          present  32768
          managed  32768
  :/# echo 0 > /sys/devices/system/memory/memory41/online
  :/# cat /proc/zoneinfo
  Node 1, zone  Movable
          spanned  0
          present  0
          managed  0

This patch (of 10):

With an altmap, the memmap falling into the reserved altmap space are not
initialized and, therefore, contain a garbage NID and a garbage zone.
Make sure to read the NID/zone from a memmap that was initialized.

This fixes a kernel crash that is observed when destroying a namespace:

  kernel BUG at include/linux/mm.h:1107!
  cpu 0x1: Vector: 700 (Program Check) at [c000000274087890]
      pc: c0000000004b9728: memunmap_pages+0x238/0x340
      lr: c0000000004b9724: memunmap_pages+0x234/0x340
      pid   = 3669, comm = ndctl
  kernel BUG at include/linux/mm.h:1107!

The "page_zone(pfn_to_page(pfn)" was introduced by 69324b8f4833 ("mm,
devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I
think we will never have driver reserved memory with

[david@redhat.com: minimze code changes, rephrase description]
Link: http://lkml.kernel.org/r/20191006085646.5768-2-david@redhat.com
Fixes: 2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to arch_remove_memory")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>	[5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
 kernel/memremap.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff mbox series

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 2ee2e672d5fc..1ec1f8fd97f5 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -120,6 +120,7 @@  static void devm_memremap_pages_release(void *data)
 	struct device *dev = pgmap->dev;
 	struct resource *res = &pgmap->res;
 	resource_size_t align_start, align_size;
+	struct page *first_page;
 	unsigned long pfn;
 	int nid;
@@ -132,13 +133,16 @@  static void devm_memremap_pages_release(void *data)
 	align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE)
 		- align_start;
-	nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT));
+	/* make sure to access a memmap that was actually initialized */
+	first_page = pfn_to_page(pfn_first(pgmap));
+	nid = page_to_nid(first_page);
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
 		pfn = align_start >> PAGE_SHIFT;
-		__remove_pages(page_zone(pfn_to_page(pfn)), pfn,
-				align_size >> PAGE_SHIFT, NULL);
+		__remove_pages(page_zone(first_page), pfn,
+			       align_size >> PAGE_SHIFT, NULL);
 	} else {
 		arch_remove_memory(nid, align_start, align_size,
 				pgmap->altmap_valid ? &pgmap->altmap : NULL);