From patchwork Fri Jul 27 16:54:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 10547445 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E6CCF14E2 for ; Fri, 27 Jul 2018 16:55:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D46C228A81 for ; Fri, 27 Jul 2018 16:55:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C7AC92C188; Fri, 27 Jul 2018 16:55:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81E4528A81 for ; Fri, 27 Jul 2018 16:55:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A1206B0003; Fri, 27 Jul 2018 12:55:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 152E16B0005; Fri, 27 Jul 2018 12:55:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01A4E6B0007; Fri, 27 Jul 2018 12:55:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk0-f199.google.com (mail-qk0-f199.google.com [209.85.220.199]) by kanga.kvack.org (Postfix) with ESMTP id C86936B0003 for ; Fri, 27 Jul 2018 12:55:04 -0400 (EDT) Received: by mail-qk0-f199.google.com with SMTP id f64-v6so4734248qkb.20 for ; Fri, 27 Jul 2018 09:55:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:mime-version:content-transfer-encoding; bh=q2YTx6rouWZU4jmdos0KEtq00A+1Lcx9EGGfyxtePnU=; b=kDRnrRldjUCCZRKBra2nwEXkxNOv3hXJLro+JIoTNXG3HjFCPu6a+6FWyvh8lIrdtr chQqWrGctHfWcS3ts9yzOmMAxMOEsdZeaIwHaMjBzFsSAU2Ph7gLHsGJ0LgEcMbP71yQ BA822oaE7LnT17C46fOpHzKPA1ZQJk26Y/xTeaKB2/D+DWFEeIh5xYz5D6lQmzEBK8vR TLdJw49uLi7ViIJoL7HBVPA/D9M5VCbGTuVNCtoDuv/TGIKAsb4gbFdKdks/OG624z5Q mB6ONn/YjLQCewS3dpKo1NM0D/KJTnYTDcPRdwJ8V/qZf4adG8V1u8jLTOaUJXRwmmIg wQ4Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AOUpUlHueB5dmZYUvcJrpasCAT45wprn9CSPh5+wcKZltfvxIRCY9Q9y SCqJG3Qs3LiYGtuTxu2JP4PjS8muoKdwFbXUSTHOyFxdgO+myusIzJ5P7jOeo/qU7dVdkBqstWE X9qZ0K/cSsNrhTTAChVytVA3WF8lEfcl/LnCCX9mUHP+rRDgEnD5XY2dVnOb/8EAWIw== X-Received: by 2002:a0c:994c:: with SMTP id i12-v6mr6376713qvd.24.1532710504506; Fri, 27 Jul 2018 09:55:04 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfqYiri45hDj/M6bFJsBzcwtmDA1cYQGxXAvePhypOUc8r4kMO942f31zCi887jrAtqNIfN X-Received: by 2002:a0c:994c:: with SMTP id i12-v6mr6376660qvd.24.1532710503593; Fri, 27 Jul 2018 09:55:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532710503; cv=none; d=google.com; s=arc-20160816; b=Fi2Z/CvH/3i4P9bTcvNLhwSi0ydBObw7EqkqE+ne8qzdH0HnUPloMhwfEN4E5ovfZy YGFk6/XooAw2GQBmMN5uD1oo5s7P5K4a35c64PgOq30ggI3Z7MWAC8sPSzpmdWsVDO3w iq5bKRHQzidZ98TL+FURM8Sh0ounwqhAhRzbS6Xriq1Pkp+P2DT3KTzed9BcmLu7XIj8 9ef/iFax7p3IUnQiDRruW1z1H43xfK8V7QeUTDmtET1Hf2It9W2K0Fz6ese3jMXZl/MG 38GtXoXi54WHmTim+u6I3rBcdd0OG8JOREhFPJXP4oNa8r7P/UeCbVCM92oaL/GwHD+n nJQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-authentication-results; bh=q2YTx6rouWZU4jmdos0KEtq00A+1Lcx9EGGfyxtePnU=; b=nDTbtYco3iGQvQUGHu9pW6sCcaN5EtKlEAgqBl7BQ+lZQubvsloBu6mHqc6YNK9JI+ qAEqGotvignXoSKCfGHOhVi7aTZPr/lkvPIGu7F90wEs7tGjqD/hEfzZQBsfUnPylkwz 7L69/QKS0ohCMkJ/6O5MkfreoqWcsyXCGct2/nIXdCmaQjJZ/GYFrBjprN0hYK8iZzqa ZK7+Wej/q9KG40IvVrSZZ7TodTLHOg1p161ulSzXepXPejZQykGI7QHyimyfDy+4eV+W V1Jk+VNd7UmD2nd92pPhdWlzTivN9v0sAt8w3G4aUpKDX/xmZDbDVmwIRQnJtnqyURYP b2pQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx3-rdu2.redhat.com. [66.187.233.73]) by mx.google.com with ESMTPS id f34-v6si775894qtb.125.2018.07.27.09.55.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Jul 2018 09:55:03 -0700 (PDT) Received-SPF: pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) client-ip=66.187.233.73; Authentication-Results: mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 66.187.233.73 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 99E037DAC3; Fri, 27 Jul 2018 16:55:02 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-54.ams2.redhat.com [10.36.116.54]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A30A111E418; Fri, 27 Jul 2018 16:54:55 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, David Hildenbrand , Greg Kroah-Hartman , Ingo Molnar , Pavel Tatashin , Andrew Morton , Dan Williams , Michal Hocko , Jan Kara , Matthew Wilcox , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Souptick Joarder , "Kirill A. Shutemov" , Vlastimil Babka , Oscar Salvador , YASUAKI ISHIMATSU , Mathieu Malaterre , Mel Gorman , Joonsoo Kim Subject: [PATCH v1] mm: inititalize struct pages when adding a section Date: Fri, 27 Jul 2018 18:54:54 +0200 Message-Id: <20180727165454.27292-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 27 Jul 2018 16:55:02 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 27 Jul 2018 16:55:02 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'david@redhat.com' RCPT:'' X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Right now, struct pages are inititalized when memory is onlined, not when it is added (since commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")). remove_memory() will call arch_remove_memory(). Here, we usually access the struct page to get the zone of the pages. So effectively, we access stale struct pages in case we remove memory that was never onlined. So let's simply inititalize them earlier, when the memory is added. We only have to take care of updating the zone once we know it. We can use a dummy zone for that purpose. So effectively, all pages will already be initialized and set to reserved after memory was added but before it was onlined (and even the memblock is added). We only inititalize pages once, to not degrade performance. This will also mean that user space dump tools will always see sane struct pages once a memblock pops up. Cc: Greg Kroah-Hartman Cc: Ingo Molnar Cc: Pavel Tatashin Cc: Andrew Morton Cc: David Hildenbrand Cc: Dan Williams Cc: Michal Hocko Cc: Jan Kara Cc: Matthew Wilcox Cc: "Jérôme Glisse" Cc: Souptick Joarder Cc: "Kirill A. Shutemov" Cc: Vlastimil Babka Cc: Oscar Salvador Cc: YASUAKI ISHIMATSU Cc: Mathieu Malaterre Cc: Mel Gorman Cc: Joonsoo Kim Signed-off-by: David Hildenbrand --- drivers/base/node.c | 1 - include/linux/memory.h | 1 - include/linux/mm.h | 10 ++++++++++ mm/memory_hotplug.c | 27 +++++++++++++++++++-------- mm/page_alloc.c | 23 +++++++++++------------ 5 files changed, 40 insertions(+), 22 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index a5e821d09656..3ec78f80afe2 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -408,7 +408,6 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid, if (!mem_blk) return -EFAULT; - mem_blk->nid = nid; if (!node_online(nid)) return 0; diff --git a/include/linux/memory.h b/include/linux/memory.h index a6ddefc60517..8a0864a65a98 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -33,7 +33,6 @@ struct memory_block { void *hw; /* optional pointer to fw/hw data */ int (*phys_callback)(struct memory_block *); struct device dev; - int nid; /* NID for this memory block */ }; int arch_get_memory_phys_device(unsigned long start_pfn); diff --git a/include/linux/mm.h b/include/linux/mm.h index d3a3842316b8..e6bf3527b7a2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1162,7 +1162,15 @@ static inline void set_page_address(struct page *page, void *address) { page->virtual = address; } +static void set_page_virtual(struct page *page, and enum zone_type zone) +{ + /* The shift won't overflow because ZONE_NORMAL is below 4G. */ + if (!is_highmem_idx(zone)) + set_page_address(page, __va(pfn << PAGE_SHIFT)); +} #define page_address_init() do { } while(0) +#else +#define set_page_virtual(page, zone) do { } while(0) #endif #if defined(HASHED_PAGE_VIRTUAL) @@ -2116,6 +2124,8 @@ extern unsigned long find_min_pfn_with_active_regions(void); extern void free_bootmem_with_active_regions(int nid, unsigned long max_low_pfn); extern void sparse_memory_present_with_active_regions(int nid); +extern void __meminit init_single_page(struct page *page, unsigned long pfn, + unsigned long zone, int nid); #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7deb49f69e27..3f28ca3c3a33 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -250,6 +250,7 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, struct vmem_altmap *altmap, bool want_memblock) { int ret; + int i; if (pfn_valid(phys_start_pfn)) return -EEXIST; @@ -258,6 +259,23 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, if (ret < 0) return ret; + /* + * Initialize all pages in the section before fully exposing them to the + * system so nobody will stumble over a half inititalized state. + */ + for (i = 0; i < PAGES_PER_SECTION; i++) { + unsigned long pfn = phys_start_pfn + i; + struct page *page; + + if (!pfn_valid(pfn)) + continue; + page = pfn_to_page(pfn); + + /* dummy zone, the actual one will be set when onlining pages */ + init_single_page(page, pfn, ZONE_NORMAL, nid); + SetPageReserved(page); + } + if (!want_memblock) return 0; @@ -891,15 +909,8 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ int nid; int ret; struct memory_notify arg; - struct memory_block *mem; - - /* - * We can't use pfn_to_nid() because nid might be stored in struct page - * which is not yet initialized. Instead, we find nid from memory block. - */ - mem = find_memory_block(__pfn_to_section(pfn)); - nid = mem->nid; + nid = pfn_to_nid(pfn); /* associate pfn range with the zone */ zone = move_pfn_range(online_type, nid, pfn, nr_pages); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a790ef4be74e..8d81df4c40ab 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1168,7 +1168,7 @@ static void free_one_page(struct zone *zone, spin_unlock(&zone->lock); } -static void __meminit __init_single_page(struct page *page, unsigned long pfn, +void __meminit init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid) { mm_zero_struct_page(page); @@ -1178,11 +1178,7 @@ static void __meminit __init_single_page(struct page *page, unsigned long pfn, page_cpupid_reset_last(page); INIT_LIST_HEAD(&page->lru); -#ifdef WANT_PAGE_VIRTUAL - /* The shift won't overflow because ZONE_NORMAL is below 4G. */ - if (!is_highmem_idx(zone)) - set_page_address(page, __va(pfn << PAGE_SHIFT)); -#endif + set_page_virtual(page, zone); } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT @@ -1203,7 +1199,7 @@ static void __meminit init_reserved_page(unsigned long pfn) if (pfn >= zone->zone_start_pfn && pfn < zone_end_pfn(zone)) break; } - __init_single_page(pfn_to_page(pfn), pfn, zid, nid); + init_single_page(pfn_to_page(pfn), pfn, zid, nid); } #else static inline void init_reserved_page(unsigned long pfn) @@ -1520,7 +1516,7 @@ static unsigned long __init deferred_init_pages(int nid, int zid, } else { page++; } - __init_single_page(page, pfn, zid, nid); + init_single_page(page, pfn, zid, nid); nr_pages++; } return (nr_pages); @@ -5519,9 +5515,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, not_early: page = pfn_to_page(pfn); - __init_single_page(page, pfn, zone, nid); - if (context == MEMMAP_HOTPLUG) - SetPageReserved(page); + if (context == MEMMAP_HOTPLUG) { + /* everything but the zone was inititalized */ + set_page_zone(page, zone); + set_page_virtual(page, zone); + } else + init_single_page(page, pfn, zone, nid); /* * Mark the block movable so that blocks are reserved for @@ -6386,7 +6385,7 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size, #if defined(CONFIG_HAVE_MEMBLOCK) && !defined(CONFIG_FLAT_NODE_MEM_MAP) /* * Only struct pages that are backed by physical memory are zeroed and - * initialized by going through __init_single_page(). But, there are some + * initialized by going through init_single_page(). But, there are some * struct pages which are reserved in memblock allocator and their fields * may be accessed (for example page_to_pfn() on some configuration accesses * flags). We must explicitly zero those struct pages.