From patchwork Thu Dec 17 13:07:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 11979779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D2C3C2BB48 for ; Thu, 17 Dec 2020 13:08:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E0F902396D for ; Thu, 17 Dec 2020 13:08:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E0F902396D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9340A6B0070; Thu, 17 Dec 2020 08:08:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B3446B0073; Thu, 17 Dec 2020 08:08:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B5926B0070; Thu, 17 Dec 2020 08:08:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0205.hostedemail.com [216.40.44.205]) by kanga.kvack.org (Postfix) with ESMTP id 53DFD6B0073 for ; Thu, 17 Dec 2020 08:08:10 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1658E824999B for ; Thu, 17 Dec 2020 13:08:10 +0000 (UTC) X-FDA: 77602802340.15.boot90_0f0500627435 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 9E1B11814B0D0 for ; Thu, 17 Dec 2020 13:08:09 +0000 (UTC) X-HE-Tag: boot90_0f0500627435 X-Filterd-Recvd-Size: 4450 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Dec 2020 13:08:09 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id C96B6AC7F; Thu, 17 Dec 2020 13:08:07 +0000 (UTC) From: Oscar Salvador To: akpm@linux-foundation.org Cc: david@redhat.com, mhocko@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, vbabka@suse.cz, pasha.tatashin@soleen.com, Oscar Salvador Subject: [PATCH 0/5] Allocate memmap from hotadded memory (per device) Date: Thu, 17 Dec 2020 14:07:53 +0100 Message-Id: <20201217130758.11565-1-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: I figured I would send a new version before going on vacation, so I can work on it when I am back. Changes from RFCv3 to Patchv1: - Addressed feedback from David - Re-order patches Changes from v2 -> v3: - Re-order patches (Michal) - Fold "mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY" in patch#1 - Add kernel boot option to enable this feature (Michal) Changes from v1 -> v2: - Addressed feedback provided by David - Add a arch_support_memmap_on_memory to be called from mhp_supports_memmap_on_memory, as atm, only ARM, powerpc and x86_64 have altmat support. Original cover letter: ---- The primary goal of this patchset is to reduce memory overhead of the hot-added memory (at least for SPARSEMEM_VMEMMAP memory model). The current way we use to populate memmap (struct page array) has two main drawbacks: a) it consumes an additional memory until the hotadded memory itself is onlined and b) memmap might end up on a different numa node which is especially true for movable_node configuration. c) due to fragmentation we might end up populating memmap with base pages One way to mitigate all these issues is to simply allocate memmap array (which is the largest memory footprint of the physical memory hotplug) from the hot-added memory itself. SPARSEMEM_VMEMMAP memory model allows us to map any pfn range so the memory doesn't need to be online to be usable for the array. See patch 3 for more details. This feature is only usable when CONFIG_SPARSEMEM_VMEMMAP is set. [Overall design]: Implementation wise we reuse vmem_altmap infrastructure to override the default allocator used by vmemap_populate. memory_block structure gained a new field called nr_vmemmap_pages. This plays well for two reasons: 1) {offline/online}_pages know the difference between start_pfn and buddy_start_pfn, which is start_pfn + nr_vmemmap_pages. In this way all isolation/migration operations are done to within the right range of memory without vmemmap pages. This allows us for a much cleaner handling. 2) In try_remove_memory, we construct a new vmemap_altmap struct with the right information based on memory_block->nr_vmemap_pages, so we end up calling vmem_altmap_free instead of free_pagetable when removing the memory. Oscar Salvador (5): mm: Introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE mm,memory_hotplug: Allocate memmap from the added memory range acpi,memhotplug: Enable MHP_MEMMAP_ON_MEMORY when supported powerpc/memhotplug: Enable MHP_MEMMAP_ON_MEMORY when supported mm,memory_hotplug: Add kernel boot option to enable memmap_on_memory .../admin-guide/kernel-parameters.txt | 14 ++ arch/arm64/Kconfig | 4 + arch/powerpc/Kconfig | 4 + .../platforms/pseries/hotplug-memory.c | 5 +- arch/x86/Kconfig | 4 + drivers/acpi/acpi_memhotplug.c | 5 +- drivers/base/memory.c | 20 ++- include/linux/memory.h | 8 +- include/linux/memory_hotplug.h | 21 ++- include/linux/memremap.h | 2 +- include/linux/mmzone.h | 5 + mm/Kconfig | 3 + mm/Makefile | 5 +- mm/memory_hotplug.c | 158 +++++++++++++++--- mm/page_alloc.c | 4 +- 15 files changed, 224 insertions(+), 38 deletions(-)