From patchwork Wed Jul 29 03:34:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690303 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F162014B7 for ; Wed, 29 Jul 2020 03:36:55 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C44F62074B for ; Wed, 29 Jul 2020 03:36:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="DbH3Q6Wr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C44F62074B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:References:In-Reply-To:Message-Id:Date:Subject:To: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=6V2w7SEsIEnDX8j6gerL0MKGBnDJxJKEj4BN7aWVjmI=; b=DbH3Q6Wr6N1+dsjrpFwgc+w95L Ty5C+bAwQazFohwyR6byJtun/134YNtZBEadzCqMAstzG2HqGN5E6vhCYUWvEk57KQddsmCluqhAe 6SBUL9jI0E5b+kQOrt7vv6a6q2VrGwWVPOXZQJ+yhhLu8BTz+4AHg9pivpQE4V4p1g2mey3ZpZGZj SAhp1UFlXiAFRl8aOvC2m/9bXb6DSbWlVW8P1DMj+3CPloE6x3xTZi9g6UFMS0NBMsriv1ACeFFOj KeefwhE2rQA6YWnznawsgSOJxAPUU1diMfy1rSaHODT0UJK0f3pyi33An8q/BgpDdhA4CJFOnLkMg QeEB69pg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k0csX-0002Rq-QA; Wed, 29 Jul 2020 03:35:41 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k0csT-0002Q8-VQ for linux-arm-kernel@lists.infradead.org; Wed, 29 Jul 2020 03:35:39 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1CA4631B; Tue, 28 Jul 2020 20:35:36 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A161E3F66E; Tue, 28 Jul 2020 20:35:28 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 3/6] mm/memory_hotplug: allow pmem kmem not to align with memory_block_size Date: Wed, 29 Jul 2020 11:34:21 +0800 Message-Id: <20200729033424.2629-4-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200728_233538_475313_99747AE7 X-CRM114-Status: GOOD ( 22.71 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [217.140.110.172 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , "Rafael J. Wysocki" , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, Ira Weiny , Dave Jiang , Jason Gunthorpe , Will Deacon , Kaly Xin , Kees Cook , Anshuman Khandual , Hsin-Yi Wang , Jia He , linux-arm-kernel@lists.infradead.org, Pankaj Gupta , Steve Capper , Greg Kroah-Hartman , linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, Wei Yang , Andrew Morton , Logan Gunthorpe MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org When dax pmem is probed as RAM device on arm64, previously, kmem_start in dev_dax_kmem_probe() should be aligned with 1G memblock size on arm64 due to SECTION_SIZE_BITS(30). There will be some meta data at the beginning/end of the iomem space, e.g. namespace info and nvdimm label: 240000000-33fdfffff : Persistent Memory 240000000-2403fffff : namespace0.0 280000000-2bfffffff : dax0.0 280000000-2bfffffff : System RAM Hence it makes the whole kmem space not aligned with memory_block_size for both start addr and end addr. Hence there is a big gap when kmem is added into memory block which causes big memory space wasting. This changes it by relaxing the alignment check for dax pmem kmem in the path of online/offline memory blocks. Signed-off-by: Jia He --- drivers/base/memory.c | 16 ++++++++++++++++ mm/memory_hotplug.c | 39 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 4a1691664c6c..3d2a94f3b1d9 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -334,6 +334,22 @@ static ssize_t valid_zones_show(struct device *dev, * online nodes otherwise the page_zone is not reliable */ if (mem->state == MEM_ONLINE) { +#ifdef CONFIG_ZONE_DEVICE + struct resource res; + int ret; + + /* adjust start_pfn for dax pmem kmem */ + ret = find_next_iomem_res(start_pfn << PAGE_SHIFT, + ((start_pfn + nr_pages) << PAGE_SHIFT) - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (!ret && PFN_UP(res.start) > start_pfn) { + nr_pages -= PFN_UP(res.start) - start_pfn; + start_pfn = PFN_UP(res.start); + } +#endif + /* * The block contains more than one zone can not be offlined. * This can happen e.g. for ZONE_DMA and ZONE_DMA32 diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a53103dc292b..25745f67b680 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -999,6 +999,20 @@ int try_online_node(int nid) static int check_hotplug_memory_range(u64 start, u64 size) { +#ifdef CONFIG_ZONE_DEVICE + struct resource res; + int ret; + + /* Allow pmem kmem not to align with block size */ + ret = find_next_iomem_res(start, start + size - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (!ret) { + return 0; + } +#endif + /* memory range must be block size aligned */ if (!size || !IS_ALIGNED(start, memory_block_size_bytes()) || !IS_ALIGNED(size, memory_block_size_bytes())) { @@ -1481,19 +1495,42 @@ static int __ref __offline_pages(unsigned long start_pfn, mem_hotplug_begin(); /* - * Don't allow to offline memory blocks that contain holes. + * Don't allow to offline memory blocks that contain holes except + * for pmem. * Consequently, memory blocks with holes can never get onlined * via the hotplug path - online_pages() - as hotplugged memory has * no holes. This way, we e.g., don't have to worry about marking * memory holes PG_reserved, don't need pfn_valid() checks, and can * avoid using walk_system_ram_range() later. + * When dax pmem is used as RAM (kmem), holes at the beginning is + * allowed. */ walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages, count_system_ram_pages_cb); if (nr_pages != end_pfn - start_pfn) { +#ifdef CONFIG_ZONE_DEVICE + struct resource res; + + /* Allow pmem kmem not to align with block size */ + ret = find_next_iomem_res(start_pfn << PAGE_SHIFT, + (end_pfn << PAGE_SHIFT) - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (ret) { + ret = -EINVAL; + reason = "memory holes"; + goto failed_removal; + } + + /* adjust start_pfn for dax pmem kmem */ + start_pfn = PFN_UP(res.start); + end_pfn = PFN_DOWN(res.end + 1); +#else ret = -EINVAL; reason = "memory holes"; goto failed_removal; +#endif } /* This makes hotplug much easier...and readable.