From patchwork Wed Jul 29 03:34:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690273 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D42C1575 for ; Wed, 29 Jul 2020 03:35:22 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 315F82074B for ; Wed, 29 Jul 2020 03:35:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 315F82074B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 199DA12693ECE; Tue, 28 Jul 2020 20:35:22 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.140.110.172; helo=foss.arm.com; envelope-from=justin.he@arm.com; receiver= Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ml01.01.org (Postfix) with ESMTP id CB1E2124DA51E for ; Tue, 28 Jul 2020 20:35:20 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 288FB31B; Tue, 28 Jul 2020 20:35:20 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AC23E3F66E; Tue, 28 Jul 2020 20:35:12 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 1/6] mm/memory_hotplug: remove redundant memory block size alignment check Date: Wed, 29 Jul 2020 11:34:19 +0800 Message-Id: <20200729033424.2629-2-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> Message-ID-Hash: ZBYAH4VYQVDAGDBJDSJ675TNIDYLCENW X-Message-ID-Hash: ZBYAH4VYQVDAGDBJDSJ675TNIDYLCENW X-MailFrom: justin.he@arm.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Catalin Marinas , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Steve Capper , Mark Rutland , Anshuman Khandual , Hsin-Yi Wang , Jason Gunthorpe , Dave Hansen , Kees Cook , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, Pankaj Gupta , Kaly Xin , Jia He X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The alignment check has been done by check_hotplug_memory_range(). Hence the redundant one in create_memory_block_devices() can be removed. The similar redundant check is removed in remove_memory_block_devices(). Signed-off-by: Jia He --- drivers/base/memory.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 2b09b68b9f78..4a1691664c6c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -642,10 +642,6 @@ int create_memory_block_devices(unsigned long start, unsigned long size) unsigned long block_id; int ret = 0; - if (WARN_ON_ONCE(!IS_ALIGNED(start, memory_block_size_bytes()) || - !IS_ALIGNED(size, memory_block_size_bytes()))) - return -EINVAL; - for (block_id = start_block_id; block_id != end_block_id; block_id++) { ret = init_memory_block(&mem, block_id, MEM_OFFLINE); if (ret) @@ -678,10 +674,6 @@ void remove_memory_block_devices(unsigned long start, unsigned long size) struct memory_block *mem; unsigned long block_id; - if (WARN_ON_ONCE(!IS_ALIGNED(start, memory_block_size_bytes()) || - !IS_ALIGNED(size, memory_block_size_bytes()))) - return; - for (block_id = start_block_id; block_id != end_block_id; block_id++) { mem = find_memory_block_by_id(block_id); if (WARN_ON_ONCE(!mem)) From patchwork Wed Jul 29 03:34:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690279 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84B7A1575 for ; Wed, 29 Jul 2020 03:35:32 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6D48820809 for ; Wed, 29 Jul 2020 03:35:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D48820809 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 334C111460B21; Tue, 28 Jul 2020 20:35:32 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.140.110.172; helo=foss.arm.com; envelope-from=justin.he@arm.com; receiver= Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ml01.01.org (Postfix) with ESMTP id 060AF11001AB9 for ; Tue, 28 Jul 2020 20:35:28 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 22CD631B; Tue, 28 Jul 2020 20:35:28 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A76983F66E; Tue, 28 Jul 2020 20:35:20 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 2/6] resource: export find_next_iomem_res() helper Date: Wed, 29 Jul 2020 11:34:20 +0800 Message-Id: <20200729033424.2629-3-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> Message-ID-Hash: 7RP4CSUI4DOCLVXRJ7SS3YX6AR55NS3Q X-Message-ID-Hash: 7RP4CSUI4DOCLVXRJ7SS3YX6AR55NS3Q X-MailFrom: justin.he@arm.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Catalin Marinas , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Steve Capper , Mark Rutland , Anshuman Khandual , Hsin-Yi Wang , Jason Gunthorpe , Dave Hansen , Kees Cook , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, Pankaj Gupta , Kaly Xin , Jia He X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The helper is to find the lowest iomem resource that covers part of [@start..@end] It is useful when relaxing the alignment check for dax pmem kmem. Signed-off-by: Jia He --- include/linux/ioport.h | 3 +++ kernel/resource.c | 3 ++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6c2b06fe8beb..203fd16c9f45 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -247,6 +247,9 @@ extern struct resource * __request_region(struct resource *, extern void __release_region(struct resource *, resource_size_t, resource_size_t); +extern int find_next_iomem_res(resource_size_t start, resource_size_t end, + unsigned long flags, unsigned long desc, + bool first_lvl, struct resource *res); #ifdef CONFIG_MEMORY_HOTREMOVE extern int release_mem_region_adjustable(struct resource *, resource_size_t, resource_size_t); diff --git a/kernel/resource.c b/kernel/resource.c index 841737bbda9e..57e6a6802a3d 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -338,7 +338,7 @@ EXPORT_SYMBOL(release_resource); * @first_lvl: walk only the first level children, if set * @res: return ptr, if resource found */ -static int find_next_iomem_res(resource_size_t start, resource_size_t end, +int find_next_iomem_res(resource_size_t start, resource_size_t end, unsigned long flags, unsigned long desc, bool first_lvl, struct resource *res) { @@ -391,6 +391,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end, read_unlock(&resource_lock); return p ? 0 : -ENODEV; } +EXPORT_SYMBOL(find_next_iomem_res); static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end, unsigned long flags, unsigned long desc, From patchwork Wed Jul 29 03:34:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690281 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 86421722 for ; Wed, 29 Jul 2020 03:35:38 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6D85C20825 for ; Wed, 29 Jul 2020 03:35:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D85C20825 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 515FB12664311; Tue, 28 Jul 2020 20:35:38 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.140.110.172; helo=foss.arm.com; envelope-from=justin.he@arm.com; receiver= Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ml01.01.org (Postfix) with ESMTP id 7B59B12664311 for ; Tue, 28 Jul 2020 20:35:36 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1CA4631B; Tue, 28 Jul 2020 20:35:36 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A161E3F66E; Tue, 28 Jul 2020 20:35:28 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 3/6] mm/memory_hotplug: allow pmem kmem not to align with memory_block_size Date: Wed, 29 Jul 2020 11:34:21 +0800 Message-Id: <20200729033424.2629-4-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> Message-ID-Hash: A226WWADFYG5MWIPXQML4YTBEJ6ZTBZV X-Message-ID-Hash: A226WWADFYG5MWIPXQML4YTBEJ6ZTBZV X-MailFrom: justin.he@arm.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Catalin Marinas , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Steve Capper , Mark Rutland , Anshuman Khandual , Hsin-Yi Wang , Jason Gunthorpe , Dave Hansen , Kees Cook , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, Pankaj Gupta , Kaly Xin , Jia He X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When dax pmem is probed as RAM device on arm64, previously, kmem_start in dev_dax_kmem_probe() should be aligned with 1G memblock size on arm64 due to SECTION_SIZE_BITS(30). There will be some meta data at the beginning/end of the iomem space, e.g. namespace info and nvdimm label: 240000000-33fdfffff : Persistent Memory 240000000-2403fffff : namespace0.0 280000000-2bfffffff : dax0.0 280000000-2bfffffff : System RAM Hence it makes the whole kmem space not aligned with memory_block_size for both start addr and end addr. Hence there is a big gap when kmem is added into memory block which causes big memory space wasting. This changes it by relaxing the alignment check for dax pmem kmem in the path of online/offline memory blocks. Signed-off-by: Jia He --- drivers/base/memory.c | 16 ++++++++++++++++ mm/memory_hotplug.c | 39 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 4a1691664c6c..3d2a94f3b1d9 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -334,6 +334,22 @@ static ssize_t valid_zones_show(struct device *dev, * online nodes otherwise the page_zone is not reliable */ if (mem->state == MEM_ONLINE) { +#ifdef CONFIG_ZONE_DEVICE + struct resource res; + int ret; + + /* adjust start_pfn for dax pmem kmem */ + ret = find_next_iomem_res(start_pfn << PAGE_SHIFT, + ((start_pfn + nr_pages) << PAGE_SHIFT) - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (!ret && PFN_UP(res.start) > start_pfn) { + nr_pages -= PFN_UP(res.start) - start_pfn; + start_pfn = PFN_UP(res.start); + } +#endif + /* * The block contains more than one zone can not be offlined. * This can happen e.g. for ZONE_DMA and ZONE_DMA32 diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a53103dc292b..25745f67b680 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -999,6 +999,20 @@ int try_online_node(int nid) static int check_hotplug_memory_range(u64 start, u64 size) { +#ifdef CONFIG_ZONE_DEVICE + struct resource res; + int ret; + + /* Allow pmem kmem not to align with block size */ + ret = find_next_iomem_res(start, start + size - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (!ret) { + return 0; + } +#endif + /* memory range must be block size aligned */ if (!size || !IS_ALIGNED(start, memory_block_size_bytes()) || !IS_ALIGNED(size, memory_block_size_bytes())) { @@ -1481,19 +1495,42 @@ static int __ref __offline_pages(unsigned long start_pfn, mem_hotplug_begin(); /* - * Don't allow to offline memory blocks that contain holes. + * Don't allow to offline memory blocks that contain holes except + * for pmem. * Consequently, memory blocks with holes can never get onlined * via the hotplug path - online_pages() - as hotplugged memory has * no holes. This way, we e.g., don't have to worry about marking * memory holes PG_reserved, don't need pfn_valid() checks, and can * avoid using walk_system_ram_range() later. + * When dax pmem is used as RAM (kmem), holes at the beginning is + * allowed. */ walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages, count_system_ram_pages_cb); if (nr_pages != end_pfn - start_pfn) { +#ifdef CONFIG_ZONE_DEVICE + struct resource res; + + /* Allow pmem kmem not to align with block size */ + ret = find_next_iomem_res(start_pfn << PAGE_SHIFT, + (end_pfn << PAGE_SHIFT) - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (ret) { + ret = -EINVAL; + reason = "memory holes"; + goto failed_removal; + } + + /* adjust start_pfn for dax pmem kmem */ + start_pfn = PFN_UP(res.start); + end_pfn = PFN_DOWN(res.end + 1); +#else ret = -EINVAL; reason = "memory holes"; goto failed_removal; +#endif } /* This makes hotplug much easier...and readable. From patchwork Wed Jul 29 03:34:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690287 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8ABCC722 for ; Wed, 29 Jul 2020 03:35:47 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7254F2075F for ; Wed, 29 Jul 2020 03:35:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7254F2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 6D6EC12664311; Tue, 28 Jul 2020 20:35:47 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.140.110.172; helo=foss.arm.com; envelope-from=justin.he@arm.com; receiver= Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ml01.01.org (Postfix) with ESMTP id 8B3111143B072 for ; Tue, 28 Jul 2020 20:35:44 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1581D31B; Tue, 28 Jul 2020 20:35:44 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9ACF53F66E; Tue, 28 Jul 2020 20:35:36 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 4/6] mm/page_alloc: adjust the start,end in dax pmem kmem case Date: Wed, 29 Jul 2020 11:34:22 +0800 Message-Id: <20200729033424.2629-5-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> Message-ID-Hash: YNDPXKMQGVGRJJYRDQSV7UGV6YCC6VOK X-Message-ID-Hash: YNDPXKMQGVGRJJYRDQSV7UGV6YCC6VOK X-MailFrom: justin.he@arm.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Catalin Marinas , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Steve Capper , Mark Rutland , Anshuman Khandual , Hsin-Yi Wang , Jason Gunthorpe , Dave Hansen , Kees Cook , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, Pankaj Gupta , Kaly Xin , Jia He X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There are 3 cases when doing online pages: - normal RAM, should be aligned with memory block size - persistent memory with ZONE_DEVICE - persistent memory used as normal RAM (kmem) with ZONE_NORMAL, this patch tries to adjust the start_pfn/end_pfn after finding the corresponding resource range. Without this patch, the check of __init_single_page when doing online memory will be failed because those pages haven't been mapped in mmu(not present from mmu's point of view). Signed-off-by: Jia He --- mm/page_alloc.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e028b87ce294..13216ab3623f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5971,6 +5971,20 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, if (start_pfn == altmap->base_pfn) start_pfn += altmap->reserve; end_pfn = altmap->base_pfn + vmem_altmap_offset(altmap); + } else { + struct resource res; + int ret; + + /* adjust the start,end in dax pmem kmem case */ + ret = find_next_iomem_res(start_pfn << PAGE_SHIFT, + (end_pfn << PAGE_SHIFT) - 1, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + IORES_DESC_PERSISTENT_MEMORY, + false, &res); + if (!ret) { + start_pfn = PFN_UP(res.start); + end_pfn = PFN_DOWN(res.end + 1); + } } #endif From patchwork Wed Jul 29 03:34:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690291 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AA2791575 for ; Wed, 29 Jul 2020 03:35:55 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8E5E42075F for ; Wed, 29 Jul 2020 03:35:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E5E42075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 88D2512693ECC; Tue, 28 Jul 2020 20:35:55 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.140.110.172; helo=foss.arm.com; envelope-from=justin.he@arm.com; receiver= Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ml01.01.org (Postfix) with ESMTP id 8801A11460B21 for ; Tue, 28 Jul 2020 20:35:52 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0F46A31B; Tue, 28 Jul 2020 20:35:52 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 93FEF3F66E; Tue, 28 Jul 2020 20:35:44 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 5/6] device-dax: relax the memblock size alignment for kmem_start Date: Wed, 29 Jul 2020 11:34:23 +0800 Message-Id: <20200729033424.2629-6-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> Message-ID-Hash: OEULDM2NPRBIUIZTNTITSSNHABF2VJGU X-Message-ID-Hash: OEULDM2NPRBIUIZTNTITSSNHABF2VJGU X-MailFrom: justin.he@arm.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Catalin Marinas , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Steve Capper , Mark Rutland , Anshuman Khandual , Hsin-Yi Wang , Jason Gunthorpe , Dave Hansen , Kees Cook , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, Pankaj Gupta , Kaly Xin , Jia He X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Previously, kmem_start in dev_dax_kmem_probe should be aligned with SECTION_SIZE_BITS(30), i.e. 1G memblock size on arm64. Even with Dan Williams' sub-section patch series, it was not helpful when adding the dax pmem kmem to memblock: $ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M $echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind $echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id $cat /proc/iomem ... 23c000000-23fffffff : System RAM 23dd40000-23fecffff : reserved 23fed0000-23fffffff : reserved 240000000-33fdfffff : Persistent Memory 240000000-2403fffff : namespace0.0 280000000-2bfffffff : dax0.0 <- boundary are aligned with 1G 280000000-2bfffffff : System RAM (kmem) $ lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000040000000-0x000000023fffffff 8G online yes 1-8 0x0000000280000000-0x00000002bfffffff 1G online yes 10 Memory block size: 1G Total online memory: 9G Total offline memory: 0B ... Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G alignment on arm64. More than that, only 1G memory is returned while 2G is requested. On x86, the gap is relatively small due to SECTION_SIZE_BITS(27). Besides descreasing SECTION_SIZE_BITS on arm64, we can relax the alignment when adding the kmem. After this patch: 240000000-33fdfffff : Persistent Memory 240000000-2421fffff : namespace0.0 242400000-2bfffffff : dax0.0 242400000-2bfffffff : System RAM (kmem) $ lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000040000000-0x00000002bfffffff 10G online yes 1-10 Memory block size: 1G Total online memory: 10G Total offline memory: 0B Notes, block 9-10 are the newly hotplug added. This patches remove the tight alignment constraint of memory_block_size_bytes(), but still keep the constraint from online_pages_range(). Signed-off-by: Jia He --- drivers/dax/kmem.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index d77786dc0d92..849d0706dfe0 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -30,9 +30,20 @@ int dev_dax_kmem_probe(struct device *dev) const char *new_res_name; int numa_node; int rc; + int order; - /* Hotplug starting at the beginning of the next block: */ - kmem_start = ALIGN(res->start, memory_block_size_bytes()); + /* kmem_start needn't be aligned with memory_block_size_bytes(). + * But given the constraint in online_pages_range(), adjust the + * alignment of kmem_start and kmem_size + */ + kmem_size = resource_size(res); + order = min_t(int, MAX_ORDER - 1, get_order(kmem_size)); + kmem_start = ALIGN(res->start, 1ul << (order + PAGE_SHIFT)); + /* Adjust the size down to compensate for moving up kmem_start: */ + kmem_size -= kmem_start - res->start; + /* Align the size down to cover only complete blocks: */ + kmem_size &= ~((1ul << (order + PAGE_SHIFT)) - 1); + kmem_end = kmem_start + kmem_size; /* * Ensure good NUMA information for the persistent memory. @@ -48,13 +59,6 @@ int dev_dax_kmem_probe(struct device *dev) numa_node, res); } - kmem_size = resource_size(res); - /* Adjust the size down to compensate for moving up kmem_start: */ - kmem_size -= kmem_start - res->start; - /* Align the size down to cover only complete blocks: */ - kmem_size &= ~(memory_block_size_bytes() - 1); - kmem_end = kmem_start + kmem_size; - new_res_name = kstrdup(dev_name(dev), GFP_KERNEL); if (!new_res_name) return -ENOMEM; From patchwork Wed Jul 29 03:34:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690295 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5B3A1575 for ; Wed, 29 Jul 2020 03:36:02 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A9C6E2075F for ; Wed, 29 Jul 2020 03:36:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A9C6E2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id A347112693ECE; Tue, 28 Jul 2020 20:36:02 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.140.110.172; helo=foss.arm.com; envelope-from=justin.he@arm.com; receiver= Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ml01.01.org (Postfix) with ESMTP id AB1C712664311 for ; Tue, 28 Jul 2020 20:36:00 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 14AB831B; Tue, 28 Jul 2020 20:36:00 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 8D9883F66E; Tue, 28 Jul 2020 20:35:52 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 6/6] arm64: fall back to vmemmap_populate_basepages if not aligned with PMD_SIZE Date: Wed, 29 Jul 2020 11:34:24 +0800 Message-Id: <20200729033424.2629-7-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> Message-ID-Hash: M6W4VTZYOE7GTXSK4S5COXA6S74EG24S X-Message-ID-Hash: M6W4VTZYOE7GTXSK4S5COXA6S74EG24S X-MailFrom: justin.he@arm.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Catalin Marinas , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Steve Capper , Mark Rutland , Anshuman Khandual , Hsin-Yi Wang , Jason Gunthorpe , Dave Hansen , Kees Cook , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, Pankaj Gupta , Kaly Xin , Jia He X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In dax pmem kmem (dax pmem used as RAM device) case, the start address might not be aligned with PMD_SIZE e.g. 240000000-33fdfffff : Persistent Memory 240000000-2421fffff : namespace0.0 242400000-2bfffffff : dax0.0 242400000-2bfffffff : System RAM (kmem) pfn_to_page(0x242400000) is fffffe0007e90000. Without this patch, vmemmap_populate(fffffe0007e90000, ...) will incorrectly create a pmd mapping [fffffe0007e00000, fffffe0008000000] which contains fffffe0007e90000. This adds the check and then falls back to vmemmap_populate_basepages() Signed-off-by: Jia He --- arch/arm64/mm/mmu.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index d69feb2cfb84..3b21bd47e801 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1102,6 +1102,10 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, do { next = pmd_addr_end(addr, end); + if (next - addr < PMD_SIZE) { + vmemmap_populate_basepages(start, next, node, altmap); + continue; + } pgdp = vmemmap_pgd_populate(addr, node); if (!pgdp)