From patchwork Wed Jul 29 03:34:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 11690307 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 038CB138C for ; Wed, 29 Jul 2020 03:37:13 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CC7C22074B for ; Wed, 29 Jul 2020 03:37:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="WANQS5fj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC7C22074B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:References:In-Reply-To:Message-Id:Date:Subject:To: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=eSPjvOdXhKiV16PVh0vzKvQIpreJdax8FskQSqO5OXA=; b=WANQS5fj3YKjFqUIqr6bMPTeFC YhpcGY1O+SXNbreUgtBUE5WHQsdVYvYgfK10vcav2gJuVlVf8aQXioSkuUl1511epI6VhkiQSMuNJ h2RHUwmK6WiKQI/4sLnBTDIMwVkfQT+vwmJggGNgFb+AEg0X90ZsuEaJSEeMIU3GQisTdN+ukc9Za i2jSl0hI8XQfIjQpGApxmJNA8gJYQglEPl+ILm9vSa/MQty7Pji8LM9jxCMYdsLVchSO0YXF1U7h4 +Qx0Qsc+t1zQBb22iC3uTIK/1pJ3ls6NzPY2P52XkU41vEUYoYctSosotGCVsaoj3FuFs375tYs4+ gLZbIoIw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k0csq-0002Z6-0C; Wed, 29 Jul 2020 03:36:00 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k0csk-0002WZ-7w for linux-arm-kernel@lists.infradead.org; Wed, 29 Jul 2020 03:35:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0F46A31B; Tue, 28 Jul 2020 20:35:52 -0700 (PDT) Received: from localhost.localdomain (entos-thunderx2-02.shanghai.arm.com [10.169.212.213]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 93FEF3F66E; Tue, 28 Jul 2020 20:35:44 -0700 (PDT) From: Jia He To: Dan Williams , Vishal Verma , Mike Rapoport , David Hildenbrand Subject: [RFC PATCH 5/6] device-dax: relax the memblock size alignment for kmem_start Date: Wed, 29 Jul 2020 11:34:23 +0800 Message-Id: <20200729033424.2629-6-justin.he@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200729033424.2629-1-justin.he@arm.com> References: <20200729033424.2629-1-justin.he@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200728_233554_468911_0AEE076F X-CRM114-Status: GOOD ( 18.00 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [217.140.110.172 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , "Rafael J. Wysocki" , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, Ira Weiny , Dave Jiang , Jason Gunthorpe , Will Deacon , Kaly Xin , Kees Cook , Anshuman Khandual , Hsin-Yi Wang , Jia He , linux-arm-kernel@lists.infradead.org, Pankaj Gupta , Steve Capper , Greg Kroah-Hartman , linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, Wei Yang , Andrew Morton , Logan Gunthorpe MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Previously, kmem_start in dev_dax_kmem_probe should be aligned with SECTION_SIZE_BITS(30), i.e. 1G memblock size on arm64. Even with Dan Williams' sub-section patch series, it was not helpful when adding the dax pmem kmem to memblock: $ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M $echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind $echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id $cat /proc/iomem ... 23c000000-23fffffff : System RAM 23dd40000-23fecffff : reserved 23fed0000-23fffffff : reserved 240000000-33fdfffff : Persistent Memory 240000000-2403fffff : namespace0.0 280000000-2bfffffff : dax0.0 <- boundary are aligned with 1G 280000000-2bfffffff : System RAM (kmem) $ lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000040000000-0x000000023fffffff 8G online yes 1-8 0x0000000280000000-0x00000002bfffffff 1G online yes 10 Memory block size: 1G Total online memory: 9G Total offline memory: 0B ... Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G alignment on arm64. More than that, only 1G memory is returned while 2G is requested. On x86, the gap is relatively small due to SECTION_SIZE_BITS(27). Besides descreasing SECTION_SIZE_BITS on arm64, we can relax the alignment when adding the kmem. After this patch: 240000000-33fdfffff : Persistent Memory 240000000-2421fffff : namespace0.0 242400000-2bfffffff : dax0.0 242400000-2bfffffff : System RAM (kmem) $ lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000040000000-0x00000002bfffffff 10G online yes 1-10 Memory block size: 1G Total online memory: 10G Total offline memory: 0B Notes, block 9-10 are the newly hotplug added. This patches remove the tight alignment constraint of memory_block_size_bytes(), but still keep the constraint from online_pages_range(). Signed-off-by: Jia He --- drivers/dax/kmem.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index d77786dc0d92..849d0706dfe0 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -30,9 +30,20 @@ int dev_dax_kmem_probe(struct device *dev) const char *new_res_name; int numa_node; int rc; + int order; - /* Hotplug starting at the beginning of the next block: */ - kmem_start = ALIGN(res->start, memory_block_size_bytes()); + /* kmem_start needn't be aligned with memory_block_size_bytes(). + * But given the constraint in online_pages_range(), adjust the + * alignment of kmem_start and kmem_size + */ + kmem_size = resource_size(res); + order = min_t(int, MAX_ORDER - 1, get_order(kmem_size)); + kmem_start = ALIGN(res->start, 1ul << (order + PAGE_SHIFT)); + /* Adjust the size down to compensate for moving up kmem_start: */ + kmem_size -= kmem_start - res->start; + /* Align the size down to cover only complete blocks: */ + kmem_size &= ~((1ul << (order + PAGE_SHIFT)) - 1); + kmem_end = kmem_start + kmem_size; /* * Ensure good NUMA information for the persistent memory. @@ -48,13 +59,6 @@ int dev_dax_kmem_probe(struct device *dev) numa_node, res); } - kmem_size = resource_size(res); - /* Adjust the size down to compensate for moving up kmem_start: */ - kmem_size -= kmem_start - res->start; - /* Align the size down to cover only complete blocks: */ - kmem_size &= ~(memory_block_size_bytes() - 1); - kmem_end = kmem_start + kmem_size; - new_res_name = kstrdup(dev_name(dev), GFP_KERNEL); if (!new_res_name) return -ENOMEM;