diff mbox series

[RFC,5/6] device-dax: relax the memblock size alignment for kmem_start

Message ID 20200729033424.2629-6-justin.he@arm.com (mailing list archive)
State New, archived
Headers show
Series decrease unnecessary gap due to pmem kmem alignment | expand

Commit Message

Jia He July 29, 2020, 3:34 a.m. UTC
Previously, kmem_start in dev_dax_kmem_probe should be aligned with
SECTION_SIZE_BITS(30), i.e. 1G memblock size on arm64. Even with Dan
Williams' sub-section patch series, it was not helpful when adding the
dax pmem kmem to memblock:
$ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M
$echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
$cat /proc/iomem
...
23c000000-23fffffff : System RAM
  23dd40000-23fecffff : reserved
  23fed0000-23fffffff : reserved
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0          <- boundary are aligned with 1G
    280000000-2bfffffff : System RAM (kmem)
$ lsmem
RANGE                                 SIZE  STATE REMOVABLE BLOCK
0x0000000040000000-0x000000023fffffff   8G online       yes   1-8
0x0000000280000000-0x00000002bfffffff   1G online       yes    10

Memory block size:         1G
Total online memory:       9G
Total offline memory:      0B
...
Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G
alignment on arm64. More than that, only 1G memory is returned while 2G is
requested.

On x86, the gap is relatively small due to SECTION_SIZE_BITS(27).

Besides descreasing SECTION_SIZE_BITS on arm64, we can relax the alignment
when adding the kmem.
After this patch:
240000000-33fdfffff : Persistent Memory
  240000000-2421fffff : namespace0.0
  242400000-2bfffffff : dax0.0
    242400000-2bfffffff : System RAM (kmem)
$ lsmem
RANGE                                 SIZE  STATE REMOVABLE BLOCK
0x0000000040000000-0x00000002bfffffff  10G online       yes  1-10

Memory block size:         1G
Total online memory:      10G
Total offline memory:      0B

Notes, block 9-10 are the newly hotplug added.

This patches remove the tight alignment constraint of
memory_block_size_bytes(), but still keep the constraint from
online_pages_range().

Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)
diff mbox series

Patch

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index d77786dc0d92..849d0706dfe0 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -30,9 +30,20 @@  int dev_dax_kmem_probe(struct device *dev)
 	const char *new_res_name;
 	int numa_node;
 	int rc;
+	int order;
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+	/* kmem_start needn't be aligned with memory_block_size_bytes().
+	 * But given the constraint in online_pages_range(), adjust the
+	 * alignment of kmem_start and kmem_size
+	 */
+	kmem_size = resource_size(res);
+	order = min_t(int, MAX_ORDER - 1, get_order(kmem_size));
+	kmem_start = ALIGN(res->start, 1ul << (order + PAGE_SHIFT));
+	/* Adjust the size down to compensate for moving up kmem_start: */
+	kmem_size -= kmem_start - res->start;
+	/* Align the size down to cover only complete blocks: */
+	kmem_size &= ~((1ul << (order + PAGE_SHIFT)) - 1);
+	kmem_end = kmem_start + kmem_size;
 
 	/*
 	 * Ensure good NUMA information for the persistent memory.
@@ -48,13 +59,6 @@  int dev_dax_kmem_probe(struct device *dev)
 			numa_node, res);
 	}
 
-	kmem_size = resource_size(res);
-	/* Adjust the size down to compensate for moving up kmem_start: */
-	kmem_size -= kmem_start - res->start;
-	/* Align the size down to cover only complete blocks: */
-	kmem_size &= ~(memory_block_size_bytes() - 1);
-	kmem_end = kmem_start + kmem_size;
-
 	new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
 	if (!new_res_name)
 		return -ENOMEM;