diff mbox

libnvdimm: Stop using HPAGE_SIZE

Message ID 20170627095633.6508-1-oohall@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Oliver O'Halloran June 27, 2017, 9:56 a.m. UTC
Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and
PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when
hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more
in common with THP than hugetlbfs we should proably be using
HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just
give it a new name.

The other usage of HPAGE_SIZE in libnvdimm is when determining how large
the altmap should be. For the reasons mentioned above it doesn't really
make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to
use in generic code and it happens to match the vmemmap allocation block
on x86 and Power. It's still a hack, but it's a slightly nicer hack.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 drivers/nvdimm/nd.h       | 7 +++++++
 drivers/nvdimm/pfn_devs.c | 9 +++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

Comments

Dan Williams July 10, 2017, 11:44 p.m. UTC | #1
On Tue, Jun 27, 2017 at 2:56 AM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and
> PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when
> hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more
> in common with THP than hugetlbfs we should proably be using
> HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just
> give it a new name.
>
> The other usage of HPAGE_SIZE in libnvdimm is when determining how large
> the altmap should be. For the reasons mentioned above it doesn't really
> make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to
> use in generic code and it happens to match the vmemmap allocation block
> on x86 and Power. It's still a hack, but it's a slightly nicer hack.

Looks ok, but I seem to remember some 0day kbuild robot reports the
last time I tried to use PMD_SIZE. I'll give this a spin in my tree
and see if that's the still the case or I'm mis-remembering.
Oliver O'Halloran July 11, 2017, 5:09 a.m. UTC | #2
On Tue, Jul 11, 2017 at 9:44 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Jun 27, 2017 at 2:56 AM, Oliver O'Halloran <oohall@gmail.com> wrote:
>> Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and
>> PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when
>> hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more
>> in common with THP than hugetlbfs we should proably be using
>> HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just
>> give it a new name.
>>
>> The other usage of HPAGE_SIZE in libnvdimm is when determining how large
>> the altmap should be. For the reasons mentioned above it doesn't really
>> make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to
>> use in generic code and it happens to match the vmemmap allocation block
>> on x86 and Power. It's still a hack, but it's a slightly nicer hack.
>
> Looks ok, but I seem to remember some 0day kbuild robot reports the
> last time I tried to use PMD_SIZE. I'll give this a spin in my tree
> and see if that's the still the case or I'm mis-remembering.

If PMD_SIZE is causing issues we should probably look at fixing the
arch that provides it rather than trying to work around it. The kernel
expects a three level page table structure to be provided by the arch
even if the HW doesn't provide one so it *should* be well defined
everywhere.

That said, pfn_devs.c is only built when ZONE_DEVICE is enabled. So in
practice we only really need to make sure it works on x86 and power.
If another arch wants to support it then the onus is on them to make
sure PMD_SIZE exists.

Oliver
diff mbox

Patch

diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 8cabd836df0e..714e3337b609 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -281,6 +281,13 @@  static inline struct device *nd_btt_create(struct nd_region *nd_region)
 
 struct nd_pfn *to_nd_pfn(struct device *dev);
 #if IS_ENABLED(CONFIG_NVDIMM_PFN)
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define PFN_DEFAULT_ALIGNMENT HPAGE_PMD_SIZE
+#else
+#define PFN_DEFAULT_ALIGNMENT PAGE_SIZE
+#endif
+
 int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns);
 bool is_nd_pfn(struct device *dev);
 struct device *nd_pfn_create(struct nd_region *nd_region);
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 5fcb6f5b22a2..2ae9a000b090 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -290,7 +290,7 @@  struct device *nd_pfn_devinit(struct nd_pfn *nd_pfn,
 		return NULL;
 
 	nd_pfn->mode = PFN_MODE_NONE;
-	nd_pfn->align = HPAGE_SIZE;
+	nd_pfn->align = PFN_DEFAULT_ALIGNMENT;
 	dev = &nd_pfn->dev;
 	device_initialize(&nd_pfn->dev);
 	if (ndns && !__nd_attach_ndns(&nd_pfn->dev, ndns, &nd_pfn->ndns)) {
@@ -638,11 +638,12 @@  static int nd_pfn_init(struct nd_pfn *nd_pfn)
 			/ PAGE_SIZE);
 	if (nd_pfn->mode == PFN_MODE_PMEM) {
 		/*
-		 * vmemmap_populate_hugepages() allocates the memmap array in
-		 * HPAGE_SIZE chunks.
+		 * The altmap should be padded out to the block size used
+		 * when populating the vmemmap. This *should* be equal to
+		 * PMD_SIZE for most architectures.
 		 */
 		offset = ALIGN(start + SZ_8K + 64 * npfns + dax_label_reserve,
-				max(nd_pfn->align, HPAGE_SIZE)) - start;
+				max(nd_pfn->align, PMD_SIZE)) - start;
 	} else if (nd_pfn->mode == PFN_MODE_RAM)
 		offset = ALIGN(start + SZ_8K + dax_label_reserve,
 				nd_pfn->align) - start;