Message ID | 20170627095633.6508-1-oohall@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jun 27, 2017 at 2:56 AM, Oliver O'Halloran <oohall@gmail.com> wrote: > Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and > PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when > hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more > in common with THP than hugetlbfs we should proably be using > HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just > give it a new name. > > The other usage of HPAGE_SIZE in libnvdimm is when determining how large > the altmap should be. For the reasons mentioned above it doesn't really > make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to > use in generic code and it happens to match the vmemmap allocation block > on x86 and Power. It's still a hack, but it's a slightly nicer hack. Looks ok, but I seem to remember some 0day kbuild robot reports the last time I tried to use PMD_SIZE. I'll give this a spin in my tree and see if that's the still the case or I'm mis-remembering.
On Tue, Jul 11, 2017 at 9:44 AM, Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, Jun 27, 2017 at 2:56 AM, Oliver O'Halloran <oohall@gmail.com> wrote: >> Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and >> PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when >> hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more >> in common with THP than hugetlbfs we should proably be using >> HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just >> give it a new name. >> >> The other usage of HPAGE_SIZE in libnvdimm is when determining how large >> the altmap should be. For the reasons mentioned above it doesn't really >> make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to >> use in generic code and it happens to match the vmemmap allocation block >> on x86 and Power. It's still a hack, but it's a slightly nicer hack. > > Looks ok, but I seem to remember some 0day kbuild robot reports the > last time I tried to use PMD_SIZE. I'll give this a spin in my tree > and see if that's the still the case or I'm mis-remembering. If PMD_SIZE is causing issues we should probably look at fixing the arch that provides it rather than trying to work around it. The kernel expects a three level page table structure to be provided by the arch even if the HW doesn't provide one so it *should* be well defined everywhere. That said, pfn_devs.c is only built when ZONE_DEVICE is enabled. So in practice we only really need to make sure it works on x86 and power. If another arch wants to support it then the onus is on them to make sure PMD_SIZE exists. Oliver
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h index 8cabd836df0e..714e3337b609 100644 --- a/drivers/nvdimm/nd.h +++ b/drivers/nvdimm/nd.h @@ -281,6 +281,13 @@ static inline struct device *nd_btt_create(struct nd_region *nd_region) struct nd_pfn *to_nd_pfn(struct device *dev); #if IS_ENABLED(CONFIG_NVDIMM_PFN) + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define PFN_DEFAULT_ALIGNMENT HPAGE_PMD_SIZE +#else +#define PFN_DEFAULT_ALIGNMENT PAGE_SIZE +#endif + int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns); bool is_nd_pfn(struct device *dev); struct device *nd_pfn_create(struct nd_region *nd_region); diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 5fcb6f5b22a2..2ae9a000b090 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -290,7 +290,7 @@ struct device *nd_pfn_devinit(struct nd_pfn *nd_pfn, return NULL; nd_pfn->mode = PFN_MODE_NONE; - nd_pfn->align = HPAGE_SIZE; + nd_pfn->align = PFN_DEFAULT_ALIGNMENT; dev = &nd_pfn->dev; device_initialize(&nd_pfn->dev); if (ndns && !__nd_attach_ndns(&nd_pfn->dev, ndns, &nd_pfn->ndns)) { @@ -638,11 +638,12 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) / PAGE_SIZE); if (nd_pfn->mode == PFN_MODE_PMEM) { /* - * vmemmap_populate_hugepages() allocates the memmap array in - * HPAGE_SIZE chunks. + * The altmap should be padded out to the block size used + * when populating the vmemmap. This *should* be equal to + * PMD_SIZE for most architectures. */ offset = ALIGN(start + SZ_8K + 64 * npfns + dax_label_reserve, - max(nd_pfn->align, HPAGE_SIZE)) - start; + max(nd_pfn->align, PMD_SIZE)) - start; } else if (nd_pfn->mode == PFN_MODE_RAM) offset = ALIGN(start + SZ_8K + dax_label_reserve, nd_pfn->align) - start;
Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more in common with THP than hugetlbfs we should proably be using HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just give it a new name. The other usage of HPAGE_SIZE in libnvdimm is when determining how large the altmap should be. For the reasons mentioned above it doesn't really make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to use in generic code and it happens to match the vmemmap allocation block on x86 and Power. It's still a hack, but it's a slightly nicer hack. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> --- drivers/nvdimm/nd.h | 7 +++++++ drivers/nvdimm/pfn_devs.c | 9 +++++---- 2 files changed, 12 insertions(+), 4 deletions(-)