diff mbox

[RFC,1/2] mmap: Define a new MAP_PMEM_AWARE mmap flag

Message ID 56C9EE14.9090003@plexistor.com (mailing list archive)
State New, archived
Headers show

Commit Message

Boaz Harrosh Feb. 21, 2016, 5:04 p.m. UTC
In dax.c we go to great length to keep track of write
faulted pages, so on m/fsync time we can cl_flush all these
"dirty" pages, so they are durable.

This is heavy on locking and resources and slows down
write-mmap performance considerably.

But some applications might already be aware of PMEM and
might use the fast movnt instructions to directly persist
to pmem storage bypassing CPU caches.

For these applications we define a new MAP_PMEM_AWARE mmap

In a later patch we use this flag in fs/dax.c so to optimize
for these applications.

NOTE: In current code we also want/need for the vma to
carry this flag so a new VM_PMEM_AWARE flag is also defined
and do_mmap() will translate between the constants.

NOTE2: vm_flags has already exhausted the 32 bits, but there
was a hole left at value 0x00800000
(After VM_HUGETLB and before VM_ARCH_1)
I hope this does not step on anyone's toes?

CC: Dan Williams <dan.j.williams@intel.com>
CC: Ross Zwisler <ross.zwisler@linux.intel.com>
CC: Matthew Wilcox <willy@linux.intel.com>
CC: linux-nvdimm <linux-nvdimm@ml01.01.org>
CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <hannes@cmpxchg.org>
CC: linux-mm@kvack.org (open list:MEMORY MANAGEMENT)

Signed-off-by: Boaz Harrosh <boaz@plexistor.com>
 include/linux/mm.h              | 1 +
 include/uapi/asm-generic/mman.h | 1 +
 mm/mmap.c                       | 2 ++
 3 files changed, 4 insertions(+)
diff mbox


diff --git a/include/linux/mm.h b/include/linux/mm.h
index 376f373..fe992c0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -155,6 +155,7 @@  extern unsigned int kobjsize(const void *objp);
 #define VM_ACCOUNT	0x00100000	/* Is a VM accounted object */
 #define VM_NORESERVE	0x00200000	/* should the VM suppress accounting */
 #define VM_HUGETLB	0x00400000	/* Huge TLB Page VM */
+#define VM_PMEM_AWARE	0x00800000	/* Caries MAP_PMEM_AWARE */
 #define VM_ARCH_1	0x01000000	/* Architecture-specific flag */
 #define VM_ARCH_2	0x02000000
 #define VM_DONTDUMP	0x04000000	/* Do not include in the core dump */
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 7162cd4..0dc14d7 100644
--- a/include/uapi/asm-generic/mman.h
+++ b/include/uapi/asm-generic/mman.h
@@ -12,6 +12,7 @@ 
 #define MAP_NONBLOCK	0x10000		/* do not block on IO */
 #define MAP_STACK	0x20000		/* give out an address that is best suited for process/thread stacks */
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
+#define MAP_PMEM_AWARE	0x80000		/* dax.c: Do not cl_flush dirty pages */
 /* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
diff --git a/mm/mmap.c b/mm/mmap.c
index 76d1ec2..5ebc525 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1402,6 +1402,8 @@  unsigned long do_mmap(struct file *file, unsigned long addr,
 		if (file && is_file_hugepages(file))
 			vm_flags |= VM_NORESERVE;
+	if (flags & MAP_PMEM_AWARE)
+		vm_flags |= VM_PMEM_AWARE;
 	addr = mmap_region(file, addr, len, vm_flags, pgoff);
 	if (!IS_ERR_VALUE(addr) &&