diff mbox series

[2/8] mm: Split huge pages on write-notify or COW

Message ID 20191203132239.5910-3-thomas_os@shipmail.org (mailing list archive)
State New, archived
Headers show
Series Huge page-table entries for TTM | expand

Commit Message

Thomas Hellström (Intel) Dec. 3, 2019, 1:22 p.m. UTC
From: Thomas Hellstrom <thellstrom@vmware.com>

We currently only do COW and write-notify on the PTE level, so if the
huge_fault() handler returns VM_FAULT_FALLBACK on wp faults,
split the huge pages and page-table entries. Also do this for huge PUDs
if there is no huge_fault() handler and the vma is not anonymous, similar
to how it's done for PMDs.

Note that fs/dax.c does the splitting in the huge_fault() handler, but as
huge_fault() is implemented by modules we need to consider whether to
export the splitting functions for use in the modules or whether to try
to keep calls in the core. Opt for the latter. A follow-up patch can
remove the dax.c split_huge_pmd() if needed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 mm/memory.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

Comments

Andrew Morton March 1, 2020, 4:04 a.m. UTC | #1
On Tue,  3 Dec 2019 14:22:33 +0100 Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:

> We currently only do COW and write-notify on the PTE level, so if the
> huge_fault() handler returns VM_FAULT_FALLBACK on wp faults,
> split the huge pages and page-table entries. Also do this for huge PUDs
> if there is no huge_fault() handler and the vma is not anonymous, similar
> to how it's done for PMDs.

There is no description here of *why* we are making this change?

> Note that fs/dax.c does the splitting in the huge_fault() handler, but as
> huge_fault() is implemented by modules we need to consider whether to
> export the splitting functions for use in the modules or whether to try
> to keep calls in the core. Opt for the latter. A follow-up patch can
> remove the dax.c split_huge_pmd() if needed.
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 78517a5bb8ac..7ede2456a76f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3851,11 +3851,14 @@  static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd)
 {
 	if (vma_is_anonymous(vmf->vma))
 		return do_huge_pmd_wp_page(vmf, orig_pmd);
-	if (vmf->vma->vm_ops->huge_fault)
-		return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
+	if (vmf->vma->vm_ops->huge_fault) {
+		vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
 
-	/* COW handled on pte level: split pmd */
-	VM_BUG_ON_VMA(vmf->vma->vm_flags & VM_SHARED, vmf->vma);
+		if (!(ret & VM_FAULT_FALLBACK))
+			return ret;
+	}
+
+	/* COW or write-notify handled on pte level: split pmd. */
 	__split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL);
 
 	return VM_FAULT_FALLBACK;
@@ -3883,9 +3886,16 @@  static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	/* No support for anonymous transparent PUD pages yet */
 	if (vma_is_anonymous(vmf->vma))
-		return VM_FAULT_FALLBACK;
-	if (vmf->vma->vm_ops->huge_fault)
-		return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD);
+		goto split;
+	if (vmf->vma->vm_ops->huge_fault) {
+		vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD);
+
+		if (!(ret & VM_FAULT_FALLBACK))
+			return ret;
+	}
+split:
+	/* COW or write-notify not handled on PUD level: split pud.*/
+	__split_huge_pud(vmf->vma, vmf->pud, vmf->address);
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 	return VM_FAULT_FALLBACK;
 }