diff mbox series

[v2,4/5] mm/cow: optimise pte dirty bit handling in fork

Message ID 20181016131343.20556-5-npiggin@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm: dirty/accessed pte optimisations | expand

Commit Message

Nicholas Piggin Oct. 16, 2018, 1:13 p.m. UTC
fork clears dirty/accessed bits from new ptes in the child. This logic
has existed since mapped page reclaim was done by scanning ptes when
it may have been quite important. Today with physical based pte
scanning, there is less reason to clear these bits, so this patch
avoids clearing the dirty bit in the child.

Dirty bits are all tested and cleared together, and any dirty bit is
the same as many dirty bits, so from a correctness and writeback
bandwidth point-of-view it does not matter if the child gets a dirty
bit.

Dirty ptes are more costly to unmap because they require flushing
under the page table lock, but it is pretty rare to have a shared
dirty mapping that is copied on fork, so just simplify the code and
avoid this dirty clearing logic.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 mm/memory.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 0387ee1e3582..9e314339a0bd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1028,11 +1028,12 @@  copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	}
 
 	/*
-	 * If it's a shared mapping, mark it clean in
-	 * the child
+	 * Child inherits dirty and young bits from parent. There is no
+	 * point clearing them because any cleaning or aging has to walk
+	 * all ptes anyway, and it will notice the bits set in the parent.
+	 * Leaving them set avoids stalls and even page faults on CPUs that
+	 * handle these bits in software.
 	 */
-	if (vm_flags & VM_SHARED)
-		pte = pte_mkclean(pte);
 
 	page = vm_normal_page(vma, addr, pte);
 	if (page) {