diff mbox series

[RFC,34/37] mm: rcu safe vma freeing only for multithreaded user space

Message ID 20210407014502.24091-35-michel@lespinasse.org (mailing list archive)
State New, archived
Headers show
Series [RFC,01/37] mmap locking API: mmap_lock_is_contended returns a bool | expand

Commit Message

Michel Lespinasse April 7, 2021, 1:44 a.m. UTC
Performance tuning: as single threaded userspace does not use
speculative page faults, it does not require rcu safe vma freeing.
Turn this off to avoid the related (small) extra overheads.

For multi threaded userspace, we often see a performance benefit from
the rcu safe vma freeing - even in tests that do not have any frequent
concurrent page faults ! This is because rcu safe vma freeing prevents
recently released vmas from being immediately reused in a new thread.

Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
 kernel/fork.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Matthew Wilcox April 7, 2021, 2:50 a.m. UTC | #1
On Tue, Apr 06, 2021 at 06:44:59PM -0700, Michel Lespinasse wrote:
> Performance tuning: as single threaded userspace does not use
> speculative page faults, it does not require rcu safe vma freeing.
> Turn this off to avoid the related (small) extra overheads.
> 
> For multi threaded userspace, we often see a performance benefit from
> the rcu safe vma freeing - even in tests that do not have any frequent
> concurrent page faults ! This is because rcu safe vma freeing prevents
> recently released vmas from being immediately reused in a new thread.

Why does that provide a performance benefit?  Recently released
VMAs are cache-hot, and NUMA-local.  I'd expect the RCU delay to be
performance-negative.
Michel Lespinasse April 8, 2021, 7:53 a.m. UTC | #2
On Wed, Apr 07, 2021 at 03:50:06AM +0100, Matthew Wilcox wrote:
> On Tue, Apr 06, 2021 at 06:44:59PM -0700, Michel Lespinasse wrote:
> > Performance tuning: as single threaded userspace does not use
> > speculative page faults, it does not require rcu safe vma freeing.
> > Turn this off to avoid the related (small) extra overheads.
> > 
> > For multi threaded userspace, we often see a performance benefit from
> > the rcu safe vma freeing - even in tests that do not have any frequent
> > concurrent page faults ! This is because rcu safe vma freeing prevents
> > recently released vmas from being immediately reused in a new thread.
> 
> Why does that provide a performance benefit?  Recently released
> VMAs are cache-hot, and NUMA-local.  I'd expect the RCU delay to be
> performance-negative.

I only have the observation and no full explanation for it.
Just try it on wis-mmap and wis-malloc threaded cases. Of course this
all washes away when dealing with more realistic macro benchmarks.
diff mbox series

Patch

diff --git a/kernel/fork.c b/kernel/fork.c
index 2f20a5c5fed8..623875e8e742 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -389,10 +389,12 @@  static void __vm_area_free(struct rcu_head *head)
 void vm_area_free(struct vm_area_struct *vma)
 {
 #ifdef CONFIG_SPECULATIVE_PAGE_FAULT
-	call_rcu(&vma->vm_rcu, __vm_area_free);
-#else
-	____vm_area_free(vma);
+	if (atomic_read(&vma->vm_mm->mm_users) > 1) {
+		call_rcu(&vma->vm_rcu, __vm_area_free);
+		return;
+	}
 #endif
+	____vm_area_free(vma);
 }
 
 static void account_kernel_stack(struct task_struct *tsk, int account)