mbox series

[v7,0/2] mm/vmalloc: lock contention optimization under multi-threading

Message ID 20240301155417.1852290-1-rulin.huang@intel.com (mailing list archive)
Headers show
Series mm/vmalloc: lock contention optimization under multi-threading | expand

Message

Huang, Rulin March 1, 2024, 3:54 p.m. UTC
Hi,

This version has the rearrangement of macros from the previous one.

We are not sure whether we have completely moved these macros and 
their corresponding helper to the correct position. Could you please 
help to check whether they are correct?

~

1. Motivation

When allocating a new memory area where the mapping address range is 
known, it is observed that the vmap_node->busy.lock is acquired twice 
but one of the acquisitions is actually unnecessary.

2. Design

Among the two acquisitions, the first one occurs in the 
alloc_vmap_area() function when inserting the vm area into the vm 
mapping red-black tree, and the second one occurs in the 
setup_vmalloc_vm() function when updating the properties of the vm, 
such as flags and address, etc.

Combine these two operations together in alloc_vmap_area(), which 
improves scalability when the vmap_node->busy.lock is contended.
By doing so, the need to acquire the lock twice can also be eliminated 
to once.

3. Test results

With the above change, tested on intel sapphire rapids
platform(224 vcpu), a 4% performance improvement is gained on 
stress-ng/pthread(https://github.com/ColinIanKing/stress-ng),
which is the stress test of thread creations.

rulinhuang

[v1] https://lore.kernel.org/all/20240207033059.1565623-1-rulin.huang@intel.com/
[v2] https://lore.kernel.org/all/20240220090521.3316345-1-rulin.huang@intel.com/
[v3] https://lore.kernel.org/all/20240221032905.11392-1-rulin.huang@intel.com/
[v4] https://lore.kernel.org/all/20240222120536.216166-1-rulin.huang@intel.com/
[v5] https://lore.kernel.org/all/20240223130318.112198-2-rulin.huang@intel.com/
[v6] https://lore.kernel.org/lkml/aa8f0413-d055-4b49-bcd3-401e93e01c6d@intel.com/


rulinhuang (2):
  mm/vmalloc: Moved macros with no functional change happened
  mm/vmalloc: Eliminated the lock contention from twice to once

 mm/vmalloc.c | 314 +++++++++++++++++++++++++--------------------------
 1 file changed, 155 insertions(+), 159 deletions(-)


base-commit: 10c2cf5fe97647d68ee89b1f921e982e71519f20

Comments

Huang, Rulin March 6, 2024, 9:18 a.m. UTC | #1
Hello, are there any issues with this patch that need to be modified? If
there is any, we will modify it as soon as possible, thank you.

On 2024/3/1 23:54, rulinhuang wrote:
> Hi,
> 
> This version has the rearrangement of macros from the previous one.
> 
> We are not sure whether we have completely moved these macros and 
> their corresponding helper to the correct position. Could you please 
> help to check whether they are correct?
> 
> ~
> 
> 1. Motivation
> 
> When allocating a new memory area where the mapping address range is 
> known, it is observed that the vmap_node->busy.lock is acquired twice 
> but one of the acquisitions is actually unnecessary.
> 
> 2. Design
> 
> Among the two acquisitions, the first one occurs in the 
> alloc_vmap_area() function when inserting the vm area into the vm 
> mapping red-black tree, and the second one occurs in the 
> setup_vmalloc_vm() function when updating the properties of the vm, 
> such as flags and address, etc.
> 
> Combine these two operations together in alloc_vmap_area(), which 
> improves scalability when the vmap_node->busy.lock is contended.
> By doing so, the need to acquire the lock twice can also be eliminated 
> to once.
> 
> 3. Test results
> 
> With the above change, tested on intel sapphire rapids
> platform(224 vcpu), a 4% performance improvement is gained on 
> stress-ng/pthread(https://github.com/ColinIanKing/stress-ng),
> which is the stress test of thread creations.
> 
> rulinhuang
> 
> [v1] https://lore.kernel.org/all/20240207033059.1565623-1-rulin.huang@intel.com/
> [v2] https://lore.kernel.org/all/20240220090521.3316345-1-rulin.huang@intel.com/
> [v3] https://lore.kernel.org/all/20240221032905.11392-1-rulin.huang@intel.com/
> [v4] https://lore.kernel.org/all/20240222120536.216166-1-rulin.huang@intel.com/
> [v5] https://lore.kernel.org/all/20240223130318.112198-2-rulin.huang@intel.com/
> [v6] https://lore.kernel.org/lkml/aa8f0413-d055-4b49-bcd3-401e93e01c6d@intel.com/
> 
> 
> rulinhuang (2):
>   mm/vmalloc: Moved macros with no functional change happened
>   mm/vmalloc: Eliminated the lock contention from twice to once
> 
>  mm/vmalloc.c | 314 +++++++++++++++++++++++++--------------------------
>  1 file changed, 155 insertions(+), 159 deletions(-)
> 
> 
> base-commit: 10c2cf5fe97647d68ee89b1f921e982e71519f20