mbox series

[RFC,bpf-next,v5,0/2] Handle immediate reuse in bpf memory allocator

Message ID 20230619143231.222536-1-houtao@huaweicloud.com (mailing list archive)
Headers show
Series Handle immediate reuse in bpf memory allocator | expand

Message

Hou Tao June 19, 2023, 2:32 p.m. UTC
From: Hou Tao <houtao1@huawei.com>

Hi,

V5 incorporates suggestions from Alexei and Paul (Big thanks for that).
The main changes includes:
*) Use per-cpu list for reusable list and freeing list to reduce lock
   contention and retain numa-ware attribute
*) Use multiple RCU callback for reuse as v3 did
*) Use rcu_momentary_dyntick_idle() to reduce the peak memory footprint

Please see individual patches for more details. As ususal comments and
suggestions are always welcome.

Change Log:
v5:
  * remove prepare_reuse_head and prepare_reuse_tail
  * use 32 as both low_watermark and high_watermark
  * use per-cpu list for reusable list and freeing list
  * use multiple RCU callbacks to do object reuse
  * remove *_tail for all lists
  * use rcu_momentary_dyntick_idle() to shorten RCU grace period

v4: https://lore.kernel.org/bpf/20230606035310.4026145-1-houtao@huaweicloud.com/
 * no kworker (Alexei)
 * Use a global reusable list in bpf memory allocator (Alexei)
 * Remove BPF_MA_FREE_AFTER_RCU_GP flag and do reuse-after-rcu-gp
   defaultly in bpf memory allocator (Alexei)
 * add benchmark results from map_perf_test (Alexei)

v3: https://lore.kernel.org/bpf/20230429101215.111262-1-houtao@huaweicloud.com/
 * add BPF_MA_FREE_AFTER_RCU_GP bpf memory allocator
 * Update htab memory benchmark
   * move the benchmark patch to the last patch
   * remove array and useless bpf_map_lookup_elem(&array, ...) in bpf
     programs
   * add synchronization between addition CPU and deletion CPU for
     add_del_on_diff_cpu case to prevent unnecessary loop
   * add the benchmark result for "extra call_rcu + bpf ma"

v2: https://lore.kernel.org/bpf/20230408141846.1878768-1-houtao@huaweicloud.com/
 * add a benchmark for bpf memory allocator to compare between different
   flavor of bpf memory allocator.
 * implement BPF_MA_REUSE_AFTER_RCU_GP for bpf memory allocator.

v1: https://lore.kernel.org/bpf/20221230041151.1231169-1-houtao@huaweicloud.com/
 
Hou Tao (2):
  bpf: Only reuse after one RCU GP in bpf memory allocator
  bpf: Call rcu_momentary_dyntick_idle() in task work periodically

 kernel/bpf/memalloc.c | 371 ++++++++++++++++++++++++++++--------------
 1 file changed, 250 insertions(+), 121 deletions(-)