diff mbox series

[14/16] mm/percpu: optimize pcpu_alloc_area()

Message ID 20220718192844.1805158-15-yury.norov@gmail.com (mailing list archive)
State New
Headers show
Series Introduce DEBUG_BITMAP config option and bitmap_check_params() | expand

Commit Message

Yury Norov July 18, 2022, 7:28 p.m. UTC
Don't call bitmap_clear() to clear 0 bits.

bitmap_clear() can handle 0-length requests properly, but it's not covered
with static optimizations, and falls to __bitmap_set(). So we are paying a
function call + prologue work cost just for nothing.

Caught with CONFIG_DEBUG_BITMAP:
[   45.571799]  <TASK>
[   45.571801]  pcpu_alloc_area+0x194/0x340
[   45.571806]  pcpu_alloc+0x2fb/0x8b0
[   45.571811]  ? kmem_cache_alloc_trace+0x177/0x2a0
[   45.571815]  __percpu_counter_init+0x22/0xa0
[   45.571819]  fprop_local_init_percpu+0x14/0x30
[   45.571823]  wb_get_create+0x15d/0x5f0
[   45.571828]  cleanup_offline_cgwb+0x73/0x210
[   45.571831]  cleanup_offline_cgwbs_workfn+0xcf/0x200
[   45.571835]  process_one_work+0x1e5/0x3b0
[   45.571839]  worker_thread+0x50/0x3a0
[   45.571843]  ? rescuer_thread+0x390/0x390
[   45.571846]  kthread+0xe8/0x110
[   45.571849]  ? kthread_complete_and_exit+0x20/0x20
[   45.571853]  ret_from_fork+0x22/0x30
[   45.571858]  </TASK>
[   45.571859] ---[ end trace 0000000000000000 ]---
[   45.571860] b1:		ffffa8d5002e1000
[   45.571861] b2:		0
[   45.571861] b3:		0
[   45.571862] nbits:	44638
[   45.571863] start:	44638
[   45.571864] off:	0
[   45.571864] percpu: Bitmap: parameters check failed
[   45.571865] percpu: include/linux/bitmap.h [538]: bitmap_clear

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 mm/percpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Dennis Zhou July 19, 2022, 4:25 a.m. UTC | #1
Hello,

On Mon, Jul 18, 2022 at 12:28:42PM -0700, Yury Norov wrote:
> Don't call bitmap_clear() to clear 0 bits.
> 
> bitmap_clear() can handle 0-length requests properly, but it's not covered
> with static optimizations, and falls to __bitmap_set(). So we are paying a
> function call + prologue work cost just for nothing.
> 
> Caught with CONFIG_DEBUG_BITMAP:
> [   45.571799]  <TASK>
> [   45.571801]  pcpu_alloc_area+0x194/0x340
> [   45.571806]  pcpu_alloc+0x2fb/0x8b0
> [   45.571811]  ? kmem_cache_alloc_trace+0x177/0x2a0
> [   45.571815]  __percpu_counter_init+0x22/0xa0
> [   45.571819]  fprop_local_init_percpu+0x14/0x30
> [   45.571823]  wb_get_create+0x15d/0x5f0
> [   45.571828]  cleanup_offline_cgwb+0x73/0x210
> [   45.571831]  cleanup_offline_cgwbs_workfn+0xcf/0x200
> [   45.571835]  process_one_work+0x1e5/0x3b0
> [   45.571839]  worker_thread+0x50/0x3a0
> [   45.571843]  ? rescuer_thread+0x390/0x390
> [   45.571846]  kthread+0xe8/0x110
> [   45.571849]  ? kthread_complete_and_exit+0x20/0x20
> [   45.571853]  ret_from_fork+0x22/0x30
> [   45.571858]  </TASK>
> [   45.571859] ---[ end trace 0000000000000000 ]---
> [   45.571860] b1:		ffffa8d5002e1000
> [   45.571861] b2:		0
> [   45.571861] b3:		0
> [   45.571862] nbits:	44638
> [   45.571863] start:	44638
> [   45.571864] off:	0
> [   45.571864] percpu: Bitmap: parameters check failed
> [   45.571865] percpu: include/linux/bitmap.h [538]: bitmap_clear
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  mm/percpu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 3633eeefaa0d..f720f7c36b91 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -1239,7 +1239,8 @@ static int pcpu_alloc_area(struct pcpu_chunk *chunk, int alloc_bits,
>  
>  	/* update boundary map */
>  	set_bit(bit_off, chunk->bound_map);
> -	bitmap_clear(chunk->bound_map, bit_off + 1, alloc_bits - 1);
> +	if (alloc_bits > 1)
> +		bitmap_clear(chunk->bound_map, bit_off + 1, alloc_bits - 1);
>  	set_bit(bit_off + alloc_bits, chunk->bound_map);
>  
>  	chunk->free_bytes -= alloc_bits * PCPU_MIN_ALLOC_SIZE;
> -- 
> 2.34.1
> 

Acked-by: Dennis Zhou <dennis@kernel.org>

Thanks,
Dennis
diff mbox series

Patch

diff --git a/mm/percpu.c b/mm/percpu.c
index 3633eeefaa0d..f720f7c36b91 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1239,7 +1239,8 @@  static int pcpu_alloc_area(struct pcpu_chunk *chunk, int alloc_bits,
 
 	/* update boundary map */
 	set_bit(bit_off, chunk->bound_map);
-	bitmap_clear(chunk->bound_map, bit_off + 1, alloc_bits - 1);
+	if (alloc_bits > 1)
+		bitmap_clear(chunk->bound_map, bit_off + 1, alloc_bits - 1);
 	set_bit(bit_off + alloc_bits, chunk->bound_map);
 
 	chunk->free_bytes -= alloc_bits * PCPU_MIN_ALLOC_SIZE;