diff mbox series

[08/16] smp: optimize smp_call_function_many_cond() for more

Message ID 20220718192844.1805158-9-yury.norov@gmail.com (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series Introduce DEBUG_BITMAP config option and bitmap_check_params() | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Yury Norov July 18, 2022, 7:28 p.m. UTC
smp_call_function_many_cond() is often passed with cpu_online_mask.
If it's the case, we can use cpumask_copy instead of cpumask_and, which
is faster.

Caught with CONFIG_DEBUG_BITMAP:
[    7.830337] Call trace:
[    7.830397]  __bitmap_check_params+0x1d8/0x260
[    7.830499]  smp_call_function_many_cond+0x1e8/0x45c
[    7.830607]  kick_all_cpus_sync+0x44/0x80
[    7.830698]  bpf_int_jit_compile+0x34c/0x5cc
[    7.830796]  bpf_prog_select_runtime+0x118/0x190
[    7.830900]  bpf_prepare_filter+0x3dc/0x51c
[    7.830995]  __get_filter+0xd4/0x170
[    7.831145]  sk_attach_filter+0x18/0xb0
[    7.831236]  sock_setsockopt+0x5b0/0x1214
[    7.831330]  __sys_setsockopt+0x144/0x170
[    7.831431]  __arm64_sys_setsockopt+0x2c/0x40
[    7.831541]  invoke_syscall+0x48/0x114
[    7.831634]  el0_svc_common.constprop.0+0x44/0xfc
[    7.831745]  do_el0_svc+0x30/0xc0
[    7.831825]  el0_svc+0x2c/0x84
[    7.831899]  el0t_64_sync_handler+0xbc/0x140
[    7.831999]  el0t_64_sync+0x18c/0x190
[    7.832086] ---[ end trace 0000000000000000 ]---
[    7.832375] b1:		ffff24d1ffd98a48
[    7.832385] b2:		ffffa65533a29a38
[    7.832393] b3:		ffffa65533a29a38
[    7.832400] nbits:	256
[    7.832407] start:	0
[    7.832412] off:	0
[    7.832418] smp: Bitmap: parameters check failed
[    7.832432] smp: include/linux/bitmap.h [363]: bitmap_and

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/smp.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Peter Zijlstra July 18, 2022, 9:29 p.m. UTC | #1
On Mon, Jul 18, 2022 at 12:28:36PM -0700, Yury Norov wrote:

> ---
>  kernel/smp.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 7ed2b9b12f74..f96fdf944b4a 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -942,7 +942,11 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  
>  	if (run_remote) {
>  		cfd = this_cpu_ptr(&cfd_data);
> -		cpumask_and(cfd->cpumask, mask, cpu_online_mask);
> +		if (mask == cpu_online_mask)
> +			cpumask_copy(cfd->cpumask, cpu_online_mask);
> +		else
> +			cpumask_and(cfd->cpumask, mask, cpu_online_mask);
> +

Or... you could optimize cpumask_and() to detect the src1p == src2p case?

>  		__cpumask_clear_cpu(this_cpu, cfd->cpumask);
>  
>  		cpumask_clear(cfd->cpumask_ipi);
> -- 
> 2.34.1
>
Andy Shevchenko July 18, 2022, 9:37 p.m. UTC | #2
On Mon, Jul 18, 2022 at 12:28:36PM -0700, Yury Norov wrote:
> smp_call_function_many_cond() is often passed with cpu_online_mask.
> If it's the case, we can use cpumask_copy instead of cpumask_and, which
> is faster.
> 
> Caught with CONFIG_DEBUG_BITMAP:
> [    7.830337] Call trace:
> [    7.830397]  __bitmap_check_params+0x1d8/0x260
> [    7.830499]  smp_call_function_many_cond+0x1e8/0x45c
> [    7.830607]  kick_all_cpus_sync+0x44/0x80
> [    7.830698]  bpf_int_jit_compile+0x34c/0x5cc
> [    7.830796]  bpf_prog_select_runtime+0x118/0x190
> [    7.830900]  bpf_prepare_filter+0x3dc/0x51c
> [    7.830995]  __get_filter+0xd4/0x170
> [    7.831145]  sk_attach_filter+0x18/0xb0
> [    7.831236]  sock_setsockopt+0x5b0/0x1214
> [    7.831330]  __sys_setsockopt+0x144/0x170
> [    7.831431]  __arm64_sys_setsockopt+0x2c/0x40
> [    7.831541]  invoke_syscall+0x48/0x114
> [    7.831634]  el0_svc_common.constprop.0+0x44/0xfc
> [    7.831745]  do_el0_svc+0x30/0xc0
> [    7.831825]  el0_svc+0x2c/0x84
> [    7.831899]  el0t_64_sync_handler+0xbc/0x140
> [    7.831999]  el0t_64_sync+0x18c/0x190
> [    7.832086] ---[ end trace 0000000000000000 ]---
> [    7.832375] b1:		ffff24d1ffd98a48
> [    7.832385] b2:		ffffa65533a29a38
> [    7.832393] b3:		ffffa65533a29a38
> [    7.832400] nbits:	256
> [    7.832407] start:	0
> [    7.832412] off:	0
> [    7.832418] smp: Bitmap: parameters check failed
> [    7.832432] smp: include/linux/bitmap.h [363]: bitmap_and

Same for long commit message noise.
Yury Norov July 20, 2022, 5:06 p.m. UTC | #3
On Mon, Jul 18, 2022 at 11:29:06PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 18, 2022 at 12:28:36PM -0700, Yury Norov wrote:
> 
> > ---
> >  kernel/smp.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index 7ed2b9b12f74..f96fdf944b4a 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -942,7 +942,11 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> >  
> >  	if (run_remote) {
> >  		cfd = this_cpu_ptr(&cfd_data);
> > -		cpumask_and(cfd->cpumask, mask, cpu_online_mask);
> > +		if (mask == cpu_online_mask)
> > +			cpumask_copy(cfd->cpumask, cpu_online_mask);
> > +		else
> > +			cpumask_and(cfd->cpumask, mask, cpu_online_mask);
> > +
> 
> Or... you could optimize cpumask_and() to detect the src1p == src2p case?

This is not what I would consider as optimization. For vast majority
of users this check is useless because they know for sure that
cpumasks are different.

For this case I can invent something like cpumask_and_check_eq(), so
that there'll be minimal impact on user code. (Suggestions for a better
name are very welcome.)

Thanks,
Yury
diff mbox series

Patch

diff --git a/kernel/smp.c b/kernel/smp.c
index 7ed2b9b12f74..f96fdf944b4a 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -942,7 +942,11 @@  static void smp_call_function_many_cond(const struct cpumask *mask,
 
 	if (run_remote) {
 		cfd = this_cpu_ptr(&cfd_data);
-		cpumask_and(cfd->cpumask, mask, cpu_online_mask);
+		if (mask == cpu_online_mask)
+			cpumask_copy(cfd->cpumask, cpu_online_mask);
+		else
+			cpumask_and(cfd->cpumask, mask, cpu_online_mask);
+
 		__cpumask_clear_cpu(this_cpu, cfd->cpumask);
 
 		cpumask_clear(cfd->cpumask_ipi);