Message ID | 20230317114832.13622-1-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: Filter out preempt_count_ functions from kprobe_multi bench | expand |
On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > It hits below warning on my test machine when running test_progs, > > [ 702.223611] ------------[ cut here ]------------ > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > [ 702.241388] Call Trace: > [ 702.241615] <TASK> > [ 702.241811] fprobe_handler+0x22/0x30 > [ 702.242129] 0xffffffffc04710f7 > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > [ 702.253918] do_syscall_64+0x16/0x90 > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > [ 702.255422] RIP: 0033:0x46b793 > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > function, which is not executed in rcu safe context so the kprobe handler > on top of it will trigger the rcu warning. > > Filtering out preempt_count_ functions from the bench test. > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > --- > tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c > index 22be0a9..5561b93 100644 > --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c > +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c > @@ -379,6 +379,8 @@ static int get_syms(char ***symsp, size_t *cntp, bool kernel) > if (!strncmp(name, "__ftrace_invalid_address__", > sizeof("__ftrace_invalid_address__") - 1)) > continue; > + if (!strncmp(name, "preempt_count_", strlen("preempt_count_"))) > + continue; > let's add str_has_pfx() helper macro from libbpf_internal.h to test_progs.h and use that instead of repeating each substring twice? Here's what libbpf is doing: /* Check whether a string `str` has prefix `pfx`, regardless if `pfx` is * a string literal known at compilation time or char * pointer known only at * runtime. */ #define str_has_pfx(str, pfx) \ (strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 : strlen(pfx)) == 0) > err = hashmap__add(map, name, 0); > if (err == -EEXIST) > -- > 1.8.3.1 >
On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > It hits below warning on my test machine when running test_progs, > > [ 702.223611] ------------[ cut here ]------------ > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > [ 702.241388] Call Trace: > [ 702.241615] <TASK> > [ 702.241811] fprobe_handler+0x22/0x30 > [ 702.242129] 0xffffffffc04710f7 > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > [ 702.253918] do_syscall_64+0x16/0x90 > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > [ 702.255422] RIP: 0033:0x46b793 > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > function, which is not executed in rcu safe context so the kprobe handler > on top of it will trigger the rcu warning. Why is that? preempt_count itself is fine. The problem is elsewhere. Since !rcu_is_watching() it some sort of idle or some other issue.
On Sat, Mar 18, 2023 at 12:52 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > It hits below warning on my test machine when running test_progs, > > > > [ 702.223611] ------------[ cut here ]------------ > > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > > [ 702.241388] Call Trace: > > [ 702.241615] <TASK> > > [ 702.241811] fprobe_handler+0x22/0x30 > > [ 702.242129] 0xffffffffc04710f7 > > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > > [ 702.253918] do_syscall_64+0x16/0x90 > > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > [ 702.255422] RIP: 0033:0x46b793 > > > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > > function, which is not executed in rcu safe context so the kprobe handler > > on top of it will trigger the rcu warning. > > Why is that? It is caused by CONFIG_CONTEXT_TRACKING_USER=y, and it seems the preempt_count_sub is executed before the RCU is watching. user_exit_irqoff if (context_tracking_enabled()) // CONFIG_CONTEXT_TRACKING_USER=y __ct_user_exit(CONTEXT_USER); ct_kernel_enter ... // RCU is not watching here ... ct_kernel_enter_state(offset); // ... but is watching here. It can be reproduced with a simple bpf code as follows when CONFIG_CONTEXT_TRACKING_USER=y, SEC("kprobe.multi/preempt_count_sub") int kprobe_multi_trace() { return 0; } > preempt_count itself is fine. > The problem is elsewhere. > Since !rcu_is_watching() it some sort of idle or some other issue. Not sure if we need to improve the code under CONFIG_CONTEXT_TRACKING_USER=y, but it seems skipping "preempt_count_" in kprobe_multi test case would be a quick fix.
On Sat, Mar 18, 2023 at 12:41 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > It hits below warning on my test machine when running test_progs, > > > > [ 702.223611] ------------[ cut here ]------------ > > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > > [ 702.241388] Call Trace: > > [ 702.241615] <TASK> > > [ 702.241811] fprobe_handler+0x22/0x30 > > [ 702.242129] 0xffffffffc04710f7 > > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > > [ 702.253918] do_syscall_64+0x16/0x90 > > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > [ 702.255422] RIP: 0033:0x46b793 > > > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > > function, which is not executed in rcu safe context so the kprobe handler > > on top of it will trigger the rcu warning. > > > > Filtering out preempt_count_ functions from the bench test. > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > --- > > tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c > > index 22be0a9..5561b93 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c > > +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c > > @@ -379,6 +379,8 @@ static int get_syms(char ***symsp, size_t *cntp, bool kernel) > > if (!strncmp(name, "__ftrace_invalid_address__", > > sizeof("__ftrace_invalid_address__") - 1)) > > continue; > > + if (!strncmp(name, "preempt_count_", strlen("preempt_count_"))) > > + continue; > > > > let's add str_has_pfx() helper macro from libbpf_internal.h to > test_progs.h and use that instead of repeating each substring twice? > Thanks for the suggestion. > Here's what libbpf is doing: > > /* Check whether a string `str` has prefix `pfx`, regardless if `pfx` is > * a string literal known at compilation time or char * pointer known only at > * runtime. > */ > #define str_has_pfx(str, pfx) \ > (strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 > : strlen(pfx)) == 0) > > > > err = hashmap__add(map, name, 0); > > if (err == -EEXIST) > > -- > > 1.8.3.1 > >
On Sun, Mar 19, 2023 at 03:36:57PM +0800, Yafang Shao wrote: > On Sat, Mar 18, 2023 at 12:52 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > It hits below warning on my test machine when running test_progs, > > > > > > [ 702.223611] ------------[ cut here ]------------ > > > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > > > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > > > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > > > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > > > [ 702.241388] Call Trace: > > > [ 702.241615] <TASK> > > > [ 702.241811] fprobe_handler+0x22/0x30 > > > [ 702.242129] 0xffffffffc04710f7 > > > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > > > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > > > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > > > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > > > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > > > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > > > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > > > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > > > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > > > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > > > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > > > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > > > [ 702.253918] do_syscall_64+0x16/0x90 > > > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > [ 702.255422] RIP: 0033:0x46b793 > > > > > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > > > function, which is not executed in rcu safe context so the kprobe handler > > > on top of it will trigger the rcu warning. > > > > Why is that? > > It is caused by CONFIG_CONTEXT_TRACKING_USER=y, and it seems the > preempt_count_sub is executed before the RCU is watching. > user_exit_irqoff > if (context_tracking_enabled()) // CONFIG_CONTEXT_TRACKING_USER=y > __ct_user_exit(CONTEXT_USER); > ct_kernel_enter > ... > // RCU is not watching here ... > ct_kernel_enter_state(offset); > // ... but is watching here. > > It can be reproduced with a simple bpf code as follows when > CONFIG_CONTEXT_TRACKING_USER=y, > SEC("kprobe.multi/preempt_count_sub") > int kprobe_multi_trace() > { > return 0; > } > > > preempt_count itself is fine. > > The problem is elsewhere. > > Since !rcu_is_watching() it some sort of idle or some other issue. > > Not sure if we need to improve the code under > CONFIG_CONTEXT_TRACKING_USER=y, but it seems skipping "preempt_count_" > in kprobe_multi test case would be a quick fix. It's not a fix. Only moving a goal post. We probably need if (!rcu_is_watching()) return; in [kf]probe handler instead.
On Sun, Mar 19, 2023 at 09:54:56AM -0700, Alexei Starovoitov wrote: > On Sun, Mar 19, 2023 at 03:36:57PM +0800, Yafang Shao wrote: > > On Sat, Mar 18, 2023 at 12:52 AM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > > > It hits below warning on my test machine when running test_progs, > > > > > > > > [ 702.223611] ------------[ cut here ]------------ > > > > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > > > > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > > > > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > > > > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > > > > [ 702.241388] Call Trace: > > > > [ 702.241615] <TASK> > > > > [ 702.241811] fprobe_handler+0x22/0x30 > > > > [ 702.242129] 0xffffffffc04710f7 > > > > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > > > > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > > > > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > > > > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > > > > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > > > > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > > > > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > > > > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > > > > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > > > > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > > > > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > > > > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > > > > [ 702.253918] do_syscall_64+0x16/0x90 > > > > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > [ 702.255422] RIP: 0033:0x46b793 > > > > > > > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > > > > function, which is not executed in rcu safe context so the kprobe handler > > > > on top of it will trigger the rcu warning. > > > > > > Why is that? > > > > It is caused by CONFIG_CONTEXT_TRACKING_USER=y, and it seems the > > preempt_count_sub is executed before the RCU is watching. > > user_exit_irqoff > > if (context_tracking_enabled()) // CONFIG_CONTEXT_TRACKING_USER=y > > __ct_user_exit(CONTEXT_USER); > > ct_kernel_enter > > ... > > // RCU is not watching here ... > > ct_kernel_enter_state(offset); > > // ... but is watching here. > > > > It can be reproduced with a simple bpf code as follows when > > CONFIG_CONTEXT_TRACKING_USER=y, > > SEC("kprobe.multi/preempt_count_sub") > > int kprobe_multi_trace() > > { > > return 0; > > } > > > > > preempt_count itself is fine. > > > The problem is elsewhere. > > > Since !rcu_is_watching() it some sort of idle or some other issue. > > > > Not sure if we need to improve the code under > > CONFIG_CONTEXT_TRACKING_USER=y, but it seems skipping "preempt_count_" > > in kprobe_multi test case would be a quick fix. > > It's not a fix. Only moving a goal post. > We probably need > if (!rcu_is_watching()) > return; > in [kf]probe handler instead. that might also help with some of the warnings we saw from idle functions that we skip at the moment jirka
On Mon, Mar 20, 2023 at 12:55 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Sun, Mar 19, 2023 at 03:36:57PM +0800, Yafang Shao wrote: > > On Sat, Mar 18, 2023 at 12:52 AM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > > > It hits below warning on my test machine when running test_progs, > > > > > > > > [ 702.223611] ------------[ cut here ]------------ > > > > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > > > > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > > > > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > > > > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > > > > [ 702.241388] Call Trace: > > > > [ 702.241615] <TASK> > > > > [ 702.241811] fprobe_handler+0x22/0x30 > > > > [ 702.242129] 0xffffffffc04710f7 > > > > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > > > > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > > > > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > > > > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > > > > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > > > > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > > > > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > > > > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > > > > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > > > > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > > > > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > > > > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > > > > [ 702.253918] do_syscall_64+0x16/0x90 > > > > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > [ 702.255422] RIP: 0033:0x46b793 > > > > > > > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > > > > function, which is not executed in rcu safe context so the kprobe handler > > > > on top of it will trigger the rcu warning. > > > > > > Why is that? > > > > It is caused by CONFIG_CONTEXT_TRACKING_USER=y, and it seems the > > preempt_count_sub is executed before the RCU is watching. > > user_exit_irqoff > > if (context_tracking_enabled()) // CONFIG_CONTEXT_TRACKING_USER=y > > __ct_user_exit(CONTEXT_USER); > > ct_kernel_enter > > ... > > // RCU is not watching here ... > > ct_kernel_enter_state(offset); > > // ... but is watching here. > > > > It can be reproduced with a simple bpf code as follows when > > CONFIG_CONTEXT_TRACKING_USER=y, > > SEC("kprobe.multi/preempt_count_sub") > > int kprobe_multi_trace() > > { > > return 0; > > } > > > > > preempt_count itself is fine. > > > The problem is elsewhere. > > > Since !rcu_is_watching() it some sort of idle or some other issue. > > > > Not sure if we need to improve the code under > > CONFIG_CONTEXT_TRACKING_USER=y, but it seems skipping "preempt_count_" > > in kprobe_multi test case would be a quick fix. > > It's not a fix. Only moving a goal post. > We probably need > if (!rcu_is_watching()) > return; > in [kf]probe handler instead. Good suggestion. I will think about it. BTW, we can't kprobe preempt_count_sub, because it is a nokprobe symbol. NOKPROBE_SYMBOL(preempt_count_sub);
diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c index 22be0a9..5561b93 100644 --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c @@ -379,6 +379,8 @@ static int get_syms(char ***symsp, size_t *cntp, bool kernel) if (!strncmp(name, "__ftrace_invalid_address__", sizeof("__ftrace_invalid_address__") - 1)) continue; + if (!strncmp(name, "preempt_count_", strlen("preempt_count_"))) + continue; err = hashmap__add(map, name, 0); if (err == -EEXIST)
It hits below warning on my test machine when running test_progs, [ 702.223611] ------------[ cut here ]------------ [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 [ 702.241388] Call Trace: [ 702.241615] <TASK> [ 702.241811] fprobe_handler+0x22/0x30 [ 702.242129] 0xffffffffc04710f7 [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 [ 702.250785] ? preempt_count_sub+0x5/0xa0 [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 [ 702.252368] ? preempt_count_sub+0x5/0xa0 [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 [ 702.253918] do_syscall_64+0x16/0x90 [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 702.255422] RIP: 0033:0x46b793 It's caused by bench test attaching kprobe_multi link to preempt_count_sub function, which is not executed in rcu safe context so the kprobe handler on top of it will trigger the rcu warning. Filtering out preempt_count_ functions from the bench test. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> --- tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c | 2 ++ 1 file changed, 2 insertions(+)